taa22 - stock.adobe.com

The Cambridge Analytica case: What’s a data scientist to do?

The Cambridge Analytica controversy has highlighted data ethics issues especially dear to early career stage data scientists

This article can also be found in the Premium Editorial Download: Computer Weekly: London plans to be a smarter city

The recent controversy about Cambridge Analytica’s use of Facebook data to, allegedly, influence the 2016 US presidential election and, possibly, the UK EU referendum raises an ethical issue, especially for data scientists newly minted from UK universities.

Should you take a job with any company that seems to be capitalising on personal data in areas that regulation does not yet reach? “Just because you can does not mean you should” is a commonly expressed idea in the data science community. But what does it mean in practical terms?

The data science community in Scotland seems to be flourishing, in part due to the efforts of the Scottish government-supported innovation agency The Data Lab, which runs the annual Data Fest event.

MBN Solutions, a Glasgow-based IT recruitment firm that works with The Data Lab, was approached by a representative of Cambridge Analytica itself in March 2017 in search of data science talent, according to Paul Forrest, non-executive chairman of MBN and chairman of management consultancy Purple.

Forrest attended the recent Data Summit in Edinburgh, part of Data Fest. There he told Computer Weekly that Cambridge Analytica had been trumpeting its work on behalf of Donald Trump’s election campaign when discussing possible work with MBN that did not come to fruition. “There was a glow, an aura around their ability to participate in the influence of democratic processes,” he said.

Forrest said MBN runs a data science meet-up in Scotland, drawn from 2,500 members, many of whom are “in transition from academia, and are asking the question: just because I can do something [technically], should I do it?”.

“There is a greater propensity among millennials to want to do the right thing,” he said. “People now ask more questions about corporate social responsibility and giving back to the community than they used to.”

Forrest told Computer Weekly he had a personal view on the work of Aleksandr Kogan, a senior research associate in psychology at the University of Cambridge, undertaken as a private piece of work through his company, Global Science Research (GSR), and made available to Cambridge Analytica.

Read more about the ethics of data science

  • heedless data collection in business analytics programmes is breeding moral hazards, according to author Frank Buytendijk.
  • Ethical data collection practices are becoming even more important, as cases like Cambridge Analytica’s alleged misuse of Facebook data challenge consumer trust in enterprise analytics.
  • Data collection practices spark debate on big data ethics, privacy.

This work involved, as is now well known, gathering “psychographic” data from Facebook, using a GSR-developed survey app called thisismydigitallife. The app pulled in data from friends of respondents, as well as the respondents themselves. Cambridge Analytica then used that information for its own business, according to a contractor formerly employed by the data analytics firm, as reported by The Observer.

Forrest said: “There was something in the app that would give you a profile that produces a relatively balanced [political inclinations] steer – based on likes, shares and participation [in social media]. As long as that process is signposted to the data subject – as long as you know you are entering the lion’s den, so to speak – it is not so much of an issue. But if it is covert, it could be a clear privacy violation, although not necessarily illegal.”

He said MBN would have struggled to find people meeting Cambridge Analytica’s precise requirements, but it does have access to “a candidate pool who could be the ‘Aleksandr Kogans’ of the future”. He added: “We are placing some 100 MSc students with that sort of skills base. And we’ve got a lot of PhD students who would all be able to do that sort of work.

“Last year, for example, we put an MSc student on a placement in the NHS who did an outstanding piece of work on predicting requirements for post-operative care. That’s where we should be going with predictive analytics.”

Tip of the iceberg

In relation to the mining of social media data more generally, Forrest said: “I think this is the tip of the iceberg. The reality of the delay in [Mark] Zuckerberg saying anything shows that this is a material crisis. No one was rushing to apologise. They [Facebook] are trying to guard against any consequential regulatory sanction. It is not a breach, but it is a privacy violation.”

Forrest also speculated that under the General Data Protection Regulation (GDPR), had an EU citizen’s data been dragged into Cambridge Analytica’s use of Kogan’s work, as a consequence of its links to their friends who are US voters, “you could be looking at a big fine”.

Meanwhile, Forrest says Scotland is an outstanding region of the UK for data science. “Some of that is the Scottish government’s backing of The Data Lab as an innovation hub,” he said. “But part of it is that the universities and students are really engaged, and there is a real community feel. You can feel the buzz in the air. There is an ecosystem here of fast moving startups, of work around blockchain, of work in the banks and so on that is unmatched elsewhere in the UK. That bodes well for Scotland post-Brexit.”

Read more on Big data analytics