Laurent - stock.adobe.com

ICO launches guidance on AI and data protection

The Information Commissioner’s Office (ICO) has published guidance aimed at rendering the application of machine learning to data compliant with data protection principles

The Information Commissioner’s Office (ICO) has published an 80-page guidance document for companies and other organisations about using artificial intelligence (AI) in line with data protection principles.

The guidance is the culmination of two years research and consultation by Reuben Binns, an associate professor in the department of Computer Science at the University of Oxford, and the ICO’s AI team.

The guidance covers what the ICO thinks is “best practice for data protection-compliant AI, as well as how we interpret data protection law as it applies to AI systems that process personal data. The guidance is not a statutory code. It contains advice on how to interpret relevant law as it applies to AI, and recommendations on good practice for organisational and technical measures to mitigate the risks to individuals that AI may cause or exacerbate”.

It seeks to provide a framework for “auditing AI, focusing on best practices for data protection compliance – whether you design your own AI system, or implement one from a third party”.

It embodies, it says, “auditing tools and procedures that we will use in audits and investigations; detailed guidance on AI and data protection; and a toolkit designed to provide further practical support to organisations auditing the compliance of their own AI systems”.

It is also an interactive document which invites further communication with the ICO.

This guidance is said to be aimed at two audiences: “those with a compliance focus, such as data protection officers (DPOs), general counsel, risk managers, senior management, and the ICO's own auditors; and technology specialists, including machine learning experts, data scientists, software developers and engineers, and cyber security and IT risk managers”.

Read more about AI and data ethics

It points out two security risks that can be exacerbated by AI, namely the “loss or misuse of the large amounts of personal data often required to train AI systems; and software vulnerabilities to be introduced as a result of the introduction of new AI-related code and infrastructure”.

For, as the guidance document points out, the standard practices for developing and deploying AI involve, by necessity, processing large amounts of data. There is therefore an inherent risk that this fails to comply with the data minimisation principle.

This, according to the GDPR [the EU General Data Protection Regulation] as glossed by former Computer Weekly journalist Warwick Ashford, “requires organisations not to hold data for any longer than absolutely necessary, and not to change the use of the data from the purpose for which it was originally collected, while – at the same time – they must delete any data at the request of the data subject”.

While the guidance document notes that data protection and “AI ethics” overlap, it does not seek to “provide generic ethical or design principles for your use of AI”.

AI for the ICO

What is AI, in the eyes of the ICO? “We use the umbrella term ‘AI’ because it has become a standard industry term for a range of technologies. One prominent area of AI is machine learning, which is the use of computational techniques to create (often complex) statistical models using (typically) large quantities of data. Those models can be used to make classifications or predictions about new data points. While not all AI involves ML, most of the recent interest in AI is driven by ML in some way, whether in image recognition, speech-to-text, or classifying credit risk.

“This guidance therefore focuses on the data protection challenges that ML-based AI may present, while acknowledging that other kinds of AI may give rise to other data protection challenges.”

Of particular interest to the ICO is the concept of “explainability” in AI. The guidance goes on: “in collaboration with the Alan Turing Institute we have produced guidance on how organisations can best explain their use of AI to individuals. This resulted in the Explaining decisions made with AI guidance, which was published in May 2020”.

The guidance contains commentary about the distinction between a “controller” and a “processor”. It says “organisations that determine the purposes and means of processing will be controllers regardless of how they are described in any contract about processing services”.

This could be potentially relevant to the controversy surrounding the involvement of US data analytics company Palantir’s in the NHS Data Store project, where it has been repeatedly stressed, by Palantir, that the provider is merely a processor and not a controller – which is the NHS in that contractual relationship.

Biased data

The guidance also discusses such matters as bias in data sets leading to AIs making biased decisions, and offers this advice, among other pointers: “In cases of imbalanced training data, it may be possible to balance it out by adding or removing data about under/overrepresented subsets of the population (eg adding more data points on loan applications from women).

“In cases where the training data reflects past discrimination, you could either modify the data, change the learning process, or modify the model after training”.

Simon McDougall, deputy commissioner of regulatory innovation and technology at the ICO, said of the guidance: “Understanding how to assess compliance with data protection principles can be challenging in the context of AI. From the exacerbated, and sometimes novel, security risks that come from the use of AI systems, to the potential for discrimination and bias in the data. It is hard for technology specialists and compliance experts to navigate their way to compliant and workable AI systems.  

“The guidance contains recommendations on best practice and technical measures that organisations can use to mitigate those risks caused or exacerbated by the use of this technology. It is reflective of current AI practices and is practically applicable.”

Read more on Artificial intelligence, automation and robotics