CMA focuses on data for AI foundation model

Foundational AI models require vast amounts of data. Without access to diverse datasets, the market for foundational models risks being stifled

Cliff Saran, Managing Editor

Published: 18 Sep 2023 14:15

Accountability, access, diversity, choice, flexibility and transparency are among the key principles the Competition and Markets Authority (CMA) has set out in its preliminary report on foundational AI models.

Discussing the report, Sarah Cardell, CEO of the CMA, said: “The speed at which AI is becoming part of everyday life for people and businesses is dramatic. There is real potential for this technology to turbo charge productivity and make millions of everyday tasks easier – but we can’t take a positive future for granted.

“There remains a real risk that the use of AI develops in a way that undermines consumer trust or is dominated by a few players who exert market power that prevents the full benefits being felt across the economy.”

The CMA said the proposed principles aim to guide the ongoing development and use of AI foundational models to help people, businesses and the economy fully benefit from the innovation and growth they can offer.

In the report, the CMA discusses the uncertainties over how businesses developing foundational models can gain access to the increasingly large volumes of training data they require. One approach is where the developer of the foundational model obtains or identifies high quality data from web crawls.

However, the report’s authors note that access to large volumes of proprietary data sources may become necessary to develop the most competitive foundational models. As such, this could become a key factor that influences competition, but it may stimulate a dynamic market of data providers who supply data on fair and equal terms to a range of foundational model developers.

The CMA warned, however, that competition could potentially be stifled if the most useful sources of proprietary data for training are only accessible to a small range of existing developers. Copyright issues will also have an impact on foundational models that rely on proprietary data.

The CMA reported that large technology companies’ access to vast amounts of data and resources could provide them with an insurmountable advantage over smaller organisations, making it hard for them to compete. But the report’s authors noted that the extent of this advantage is uncertain, as it depends on a number of factors, including economies of scale, economies of scope, and feedback effects.

The CMA also looked at closed source and open source foundational model. The authors of the report noted that closed source is likely to lead to the highest performant foundational models.

But if AI-based applications are able to utilise foundational models with different levels of performance, this would lower barriers to entry. Such lower performant models would require less compute, less expertise and potentially different data than at the cutting-edge models.

According to the CMA, this could be an area where open-source models have the potential to provide a competitive constraint to closed-source models, even if in the future open source is unable to achieve cutting-edge performance.

CMA focuses on data for AI foundation model

Foundational AI models require vast amounts of data. Without access to diverse datasets, the market for foundational models risks being stifled

Read more about foundational AI models

Read more on Artificial intelligence, automation and robotics

UK competition regulator looks into Google’s AI search

What will happen now Google has been given ‘strategic market status’ by CMA?

How the UK's cloud strategy was hijacked by a hyperscaler duopoly

CMA prepares roadmap ahead of Apple and Google SMS decision