Sergey - stock.adobe.com
Making machine learning operational
As artificial intelligence matures, IT departments will need to take control of change management and governance of data models
Recent research from McKinsey found that the companies seeing significant value from artificial intelligence (AI) are continuing to invest in it during the pandemic.
Most respondents at businesses that McKinsey deemed as “high performers” said their organisations have increased investment in AI in each major business function in response to the pandemic, while less than 30% of other respondents said the same.
According to McKinsey, respondents in automotive and assembly, as well as in healthcare services and pharmaceuticals and medical products, are the most likely to say their companies have increased investment.
These high-performing businesses were in a better position to meet the challenges of the global pandemic, it said. “Self-adapting, reinforced learning can navigate greater complexity,” said Jacomo Corbo, co-founder and chief scientist of QuantumBlack, an AI consultancy that is part of McKinsey’s advanced analytics business.
In his experience, businesses need to adapt the approach they take to building and retraining AI models and the collection of data, to enable greater levels of agility. “We have to collect data in a much more agile way and retrain models with a high cadence,” he said.
But according to Corbo, AI appears to have fallen between the gaps of IT governance. “A lot of CIOs tried to shirk responsibility for the maintenance of machine learning models,” he said.
Corbo said IT leadership teams need to bring in the rigour of software development to machine learning, where code is managed under version control, which provides an audit trail of changes that have been made. Without such IT governance and oversight, it would not be easy to manage machine learning data models, he said. The lack of IT governance means the machine learning code base cannot be maintained with the same service levels as other assets in IT.
MLOps treats machine learning systems development and machine learning models as a form of software development.
While IT teams have generally shifted to agile methodologies for software development, Corbo said: “MLOps will require an evolution. Think of a waterfall model for machine learning, developed by an AI centre of excellence where heavy refactoring of the machine learning model is required. It is not a pattern suitable for fast iterations.”
The general idea is that real-world data is gathered to validate the machine learning model. If its performance no longer matches what real-world metrics show, then the software development team responsible for the model optimises it.
This is important because many external factors can influence a machine learning model. McKinsey’s research found that, in general, respondents from companies that adopted more AI capabilities were more likely to report seeing AI models misperform amid the Covid-19 pandemic.
Read more about MLOps
- Taking notes from DevOps lifecycle management, machine learning operations tools and platforms seek to improve accuracy, ease integration problems and keep models trained.
- Demand for MLOps and AutoML tools is on the upswing, and the machine learning market will undergo an increase in consolidation.
The study indicated that high-performing organisations, which tend to have adopted more AI capabilities than others, witnessed more misperformance than companies seeing less value from AI. McKinsey found that high-performing organisations’ models were particularly vulnerable within marketing and sales, product development, and service operation – these were the areas its study found that AI adoption was most commonly reported.
For instance, Corbo said that during the pandemic, models relying on long time series data, such as consumer demand patterns, often broke down. “We are seeing a shift to more self adaptive models tailored to what is going on right now and less reliance on long time series data,” he said.
This requires both real-time and time-series data. According to Corbo, many deep learning models have the flexibility to take in data collected over a long-term time scale combined with data that changes at a high rate of cadence.
Previously, MLOps required a high level of advanced skills in the development teams. Corbo said that unlike a few years ago, tooling to support MLOps has been maturing. Software tooling such as Spotify’s Luigi and Netflix’s Metaflow needed to be developed internally because, until recently, workflow and dependency management tools for data scientists did not exist, he said.
“There’s now a huge variation in MLOps capabilities and there are more choices in how these environments can be provided,” said Corbo. “The whole idea is to lower the tech requirements massively.”
Many of the MLOps tools now available are open source. Organisations clearly still need people who not only understand what tools are available, but how each fits together to provide MLOps that aligns with what the business needs to do with AI.
In this respect, Corbo believes an AI centre of excellence (CoE) has an important role to play. Rather than being a large, monolithic organisation, a CoE should comprise a few opinionated people, he said. “The CoE takes a role in technology choices. What are the relevant components?”
The CoE also chooses the machine learning models that best fit with how the business plans to make use of the machine learning models. Corbo urged IT leaders to encourage close partnerships between the AI CoE and ITOps.
MLOps also requires IT chiefs to put in place tools that enable software extraction and make code pipelines for low code environments. Corbo said that data scientists who are not strong in software development need the ability to access data through a self-service model. When their machine learning models are ready for deployment, it is then passed through a pipeline to operations, which stands up the required IT infrastructure.