jijomathai - stock.adobe.com

Assessing the sustainability of cloud AI services

Using AI-optimised hardware and minimising data movements are some of the ways to mitigate the environmental impact of using cloud AI platforms

The Australian government recently introduced legislation to establish the Net Zero Economy Authority to support the country’s economy-wide net zero transformation. In response, Australian organisations must think about their current and future environmental obligations.

Those obligations will come to the fore as organisations develop and consume AI services in partnership with cloud providers, who currently host most AI technology solutions.

While cloud AI can accelerate the delivery of artificial intelligence (AI) capabilities, the demands particularly from generative AI (GenAI) can consume vast amounts of energy and other resources. So, in addition to assessing cloud providers on their AI capabilities, their sustainability posture is a top priority when making a selection.

The baseline is to partner only with cloud providers that have a demonstrated commitment to sustainability – but that is just the start. In deploying GenAI applications, there are a range of sustainability and optimisation best practices that can mitigate many of the environmental impacts of using cloud AI platforms.

Use renewable energy

Where possible only use cloud-sourced GenAI services that are powered by renewable energy. Sustainable cloud providers share renewable energy statistics for each cloud region and specific cloud datacentres. Beware of greenwashing that obfuscates the sources of energy powering cloud datacentres.

Some cloud providers achieve their “100% renewable energy” goals through the use of renewable energy certificates. Only use these cloud datacentres as a fallback option, not as a primary solution to accessing renewable energy.

Minimise energy consumption

Energy-aware workload placement and job scheduling can ensure that cloud AI workloads are running in datacentres most likely to operate sustainably. Also select cloud datacentres that use energy efficiently.

Check the power usage effectiveness (PUE) rating for the cloud services generally and the cloud datacentres specifically. PUE ratings should be as close to 1.0 as possible. Most cloud data centres have PUE ratings between 1.1 and 1.5.

Optimise cloud resource consumption

Drive better cloud optimisation through monitoring and reporting service usage; controlling unauthorised or unintended use; rightsizing cloud resources; and scaling up and down resources as required. Using GenAI techniques such as API (application programming interface)-based access to large language models (LLMs) are also effective optimisation measures.

However, be aware that the improved accessibility and affordable cost of AI, and more specifically GenAI, could lead to overuse. Optimisation of cloud resources must include governance policies to manage the prudent and responsible use of GenAI technologies.

Use AI-optimised hardware

Cloud providers increasingly use specialised hardware for AI workloads that are often energy-optimised for AI as well. These include Nvidia’s DGX systems, AWS’s Trainium and Inferentia processors, Google’s tensor processing units and the recently announced Microsoft Azure Maia chipsets.

Leading cloud providers offer this special-purpose hardware as designated AI instance types, which when selected can provide many benefits, including improved price and performance, and lower energy consumption.

Optimise data storage and management

Data storage is inexpensive and easy to use, which has led to a broad proliferation and replication of data. While deleting unneeded data will have some sustainability benefits, the active use and management of data could consume unnecessary energy.

A sustainable strategy includes eliminating data that holds no value for the organisation and selecting the most efficient types of storage technology for different forms of data. It also includes using offline storage when possible; implementing data governance policies; storing data close to the applications and processes accessing it; and minimising data replication.

Minimise data movement

Increased network usage by moving large sets of data between datacentres can increase energy consumption. The best approach to minimise this is to co-locate AI-supporting data with the AI modelling processes and applications.

Cross-cloud networking capabilities, including the use of cloud provider networking services, can minimise data movement. When data has to be transported across networks, the most sustainable approach is to ensure only the necessary data is sent.

Establish sustainable application architectures

Applications will increasingly incorporate GenAI capabilities, but not without the potential risk of introducing inefficiencies. When designing applications, be aware that AI inference activities will use more energy than AI model training.

To optimise AI inference, use smaller inference models to reduce the memory footprint, leverage inference-optimised hardware and accelerators, and use distributed access points to place AI models closer to the point of application consumption.

Run processes out of hours

GenAI processes are energy-intensive, so in addition to clean energy sources consider overall energy availability. Some geographic regions are already energy-constrained and adding additional, energy-intensive operations may overburden those systems.

Cloud datacentres may have greater access to less-expensive, green energy during off-peak times. Employ AI workload placement strategies to take advantage of energy availability in different locations.

Optimise AI model tuning

Cloud providers are currently investing billions in foundational LLMs to support general-purpose GenAI solutions. While some LLMs may need to be built from scratch, aim first to use these foundational LLMs.

Techniques such as fine-tuning can help achieve higher accuracy using existing LLMs. Other techniques, such as prompt engineering, may also produce improved resource utilisation while requiring less energy.

Ed Anderson is distinguished vice-president analyst at Gartner

Read more on Datacentre systems management