Green coding - PagerDuty: Mindfulness mechanics, an efficient code system is an 'off' system
This is a guest post for the Computer Weekly Developer Network written by Mandi Walls in her role as DevOps advocate, PagerDuty – the company is known for its platform designed to automate critical work with its AI-powered operations platform.
PagerDuty’s flagship Real-Time Operations (RTO) platform integrates machine data & human intelligence to improve visibility and agility across organisations.
Walls writes in full as follows…
Modern cloud computing practices have allowed our technology teams to forget about or ignore some of the more physical aspects of software development – and with those – the potential impacts of poor resource management.
If you’ve never had to physically be in a datacentre, hearing the fans, feeling the chill and seeing those weird suction cup grippers that pull up the floor, it’s perhaps harder to imagine the limitless appetite computing has for electricity. It goes beyond the server: running IT equipment harder directly impacts a second, larger, user of power i.e. cooling, double dipping the demand for energy requirements.
Resource-intensive processes
The good news is that a computational cycle can be thought of as the consumer of a unit of electricity: do fewer computations, use less energy.
Unfortunately, increasingly popular products offering AI to consumers use a lot of energy. Powerful AI systems are helping combat climate change in various ways, such as weather forecasting and tracking deforestation. Consumer solutions are putting power-hungry AI into the hands of folks looking to clean up their resumes or summarise meetings or email threads. We should be asking if these tasks are worth the amount of energy their computations will consume.
One estimate (referenced on NBC news) places the cost of an AI-generated search result at five times the cost of a traditional search. The carbon footprint and energy consumption of various blockchain technologies has also come under scrutiny. When faced with design options to use high-consumption methods, teams really should be considering the carbon footprint of their choices when other options are available. Labelling may help end users make a call on if the cost is worth it, but it’s likely other incentives, like pricing to support carbon offsets, may need to be brought in.
But, the role platform engineering improvements can play in monitoring and right-sizing infrastructure deployments across multiple platforms and environments is worth thinking about.
Cloud convenience costs
Cloud computing is convenient, but it hides the externalities related to running systems, like power consumption. This is especially difficult to manage in larger organisations with strict separation of access or duties where the cloud assets are concerned. An individual contributor (IC) software developer may not have any direct or easy access to the accounts where their services will ultimately run for security or compliance reasons, so they won’t necessarily know if something is left running that is no longer needed.
New solutions around the practice of platform engineering should help teams manage leftover, deprecated, unneeded, or obsolete components in their cloud infrastructure more easily by presenting simplified views of deployed assets to the teams best placed to make decisions about them.
AI-generated runbooks
Perhaps ironically, some AI solutions will support the engineering team. We can see that AI-generated runbooks help developers quickly build automation to fix operational challenges quicker. The faster engineers can turn a tap off, the more is saved.
Expansion of developer-centric tools can help to to manage localised and cloud-based testing with an eye towards increasing resource efficiency.
An efficient system is an off system
Developer workflow tools are also making improvements in how teams can make better use of resources for local development and testing. Ephemeral environments whose lifetime exists only as long as tests are running are easier to manage than ever. Fewer teams need to depend on long-lived, mostly idle testing environments when building their services. The most energy-efficient systems are those that aren’t running. Making the process of launching and shutting down temporary environments makes development and testing workflows faster and cheaper, as well as more energy efficient.
Do what your parents told you
For teams that don’t have granular control over how the energy their services consume is procured, the best plan is what your parents always told you: if you’re not in the room, turn off the lights.
Shut off resources that aren’t in use.
Modern tooling will help teams with the visibility necessary to make good decisions about the deployment of infrastructure, but product teams need to think critically about whether or not their solutions really require resource-intensive components like AI.