Flyalone - Adobe Stock

Carbon copies: How to stop data retention from killing the planet

Our society has a serious digital hoarding problem, which is coming at a cost to the environment. But what can be done to address it?

This article can also be found in the Premium Editorial Download: Computer Weekly: How to stop data retention from killing the planet

There’s a hidden environmental cost to the age of data that few of us ever think about – and it is time that changed. 

While the average content management system may be full of business intelligence data with the potential to revolutionise how an enterprise operates, it accounts for just a fraction of the petabytes of data being stored every day.

A report from IDC cited by The Conversation estimates that by 2025, society will be storing 175ZB (zettabytes) of data – an exponential explosion from the 59ZB total in 2020. To put it in perspective, that’s enough to fill nearly 1.5 trillion mobile phones. 

And the truth is the way data is stored has turned businesses and consumers into a race of hoarders, logging everything, just in case it is needed again one day – most of which is never going to be accessed again. 

But for better or worse, it has to be stored somewhere, and regardless of the greenest credentials of your stakeholders, it comes at a cost. A single plain text email produces around 4g of CO2. Add pictures and it’s more like 50g. That’s not insignificant – and yet it’s an environmental impact that is rarely talked about. 

A heated debate

Computer hardware, like most electronic equipment, creates heat. Lots of it. And one of the great ironies of that is that heat is extremely detrimental to the chips and diodes, meaning it has to be cooled down. As a result, even the smallest of server rooms requires cooling equipment to bring the temperature below the ambient temperature of an empty room. In effect, we’re using twice the electricity, twice the energy and twice the carbon, just to maintain the status quo. 

The way data is stored has turned us into a race of hoarders, logging everything, just in case it is needed again one day. This comes at a cost – a single plain text email produces around 4g of CO2; add pictures and it’s more like 50g

So what can be done about it? It is a question that has been plaguing the IT industry for years, and the lack of a definitive answer often makes it easier to just turn on another air-conditioning unit and look the other way. But that’s causing even more harm. So what are the alternatives?

Storing less data appears to be an obvious answer, but it would be almost impossible to implement, because who decides what parameters are worth recording and what are not? The BBC learned this the hard way when it trashed much of its TV archive during the 1970s and 1980s, assuming that it would be no use. Then came the VCR, the DVD player and, of course, streaming. Ask any Doctor Who fan and they will grimace at the number of early episodes of the long-running Sci-Fi series that have been lost, perhaps forever, because of a lack of foresight. 

So, that’s the case to justify digital hoarding. But it all has to be stored somewhere, and those facilities have to be environmentally controlled. 

Deep freeze data

Not all information necessarily has to be instantly accessible. Offline storage still has a valid place in the online world.

Take CERN, for example, the home of the Large Hadron Collider. Much of the data it has generated over the past 50 years through myriad experiments is still kept on spools of tape, and is only available if requested by, say, a university. It can take between 30 minutes and two hours to make that cold storage data available – but it is there. Of course, in another great irony, this shining beacon would be better served if all that data was digitised and could be cross-referenced – but at least, as it is, it has a much lower carbon footprint.

There’s an even colder type of storage – under 250m of permafrost in Svalbard, Norway, well inside the Arctic Circle, is the GitHub Arctic Vault. On 2 February 2020, a snapshot of every single piece of data on GitHub – every repository, every codebase, every user profile – was taken and buried in this Fortress of Solitude near the North Pole. It’s not designed to be accessed, but rather to act as a disaster recovery plan for the end of the world. But it raises the question – does our data really need to be online? 

Another increasingly popular way of keeping computer equipment cool is with water – gallons of recirculated liquid is cooled and sent through tubes to pass over heat-generating hardware, reducing the need for cooling the whole environment. The downside is that such systems are extremely difficult to retrofit – they usually involve removing walls and floors to create a complete plumbing system for the sole purpose of cooling. 

Enterprises that do not want to tear their building apart could look to replace their on-premise systems with a colocation facility. Many have water coolants as standard, but more importantly, by sharing the physical location of their data, they also share the cost of cooling and there’s less waste – after all, it costs the same to cool an on-premise datacentre whether it’s full to bursting or three-quarters empty.

Many companies are already investigating more radical solutions that are both greener and, ultimately, cheaper. Microsoft launched Project Natick in 2018 as an experiment, which saw two datacentres submerged 117ft under the Pacific Ocean. The units were kept in a dry nitrogen atmosphere, while the surrounding water acted as a natural coolant. Future iterations of Project Natick could see offshore wind farms added to create completely self-sustained, carbon neutral data storage. 

Speaking about the project after its initial pilot, Project Natick’s Ben Cutler, a member of Microsoft’s Special Project Research Group, explained that the underwater datacentres also had a failure rate much lower than land-based data storage facilities. “Our failure rate in the water is one-eighth of what we see on land,” said Cutler, “I have an economic model that says if I lose so many servers per unit of time, I’m at least at parity with land. We are considerably better than that.”

Acres of space

Nasa has just begun to send its first unmanned test missions back to the moon, as part of Project Artemis, which will eventually see manned moon missions for the first time in over half a century. With temperatures on the dark side of the moon capable of plunging to a rather chilly -230°C, and Nokia already contracted to provide a lunar 4G service, it does give rise to the question of whether datacentres on the moon could be a future possibility.  

The equipment would have to be underground, perhaps in the types of craters and lava tubes, close to the poles, that have kept water frozen as huge ice deposits for thousands of years – otherwise they risk the moon’s other extreme – a blistering 120°C daytime temperature. 

With temperatures on the dark side of the moon capable of plunging to -230°C, it gives rise to the question of whether datacentres on the moon could be a future possibility. The downside to storing data in space is latency – and maintenance

In moving spacecraft, the temperature is regulated by rotating the ship like a rotisserie to even out exposure to the sun. For a fixed environment on the Moon, overcoming these extremes remains a challenge that Artemis will be looking to explore during its missions.

The downside to storing data in space is latency – an almost imperceptible delay in voice comms between Earth and the Moon would be far more noticeable in data transfer, and as such, cold storage in space would need to be exactly that – an archive for all that information that may come in handy one day, but may never be accessed again.

There’s another problem with storage in the sea and the sky alike – maintenance. It’s not easy for a human being to pop down to the seabed to swap out a malfunctioning drive. As such, infrastructure would need to include enough redundancy to keep everything running in between scheduled visits, which would probably be once or twice per decade. 

And so, for many of these innovations, an immediate revolution is not likely. Project Natick has just begun its second phase of experiments, while any lunar infrastructure is probably decades away. So back on Earth, what else can we do to lessen the environmental impact of our digital hoarding?

One idea involves carbon offsetting, and could see data scientists grow increasingly green fingers. In these huge cathedrals made up of endless racks of hardware, there are huge spaces that could be filled with nature’s solution – plants. Datacentres require arid environments, but then, so do cacti, which, like most plants, are hungry consumers of carbon dioxide, which they can absorb from the air and store in their soil. Cacti have matured to survive in the dry conditions of the desert, so they would feel right at home surrounded by computer equipment. Who knows – maybe one day the finest tequila will be distilled using the agave grown by a billion data points. 

However, it is important to remember that carbon offsetting only works if the company creates a carbon reduction that wasn’t previously planned. Simply paying someone not to chop down trees that were not going to be chopped down anyway, as is common with many offset schemes, simply does not add up – it is greenwashing, pure and simple. 

Solutions in training

It’s also worth considering that, in the right circumstances, all this heat could be a good thing. Rather than wasting it, if the excess temperatures can be harnessed, it could be used as a cheap source of heat for homes and businesses.

At a disused London Underground station in Islington, the Bunhill 2 Energy Centre harnesses all the waste heat produced by the Northern Line and converts it to heating for local office buildings and an entire housing estate. Plans for further similar schemes are in the pipeline. If it can be done for waste transport heat, there’s no reason why the same couldn’t be done for waste-spewing datacentres – all it would take is the will.

Speaking about this project, Andy Lord, London Underground’s managing director, says: “Capturing waste heat from Tube tunnels and using it to supply heating and hot water to thousands of local homes hasn’t been done anywhere in the world before so this ground-breaking partnership with Islington Council is a really important step. Heat from the London Underground has the potential to be a significant low-carbon energy source and we are carrying out further research, as part of our energy and carbon strategy, to identify opportunities for similar projects across the Tube network.”

As much as it would be wonderful to be able to conclude by saying that we’ve solved all the problems caused by the wastage and environmental impact caused by our obsession with data, all of these ideas have drawbacks. Some are practical, such as how to swap out a hard drive at the bottom of a crater. Some are financial, as installing water cooling is not something any company would invest in on a whim. 

What’s important is that there are possibilities – lots of them – that would make our dirty data a bit cleaner, and maybe even have a positive effect on the environment. What’s required is a collective will by the industry to take these ideas forward and to make the revolution part of the solution, rather than part of the problem. 

Read more about the environmental impact of datacentres

Read more on IT efficiency and sustainability