DNA storage promises 10 million times storage capacity boost

A datacentre that fits in the palm of your hand? However, right now, DNA storage is an expensive chemical process that researchers are trying to make a practical proposal

Stephen Pritchard

Published: 22 Jul 2021

Suggest to a chief information officer that they could soon store 10 million times as much data as the capacity of a single hard drive and, at the very least, they are likely to be sceptical.

But such advances could be possible – and within the next few years. The reason is DNA storage. Instead of using hard drives, magnetic tape or flash memory, DNA storage holds data using the code of life itself.

With today’s science, a DNA storage system can hold 10ZB (zettabytes) of data in a device the size of a shoebox, according to John Monroe, vice-president and analyst at industry researcher Gartner. “These beautiful four-letter codes could be the ideal way to store digital data,” he says. “It is huge in terms of capacity – it holds more promise than any other archival storage format.”

Researchers estimate that data stored in DNA could last between 700,000 and a million years, far beyond the lifespan of any current storage technology. Monroe sees DNA storage replacing tape or optical drives for nearline or offline storage.

DNA itself is extremely robust, able to withstand heat and cold. And once the information has been encoded and synthesised into DNA – the “write” phase – it needs no power to keep it in that state. DNA sequencing and decoding, the “read” phase, converts the DNA’s four-letter nucleotide code back into a form that a computer can process.

But despite this promise, the idea is still some way from being a practical technology. The IT industry has yet to come up with workable, production-scale DNA storage devices. “People are still struggling with how that looks,” Monroe admits.

He believes the equipment will be the size of a kitchen appliance; others predict it could be the size of a school bus. Microsoft has already developed a more practically sized DNA encoding and retrieval machine in collaboration with the University of Washington. It is very much still a prototype, however, and not something an IT department could simply drop into an existing 19U IT rack.

Chemical romance

Current DNA encoding and sequencing is still largely a chemical process, however. That’s the reason why the Microsoft and University of Washington prototype looks closer to something you might find in a school science lab than a datacentre. And the process is currently expensive.

Sequencing 1MB of data costs about $3,500 (£2,500). And although costs are falling, this is vastly more than the cost of writing the same volume of data to flash or disk. Gartner believes the technology will not become mainstream until the cost falls to about $0.01 per gigabyte.

Alternative technologies include enzymatic DNA synthesis (EDS), which is being developed by the Wyss Institute, part of Harvard University. Researchers believe this will reduce the cost of DNA synthesis by many orders of magnitude. The team at Wyss is developing an electronic device that can synthesise data into DNA. They believe this will scale up the process by allowing the synthesis to be parallelised.

But researchers are confident that the cost and practical barriers will be overcome, if only because few, if any, technologies offer the potential of storing the vast quantities of data that can be stored in DNA.

Tech alliance

Now that academic researchers have proved DNA storage is possible, the focus is turning to practicalities.

In 2020, a group of computing industry heavyweights, including Microsoft and Western Digital, formed the DNA Data Storage Alliance along with biotech companies Twist Bioscience and Illumina, and academic researchers.

The goal is to create a viable ecosystem around DNA storage, with Microsoft and others pointing out that the field is moving from academic and scientific research towards practical data storage applications for IT. The most attractive application, at least at first, is cold storage data that is written once, and read rarely.

Other applications include media. Last year, Twist encoded – appropriately enough – an episode of the Netflix series Biohackers to DNA. Being able to record effectively limitless quantities of data, store them indefinitely and replicate them quickly, could suit the movie and other creative industries.

Other potential applications include medical data storage, and legal and compliance archiving.

This poses a few other problems, however, and these are as much around standards as technology. “For data such as WORM – write once, read many – or WORN – write once, read never – it is important that the data is immutable,” cautions Gartner’s Monroe. “You need to know that what you write, say an image of a brain today, will be exactly the same in 10 years’ time.”

If researchers can ensure that is the case, then the double helix of life could yet emerge as the best way to store our data into the distant future.

DNA storage promises 10 million times storage capacity boost

A datacentre that fits in the palm of your hand? However, right now, DNA storage is an expensive chemical process that researchers are trying to make a practical proposal

Chemical romance

Read more about future storage

Tech alliance

Read more on SAN, NAS, solid state, RAID

9 data backup trends to watch

What is neural radiance field (NeRF)?

Belgium: Researchers at Imec tackle bioconvergence

An industry-wide agreement is needed to identify AI-generated content