Fotolia
Key storage choices: Cloud vs tape for archive storage?
Tape still has benefits, such as an ‘air gap’ that can insulate archives from threats to data integrity. But what are the opportunities for cloud in tape’s traditional use cases?
Tape is the Mark Twain of data storage: reports of its death are certainly exaggerated.
Tape continues to play a role in the enterprise, and not just because it is a tried and tested technology in backup and archive. Tape has attributes that set it apart from other media. This continues to apply, even as more enterprises move data to the cloud.
Magnetic tape has been around since the 1950s, yet it’s still a key component of data backup and recovery, and archiving. These are applications where offline storage is an advantage, rather than a disadvantage.
Tape is uniquely suitable for offsite storage, as the media itself is lightweight and more robust in transit than the hard drive. The way tape operates, with the data separated from the read/write mechanism, creates a natural “air gap”.
This air gap is one reason tape continues to find favour with storage and disaster recovery experts. A completely separate data store is resilient against problems caused by code errors or other problems with production applications.
Increasingly, organisations are also turning to tape because it offers a fairly high degree of protection against ransomware.
Online or nearline systems are vulnerable to the same malware that targets core production systems. If tape storage is managed well, it should avoid cross-infection by ransomware payloads.
Cost advantage
Cost, too, is an advantage for tape. Hard drives and solid-state storage continue to fall in price, and capacities have now reached 16TB per disk. But tape storage has no theoretical limit, as long as the user has a robust system for managing and storing the cartridges. Most businesses today use LTO-based tape.
And, while tape systems remain relatively expensive to buy – an issue especially for smaller businesses – the incremental cost of adding capacity is far lower than on a disk-based array. A latest-generation LTO-8 tape stores 12TB, which rises to 30TB with compression. This, according to storage consultant Barnaby Mote, CEO of 4sl, works out at about 0.4 pence per gigabyte.
Media costs, once the business has bought tape drive hardware, are low. Running costs are a further factor in tape’s favour. Disk- and flash-based systems need power and cooling. Tape will last longer in a climate- and moisture-controlled environment, but it can be stored in an office.
These low lifetime costs makes tape suitable for long-term data storage and archiving in a range of industries, including financial services, oil and gas, research and media. Even so, more industries are looking to the cloud for archive storage.
The problems with tape
Tape, however, has disadvantages, and these are prompting CIOs to look at alternatives. As IDC research director Phil Goodwin points out, tape backup is a largely manual process. “It depends on human labour to load and unload, and tapes can be lost, broken or wear out,” he cautions. Although suppliers have created automated Tape libraries – sometimes known as tape jukeboxes – these are expensive and also mechanical.
Automation changes tape into a nearline storage system. This suits applications where organisations need to keep data for long periods, and access it occasionally, or need to store large quantities of contiguous information, such as video files in the broadcast industry, or scientific research data. CERN’s Large Hadron Collider data, for example, is stored on tape. But even tape library systems need manual management.
Tape, too, is slow. Read speed on LTO-8 tape is a respectable 360MBps for uncompressed data. This is comparable to a 7,200rpm hard drive, but far slower than flash-based storage. However these speeds do not include the time penalties that come with changing the cartridges themselves.
Although LTO-8 tape can be configured for write once read many (Worm) for archiving, a key attribute for compliance, and data protection, businesses may find that tape is too slow for backup and recovery.
Tape-based systems, for all their other advantages, can struggle to meet increasingly strict recovery time objectives (RTOs). Systems such as tape libraries allow for an effective retrieval time for archived data, but restoring a whole enterprise system from tape is likely to mean the business is offline for too long.
“Use cases for tape, such as disaster recovery with restore from tape-based backups, simply don’t provide the performance levels required by many businesses in 2019,” cautions 4sl’s Mote. “What was good enough 10 years ago just isn’t acceptable now when recovery from disk-based data – be it on premise or in the cloud – is readily achievable?”
To close the gap, businesses need to use tiered systems – combinations of flash, online and nearline disk and tape – or consider entirely different approaches to disaster recovery. These include mirroring data or synchronous replication between primary and secondary storage systems, and more and more, backup and recovery in the cloud.
Is cloud the middle way?
Cloud-based storage, at first glance, seems the obvious alternative to tape. And in some ways it is.
As IDC’s Goodwin points out, mainstream backup and recovery tool providers, as well as archiving vendors, including companies such as Acronis and Commvault, now have either a cloud-based offering or provide for use of the cloud as a target. The goal there is to make offsite backup to the cloud transparent to the user.
The large cloud-based storage providers also have their own archiving services, and increasingly pitch these as an alternative to tape.
Amazon Web Services has Glacier, and Microsoft offers Azure Archive. Google Cloud has Nearline and Coldline storage, priced as low as US¢1/GB per month for nearline, and ¢0.4/GB per month for coldline deep archive. IBM has its Cloud Object Storage Archive, which, at ¢0.2/GB, is among the cheapest in the industry.
But not all backup products can integrate directly with cloud storage, and not all work the same way, which can cause application compatibility and operational issues. Data managers need to ensure their business applications, backup products or services and cloud providers are compatible, and offer the right service-level guarantees.
Read more about tape and cloud storage
- ITV goes on air with SpectraLogic LTFS tape backup and archive. 2PB of SpectraLogic tape front-ended by Black Pearl LTFS object storage access creates integrated archive.
- News UK saves up to 40% on hardware and software costs by switching in-house hardware for a cloud-first policy, using Zadara cloud storage as a backup target.
Cloud services have other disadvantages too. The first, surprisingly, is cost.
Although the cost per gigabyte is low, cloud-based storage incurs monthly or annual fees. These can quickly add up. Then there are additional “bandwidth” charges for accessing or restoring data.
Although it is hard to generalise across all businesses, storage consultancy ProStorage calculates that cloud storage is generally viable for organisations that store up to 50TB for the long term. Above that, scale economies favour tape. This reflects the upfront cost of tape storage hardware, as well as the ongoing cost for large volumes of data in the cloud.
Another factor, and one which is as much about practicalities as cost, is the time it takes to move significant data volumes to the cloud.
Data upload speeds over the public network remain a barrier. Smaller or newer businesses, or businesses that already use cloud-based applications, will find it easier to move backups or archives to the cloud. For businesses with substantial data volumes on-premise, phasing in cloud storage, or running it alongside nearline disk and tape, is more practical.
As IDC’s Dawson says: “We don’t say go cloud or go tape, it is more and/or, with technology being driven by the business.”
Storage tiers in the cloud era
Storage tiering can be broken down into four categories:
Tier 1: Flash or solid-state storage, used for the most mission-critical applications.
Tier 2 and Tier 3: The next two tiers generally refer to fast disk and slow disk; before flash emerged, fast-spinning disk offered the quickest access times and was the preferred option for critical use cases. Both of these disk-based tiers are online, meaning that data can be accessed without any significant delay.
Tier 4: The lowest tier would typically refer to a medium such as tape, although there are others. As tapes store data on a linear magnetic strip, there is an inherent delay in storing and retrieving information.
Leading cloud providers offer all these options, although they are coy about what hardware underpins the lowest cost and slowest tier, which – alongside the marketed performance characteristics – has led to speculation that it is tape-based.
Source: 4sl