markrubens - Fotolia
Hybrid cloud file and object pushes the frontiers of storage
A single environment across on-premise and cloud environments is possible with a new class of product that builds file systems and object stores with hybrid cloud
Use of public cloud services have been widely adopted by IT departments around the world. But it has become clear hybrid solutions that span on- and off-premises deployment are often superior, and seem to be on the rise.
However, to get data in and out of the public cloud can be tricky from a performance and consistency point of view. So, could a new wave of distributed file systems and object stores hold the answer?
Hybrid cloud operations require the ability to move data between private and public datacentres. Without data mobility, public and private cloud are nothing more than two separate environments that can’t exploit the benefits of data and application portability.
Looking at the storage that underpins public and private cloud, there are potentially three options available.
Block storage, traditionally used for high-performance input/output (I/O), doesn’t offer practical mobility features. The technology is great on-premise, or across locations operated by the same organisation.
That’s because block access storage depends on the use of a file system above the block level to organise data and provide functionality. For example, snapshots and replication depend on the maintenance of strict consistency between data instances.
Meanwhile, object storage provides high scalability and ubiquitous access, but can lag in terms of data integrity and performance capabilities required by modern applications.
Last writer wins
There’s also no concept of object locking – it’s simply a case of last writer wins. This is great for relatively static content, but not practical for database applications or analytics that need to do partial content reads and updates.
But, object storage is a method of choice for some hybrid cloud storage distributed environments. It can work to provide a single object/file environment across locations with S3 almost a de facto standard for access between sites.
File storage sits between the two extremes. It offers high scalability, data integrity and security and file systems have locking that protect against concurrent updates either locally or globally, depending on how lock management is implemented. Often, file system data security permissions integrate with existing credentials management systems like Active Directory.
File systems, like object storage, implement a single global name space that abstracts from the underlying hardware and provide consistency in accessing content, wherever it is located. Some object storage-based systems also provide file access via network file system (NFS) and server message block protocol (SMB).
In some ways what we’re looking at here are a development of the parallel file system, or its key functionality, for hybrid cloud operations.
Distributed and parallel file systems have been on the market for years. Dell EMC is a market leader with its Isilon hardware platform. Also, DDN offers a hardware solution called Gridscaler and there are also a range of other software solutions like Lustre, Ceph and IBM’s Spectrum Scale (GPFS).
But these are not built for hybrid cloud operations. So, what do new solutions offer over the traditional suppliers?
Distributed file systems 2.0
The new wave of distributed file systems and object stores are built to operate in hybrid cloud environments. In other words, they are designed to work across private and public environments.
Key to this is support for public cloud and the capability to deploy a scale-out file/object cluster in the public cloud and span on/off-premise operations with a hybrid solution.
Native support for public cloud means much more than simply running a software instance in a cloud VM. Solutions need to be deployable with automation, understand the performance characteristics of storage in cloud instances and be lightweight and efficient to reduce costs as much as possible.
New distributed file systems in particular are designed to cover applications that require very low latency to operate efficiently. These include traditional databases, high-performance analytics, financial trading and general high-performance computing applications, such as life sciences and media/entertainment.
By providing data mobility, these new distributed file systems allow end users and IT organisations to take advantage of cheap compute in public cloud, while maintaining data consistency across geographic boundaries.
Supplier roundup
WekaIO was founded in 2013 and has spent almost five years developing a scale-out parallel file system solution called Matrix. Matrix is a POSIX-compliant file system that was specifically designed for NVMe storage.
As a scale-out storage offering, Matrix runs across a cluster of commodity storage servers or can be deployed in the public cloud and run on standard compute instances using local SSD block storage. It also claims hybrid operations are possible, with the ability to tier to public cloud services. WekaIO publishes latency figures as low as 200µs and I/O throughput of 20,000 to 50,000 IOPS per CPU core.
Elastifile was founded in 2014 and has a team with a range of successful storage product developments behind it, including XtremIO and XIV. The Elastifile Cloud File System (ECFS) is a software solution built to scale across thousands of compute nodes, offering file, block and object storage.
ECFS is designed to support heterogeneous environments, including public and private cloud environments under a single global name space. Today, this is achieved using a feature called CloudConnect, which bridges the gap between on-premise and cloud deployments.
Read more about hybrid cloud
- Public cloud is growing as a proportion of IT spend at the same time as storage platforms that can span private and public cloud environments are emerging. Hybrid cloud has never looked a better prospect.
- Hybrid cloud storage optimises the opportunities provided by the cloud while recognising and working with its limitations. We survey the key storage players’ hybrid cloud plays.
Qumulo was founded in 2012 by a team that previously worked on developing the Isilon scale-out NAS platform. The Qumulo File Fabric (QF2) is a scale-out software solution that can be deployed on commodity hardware or in the public cloud.
Cross-platform capabilities are provided through the ability to replicate file shares between physical locations using a feature called Continuous Replication. Although primarily a software solution, QF2 is available as an appliance with a throughput of 4GBps per node (minimum four nodes), although no latency figures are quoted.
Object storage maker Cloudian announced an upgrade in January 2018 to its Hyperstore product which brings true hybrid cloud operations across Microsoft, Amazon and Google cloud environments with data portability between them. Cloudian is based on the Apache Cassandra open source distributed database.
It can come as storage software that customers deploy on commodity hardware, in cloud software format or in hardware appliance form. Hyperfile file access – which is Posix/Windows compliant – can also be deployed on-premise and in the cloud to provide file access.
Multi-cloud data controller
Another object storage specialist, Scality, will release a commercially supported version of its “multi-cloud data controller” Zenko at the end of March. The product promises to allow customers hybrid cloud functionality; to move, replicate, tier, migrate and search data across on-premise, private cloud locations and public cloud, although it’s not that clear how seamless those operations will be.
Zenko is based on Scality’s 2016 launch of its S3 server, which provided S3 access to Scality Ring object storage. The key concept behind Zenko is to allow customers to mix and match Scality on-site storage with storage from different cloud providers, initially Amazon Web Services, Google Cloud Platform and Microsoft Azure.