Helene - stock.adobe.com

Software-defined storage: Startups bring innovation to SDS

We survey the startup suppliers pioneering new approaches to software-defined storage, including block, file and object, and VM and container-oriented approaches

This article can also be found in the Premium Editorial Download: Computer Weekly: Visa reveals causes of payment chaos

Datacentre infrastructure has standardised on the Intel x86 architecture and ushered in a transition to commodity hardware for storage.

That has allowed a whole ecosystem of storage software to develop that enables IT organisations to build to their requirements without being reliant on supplier hardware choices.

In this article we take a look at where software-defined storage has reached and some of the more interesting product developments.

Software-defined storage 101

Let’s start with a quick recap on what software-defined storage actually is. Although there’s no official definition, generally, software-defined storage meets the following criteria:

  • Commodity hardware: Solutions are built from commodity components, such as standard servers, controllers and drives.
  • Hardware abstraction: Refreshing or replacing the hardware should have no impact on the delivery of services. For example, performance should be delivered by Quality of Service, not the speed of the storage media.
  • Software-based features: Data services like compression and data deduplication are implemented in software, rather than bespoke hardware components.

Although suppliers class their products as software-defined, the actual delivery model may be software-only or bound to an appliance. Many vendors choose to offer hardware, as it reduces the work needed by the customer in selecting and testing components that will work together.

The software-defined model can work out cheaper and more flexible than deploying traditional storage hardware. IT organisations can be more flexible and innovative in the design of solutions, or use the features of software-defined storage on existing hyper-converged or public cloud platforms.

Innovations

Since we last discussed software-defined storage, the marketplace has continued to evolve. We now see products that address the three main workload types in the datacentre – block, file and object.

But, as the idea of hybrid cloud has evolved, many solutions can be deployed equally well in the public cloud as well as the datacentre.

Read more about software-defined storage

The idea that public cloud needs to be supplemented with software-defined storage solutions, when storage offerings already exist, may seem somewhat at odds with the idea of cloud services.

But, while cloud service providers offer solutions that fit most use cases, enterprises may expect more. In addition, cloud storage operates in a relatively isolated fashion, so efficient integration between public and private clouds doesn’t really exist – unless a software-defined storage layer is added over the top.

Building for developers

Many new software-defined storage solutions are built with a focus on developers and Open Source. Here the requirements are based around ease and agility. Performance is less of a consideration, but the ability to integrate with orchestration platforms like Kubernetes is critical.

The interesting development here is that storage is rapidly becoming another service to consume anywhere, regardless of the platform, instead of being a major structural component. Storage is becoming a shopping list item, just like virtual machines or containers.

Take your pick

One issue that will continue to be a problem for IT organisations is how to validate the practicality of solutions for their needs. With lots of products to choose from, picking the right one could be hard.

However, this is where the benefit of software-defined storage and in particular Open Source comes in. 

With most of the solutions we will discuss in the supplier roundup, installation is pretty simple and can be done on a few VMs or cloud instances. This makes it incredibly easy to try before you buy.

This is likely to be an area in which software-defined storage suppliers will step up and address – making their products readily downloadable and installable.

Software-defined storage supplier roundup

In our product roundup, we focus on smaller, newer startups that are bringing software-defined storage products to market. There’s a mix of block, file and object, with solutions that focus on one of these use-cases, rather than trying to meet them all.

StorPool

StorPool is a startup focused on block storage. The company was founded in Bulgaria in 2011, although it now has a presence in the UK and US. The StorPool solution is a distributed scale-out block storage platform using commodity components that runs on Linux.

Nodes can be either storage providers or consumers, enabling deployment either as a dedicated storage platform or as hyper-converged. Non-Linux platforms are supported through exporting iSCSI volumes to, for example, Windows and VMware vSphere.

Typically, the company has aimed at managed service providers and solution providers looking to develop solutions for their customers. StorPool is perhaps unique in that the company installs and directly supports the platform. This includes upgrading and patching software.

MooseFS

MooseFS is another eastern European storage startup, this time focusing on a distributed file system. The platform is provided either as an open-source release (under GPLv2), or as an enterprise product (MooseFS Pro) that includes enterprise-level features supporting high availability.

In common with most software-defined storage solutions, MooseFS is deployed on Linux, with servers supporting centralised management, metadata backup and storage capacity (or all functions). Client support includes Linux, Windows and MacOS. There is also support to run MooseFS on ARM, with Raspberry Pi 3 hardware and Raspbian Jessie.

MooseFS is fully featured with tiered storage support, erasure coding, POSIX compliance and deployment either as purely storage or in a hyper-converged mode. The software can easily be deployed into a VM, instance or on dedicated hardware.

Minio

Minio is an object storage platform that is built from containers. The company was founded in 2014 and is the main developer behind Minio software, having raised $20 million towards the end of 2017.

Minio can be run either natively in Docker, with Kubernetes, or in a range of public cloud environments, including Microsoft Azure and Google Cloud Platform.

A Minio deployment uses persistent storage provided by the Docker volume plugin and a range of vendors that can include traditional storage platforms such as NetApp, or simply local drives on a virtual machine or cloud instance. The ease of deployment of a Minio cluster makes it highly practical to automate and build on-demand for DevOps environments.

OpenIO

OpenIO is a storage startup based just outside Lille, France. The company has developed an open-source object storage solution called SDS. In common with most of the solutions we’re discussing, SDS is a scale-out, node based architecture, that runs on Linux and is deployable on x86 or ARM architectures. SDS can be deployed in mixed configurations, including on virtual and physical hardware.

SDS extends the classic model of an object store by providing a serverless framework that interoperates with content stored in the object store itself.  As content is added to SDS, triggers enable actions to be called which can perform a range of tasks, such as validating and indexing content or transcoding.

The OpenIO open source model provides access to the software with basic features that can be extended with web GUI support, erasure coding and additional deployment tools under a paid plan.

Rozo Systems

Rozo Systems is a start-up company, originally founded in Nantes, France, in 2010. RozoFS software is a POSIX-compliant scale-out file system that uses a patented erasure coding technique called the Mojette Transform, which was developed at the University of Nantes.

RozoFS is deployed on Linux, either on dedicated hardware, as a VM or as an instance in public cloud. Performance and capacity is scalable from four to 1,024 nodes, with support for up to 1,024 data volumes. The solution is aimed at high-performance and parallel workloads, such as analytics or content distribution.

Read more on Server hardware