Fotolia

How to build SME storage using public cloud storage

Small companies can often take advantage of public cloud storage offerings to supplement or replace on-premise infrastructure. We look at key options in file, block and object storage

This article can also be found in the Premium Editorial Download: CW Middle East: CW Middle East: Bahrain aims to be fintech hub

The technology needs of small and medium-sized enterprises (SMEs) are less demanding than those of large enterprises, but important nonetheless. And key among these is storage.

In the past, this often meant scaled-down versions of the storage used by enterprises, but now we live in an era when cloud storage is an increasingly viable option. This is particularly relevant to SMEs, which lack the IT department resources of larger organisations.

There are a variety of ways in which SMEs can adopt cloud storage. Using the cloud puts budget to operational expenditure and can significantly reduce the capital cost of buying and maintaining storage hardware. The question for SMEs is where those benefits can best be applied.

Public cloud works well with secondary data, like backups. Also, file storage for home directories and shared data can be a good cloud storage use case, although security needs some thought and planning.

Traditional core applications, such as databases and enterprise resource planning, are probably the most difficult to fit to cloud storage.

On-premise hyper-converged infrastructure can be a good fit, and this removes the need for storage area networks (SANs) and the associated skills. For SMEs that want to modernise their storage infrastructure, however, there are numerous ways to benefit from cloud storage offerings while combining these with on-premise capacity.

Primary storage and secondary: Structured and unstructured

Typically, storage requirements divide into two main areas. Primary storage defines the requirements of production workloads that run the business.

Secondary storage is anything that’s not production data, so can be anything not in current use. The next key differentiation is between structured and unstructured data.

Structured data is data held in a data model, such as databases that drive ERP, transactional processing or websites. Transactional systems are typically deployed on block-based storage solutions such as SANs because of the latency-sensitive nature of the data.

Unstructured data is anything that falls outside a database-type model, so can include almost anything from office documents to images and streaming video, although often these types of data will contain metadata headers that can be interrogated and which therefore make it semi-structured data.

Unstructured data, whether primary or secondary, is often held in NAS/file access or object storage.

This means we will see requirements for block, file and object protocols among SMEs. And it’s not surprising that the public cloud suppliers have aligned themselves to these requirements to offer each protocol in their storage portfolios. 

Block storage and the cloud

In the public cloud, block storage is usually only accessible by local virtual compute instances. There are two main reasons for this.

First, virtual instances need block storage for boot and local data drives. These are generally implemented in virtual environments within the hypervisor that runs the virtual instances.

The second issue is one of performance. Block-based storage and applications are latency sensitive, specifically to the response time of individual input/output.

Meanwhile, on-premise shared storage – such as in a SAN – can offer sub-10 millisecond response times from hybrid arrays, with sub-millisecond typical for all-flash systems.

Should an SME want to use block storage and the public cloud, how can it be done? One solution is to use a storage gateway. These are hardware and software appliances that live in the on-premise datacentre and present block storage locally over protocols such as iSCSI.

Data is periodically archived to public cloud to offer a form of data protection, or the ability to burst or scale on-premise capacity to the cloud. Solutions exist from Microsoft (StorSimple) and Amazon (Storage Gateway).

Another alternative is to move applications to the public cloud and use block-based cloud storage there. This probably needs to be part of a larger strategy to make use of public cloud in general.

Unstructured data

Two options exist to manage unstructured data. The first is to use file-based storage in the public cloud, and the second is to use object storage.

In both instances, the protocols involved (NFS/SMB for files, HTTP for object) will work over a wide-area network, although file performance can be latency sensitive.

File storage offers similar functionality to on-premise NAS appliances. Using a cloud-based solution removes all the infrastructure management issues typically seen when deploying hardware in the datacentre. New file systems can be created and scaled dynamically, subject to the limits of the cloud provider’s offering.

Read more about cloud storage

The maturity of cloud-based file services has increased over the past 12 months as suppliers such as NetApp have started to offer existing and mature storage offerings as native cloud services.

A cloud-based file storage solution can offer cost savings, as well as operational benefits.

Platforms such as Nasuni Primary offer global file access, wherever the customer is located. This makes it easier to implement disaster recovery, without having expensive array-based replication or backups.

Global access also reduces duplication of data, where data is copied to multiple locations for performance or operational reasons. With data single-instanced, there is also a much lower risk of accidentally using out-of-date content.

File storage security and data protection

Use of cloud-based file storage also brings new challenges. The most obvious is security.

Although data can pass across the public internet and be encrypted, this isn’t an ideal solution. SMEs will probably want to invest in local point-to-point VPN connectivity with the cloud provider, but this adds some cost and complexity.

Data should also be encrypted at rest within the public cloud using customer-generated keys for extra safety.

On-premise file services still need the same level of operational management as on-premise systems. Security, is one obvious area, which includes in-flight and at-rest encryption.

Credentials management is another area of focus, and some suppliers offer integration with Microsoft Active Directory and Lightweight Directory Access Protocol (LDAP).

Finally, remember that data in public cloud isn’t backed up by default. Public cloud service providers will commit to service level agreements on uptime, but any backup will be in place only to bring the service back online.

The cloud provider will not recover accidentally or maliciously deleted data, so you should also look at cloud-to-cloud backup.

Object storage

As an unstructured storage protocol, object storage represents a great way to store large volumes of data at a cost-effective price.

Objects are simply files that can range from a few kilobytes in size to multi-gigabytes and are usually stored in large logical containers like buckets (in AWS).

Object storage uses HTTP as its underlying protocol, with requests issued through Rest-based application programming interfaces. As a result, each object store request is effectively an independent event, so features such as file locking are not offered.

Object storage is good for streaming-type access or large-scale processing of large numbers of files (in analytics, for example).

SMEs could use object stores for content that rarely changes – document repositories, video and audio media training libraries – or where the entire object is replaced each time it is refreshed or changed. 

Cost efficiency

Cloud service providers offer features to optimise the placement of data based on pre-defined policies. The customer can, for example, put in place a process to move less frequently accessed content to cold storage such as AWS Glacier.

The cost savings can be significant, although there are restrictions on accessing cold data. Backups and archives are great for placing on object stores with policy-based tiering in place.

One word of warning, though, when looking at object stores: data volumes can increase significantly, for two main reasons.

First, if previous versions of files need to be kept, each object will be charged at full prices, although they can be tiered by policy to cheaper storage).

Second, cloud providers pass on none of the benefits of internal storage features such as data deduplication. In the worst case, two versions of a 10GB file that differ by only a single byte would incur a 20GB charge.

Data protection

A great use of public cloud storage is for data protection.

Object stores, as we’ve discussed, offer low-cost, long-term storage with capacity that is effectively unlimited. That can work well as a backup target. Object protocols are also well-suited to the streaming nature of backup data.

With data in a central location, recovery can be performed from multiple offices, with cloud providers offering the ability to replicate data between their datacentres and geographic locations.

Use of object storage for backups will not, however, be able to take advantage of native data deduplication. This needs to be included in backup software to implement this feature, otherwise the cost of taking many similar backups could become very expensive. 

Marketplace

Finally, we shouldn’t forget that many storage suppliers offer cloud-based versions of their existing hardware and software solutions.

Rather than refresh to new hardware, this offers an opportunity to move to an operational model and reduce the on-premise hardware footprint.  .... . . ......   .... . . ..........

Read more on Cloud applications