cherezoff - stock.adobe.com
Cloud storage 101: Specifying for cloud storage
We run through key questions to ask when specifying cloud storage, such as disk type, performance, availability and the possible hidden cost of getting data out of the cloud
Public cloud has changed the way many enterprises think about delivering their application infrastructure. Software-as-a-service (SaaS) platforms, such as Microsoft Office 365 for example, have become the norm, allowing businesses to quickly and efficiently deploy the infrastructure they need.
But other areas of business infrastructure are also well suited to the cloud. A key one is storage. Cloud storage offers ease of scale, limitless capacity and flexible cost models that are difficult to achieve on-premise.
But how do we decide whether cloud storage is right for our workloads? Here, we provide a few pointers for what to consider when it comes to public cloud to ensure it is right for your organisation.
Cloud storage is a big subject, so we will limit the scope of this article to focus on presenting dedicated disks as we might to a traditional on-premise environment built around servers and applications.
So, where do we need to start when it comes to designing our cloud storage infrastructure?
It’s not just click-and-go
Just because it’s cloud, we can’t assume we can click “enterprise storage” on a provider’s marketplace and assume that’s “job done”. It’s important we treat the design of cloud-based storage with the same care and attention we would if it were on-premise.
Our first step must be to understand the demands of our enterprise applications so we can profile the dataset and understand its requirements before trying to choose from the plethora of options available.
To be more precise, during this process it’s important to define the performance, capacity and availability profile of our data.
Key cloud choices
• Media types: flash storage and spinning disk
Flash disks are the de facto standard in public cloud. The big providers offer most of their disk-based solutions on solid-state drives (SSD). They are usually tiered, so choosing the right tier of SSD is important.
If we look at Microsoft, its flash tiers range from Ultra SSD to Standard, with choices possible in capacity, input/output operations per second (IOPS) and latency terms.
If we don’t want to spend on SSD, there is still the option of standard hard disks. They are often presented as not providing “production” levels of performance, but can certainly be cost-effective for test and development environments.
• Performance
Performance is at the heart of cloud storage choices and there are two key criteria to consider: input/output (I/O) performance in IOPS and latency.
Storage tiers offered all have their own I/O and latency profiles.
For example, Microsoft Ultra SSD provides 160,000 IOPS at sub-millisecond latency, which should meet the requirement of even the most demanding applications. It comes at a price, however, so understanding workload and budgetary constraints is key to picking the right storage performance tier.
• Availability
When it comes to availability, cloud can deliver real value.
It is possible to replicate across datacentres in-house, but this can quickly become costly and complex. Public cloud allows businesses to take advantage of the provider’s geographic locations via the use of “resilience zones”.
Again, using Microsoft by way of example, Azure offers three zones:
- Local redundancy provides resilience within the same datacentre fault domain.
- Zone resilience extends this with multiple copies of data housed in multiple datacentres within a geographic region – UK west to UK south, for example.
- Geographic redundancy to replicate data to a completely different geography – from Europe, the Middle East and Africa (EMEA) to the US, for example.
• Data protection
Having specified levels of availability in the event of loss of infrastructure components, protecting data in the cloud is still our organisation’s responsibility.
Cloud providers offer some native tools to protect workloads in the form of snapshots or more traditional backup in the cloud platform.
But it’s important to remember that data protection has an impact, consumes capacity and therefore adds costs. It may also be the case that native backup is limited, for example with workloads only protected inside the cloud provider’s infrastructure, which may not meet the demands of our organisation’s data protection policies.
To address these kinds of shortcomings there is a rapidly growing ecosystem of third-party cloud-to-cloud backup providers that can extend protection of business data from standard cloud provision.
• Data egress
When moving data out of a cloud datacentre, most cloud providers will apply an Egress charge.
Costs will vary depending on where the data is moving to, but it’s important to know these charges exist so businesses can plan to get their data out, should the need arise.
This plan has to go beyond the costs of egress to embrace the potential complexity of moving data around.
Natively this can be difficult as there is no standard for data, virtual machine or application formats that can guarantee a transformation-free movement between destinations.
It is also important to remember that data has “gravity”. In other words, it takes time to move it, and moving many terabytes can potentially take many days.
There is, as with data protection, a growing partner ecosystem that provides tools aimed at easing migration complexity. But even with these, the cost and/or time to move from one cloud to another is likely to remain, so this should be taken into consideration in case it is a possible scenario that could arise.
• Cost
All the options we have discussed have a wide-ranging impact on cost, and performance and understanding their impacts is critical when designing and specifying cloud storage.
As an example, 10TB of Ultra SSD from Azure with 10,000 IOPS will cost $4,500 per month. Meanwhile, 10TB of Standard SSD will cost $1,400 per month but only offers 500 IOPS. That’s quite a difference. The cheaper option may be attractive, but may prove very costly if it doesn’t meet the requirements of your workload.
It is important to spend time and take advice to ensure you fully understand the cost impact of your decisions.
Essential questions
Cloud storage is hugely flexible and most of our workloads could be candidates for cloud migration. But, that doesn’t mean we should put them there.
The choices are numerous and we must treat our cloud design with the same care and consideration we would on-premise.
It’s easy to assume all cloud services are like a SaaS platform, but they are not. We need to understand our requirements and the impact they have on our cloud storage decisions.
Before making a cloud storage decision, be sure to answer the following questions:
- What are our workload performance characteristics?
- What type of cloud media is appropriate to meet those performance demands?
- What levels of resilience are needed?
- What would be the impact of moving data out of the cloud?
- Do we fully understand the public cloud cost model?
The points covered here offer a high-level introduction. There are other considerations, such as data privacy, security, and even different storage and compute models. Hopefully this article will help you start exploring cloud storage and provide the basis of a successful deployment.
Read more about cloud storage
- Computer Weekly looks at the biggest four cloud storage providers, how they stand in the market, the products they offer, and which offers the widest range of products and features.
- We survey the big three cloud providers – AWS, Azure and Google – and find a range of mostly block storage flash storage options with performance choices available.