luchschen_shutter - Fotolia
Storage class memory set to upset hardware architectures
Storage class memory and persistent memory products are set to add a layer of superfast storage media between bulk drives and memory, pushing performance to five million IOPS
There was a time when the hierarchy of persistent storage products consisted of just disk and NAND flash storage. But today a new layer has become possible, between disk/SSD and memory.
It is often called storage class memory, or persistent memory, and comprises hardware products made with flash and other persistent media to allow a number of permutations of cost, performance, capacity and endurance.
They are being used to extend the performance of existing architectures and allow storage performance that approaches 200µs and five million IOPS.
So how will storage class memory/persistent memory affect the enterprise, and which suppliers are starting to build them into products?
NAND flash has long ruled the solid-state media market because of the decreasing cost profile of products. With each successive generation of flash architecture, manufacturers have been able to reduce cost by increasing bit density, implementing multiple bits per cell and layering cells on top of each other.
But the drive for better performance is as relentless as efforts to reduce cost, and this has provided an opportunity for new products to come to market.
Samsung recently announced Z-NAND and Z-SSD, a new range of products based on modified NAND technology. Performance from the first released product, the SZ985 (800GB), shows 3.2GBps read/write throughput, excellent read IOPS capability (750K) and good write IOPS (170K). Read latency is between 12µs and 20µs, with 16µs typical for writes.
Much of the performance of Z-SSD could be the result of a high ratio of DRAM to NAND on the hardware (1.5GB of DDR4 memory), plus a high-performance controller. But Samsung has not been forthcoming with architecture details. It is rumoured that SLC NAND is being used, so the actual implementation could be some kind of hybrid device of various NAND formats.
Meanwhile, 3D-XPoint is a technology developed jointly by Intel and Micron. Again, although no specifics have been released, 3D-Xpoint is believed to be based on Resistive RAM (ReRAM) technology.
Read more about storage class memory
Flash has outstripped Moore’s Law. If software development caught up, the solid-state datacentre would be a cost-effective reality, bringing a step-change in agility and performance.
The interest in persistent memory continues to grow, but a few things – including support from server suppliers – must happen before widespread adoption.
ReRAM uses electrical resistance to store the state of bits of data, compared with NAND technology that stores bits using an electrical charge. In contrast to NAND flash, which has to be written and read in blocks, 3D-Xpoint is byte-addressable, making it more like DRAM but of course, non-volatile.
To date, only Intel has released 3D-Xpoint, under the Optane brand name. Micron uses the brand name QuantX, but has yet to release any products.
Datacentre products such as the Intel DC P4800X delivers 2.5GBps write and 2.2GBps read performance with 550,000 IOPS (read and write) and latencies of 10µs (read/write). Endurance is around 30 DWPD (drive writes per day), much higher than any traditional NAND flash products and comparable to Z-NAND from Samsung.
Between the speeds of DRAM and Z-SSD/Optane lies magneto-resistive RAM or MRAM, specifically STT-MRAM (Spin-transfer Torque MRAM) from Everspin.
Devices such as the Everspin nvNITRO NVMe accelerator card offer effectively unlimited endurance, 6GBps read/write performance with 1.5 million IOPS at average latencies of 6µs (read) and 7µs (write). Unfortunately, capacity is the compromise here at only 1GB per device.
There are other companies also working on ReRAM and STT-MRAM products, including Crossbar and Spin Transfer Technologies, although no details of products from these companies have been made public.
Storage class memory applications
With a plethora of media to choose from, users and appliance supplier now have a range of deployment options.
The more expensive and lower capacity devices offer the capability to be used as a host or array cache that add the benefit of persistence compared with simply using DRAM. The extended endurance of these products compared with NAND flash also makes them more suited for write caching or as an active data tier.
Hyper-converged infrastructure (HCI) solutions can take advantage of low latency persistent memory deployed into each host. Placing the persistent storage on the PCIe or even the memory bus will significantly reduce I/O latency. But this also risks exposing inefficiencies in the storage stack, so suppliers will want to be quick to identify and fix any issues.
Disaggregated HCI solutions, such as those from Datrium and NetApp, should see large performance improvements. In both cases, the architecture is built on shared storage with local cache in each host. Performance is already high with NAND flash, but offers more resiliency (and less cache warm-up) with persistent caching using products such as Optane. There are also other disaggregated storage solutions starting to use faster media, such as Optane. We’ll look at these in the product roundup below.
We’re unlikely to see widespread adoption of these faster storage products in traditional all-flash arrays. Storage networks introduce too much overhead and the shared controller model means throughput and latency can’t be fully exploited. Price is also a factor here.
We can, however, expect to see suppliers use these fast storage class memory/persistent memory systems as another tier of storage that add an extra bump to performance. It becomes a cost/benefit game where some extra high-speed storage could deliver significant performance improvements.
Storage class memory supplier roundup
Who’s using these new systems? In December 2016, HPE demonstrated a 3PAR array using Intel Optane as a cache (a feature called 3D Cache). At the time, HPE were claiming latency figures that were half of those seen without 3D Cache, at around 250µs and an increase in IOPS by around 80%.
Tegile, now a Western Digital company, is believed to use Intel Optane NVDIMM products in its NVMe IntelliFlash arrays. This allows the company to offer products such as the N-Series with latencies as low as 200µs.
Storage startup Vexata has developed a new architecture that uses either all-flash or Intel Optane technology. Optane-based systems claim a typical latency level of around 40µs with seven million IOPS and minimum latencies of 25µs.
In August 2017, E8 Storage, a startup supplier of disaggregated solutions announced the E8-X24, a system based on Intel Optane drives. Although no performance figures have yet been released, latency figures are expected to be well under the 100µs (read) and 40µs (write) figures quoted for E8’s existing NVMe flash system.
Apeiron Data Systems, another disaggregated all-flash storage startup has a system based on 24 NVMe drives in a shared storage chassis. Performance figures are quoted as low as 12µs when using Optane drives, with more than 18 million IOPS.
Scale Computing has announced the results of testing its HC3 platform with Intel Optane, seeing figures of around 20µs response time in guest virtual machines. The capability is made possible by the SCRIBE file system that is integrated into the Linux kernel of the platform.
VMware announced the results of testing with Virtual SAN and Intel Optane in March 2017. This showed a 2.5x improvement in IOPS and a reduction in latency by a factor of 2.5. In the virtual SAN architecture, it’s expected that Optane would act as the cache tier, with traditional NAND flash for capacity in an all-flash configuration.
NetApp has demonstrated the use of Samsung Z-SSD technology in ONTAP storage arrays. Figures claim a three times greater number of IOPS using Z-SSD as a non-volatile read-only cache. NetApp acquired Plexistor in May 2017 and used the technology to demonstrate a disaggregated architecture at the 2017 Insight event in Berlin. This delivered approximately 3.4 million IOPS at around 4µs of host latency.
We can see from these suppliers that storage class memory/persistent memory is being used to extend the performance of existing architectures.
A few short years ago, we were looking at 1ms latency and one million IOPS as the benchmark for storage arrays. Today those numbers are starting to creep towards 200µs and perhaps five million IOPS.
New architectures (HCI and disaggregated) are being used to minimise SAN overheads even further. Faster storage is available, but at the top levels of performance users may well have to rethink their application architectures to gain the most performance, as the media becomes less of an overhead when storing data.
Looking at the future of media, 3D-Xpoint appears to have just got started, with performance and capacity improvements expected over the coming years. Other technologies (Z-NAND and STT-MRAM) will need to reduce costs and improve capacities to compete. Expect to see companies such as Huawei develop persistent memory products to complement the SSDs they already produce.