For as long as most can remember, applied computer storage architectures have been based on allocating volatile, expensive RAM and persistent, cheap disk. Recent trends in flash memory technology have provided more options for architectures seeking to optimize performance with an eye on cost. In the current marketplace, however, software solutions that capitalize on flash-based storage are lagging in development. Thus, software solutions that exploit flash storage offer a competitive edge.
A Bit of Storage History
The split of storage into fast/expensive and slow/cheap(er) is rooted in history. From the mid 1950s, RAM/core memory was the location for programs and working set data, whereas tape or drum might be the source and sink for persistent/secondary data. Disk drives arrived in the 1960s, replacing drums and relegating tape to archival storage. Since the 60s, the disk drive has dominated secondary storage due to advancements in controller technology and density.
Over time, the cost/GB of disk drives has plummeted while device bandwidth has increased; however, latency (or random bandwidth) is ultimately constrained by the need to rotate and seek. Unfortunately improving seek time and rotational speed has advanced slowly relative to interface speed and storage density.
The Arrival of Flash and What the Future Holds
Flash memory appeared in the 1980s and became commercially successful for small devices requiring removable storage (e.g., digital cameras). NAND flash memory—as a block oriented, random access, persistent, solid state alternative to hard drives—was initially too expensive and capacities were too small to compete broadly with disk drives. Since then,flash capacities have expanded and prices have dropped making flash solid-state drives (or SSDs) affordable on laptops, where their speed and low energy consumption is attractive.
As price/capacity measures continue falling, flash is displacing disk storage where high performance, durability and energy efficiency are valued more than bulk storage capacity. Some projections have SSD storage competitive with enterprise-class disk storage as early as 2017.
Other factors are beginning to tip the scales towards flash. For applications that value increased performance as much as raw storage capacity per dollar, flash wins. Plus, applications like real-time analytics, market data, mobile, IOT, real-time ecommerce, and others value flash over disk. It is conceivable flash will soon trump disk as the cost/GB differential continues to shrink.
The Current Reality—Hybrid Solutions
Despite the closing gap, we will be living in a hybrid storage world for some time. Flash memory does not replace disk storage, it provides another “tier” of storage as part of a palette of options storage architects can use, optimizing for factors like latency, throughput, device cost, energy/cooling cost, capacity and durability.
The addition of another storage tier has increased system complexity. To minimize complexity, flash SSDs have typically imitated disk drives. This permits a standard file system interface to be presented to applications, including databases. This imitation has performance costs of between one and two orders of magnitude in sustained I/O operations per second (IOPS). In order maximize performance, a native API must be used, which adds significant complexity compared to the standard file system interface.
To minimize complexity and maximize performance, application platforms and APIs must step in and provide seamless, high performance access to flash devices. At the API layer, cross-vendor APIs are being developed. At the software platform layer, data already abstracted as objects (and typically mapped to tables) can be mapped by a higher-level API to both database tables and flash storage.
The marriage of these two concepts—a portable flash API and a universal object mapping layer—makes the conceptual boundary between storage tiers disappear. The allocation of data becomes a deployment decision, not a programming decision. This simplifies data locality architecture and programming, seamlessly placing data that needs the immediacy of RAM, the speed and persistency of flash, and the bulk storage of disk in its proper place.
Software Solutions Lag
So the good news is that flash storage capacity and affordability are improving. The bad news is that software systems are playing catch-up.
The basics are in place at the file-system level. The basic approach to employing flash is to substitute conventional disk drives with their flash equivalents. The Flash Translation Layer (FTL) provided by the drive presents a seamless integration point at the OS level. Another strategy is automated tiered storage, which uses flash as a persistent, least recently used (LRU) cache for conventional disk storage. Asynchronous processes periodically move cold data to disk and hot data to flash.
At the data-tier level, we see some consistency as well. All major vendors integrate flash drives/modules as an option and have native interfaces to flash devices. Another strategy is using high-bandwidth flash as a cache on the server node, while retaining disk storage in its traditional role. All these strategies give users access to flash technology using standard SQL methods.
Unfortunately, at the middle/processing tier, platform support for flash is inconsistent. The limited number of middle-tier caches that exist and can persist to disk can easily transition to flash. However, if you are not seeking a middle-tier cache but a data grid, only a few data grid vendors have programmable flash integrations.
Flash is rapidly becoming a competitor to spinning disk, but software systems are behind. We have seen adoption at the file system and database layers, where easy payoffs were available, but uptake at the middle tier has been slower. Flash will continue eating away at the disk storage market, driven by the insatiable desire for ever-higher data velocity and low latency. There are competitive advantages for enterprises that capitalize on flash, but vendors delivering those capabilities limited.
About the Author
DeWayne Filppi is director of solution architecture at GigaSpaces. He is a software technologist with broad and deep industry experience, ranging from pre-sales engineering and post-sales consulting to product design, development, architecture and management. His work is focused on high-performance server platforms and cloud computing.