By Martin Fink
Data growth has been on a rocket-ship trajectory, and IDC is expecting 103 zettabytes of data to be generated worldwide by 2023.[1] With the proliferation of IoT devices, 5G-enabled technologies, and the massive growth of video, we’re just scratching the surface regarding how companies will be storing and extracting value from data. Yet, if there’s one thing clear about entering the zettabyte age, it is that we have to reconsider how we architect data centers to meet these demands.
Rethinking Storage Architectures for the Zettabyte Age
The innovation, products, and requirements for this coming architectural shift will depend on several key things:
The first is that we need to disaggregate compute, storage, and network to leverage each component in the most efficient and optimal way. Disaggregation is the only way to deal with the volume, velocity, and variety of the data coming down the pike in this zettabyte age.
The second is that data infrastructure will need to be purpose-built. We can no longer depend on general-purpose solutions–that is, one solution cannot be “good enough” to solve across-the-board needs. We need to maximize efficiencies and focus on our purpose: delivering the perfect balance of performance, density, and cost in the zettabyte world.
The third is that there must be collaboration and intelligence among elements in the pipeline. Hardware and software need to interact together, and we need to understand the full stack in order to design hardware and software to holistically maximize performance and functionality.
The First Step — SMR Technology
We’ve been doing a lot of work with the open source and Linux communities to contribute to the core technologies of SMR (Shingled Magnetic Recording). By overlaying tracks on a disk, we can achieve roughly a 20% increase in capacity. This requires data to be written sequentially so that it will not alter an underlying write track.
For many hyperscalers, sequential writing is a good fit due to the write once/read many nature of large-scale workloads like video streaming. But the ramp-up for customers to deploy SMR requires rearchitecting the host end of things—modifying the operating system to stage writes sequentially or even enabling the application to be aware of the sequential write model.
Rearchitecting can require some effort initially, but the density and cost benefits are substantial and demonstrate all the advantages of purpose-built hardware and software-aware constructs. Today, our customers are already deploying SMR technology, and we expect that 50% of the HDD exabytes we ship will be on SMR by 2023.
The Next Step — Zoned Namespaces
It may sound strange to bring SMR HDDs and SSDs into comparison because, in many ways, these technologies are a world apart. However, as we look at SSDs and NAND to be part of this disaggregated future, we’re seeing a companion technology to the SMR/HDD space called Zoned Namespaces (ZNS), based on the NVMe™ standard and the initial work done on Open-Channel SSDs.
NAND-based media can handle only a certain number of writes and, as a result, it has to be managed. The Flash Translation Layer (FTL) intelligently deals with everything from cache to performance, wear leveling, garbage collection, etc. However, at the zettabyte scale, device level management brings indirection between the host and the actual media and impacts throughput, latency, and cost.
In an era where we want to control these elements and maximize efficiencies, we have to look at moving this management from the device level to the host—exactly how SMR is approached.
ZNS divides the flash media into zones, where each zone is an isolated namespace. Cloud providers can, for example, separate workloads or data types to different zones so that usage patterns are predictable among multiple users. Yet, more importantly, like the SMR construct, data is written through a zone in a sequential manner. Suddenly, you don’t need all that media management. This has some astounding implications:
• Reduced TCO due to minimal DRAM requirement per SSD
• Additional savings due to decreased need for over provisioning of NAND media
• Better drive endurance by reducing write amplification
• Dramatically reduced latency
• Significantly improved throughput
Zoned Storage — A Unified Platform Supporting SMR and ZNS Technologies
As part of our Zoned Storage Initiative, we’ve been working with the community to establish ZNS as an open standard that can use the same interface and API as SMR. This important step allows customers to adopt a single interface that can communicate with the entire storage layer. Data center architects can now make the transition to zettascale architectures more easily as applications don’t have to change regardless of the storage environment they choose.
Zoned Storage affords us an exciting opportunity to reach a new balance between performance, latency, and cost using disaggregated, purpose-built, and intelligent architectures. Western Digital is in a unique position to support this approach on both sides of the SMR/HDD and ZNS/SSD technology equation, and we have taken it upon ourselves to maintain and track the Linux® kernel work with the open source community to facilitate the adoption of Zoned Storage for the zettabyte age.
We are very excited about this next architectural shift and look forward to working with an ecosystem of hardware and software developers as we progress into the zettabyte age and drive the innovation that’s needed for the next generation data infrastructure.
Finally, I want to thank the Western Digital team for their tremendous work in the open source community and on Western Digital’s Zoned Storage Initiative.
You can get started with Zoned Storage by going to ZonedStorage.io, our one-stop-shop for details about SMR and ZNS adoption where you can find:
• “Getting Started with SMR” – the first steps necessary to set up and verify a system.
• “Linux Distributions” – information about the availability of the zoned storage interface on various Linux distributions.
• “Applications and Libraries” – information about other zoned storage open source projects, such as SCSI generic utilities, Linux system utilities, the libzbc user library, and the tcmu-runner ZBC disk emulation.”
We hope this work will promote the successful adoption of open source technologies and open standards.
Forward-Looking Statements
Certain blog and other posts on this website may contain forward-looking statements, including statements relating to expectations for our product portfolio, the market for our products, product development efforts, and the capacities, capabilities and applications of our products. These forward-looking statements are subject to risks and uncertainties that could cause actual results to differ materially from those expressed in the forward-looking statements, including development challenges or delays, supply chain and logistics issues, changes in markets, demand, global economic conditions and other risks and uncertainties listed in Western Digital Corporation’s most recent quarterly and annual reports filed with the Securities and Exchange Commission, to which your attention is directed. Readers are cautioned not to place undue reliance on these forward-looking statements and we undertake no obligation to update these forward-looking statements to reflect subsequent events or circumstances.
[1] IDC Worldwide Global DataSphere Forecast, 2019-2023: Consumer Dependence on the Enterprise Widening, January 2019, DOC #US44615319