Google Cloud Platform
Google seeks new disks for data centers
Today, during my keynote at the 2016 USENIX conference on File and Storage Technologies (FAST 2016), I’ll be talking about our goal to work with industry and academia to develop new lines of disks that are a better fit for data centers supporting cloud-based storage services. We're also releasing a white paper on the evolution of disk drives that we hope will help continue the decades of remarkable innovation achieved by the industry to date.
But why now? It's a fun but apocryphal story that the width of Roman chariots drove the spacing of modern train tracks. However, it is true that the modern disk drive owes its dimensions to the 3½” floppy disk used in PCs. It's very unlikely that's the optimal design, and now that we're firmly in the era of cloud-based storage, it's time to reevaluate broadly the design of modern disk drives.
The rise of cloud-based storage means that most (spinning) hard disks will be deployed primarily as part of large storage services housed in data centers. Such services are already the fastest growing market for disks and will be the majority market in the near future. For example, for YouTube alone, users upload over 400 hours of video every minute, which at one gigabyte per hour requires more than one petabyte (1M GB) of new storage every day or about 100x the Library of Congress. As shown in the graph, this continues to grow exponentially, with a 10x increase every five years.
At the heart of the paper is the idea that we need to optimize the collection of disks, rather than a single disk in a server. This shift has a range of interesting consequences including the counter-intuitive goal of having disks that are actually a little more likely to lose data, as we already have to have that data somewhere else anyway. It’s not that we want the disk to lose data, but rather that we can better focus the cost and effort spent trying to avoid data loss for other gains such as capacity or system performance.
We explore physical changes, such as taller drives and grouping of disks, as well as a range of shorter-term firmware-only changes. Our goals include higher capacity and more I/O operations per second, in addition to a better overall total cost of ownership. We hope this is the beginning of both a new chapter for disks and a broad and healthy discussion, including vendors, academia and other customers, about what “data center” disks should be in the era of cloud.