At the 2009 SOSP David Anderson and co-authors from C-MU presented
FAWN, the Fast Array of Wimpy Nodes. It inspired me to suggest, in my 2010
JCDL keynote, that the cost savings FAWN realized without performance penalty by distributing computation across a very large number of very low-power nodes might also apply to storage.
The following year Ian Adams and Ethan Miller of UC Santa Cruz's
Storage Systems Research Center and I looked at this possibility more closely in a Technical Report entitled
Using Storage Class Memory for Archives with DAWN, a Durable Array of Wimpy Nodes. We showed that it was indeed plausible that, even at then current flash prices, the total cost of ownership over the long term of a storage system built from very
low-power system-on-chip technology and flash memory would be competitive with disk while providing high performance and enabling self-healing.
Although flash remains more expensive than hard disk, since 2011 the gap has narrowed from a factor of about 12 to about 6. Pure Storage recently announced FlashBlade, an object storage fabric composed of large numbers of blades,
each equipped with:
- Compute – 8-core Xeon system-on-a-chip – and Elastic Fabric Connector for external, off-blade, 40GbitE networking,
- Storage – NAND storage with 8TB or 52TB raw capacity of raw capacity
and on-board NV-RAM with a super-capacitor-backed write buffer plus a
pair of ARM CPU cores and an FPGA,
- On-blade networking – PCIe card to link compute and storage cards via a proprietary protocol.
Chris Mellor at The Register has
details and
two commentaries.
FlashBlade clearly isn't DAWN. Each blade is much bigger, much more powerful and much more expensive than a DAWN node. No-one could call a node with an 8-core Xeon, 2 ARMs, and 52TB of flash "wimpy", and it'll clearly be too expensive for long-term bulk storage. But it is a big step in the direction of the DAWN architecture.
DAWN exploits two separate sets of synergies:
- Like FlashBlade, it moves the computation to where the data is, rather then moving the data to where the computation is, reducing both latency and power consumption. The further data moves on wires from the storage medium, the more power and time it takes. This is why Berkeley's Aspire project's architecture is based on optical interconnect technology, which when it becomes mainstream will be both faster and lower-power than wires. In the meantime, we have to use wires.
- Unlike FlashBlade, it divides the object storage fabric into a much larger number of much smaller nodes, implemented using the very low-power ARM chips used in cellphones. Because the power a CPU needs tends to grow faster than linearly with performance, the additional parallelism provides comparable performance at lower power.
So FlashBlade currently exploits only one of the two sets of synergies. But once Pure Storage has deployed this architecture in its current relatively high-cost and high-power technology, re-implementing it in lower-cost, lower-power technology should be easy and non-disruptive. They have done the harder of the two parts.