Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

SDF: software-defined flash for web-scale internet storage systems

Published: 24 February 2014 Publication History

Abstract

In the last several years hundreds of thousands of SSDs have been deployed in the data centers of Baidu, China's largest Internet search company. Currently only 40\% or less of the raw bandwidth of the flash memory in the SSDs is delivered by the storage system to the applications. Moreover, because of space over-provisioning in the SSD to accommodate non-sequential or random writes, and additionally, parity coding across flash channels, typically only 50-70\% of the raw capacity of a commodity SSD can be used for user data. Given the large scale of Baidu's data center, making the most effective use of its SSDs is of great importance. Specifically, we seek to maximize both bandwidth and usable capacity.
To achieve this goal we propose {\em software-defined flash} (SDF), a hardware/software co-designed storage system to maximally exploit the performance characteristics of flash memory in the context of our workloads. SDF exposes individual flash channels to the host software and eliminates space over-provisioning. The host software, given direct access to the raw flash channels of the SSD, can effectively organize its data and schedule its data access to better realize the SSD's raw performance potential.
Currently more than 3000 SDFs have been deployed in Baidu's storage system that supports its web page and image repository services. Our measurements show that SDF can deliver approximately 95% of the raw flash bandwidth and provide 99% of the flash capacity for user data. SDF increases I/O bandwidth by 300\% and reduces per-GB hardware cost by 50% on average compared with the commodity-SSD-based system used at Baidu.

References

[1]
"Micron 25nm MLC Product Datasheet." http://www.micron.com.
[2]
"NVMe: Non-volatile Memory Express." http://www.nvmexpress.org/.
[3]
"UBIFS: Unsorted Block Image File System." http://www.linux-mtd.infradead.org/doc/ubifs.html/.
[4]
"YFFS: Yet Another Flash File System." http://www.yaffs.net/.
[5]
F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. "Bigtable: A Distributed Storage System for Structured Data." In Seventh Symposium on Operating System Design and Implementation, 2006.
[6]
F. Chen, R. Lee, and X. Zhang. "Essential Roles of Exploiting Internal Parallelism of Flash Memory based Solid State Drives in High-speed Data Processing." In IEEE 17th International Symposium on High Performance Computer Architecture, 2011.
[7]
F. Chen, T. Luo, and X. Zhang, "CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of Flash Memory based Solid State Drives." In 9th USENIX Conference on File and Storage Technologies, 2011.
[8]
A. Foong, B. Veal, and F. Hady. "Towards SSD-ready Enterprise Platforms." In 1st International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures, 2010.
[9]
S. Ghemawat, H. Gobioff, and S.-T. Leung. "The Google File System" In 19th ACM Symposium on Operating Systems Principles, 2003.
[10]
S. S. Hahn, S. Lee, and J. Kim. "SOS: Software-based Out-oforder Scheduling for High-performance NAND Flash-based SSDs." In IEEE 29th Symposium on Mass Storage Systems and Technologies, 2013.
[11]
A. Kawaguchi, S. Nishioka, and H. Motoda. "A Flashmemory based File System." In Winter USENIX Technical Conference, 1995.
[12]
J. Kim, J. M. Kim, S. H. Noh, S. L. Min, and Y. Cho. "A Space-efficient Flash Translation Layer for Compact-flash Systems." In IEEE Transactions on Consumer Electronics, 2002.
[13]
S.W. Lee, D. J. Park, T. S. Chung, D. H. Lee, S. Park, and H. J. Song. "A Log Buffer-based Flash Translation Layer Using Fully-associative Sector Translation." In Trans. on Embedded Computing Systems, 2007.
[14]
S. Lee, D. Shin, Y. J. Kim, and J. Kim. "LAST: Locality-aware Sector Translation for NAND Flash Memory-based Storage Systems." In SIGOPS Operating Systems Review, 2008.
[15]
C. Min, K. Kim, H. Cho, S. Lee, and Y. Eom. "SFS: Random Write Considered Harmful in Solid State Drives," In 10th USENIX Conference on File and Storage Technologies, 2012.
[16]
P. O'Neil, E. Cheng, D. Gawlick, and E. O'Neil. "The Log-structured Merge-tree (LSM-tree)." In Acta Informatica 33(4):351--385, 1996.
[17]
J. Ouyang, S. Lin, Z. Hou, P. Wang, Y. Wang, and G. Sun. "Active SSD Design for Energy-efficiency Improvement of Web-scale Data Analysis." In International Symposium on Low Power Electronics and Design, 2013.
[18]
E. Seppanen, M. T. O'Keefe, and D. J. Lilja. "High Performance Solid State Storage under Linux." In IEEE 26th Symposium on Mass Storage Systems and Technologies, 2010.
[19]
D. Woodhouse. "JFFS: The Journaling Flash File System." In Ottowa Linux symposium, 2012.

Cited By

View all
  • (2022)RAIL: Predictable, Low Tail Latency for NVMe FlashACM Transactions on Storage10.1145/346540618:1(1-21)Online publication date: 29-Jan-2022
  • (2022)MDev-NVMe: Mediated Pass-Through NVMe Virtualization Solution With Adaptive PollingIEEE Transactions on Computers10.1109/TC.2020.304578571:2(251-265)Online publication date: 1-Feb-2022
  • (2021)Performance Modeling and Practical Use Cases for Black-Box SSDsACM Transactions on Storage10.1145/344002217:2(1-38)Online publication date: 8-Jun-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 42, Issue 1
ASPLOS '14
March 2014
729 pages
ISSN:0163-5964
DOI:10.1145/2654822
Issue’s Table of Contents
  • cover image ACM Conferences
    ASPLOS '14: Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
    February 2014
    780 pages
    ISBN:9781450323055
    DOI:10.1145/2541940
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 February 2014
Published in SIGARCH Volume 42, Issue 1

Check for updates

Author Tags

  1. data center
  2. flash memory
  3. solid-state drive(ssd)

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)86
  • Downloads (Last 6 weeks)12
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)RAIL: Predictable, Low Tail Latency for NVMe FlashACM Transactions on Storage10.1145/346540618:1(1-21)Online publication date: 29-Jan-2022
  • (2022)MDev-NVMe: Mediated Pass-Through NVMe Virtualization Solution With Adaptive PollingIEEE Transactions on Computers10.1109/TC.2020.304578571:2(251-265)Online publication date: 1-Feb-2022
  • (2021)Performance Modeling and Practical Use Cases for Black-Box SSDsACM Transactions on Storage10.1145/344002217:2(1-38)Online publication date: 8-Jun-2021
  • (2021)A Throughput-Oriented NVMe Storage Virtualization With Workload-Aware ManagementIEEE Transactions on Computers10.1109/TC.2020.303781770:12(2112-2124)Online publication date: 1-Dec-2021
  • (2020)Artificial intelligence in cyber physical systemsAI & SOCIETY10.1007/s00146-020-01049-0Online publication date: 27-Aug-2020
  • (2019)Who's afraid of uncorrectable bit errors? online recovery of flash errors with distributed redundancyProceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference10.5555/3358807.3358891(977-991)Online publication date: 10-Jul-2019
  • (2019)Cognitive SSDProceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference10.5555/3358807.3358841(395-410)Online publication date: 10-Jul-2019
  • (2019)FlashieldProceedings of the 16th USENIX Conference on Networked Systems Design and Implementation10.5555/3323234.3323241(65-78)Online publication date: 26-Feb-2019
  • (2019)Efficient User-Level Storage Disaggregation for Deep Learning2019 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2019.8891023(1-12)Online publication date: Sep-2019
  • (2017)FreewriteProceedings of the 10th ACM International Systems and Storage Conference10.1145/3078468.3078471(1-6)Online publication date: 22-May-2017
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media