Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3456727.3463781acmconferencesArticle/Chapter ViewAbstractPublication PagessystorConference Proceedingsconference-collections
research-article

KVRAID: high performance, write efficient, update friendly erasure coding scheme for KV-SSDs

Published: 14 June 2021 Publication History

Abstract

Key-value (KV) stores have been widely deployed in a variety of scale-out enterprise applications such as online retail, big data analytics, social networks, etc. Key-Value SSDs (KVSSDs) provide a key-value interface directly from the device aiming at lowering software overhead and reducing I/O amplification for such applications.
In this paper, we present KVRAID, a high performance, write efficient erasure coding management scheme on emerging key-value SSDs. The core innovation of KVRAID is to use logical to physical key conversion to efficiently pack similar size KV objects and dynamically manage the membership of erasure coding groups. Such design enables packing multiple user objects to a single physical object to reduce the object amplification compared to prior works. By applying out-of-place update technique, KVRAID can significantly reduce the I/O amplification compared to the state-of-art designs. Our experiments show that KVRAID outperforms state-of-art software KV-store with block RAID by 28x in terms of insert throughput and reduces CPU utilization, tail latency and write amplification significantly. Compared to state-of-art KV devices erasure coding management, KVRAID reduces object amplification by ~ 2.6x compared to StripeFinder and reduces I/O amplification by ~ 9.6x when compared to KVMD and StripeFinder for update intensive workloads.

References

[1]
Dan Ardelean, Amer Diwan, and Chandra Erdman. 2018. Performance Analysis of Cloud Applications. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). USENIX Association, Renton, WA, 405--417. https://www.usenix.org/conference/nsdi18/presentation/ardelean
[2]
Zhichao Cao, Siying Dong, Sagar Vemuri, and David H.C. Du. 2020. Characterizing, Modeling, and Benchmarking RocksDB Key-Value Workloads at Facebook. In 18th USENIX Conference on File and Storage Technologies (FAST 20). USENIX Association, Santa Clara, CA, 209--223. https://www.usenix.org/conference/fast20/presentation/cao-zhichao
[3]
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. 2006. Bigtable: A Distributed Storage System for Structured Data. In 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI 06). USENIX Association, Seattle, WA. https://www.usenix.org/conference/osdi-06/bigtable-distributed-storage-system-structured-data
[4]
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (Indianapolis, Indiana, USA) (SoCC '10). ACM, New York, NY, USA, 143--154.
[5]
J. Dean and S. Ghemawat. 2017. LevelDB: Google's fast key value store library. Github release 1.2 (2017).
[6]
Biplob Debnath, Sudipta Sengupta, and Jin Li. 2010. FlashStore: High Throughput Persistent Key-value Store. Proc. VLDB Endow. 3, 1-2 (Sept. 2010), 1414--1425.
[7]
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: Amazon's Highly Available Key-value Store. SIGOPS Oper. Syst. Rev. 41, 6 (Oct. 2007), 205--220.
[8]
Facebook. 2015. Rocksdb. https://rocksdb.org/.
[9]
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. 2003. The Google File System. In Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles (Bolton Landing, NY, USA) (SOSP '03). ACM, New York, NY, USA, 29--43.
[10]
Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin. 2012. Erasure Coding in Windows Azure Storage. In Presented as part of the 2012 USENIX Annual Technical Conference (USENIX ATC 12). USENIX, Boston, MA, 15--26. https://www.usenix.org/conference/atc12/technical-sessions/presentation/huang
[11]
Junsu Im, Jinwook Bae, Chanwoo Chung, Arvind, and Sungjin Lee. 2020. PinK: High-speed In-storage Key-value Store with Bounded Tails. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). USENIX Association, 173--187. https://www.usenix.org/conference/atc20/presentation/im
[12]
Intel. 2019. Intel Optane DC Persistent Memory. https://www.intel.com/content/www/us/en/architecture-and-technology/optane-dc-persistent-memory.html.
[13]
Jaeyong Jeong, Sangwook Shane Hahn, Sungjin Lee, and Jihong Kim. 2014. Lifetime Improvement of NAND Flash-based Storage Systems Using Dynamic Program and Erase Scaling. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST 14). USENIX, Santa Clara, CA, 61--74. https://www.usenix.org/conference/fast14/technical-sessions/presentation/jeong
[14]
Y. Jin, H. Tseng, Y. Papakonstantinou, and S. Swanson. 2017. KAML: A Flexible, High-Performance Key-Value SSD. In 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA). 373--384.
[15]
Asim Kadav, Mahesh Balakrishnan, Vijayan Prabhakaran, and Dahlia Malkhi. 2009. Differential RAID: Rethinking RAID for SSD Reliability. In HotStorage 2009: 1st Workshop on Hot Topics in Storage and File Systems (hotstorage 2009: 1st workshop on hot topics in storage and file systems ed.). Association for Computing Machinery, Inc. https://www.microsoft.com/en-us/research/publication/differential-raid-rethinking-raid-for-ssd-reliability/ (best paper award.).
[16]
Yangwook Kang, Rekha Pitchumani, Pratik Mishra, Yang-suk Kee, Francisco Londono, Sangyoon Oh, Jongyeol Lee, and Daniel D. G. Lee. 2019. Towards Building a High-performance, Scale-in Key-value Storage System. In Proceedings of the 12th ACM International Conference on Systems and Storage (Haifa, Israel) (SYSTOR '19). ACM, New York, NY, USA, 144--154.
[17]
J. Kim, J. Lee, J. Choi, D. Lee, and S. H. Noh. 2013. Improving SSD reliability with RAID via Elastic Striping and Anywhere Parity. In 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). 1--12.
[18]
Avinash Lakshman and Prashant Malik. 2010. Cassandra: A Decentralized Structured Storage System. SIGOPS Oper. Syst. Rev. 44, 2 (April 2010), 35--40.
[19]
Hyeontaek Lim, Bin Fan, David G. Andersen, and Michael Kaminsky. 2011. SILT: A Memory-efficient, High-performance Key-value Store. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (Cascais, Portugal) (SOSP '11). ACM, New York, NY, USA, 1--13.
[20]
Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. WiscKey: Separating Keys from Values in SSD-conscious Storage. In 14th USENIX Conference on File and Storage Technologies (FAST 16). USENIX Association, Santa Clara, CA, 133--148. https://www.usenix.org/conference/fast16/technical-sessions/presentation/lu
[21]
Youyou Lu, Jiwu Shu, and Weimin Zheng. 2013. Extending the Lifetime of Flash-based Storage through Reducing Write Amplification from File Systems. In Presented as part of the 11th USENIX Conference on File and Storage Technologies (FAST 13). USENIX, San Jose, CA, 257--270. https://www.usenix.org/conference/fast13/technical-sessions/presentation/lu_youyou
[22]
Umesh Maheshwari. 2020. StripeFinder: Erasure Coding of Small Objects Over Key-Value Storage Devices (An Uphill Battle). In 12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 20). USENIX Association. https://www.usenix.org/conference/hotstorage20/presentation/maheshwari
[23]
Leonardo Marmol, Swaminathan Sundararaman, Nisha Talagala, and Raju Rangaswami. 2015. NVMKV: A Scalable, Lightweight, FTL-aware Key-Value Store. In 2015 USENIX Annual Technical Conference (USENIX ATC 15). USENIX Association, Santa Clara, CA, 207--219. https://www.usenix.org/conference/atc15/technical-session/presentation/marmol
[24]
Michael A. Olson, Keith Bostic, and Margo Seltzer. 1999. Berkeley DB. In Proceedings of the Annual Conference on USENIX Annual Technical Conference (Monterey, California) (ATEC '99). USENIX Association, Berkeley, CA, USA, 43--43. http://dl.acm.org/citation.cfm?id=1268708.1268751
[25]
Patrick O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O'Neil. 1996. The Log-structured Merge-tree (LSM-tree). Acta Inf. 33, 4 (June 1996), 351--385.
[26]
Patrick O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O'Neil. 1996. The Log-structured Merge-tree (LSM-tree). Acta Inf. 33, 4 (June 1996), 351--385.
[27]
David A. Patterson, Garth Gibson, and Randy H. Katz. 1988. A Case for Redundant Arrays of Inexpensive Disks (RAID). In Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data (Chicago, Illinois, USA) (SIGMOD '88). ACM, New York, NY, USA, 109--116.
[28]
Rekha Pitchumani and Yang-Suk Kee. 2020. Hybrid Data Reliability for Emerging Key-Value Storage Devices. In 18th USENIX Conference on File and Storage Technologies (FAST 20). USENIX Association, Santa Clara, CA, 309--322. https://www.usenix.org/conference/fast20/presentation/pitchumani
[29]
James S. Plank. 2013. Erasure Codes for Storage Systems. A Brief Primer. 38 (11 2013), 300.
[30]
Pandian Raju, Rohan Kadekodi, Vijay Chidambaram, and Ittai Abraham. 2017. PebblesDB: Building Key-Value Stores Using Fragmented Log-Structured Merge Trees. In Proceedings of the 26th Symposium on Operating Systems Principles (Shanghai, China) (SOSP '17). ACM, New York, NY, USA, 497--514.
[31]
Peng Wang, Guangyu Sun, Song Jiang, Jian Ouyang, Shiding Lin, Chen Zhang, and Jason Cong. 2014. An Efficient Design and Implementation of LSM-tree Based Key-value Store on Open-channel SSD. In Proceedings of the Ninth European Conference on Computer Systems (Amsterdam, The Netherlands) (EuroSys '14). ACM, New York, NY, USA, Article 16, 14 pages.
[32]
Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, and Carlos Maltzahn. 2006. Ceph: A Scalable, High-performance Distributed File System. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (Seattle, Washington) (OSDI '06). USENIX Association, Berkeley, CA, USA, 307--320. http://dl.acm.org/citation.cfm?id=1298455.1298485
[33]
Erci Xu, Mai Zheng, Feng Qin, Yikang Xu, and Jiesheng Wu. 2019. Lessons and Actions: What We Learned from 10K SSD-Related Storage System Failures. In 2019 USENIX Annual Technical Conference (USENIX ATC 19). USENIX Association, Renton, WA, 961--976. https://www.usenix.org/conference/atc19/presentation/xu
[34]
Heng Zhang, Mingkai Dong, and Haibo Chen. 2016. Efficient and Available In-memory KV-Store with Hybrid Erasure Coding and Replication. In 14th USENIX Conference on File and Storage Technologies (FAST 16). USENIX Association, Santa Clara, CA, 167--180. https://www.usenix.org/conference/fast16/technical-sessions/presentation/zhang-heng
[35]
Zhe Zhang, Andrew Wang, Kai Zheng, Uma Maheswara G, and Vinayakumar. 2018. Introduction to HDFS Erasure Coding in Apache Hadoop. https://blog.cloudera.com/blog/2015/09/introduction-to-hdfs-erasure-coding-in-apache-hadoop.

Cited By

View all
  • (2025)ProckStore: An NDP-empowered key-value store with asynchronous and multi-threaded compaction scheme for optimized performanceJournal of Systems Architecture10.1016/j.sysarc.2025.103342160(103342)Online publication date: Mar-2025
  • (2024)Storage Abstractions for SSDs: The Past, Present, and FutureACM Transactions on Storage10.1145/370899221:1(1-44)Online publication date: 30-Dec-2024
  • (2024)Advanced Elastic Reed–Solomon Codes for Erasure-Coded Key–Value StoresIEEE Internet of Things Journal10.1109/JIOT.2023.329957411:3(4747-4762)Online publication date: 1-Feb-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SYSTOR '21: Proceedings of the 14th ACM International Conference on Systems and Storage
June 2021
226 pages
ISBN:9781450383981
DOI:10.1145/3456727
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • Technion: Israel Institute of Technology
  • USENIX Assoc: USENIX Assoc

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. RAID
  2. key-value SSD
  3. key-value store
  4. redundancy

Qualifiers

  • Research-article

Funding Sources

  • National Science Foundation
  • Samsung Corporation

Conference

SYSTOR '21
Sponsor:

Acceptance Rates

SYSTOR '21 Paper Acceptance Rate 18 of 63 submissions, 29%;
Overall Acceptance Rate 108 of 323 submissions, 33%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)91
  • Downloads (Last 6 weeks)14
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)ProckStore: An NDP-empowered key-value store with asynchronous and multi-threaded compaction scheme for optimized performanceJournal of Systems Architecture10.1016/j.sysarc.2025.103342160(103342)Online publication date: Mar-2025
  • (2024)Storage Abstractions for SSDs: The Past, Present, and FutureACM Transactions on Storage10.1145/370899221:1(1-44)Online publication date: 30-Dec-2024
  • (2024)Advanced Elastic Reed–Solomon Codes for Erasure-Coded Key–Value StoresIEEE Internet of Things Journal10.1109/JIOT.2023.329957411:3(4747-4762)Online publication date: 1-Feb-2024
  • (2023)KVRangeDB: Range Queries for a Hash-based Key–Value DeviceACM Transactions on Storage10.1145/358201319:3(1-21)Online publication date: 19-Jun-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media