Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3357223.3362705acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Analysis of and Optimization for Write-dominated Hybrid Storage Nodes in Cloud

Published: 20 November 2019 Publication History

Abstract

Cloud providers like the Alibaba cloud routinely and widely employ hybrid storage nodes composed of solid-state drives (SSDs) and hard disk drives (HDDs), reaping their respective benefits: performance from SSD and capacity from HDD. These hybrid storage nodes generally write incoming data to its SSDs and then flush them to their HDD counterparts, referred to as the SSD Write Back (SWB) mode, thereby ensuring low write latency. When comprehensively analyzing real production workloads from Pangu, a large-scale storage platform underlying the Alibaba cloud, we find that (1) there exist many write dominated storage nodes (WSNs); however, (2) under the SWB mode, the SSDs of these WSNs suffer from severely high write intensity and long tail latency. To address these unique observed problems of WSNs, we present SSD Write Redirect (SWR), a runtime IO scheduling mechanism for WSNs. SWR judiciously and selectively forwards some or all SSD-writes to HDDs, adapting to runtime conditions. By effectively offloading the right amount of write IOs from overburdened SSDs to underutilized HDDs in WSNs, SWR is able to adequately alleviate the aforementioned problems suffered by WSNs. This significantly improves overall system performance and SSD endurance. Our trace-driven evaluation of SWR, through replaying production workload traces collected from the Alibaba cloud in our cloud testbed, shows that SWR decreases the average and 99til-percentile latencies of SSD-writes by up to 13% and 47% respectively, notably improving system performance. Meanwhile the amount of data written to SSDs is reduced by up to 70%, significantly improving SSD lifetime.

References

[1]
Daniel S. Berger, Benjamin Berg, Timothy Zhu, Siddhartha Sen, and Mor Harchol-Balter. 2018. RobinHood: Tail Latency Aware Caching --- Dynamic Reallocation from Cache-Rich to Cache-Poor. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18).
[2]
Brad Calder, Ju Wang, Aaron Ogus, Niranjan Nilakantan, Arild Skjolsvold, Sam McKelvie, Yikang Xu, Shashwat Srivastav, Jiesheng Wu, Huseyin Simitci, et al. 2011. Windows Azure Storage: a highly available cloud storage service with strong consistency. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles. ACM, 143--157.
[3]
Wei Cao, Zhenjun Liu, Peng Wang, Sen Chen, Caifeng Zhu, Song Zheng, Yuhui Wang, and Guoqing Ma. 2018. PolarFS: an ultra-low latency and failure resilient distributed file system for shared storage cloud database. Proceedings of the VLDB Endowment 11, 12 (2018), 1849--1862.
[4]
Yunpeng Chai, Zhihui Du, Xiao Qin, and David A Bader. 2015. WEC: Improving durability of SSD cache drives by caching write-efficient data. IEEE Transactions on computers 64, 11 (2015), 3304--3316.
[5]
Chanwoo Chung, Jinhyung Koo, Junsu Im, Sungjin Lee, et al. 2019. Light-Store: Software-defined Network-attached Key-value Drives. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 939--953.
[6]
Alibaba Clouder. 2018. Pangu - The High Performance Distributed File System by Alibaba Cloud. https://www.alibabacloud.com/blog/pangu__the__high__performance__distributed_file__system__by__alibaba__cloud__594059.
[7]
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: amazon's highly available key-value store. In ACM SIGOPS operating systems review, Vol. 41. ACM, 205--220.
[8]
Samsung Electronics. 2018. Samsung V-NAND SSD Data Sheet. 5--6. https://s3.ap-northeast-2.amazonaws.com/global.semi.static/Samsung_SSD_860_EVO_Data_Sheet_Revl.pdf.
[9]
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. 2003. The Google file system. (2003).
[10]
Qi Huang, Ken Birman, Robbert van Renesse, Wyatt Lloyd, Sanjeev Kumar, and Harry C Li. 2013. An analysis of Facebook photo caching. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM, 167--181.
[11]
Cheng Li, Philip Shilane, Fred Douglis, Hyong Shim, Stephen Smaldone, and Grant Wallace. 2014. c: A Capacity-Optimized {SSD} Cache for Primary Storage. In 2014 {USENIX} Annual Technical Conference ({USENIX}{ATC} 14). 501--512.
[12]
Huiba Li, Yiming Zhang, Dongsheng Li, Zhiming Zhang, Shengyun Liu, Peng Huang, Zheng Qin, Kai Chen, and Yongqiang Xiong. 2019. Ursa: Hybrid Block Storage for Cloud-Scale Virtual Disks. In Proceedings of the Fourteenth EuroSys Conference 2019. ACM, 15.
[13]
Ning Li, Hong Jiang, Dan Feng, and Zhan Shi. 2016. PSLO: enforcing the X th percentile latency and throughput SLOs for consolidated VM storage. In Proceedings of the Eleventh European Conference on Computer Systems. ACM, 28.
[14]
Changwoo Min, Kangnyeon Kim, Hyunjin Cho, Sang-Won Lee, and Young Ik Eom. 2012. SFS: random write considered harmful in solid state drives. In FAST, Vol. 12. 1--16.
[15]
Pulkit A Misra, Maria F Borge, íñigo Goiri, Alvin R Lebeck, Willy Zwaenepoel, and Ricardo Bianchini. 2019. Managing Tail Latency in Datacenter-Scale File Systems Under Production Constraints. In Proceedings of the Fourteenth EuroSys Conference 2019. ACM, 17.
[16]
Subramanian Muralidhar, Wyatt Lloyd, Sabyasachi Roy, Cory Hill, Ernest Lin, Weiwen Liu, Satadru Pan, Shiva Shankar, Viswanath Sivakumar, Linpeng Tang, et al. 2014. f4: Facebook's Warm {BLOB} Storage System. In 11th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 14). 383--398.
[17]
J. Ou, J. Shu, Y. Lu, L. Yi, and W. Wang. 2014. EDM: An Endurance-Aware Data Migration Scheme for Load Balancing in SSD Storage Clusters. In 2014 IEEE 28th International Parallel and Distributed Processing Symposium. 787--796. https://doi.org/10.1109/IPDPS.2014.86
[18]
Mayur R Palankar, Adriana Iamnitchi, Matei Ripeanu, and Simson Garfinkel. 2008. Amazon S3 for science grids: a viable solution?. In Proceedings of the 2008 international workshop on Data-aware distributed computing. ACM, 55--64.
[19]
Raghu Ramakrishnan, Baskar Sridharan, John R Douceur, Pavan Kasturi, Balaji Krishnamachari-Sampath, Karthick Krishnamoorthy, Peng Li, Mitica Manu, Spiro Michaylov, Rogério Ramos, et al. 2017. Azure data lake store: a hyperscale distributed file service for big data analytics. In Proceedings of the 2017 ACM International Conference on Management of Data. ACM, 51--63.
[20]
Waleed Reda, Marco Canini, Laiith Suresh, Dejan Kostié, and Sean Braithwaite. 2017. Rein: Taming tail latency in key-value stores via multiget scheduling. In Proceedings of the Twelfth European Conference on Computer Systems. ACM, 95--110.
[21]
Alessandro Rubini and Jonathan Corbet. 2001. Linux device drivers. "O'Reilly Media, Inc.".
[22]
David Shue, Michael J Freedman, and Anees Shaikh. 2012. Performance isolation and fairness for multi-tenant cloud storage. In Presented as part of the 10th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 12). 349--362.
[23]
Gokul Soundararajan, Vijayan Prabhakaran, Mahesh Balakrishnan, and Ted Wobber. 2010. Extending SSD Lifetimes with Disk-Based Write Caches. In FAST, Vol. 10. 101--114.
[24]
Lalith Suresh, Marco Canini, Stefan Schmid, and Anja Feldmann. 2015. C3: Cutting tail latency in cloud data stores via adaptive replica selection. In 12th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 15). 513--527.
[25]
Linpeng Tang, Qi Huang, Wyatt Lloyd, Sanjeev Kumar, and Kai Li. 2015. RIPQ: Advanced Photo Caching on Flash for Facebook. In 13th USENIX Conference on File and Storage Technologies (FAST 15). USENIX Association, 373--386.
[26]
Arash Tavakkol, Mohammad Sadrosadati, Saugata Ghose, Jeremie Kim, Yixin Luo, Yaohua Wang, Nika Mansouri Ghiasi, Lois Orosa, Juan Gómez-Luna, and Onur Mutlu. 2018. FLIN: Enabling fairness and enhancing performance in modern NVMe solid state drives. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). IEEE, 397--410.
[27]
Seagate Technology. 2018. BARRACUDA COMPUTE Product Manual. 8--8. https://www.seagate.com/www-content/product-content/barracuda-fam/barracuda-new/en-us/docs/100804187g.pdf.
[28]
Andrew Wang, Shivaram Venkataraman, Sara Alspaugh, Randy Katz, and Ion Stoica. 2012. Cake: enabling high-level SLOs on shared storage systems. In Proceedings of the Third ACM Symposium on Cloud Computing. ACM, 14.
[29]
Hui Wang and Peter Varman. 2014. Balancing Fairness and Efficiency in Tiered Storage Systems with Bottleneck-Aware Allocation. In Proceedings of the 12th {USENIX} Conference on File and Storage Technologies ({FAST} I4). 229--242.
[30]
Yeong-Jae Woo and Jin-Soo Kim. 2013. Diversifying wear index for MLC NAND flash memory to extend the lifetime of SSDs. In Proceedings of the Eleventh ACM International Conference on Embedded Software. IEEE Press, 6.
[31]
Shiqin Yan, Huaicheng Li, Mingzhe Hao, Michael Hao Tong, Swaminathan Sundararaman, Andrew A Chien, and Haryadi S Gunawi. 2017. Tiny-tail flash: Near-perfect elimination of garbage collection tail latencies in NAND SSDs. ACM Transactions on Storage (TOS) 13, 3 (2017), 22.
[32]
Ke Zhou, Si Sun, Hua Wang, Ping Huang, Xubin He, Rui Lan, Wenyan Li, Wenjie Liu, and Tianming Yang. 2018. Demystifying Cache Policies for Photo Stores at Scale: A Tencent Case Study. 284--294. https://doi.org/10.1145/3205289.3205299
[33]
Timothy Zhu, Alexey Tumanov, Michael A Kozuch, Mor Harchol-Balter, and Gregory R Ganger. 2014. Priority meister: Tail latency qos for shared networked storage. In Proceedings of the ACM Symposium on Cloud Computing. ACM, 1--14.

Cited By

View all
  • (2024)RomeFS: A CXL-SSD Aware File System Exploiting Synergy of Memory-Block Dual PathsProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698539(720-736)Online publication date: 20-Nov-2024
  • (2024)Performance Characterization of SmartNIC NVMe-over-Fabrics Target OffloadingProceedings of the 17th ACM International Systems and Storage Conference10.1145/3688351.3689154(14-24)Online publication date: 16-Sep-2024
  • (2024)SIndex: An SSD-based Large-scale Indexing with Deterministic Latency for Cloud Block StorageProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673041(1237-1246)Online publication date: 12-Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SoCC '19: Proceedings of the ACM Symposium on Cloud Computing
November 2019
503 pages
ISBN:9781450369732
DOI:10.1145/3357223
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 November 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Hybrid Storage
  2. SSD Queue Management
  3. SSD Write Redirect
  4. Write-dominated Workload

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • NSFC
  • Creative Research Group Project of NSFC
  • National key research and development program of China
  • Fundamental Research Funds for the Central Universities
  • US NSF under Grant

Conference

SoCC '19
Sponsor:
SoCC '19: ACM Symposium on Cloud Computing
November 20 - 23, 2019
CA, Santa Cruz, USA

Acceptance Rates

SoCC '19 Paper Acceptance Rate 39 of 157 submissions, 25%;
Overall Acceptance Rate 169 of 722 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)2
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)RomeFS: A CXL-SSD Aware File System Exploiting Synergy of Memory-Block Dual PathsProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698539(720-736)Online publication date: 20-Nov-2024
  • (2024)Performance Characterization of SmartNIC NVMe-over-Fabrics Target OffloadingProceedings of the 17th ACM International Systems and Storage Conference10.1145/3688351.3689154(14-24)Online publication date: 16-Sep-2024
  • (2024)SIndex: An SSD-based Large-scale Indexing with Deterministic Latency for Cloud Block StorageProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673041(1237-1246)Online publication date: 12-Aug-2024
  • (2024)Explorations and Exploitation for Parity-based RAIDs with Ultra-fast SSDsACM Transactions on Storage10.1145/362799220:1(1-32)Online publication date: 30-Jan-2024
  • (2024)Highly VM-Scalable SSD in Cloud Storage SystemsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.330557343:1(113-126)Online publication date: Jan-2024
  • (2024)An Efficient Deep Reinforcement Learning-Based Automatic Cache Replacement Policy in Cloud Block Storage SystemsIEEE Transactions on Computers10.1109/TC.2023.332562573:1(164-177)Online publication date: 1-Jan-2024
  • (2023)DiffForward: On Balancing Forwarding Traffic for Modern Cloud Block Services via Differentiated ForwardingProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/35794447:1(1-26)Online publication date: 2-Mar-2023
  • (2023)An In-depth Comparative Analysis of Cloud Block Storage Workloads: Findings and ImplicationsACM Transactions on Storage10.1145/357277919:2(1-32)Online publication date: 6-Mar-2023
  • (2023)Characterization of I/O Behaviors in Cloud Storage WorkloadsIEEE Transactions on Computers10.1109/TC.2023.326372672:10(2726-2739)Online publication date: Oct-2023
  • (2023)Fair Will Go On: A Collaboration-Aware Fairness Scheme for NVMe SSD in Cloud Storage System2023 60th ACM/IEEE Design Automation Conference (DAC)10.1109/DAC56929.2023.10247718(1-6)Online publication date: 9-Jul-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media