Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2611354.2611370acmconferencesArticle/Chapter ViewAbstractPublication PagessystorConference Proceedingsconference-collections
tutorial

Lazy Means Smart: Reducing Repair Bandwidth Costs in Erasure-coded Distributed Storage

Published: 30 June 2014 Publication History

Abstract

Erasure coding schemes provide higher durability at lower storage cost, and thus constitute an attractive alternative to replication in distributed storage systems, in particular for storing rarely accessed "cold" data. These schemes, however, require an order of magnitude higher recovery bandwidth for maintaining a constant level of durability in the face of node failures. In this paper we propose lazy recovery, a technique to reduce recovery bandwidth demands down to the level of replicated storage. The key insight is that a careful adjustment of recovery rate substantially reduces recovery bandwidth, while keeping the impact on read performance and data durability low. We demonstrate the benefits of lazy recovery via extensive simulation using a realistic distributed storage configuration and published component failure parameters. For example, when applied to the commonly used RS(14, 10) code, lazy recovery reduces repair bandwidth by up to 76% even below replication, while increasing the amount of degraded stripes by 0.1 percentage points. Lazy recovery works well with a variety of erasure coding schemes, including the recently introduced bandwidth efficient codes, achieving up to a factor of 2 additional bandwidth savings.

References

[1]
M. Abd-El-Malek, W. V. Courtright, C. Cranor, G. R. Ganger, J. Hendricks, A. J. Klosterman, M. Mesnier, M. Prasad, B. Salmon, R. R. Sambasivan, S. Sinnamohideen, J. D. Strunk, E. Thereska, M. Wachs, and J. J. Wylie. Ursa Minor: Versatile Cluster-based Storage. In FAST, 2005.
[2]
R. Bhagwan, K. Tati, Y.-C. Cheng, S. Savage, and G. M. Voelker. Total recall: System support for automated availability management. In NSDI, volume 4, pages 25--25, 2004.
[3]
B. Calder, J. Wang, A. Ogus, N. Nilakantan, A. Skjolsvold, S. McKelvie, Y. Xu, S. Srivastav, J. Wu, H. Simitci, J. Haridas, C. Uddaraju, H. Khatri, A. Edwards, V. Bedekar, S. Mainali, R. Abbasi, A. Agarwal, M. F. u. Haq, M. I. u. Haq, D. Bhardwaj, S. Dayanand, A. Adusumilli, M. McNett, S. Sankaran, K. Manivannan, and L. Rigas. Windows Azure Storage: a highly available cloud storage service with strong consistency. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, SOSP '11, pages 143--157, New York, NY, USA, 2011. ACM.
[4]
B.-G. Chun, F. Dabek, A. Haeberlen, E. Sit, H. Weatherspoon, M. F. Kaashoek, J. Kubiatowicz, and R. Morris. Efficient replica maintenance for distributed storage systems. In NSDI, 2006.
[5]
A. G. Dimakis. Technical talk. http://ita.ucsd.edu/workshop/12/files/abstract/abstract_764.txt.
[6]
A. G. Dimakis, P. B. Godfrey, Y. Wu, M. J. Wainwright, and K. Ramchandran. Network coding for distributed storage systems. IEEE Trans. Inf. Theor., 56(9):4539--4551, Sept. 2010.
[7]
J. Elerath and M. Pecht. A Highly Accurate Method for Assessing Reliability of Redundant Arrays of Inexpensive Disks (RAID). Computers, IEEE Transactions on, 58(3):289--299, march 2009.
[8]
Erasure Coding for Distributed Storage Wiki. http://csi.usc.edu/~dimakis/StorageWiki/doku.php.
[9]
D. Ford, F. Labelle, F. I. Popovici, M. Stokely, V.-A. Truong, L. Barroso, C. Grimes, and S. Quinlan. Availability in globally distributed storage systems. In OSDI, pages 61--74, 2010.
[10]
S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google file system. In Proceedings of the nineteenth ACM symposium on Operating systems principles, SOSP '03, pages 29--43, New York, NY, USA, 2003. ACM.
[11]
F. Giroire, J. Monteiro, and S. Perennes. Peer-to-peer storage systems: a practical guideline to be lazy. In GlobeCom, 2010.
[12]
K. Greenan. Reliability and power-efficiency in erasure-coded storage systems. PhD thesis, UCSC, 2009.
[13]
K. Greenan, E. L. Miller, and J. Wylie. Reliability of xor-based erasure codes on heterogeneous devices. In Proceedings of the 38th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2008), pages 147--156, June 2008.
[14]
K. M. Greenan, J. S. Plank, and J. J. Wylie. Mean time to meaningless: Mttdl, markov models, and storage system reliability. In Proceedings of the 2nd USENIX conference on Hot topics in storage and file systems, HotStorage'10, pages 5--5, Berkeley, CA, USA, 2010. USENIX Association.
[15]
Hadoop Scalability at Facebook. http://download.yandex.ru/company/experience/yac/Molkov.pdf.
[16]
A. Haeberlen, A. Mislove, and P. Druschel. Glacier: Highly durable, decentralized storage despite massive correlated failures. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2005.
[17]
HDFS RAID. http://wiki.apache.org/hadoop/HDFS-RAID.
[18]
Y. Hu, H. C. H. Chen, P. P. C. Lee, and Y. Tang. Nccloud: Applying network coding for the storage repair in a cloud-of-clouds. In FAST, 2012.
[19]
Y. Hu, H. C. H. Chen, P. P. C. Lee, and Y. Tang. Rethinking Erasure Codes for Cloud File Systems: Minimizing I/O for Recovery and Degraded Reads. In FAST, 2012.
[20]
C. Huang, H. Simitci, Y. Xu, A. Ogus, B. Calder, P. Gopalan, J. Li, and S. Yekhanin. Erasure Coding in Windows Azure Storage. In USENIX ATC, 2012.
[21]
R. T. Kaushik and M. Bhandarkar. Greenhdfs: towards an energy-conserving, storage-efficient, hybrid hadoop compute cluster. In Proceedings of the 2010 international conference on Power aware computing and systems, HotPower'10, pages 1--9, Berkeley, CA, USA, 2010. USENIX Association.
[22]
R. T. Kaushik and M. Bhandarkar. GreenHDFS: Towards an Energy-conserving, Storage-efficient, Hybrid Hadoop Compute Cluster. In Proceedings of the 2010 International Conference on Power Aware Computing and Systems, HotPower'10, pages 1--9, Berkeley, CA, USA, 2010. USENIX Association.
[23]
J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer, C. Wells, and B. Zhao. OceanStore: An Architecture for Global-Scale Persistent Storage. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2000.
[24]
Large-Scale Distributed Systems at Google: Current Systems and Future Directions. http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf.
[25]
A. W. Leung, S. Pasupathy, G. Goodson, and E. L. Miller. Measurement and analysis of large-scale network file system workloads. In USENIX 2008 Annual Technical Conference on Annual Technical Conference, ATC'08, pages 213--226, Berkeley, CA, USA, 2008. USENIX Association.
[26]
D. S. Papailiopoulos, J. Luo, A. G. Dimakis, C. Huang, and J. Li. Simple regenerating codes: Network coding for cloud storage. CoRR, abs/1109.0264, 2011.
[27]
K. Rashmi, N. B. Shah, D. Gu, H. Kuang, D. Borthakur, and K. Ramchandran. A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster. In USENIX HotStorage 2013, 2013.
[28]
M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A. G. Dimakis, R. Vadali, S. Chen, and D. Borthakur. Xoring elephants: Novel erasure codes for big data. Proc. VLDB Endow., 6(5):325--336, Mar. 2013.
[29]
K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The Hadoop Distributed File System. In MSST, 2010.
[30]
I. Tamo, Z. Wang, and J. Bruck. MDS array codes with optimal rebuilding. In ISIT, pages 1240--1244, 2011.
[31]
The computer failure data repository. http://cfdr.usenix.org.
[32]
The Hadoop Distributed File System. http://www.aosabook.org/en/hdfs.html.

Cited By

View all
  • (2024)Stripe-schedule Aware Repair in Erasure-coded Clusters with Heterogeneous Star NetworksACM Transactions on Architecture and Code Optimization10.1145/3664926Online publication date: 13-May-2024
  • (2024)Parallelized In-Network Aggregation for Failure Repair in Erasure-Coded Storage SystemsIEEE/ACM Transactions on Networking10.1109/TNET.2024.336799532:4(2888-2903)Online publication date: Aug-2024
  • (2024)BFT-DSN: A Byzantine Fault-Tolerant Decentralized Storage NetworkIEEE Transactions on Computers10.1109/TC.2024.336595373:5(1300-1312)Online publication date: 14-Feb-2024
  • Show More Cited By

Index Terms

  1. Lazy Means Smart: Reducing Repair Bandwidth Costs in Erasure-coded Distributed Storage

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SYSTOR 2014: Proceedings of International Conference on Systems and Storage
      June 2014
      168 pages
      ISBN:9781450329200
      DOI:10.1145/2611354
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      In-Cooperation

      • Technion: Israel Institute of Technology
      • USENIX Assoc: USENIX Assoc

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 30 June 2014

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Distributed storage systems
      2. Erasure codes
      3. Repair bandwidth

      Qualifiers

      • Tutorial
      • Research
      • Refereed limited

      Conference

      SYSTOR 2014
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 94 of 285 submissions, 33%

      Upcoming Conference

      SYSTOR '24
      The 17th ACM International Systems and Storage Conference
      September 23 - 24, 2024
      Virtual , Israel

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)16
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 30 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Stripe-schedule Aware Repair in Erasure-coded Clusters with Heterogeneous Star NetworksACM Transactions on Architecture and Code Optimization10.1145/3664926Online publication date: 13-May-2024
      • (2024)Parallelized In-Network Aggregation for Failure Repair in Erasure-Coded Storage SystemsIEEE/ACM Transactions on Networking10.1109/TNET.2024.336799532:4(2888-2903)Online publication date: Aug-2024
      • (2024)BFT-DSN: A Byzantine Fault-Tolerant Decentralized Storage NetworkIEEE Transactions on Computers10.1109/TC.2024.336595373:5(1300-1312)Online publication date: 14-Feb-2024
      • (2024)Boosting Correlated Failure Repair in SSD Data CentersIEEE Internet of Things Journal10.1109/JIOT.2023.333997911:8(14228-14240)Online publication date: 15-Apr-2024
      • (2024)Erasure-Coded Hybrid Writes Based on Data DeltaInternational Journal of Parallel Programming10.1007/s10766-024-00773-052:4(231-252)Online publication date: 24-May-2024
      • (2023)Cache-Based Matrix Technology for Efficient Write and Recovery in Erasure Coding Distributed File SystemsSymmetry10.3390/sym1504087215:4(872)Online publication date: 6-Apr-2023
      • (2023)Reliability Evaluation of Erasure-coded Storage Systems with Latent ErrorsACM Transactions on Storage10.1145/356831319:1(1-47)Online publication date: 11-Jan-2023
      • (2023)Boosting Erasure-Coded Multi-Stripe Repair in Rack Architecture and Heterogeneous Clusters: Design and AnalysisIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.328218034:8(2251-2264)Online publication date: Aug-2023
      • (2023)LFPR: A Lazy Fast Predictive Repair Strategy for Mobile Distributed Erasure Coded ClusterIEEE Internet of Things Journal10.1109/JIOT.2022.320341510:1(704-719)Online publication date: 1-Jan-2023
      • (2023)Designing Compact Repair Groups for Reed-Solomon Codes2023 IEEE International Symposium on Information Theory (ISIT)10.1109/ISIT54713.2023.10206491(2027-2032)Online publication date: 25-Jun-2023
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media