Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2592798.2592816acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Archiving cold data in warehouses with clustered network coding

Published: 14 April 2014 Publication History
  • Get Citation Alerts
  • Abstract

    Modern storage systems now typically combine plain replication and erasure codes to reliably store large amount of data in datacenters. Plain replication allows a fast access to popular data, while erasure codes, e.g., Reed-Solomon codes, provide a storage-efficient alternative for archiving less popular data. Although erasure codes are now increasingly employed in real systems, they experience high overhead during maintenance, i.e., upon failures, typically requiring files to be decoded before being encoded again to repair the encoded blocks stored at the faulty node.
    In this paper, we propose a novel erasure code system, tailored for networked archival systems. The efficiency of our approach relies on the joint use of random codes and a clustered placement strategy. Our repair protocol leverages network coding techniques to reduce by 50% the amount of data transferred during maintenance, by repairing several cluster files simultaneously. We demonstrate both through an analysis and extensive experimental study conducted on a public testbed that our approach significantly decreases both the bandwidth overhead during the maintenance process and the time to repair lost data. We also show that using a non-systematic code does not impact the throughput, and comes only at the price of a higher CPU usage. Based on these results, we evaluate the impact of this higher CPU consumption on different configurations of data coldness by determining whether the cluster's network bandwidth dedicated to repair or CPU dedicated to decoding saturates first.

    References

    [1]
    S. Acedański, S. Deb, M. Médard, and R. Koetter. How good is random linear coding based distributed networked storage. In NetCod, 2005.
    [2]
    R. Ahlswede, N. Cai, S.-Y. Li, and R. Yeung. Network Information Flow. IEEE Transactions On Information Theory, 46:1204--1216, 2000.
    [3]
    F. André, A.-M. Kermarrec, Erwan Le Merrer, N. Le Scouarnec, G. Straub, and A. van Kempen. Archiving Cold Data in Warehouses with Clustered Network Coding. arxiv:1206.4175.
    [4]
    R. Bhagwan, K. Tati, Y.-C. Cheng, S. Savage, and G. M. Voelker. Total recall: system support for automated availability management. In NSDI, 2004.
    [5]
    B. Calder, J. Wang, A. Ogus, N. Nilakantan, A. Skjolsvold, S. McKelvie, Y. Xu, S. Srivastav, J. Wu, H. Simitci, J. Haridas, C. Uddaraju, H. Khatri, A. Edwards, V. Bedekar, S. Mainali, R. Abbasi, A. Agarwal, M. F. ul Haq, M. I. ul Haq, D. Bhardwaj, S. Dayanand, A. Adusumilli, M. McNett, S. Sankaran, K. Manivannan, and L. Rigas. Windows Azure Storage: a highly available cloud storage service with strong consistency. In SOSP, 2011.
    [6]
    S. Caron, F. Giroire, D. Mazauric, J. Monteiro, and S. Pérennes. Data life time for different placement policies in P2P storage systems. In Globe, 2010.
    [7]
    B.-G. Chun, F. Dabek, A. Haeberlen, E. Sit, H. Weatherspoon, F. Kaashoek, J. Kubiatowicz, and R. Morris. Efficient Replica Maintenance for Distributed Storage Systems. In NSDI, 2006.
    [8]
    F. Dabek, M. F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Wide-area cooperative storage with CFS. In SOSP, 2001.
    [9]
    A. G. Dimakis, P. B. Godfrey, Y. Wu, M. O. Wainwright, and K. Ramchandran. Network Coding for Distributed Storage Systems. In INFOCOM, 2007.
    [10]
    A. G. Dimakis, V. Prabhakaran, and K. Ramchandran. Decentralized Erasure Codes for Distributed Networked Storage. In Joint special issue, IEEE/ACM Transactions on Networking and IEEE Transactions on Information Theory, 2006.
    [11]
    A. G. Dimakis, K. Ramchandran, Y. Wu, and C. Suh. A Survey on Network Codes for Distributed Storage. The Proceedings of the IEEE, 99:476--489, 2010.
    [12]
    A. Duminuco and E. Biersack. Hierarchical Codes: How to Make Erasure Codes Attractive for Peer-to-Peer Storage Systems. In P2P, 2008.
    [13]
    A. Duminuco and E. Biersack. A Pratical Study of Regenerating Codes for Peer-to-Peer Backup Systems. In ICDCS, 2009.
    [14]
    A. Duminuco, E. Biersack, and T. En-Najjary. Proactive replication in distributed storage systems using machine availability estimation. In CoNEXT, 2007.
    [15]
    D. Ford, F. Labelle, F. I. Popovici, M. Stokely, V.-A. Truong, L. Barroso, C. Grimes, and S. Quinlan. Availability in Globally Distributed Storage Systems. In OSDI, 2010.
    [16]
    A. Gharaibeh and M. Ripeanu. Exploring data reliability tradeoffs in replicated storage systems. In HPDC, 2009.
    [17]
    S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google File System. In SOSP, 2003.
    [18]
    C. Gkantsidis and P. Rodriguez. Network Coding for Large Scale Content Distribution. In INFOCOM, 2005.
    [19]
    Glacier. http://aws.amazon.com/fr/glacier/.
    [20]
    P. B. Godfrey, S. Shenker, and I. Stoica. Minimizing Churn in Distributed Systems. In SIGCOMM, 2006.
    [21]
    Grid'5000. https://www.grid5000.fr/.
    [22]
    Y. Hu, H. C. H. Chen, P. P. C. Lee, and Y. Tang. NCCloud: Applying Network Coding for the Storage Repair in a Cloud-of-Clouds. In FAST, 2012.
    [23]
    Y. Hu, C.-M. Yu, Y. K. Li, P. Lee, and J. Lui. NCFS: On the Practicality and Extensibility of a Network-Coding-Based Distributed File System. In NetCod, 2011.
    [24]
    C. Huang, M. Chen, and J. Li. Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems. In NCA, 2007.
    [25]
    C. Huang, H. Simitci, Y. Xu, A. Ogus, B. Calder, P. Gopalan, J. Li, and S. Yekhanin. Erasure coding in Windows Azure Storage. In USENIX ATC, 2012.
    [26]
    S. Jiekak, A.-M. Kermarrec, N. Le Scouarnec, G. Straub, and A. Van Kempen. Regenerating Codes: A System Perspective. ACM SIGOPS Operating Systems Review, 47:23--32, 2013.
    [27]
    A. Kermarrec, E. Le Merrer, G. Straub, and A. Van Kempen. Availability-Based Methods for Distributed Storage Systems. In SRDS, 2012.
    [28]
    A. Kermarrec, N. Le Scouarnec, and G. Straub. Repairing Multiple Failures with Coordinated and Adaptive Regenerating Codes. arxiv:1102.0204, (updated September 2013).
    [29]
    O. Khan, R. Burns, J. Plank, W. Pierce, and C. Huang. Rethinking Erasure Codes for Cloud File Systems: Minimizing I/O for Recovery and Degraded Reads. In FAST, 2012.
    [30]
    J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer, C. Wells, and B. Zhao. OceanStore: an architecture for global-scale persistent storage. ACM SIGPLAN Not., 35(11):190--201, 2000.
    [31]
    H.-Y. Lin and W.-G. Tzeng. A Secure Erasure Code-Based Cloud Storage System with Secure Data Forwarding. IEEE Transactions on Parallel and Distributed Systems, 2012.
    [32]
    Y. Lin, B. Liang, and B. Li. Priority Random Linear Codes in Distributed Storage Systems. IEEE Transactions on Parallel and Distributed Systems, 20(11):1653--1667, 2009.
    [33]
    M. Martalo and, M. Picone, M. Amoretti, G. Ferrari, and R. Raheli. Randomized network coding in distributed storage systems with layered overlay. In ITA, 2011.
    [34]
    F. E. Oggier and A. Datta. Self-repairing homomorphic codes for distributed storage systems. In INFOCOM, 2011.
    [35]
    D. S. Papailiopoulos, J. Luo, A. G. Dimakis, C. Huang, and J. Li. Simple Regenerating Codes: Network Coding for Cloud Storage. In INFOCOM, 2012.
    [36]
    J. S. Plank, K. Greenan, and E. L. Miller. Screaming Fast Galois Field Arithmetic Using Intel SIMD Extensions. In FAST, 2013.
    [37]
    J. S. Plank, J. Luo, C. D. Schuman, L. Xu, and Z. Wilcox-O'Hearn. A performance evaluation and examination of open-source erasure coding libraries for storage. In FAST, 2009.
    [38]
    J. S. Plank, S. Simmerman, and C. D. Schuman. Jerasure: A Library in C/C++ Facilitating Erasure Coding for Storage Applications - Version 1.2A. University of Tennessee, CS-08-627, 2008.
    [39]
    K. V. Rashmi, N. B. Shah, D. Gu, H. Kuang, D. Borthakur, and K. Ramchandran. A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster. In HotStorage, 2013.
    [40]
    R. Rodrigues and B. Liskov. High Availability in DHTs: Erasure Coding vs. Replication. In IPTPS, 2005.
    [41]
    A. I. T. Rowstron and P. Druschel. Storage Management and Caching in PAST, A Large-scale, Persistent Peer-to-peer Storage Utility. In SOSP, 2001.
    [42]
    M. Sathiamoorthy, M. Asteris, D. S. Papailiopoulos, A. G. Dimakis, R. Vadali, S. Chen, and D. Borthakur. XORing Elephants: Novel Erasure Codes for Big Data. In VLDB, 2013.
    [43]
    K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The Hadoop Distributed File System. In MSST, 2010.
    [44]
    K. Tati and G. M. Voelker. On Object Maintenance in Peer-to-Peer Systems. In IPTPS, 2006.
    [45]
    K. V. Vishwanath and N. Nagappan. Characterizing cloud computing hardware reliability. In SoCC, 2010.
    [46]
    H. Weatherspoon and J. Kubiatowicz. Erasure Coding Vs. Replication: A Quantitative Comparison. In IPTPS, 2002.

    Cited By

    View all
    • (2022)Scalable local reconstruction code design for hot data reads in cloud storage systemsScience China Information Sciences10.1007/s11432-021-3421-665:12Online publication date: 22-Nov-2022
    • (2021)Repair Pipelining for Erasure-coded Storage: Algorithms and EvaluationACM Transactions on Storage10.1145/343689017:2(1-29)Online publication date: 28-May-2021
    • (2021)Design and Evaluation of a Risk-Aware Failure Identification Scheme for Improved RAS in Erasure-Coded Data CentersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.301004832:1(16-30)Online publication date: 1-Jan-2021
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    EuroSys '14: Proceedings of the Ninth European Conference on Computer Systems
    April 2014
    388 pages
    ISBN:9781450327046
    DOI:10.1145/2592798
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 April 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. cold data
    2. distributed storage
    3. erasure codes
    4. maintenance

    Qualifiers

    • Research-article

    Funding Sources

    • ODISEA

    Conference

    EuroSys 2014
    Sponsor:
    EuroSys 2014: Ninth Eurosys Conference 2014
    April 14 - 16, 2014
    Amsterdam, The Netherlands

    Acceptance Rates

    EuroSys '14 Paper Acceptance Rate 27 of 147 submissions, 18%;
    Overall Acceptance Rate 241 of 1,308 submissions, 18%

    Upcoming Conference

    EuroSys '25
    Twentieth European Conference on Computer Systems
    March 30 - April 3, 2025
    Rotterdam , Netherlands

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Scalable local reconstruction code design for hot data reads in cloud storage systemsScience China Information Sciences10.1007/s11432-021-3421-665:12Online publication date: 22-Nov-2022
    • (2021)Repair Pipelining for Erasure-coded Storage: Algorithms and EvaluationACM Transactions on Storage10.1145/343689017:2(1-29)Online publication date: 28-May-2021
    • (2021)Design and Evaluation of a Risk-Aware Failure Identification Scheme for Improved RAS in Erasure-Coded Data CentersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.301004832:1(16-30)Online publication date: 1-Jan-2021
    • (2021)Unequal Failure Protection Coding Technique for Distributed Cloud Storage SystemsIEEE Transactions on Cloud Computing10.1109/TCC.2017.27853969:1(386-400)Online publication date: 1-Jan-2021
    • (2021)F-Write: Fast RDMA-supported Writes in Erasure-coded In-memory Clusters2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS49936.2021.00091(817-826)Online publication date: May-2021
    • (2019)Popularity-Aware Multi-Failure Resilient and Cost-Effective Replication for High Data Durability in Cloud StorageIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2018.287338430:10(2355-2369)Online publication date: 1-Oct-2019
    • (2017)Which Achieves Lower Latency with Redundant Requests, Replication or Coding?GLOBECOM 2017 - 2017 IEEE Global Communications Conference10.1109/GLOCOM.2017.8254986(1-6)Online publication date: Dec-2017
    • (2017)Latency Analysis of Flexible Redundant Scheme in MDS-Coded Distributed Storage SystemsGLOBECOM 2017 - 2017 IEEE Global Communications Conference10.1109/GLOCOM.2017.8254038(1-6)Online publication date: Dec-2017
    • (2016)A Low-Cost Multi-failure Resilient Replication Scheme for High Data Availability in Cloud Storage2016 IEEE 23rd International Conference on High Performance Computing (HiPC)10.1109/HiPC.2016.036(242-251)Online publication date: Dec-2016
    • (2016)Unequal Failure Protection Coding Technology for Cloud Storage Systems2016 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2016.16(231-240)Online publication date: Sep-2016
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media