Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1362622.1362682acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

RobuSTore: a distributed storage architecture with robust and high performance

Published: 10 November 2007 Publication History

Abstract

Emerging large-scale scientific applications require to access large data objects in high and robust performance. We propose RobuSTore, a storage architecture that combines erasure codes and speculative access mechanisms for parallel write and read in distributed environments. The mechanisms can effectively aggregate the bandwidth from a large number of distributed disks and statistically tolerate pear-disk performance variation. Our simulation results affirm the high and robust performance of RobuSTore in both write and read operations compared to traditional parallel storage systems. For example, for a 1GB data access using 64 disks, RobuSTore achieves average bandwidth of 186MBps for write and 400MBps for read, nearly 6x and 15x that achieved by a RAID-0 system. The standard deviation of access latency is only 0.5 second, about 9% of the write latency and 20% of the read latency, and a 5-fold improvement from RAID-0. The improvements are achieved at moderate cost: about 40% increase in I/O operations and 2x-3x increase in storage capacity utilization.

References

[1]
Smarr, L. L., et al., The OptIPuter. Communications of the ACM, 2003. 46(11): p. 58--67.
[2]
iGrid2005. San Diego, CA, Sep 26--30, 2005; http://www.igrid2005.org.
[3]
Grochowski, E. and R. D. Halem, Technological impact of magnetic hard disk drives on storage systems. IBM Systems Journal, 2003. 42(2): p. 338--346.
[4]
Luby, M. LT Codes. in Proc. IEEE Symp. On Foundations of Computer Science 2002. 2002.
[5]
Shokrollahi, A., Raptor codes. 2003, Digital Fountain and EPFL.
[6]
Plank, J. S., A Tutorial on Reed-Solomon Coding for Fault-Tolerance in RAID-like Systems. Software --- Practice & Experience, 1997. 27(9): p. 995--1012.
[7]
Reed, I. and G. Solomon, Polynomial codes over certain finite fields. Journal of the Society for Industrial and Applied Mathematics, 1960. 8(2): p. 300--304.
[8]
Gallager, R. G., Low Density Parity-Check Codes. 1963, Cambridge, MA: MIT Press.
[9]
Uyeda, F., H. Xia, and A. Chien, Evaluation of a High Performance Erasure Code Implementation. 2004, UCSD.
[10]
Bucy, J. S. and G. R. Ganger, The DiskSim Simulation Environment Version 3.0 Reference Manual. 2003, Carnegie Mellon University.
[11]
Biomedical Informatics Research Network (BIRN). http://www.nbirn.net.
[12]
GriPhyN: Grid Physics Network. http://www.griphyn.org.
[13]
The EarthScope Project. http://www.earthscope.org.
[14]
Patterson, D. A., G. A. Gibson, and R. H. Katz. A Case for Redundant Arrays of Inexpensive Disks (RAID). in International Conference on Management of Data (SIGMOD). 1988.
[15]
Carns, P. H., et al. PVFS: A Parallel File System For Linux Clusters. in 4th Annual Linux Showcase and Conference. 2000. Atlanta, GA.
[16]
Schmuck, F. and R. Haskin. GPFS: A Shared-Disk File System for Large Computing Clusters. in USENIX Conference on File and Storage Technologies (FAST). 2002. Monterey, CA.
[17]
System, C. F., Lustre: A Scalable, High-Performance File System. 2002, Lustre File System v1.0 Architecture White Paper from clusterfs.org.
[18]
Nagle, D., D. Serenyi, and A. Matthews. The Panasas ActiveScale Storage Cluster -Delivering Scalable High Bandwidth Storage. in ACM/IEEE Conference on Supercomputing, 2004 (SC'04). 2004. Pittsburgh, PA.
[19]
Kazaa. http://www.kazaa.com.
[20]
BitTorrent. http://www.bittorrent.com.
[21]
Corbett, P. F. and D. G. Feitelson, The Vesta parallel file system. ACM Transactions on Computer Systems, 1996. 14: p. 225--264.
[22]
Abd-El-Malek, M., et al. Fault-Scalable Byzantine Fault-Tolerant Services. in Symposium on Operating Systems Principles. 2005. Brighton, UK.
[23]
Kubiatowicz, J., D. Bindel, and e. al. OceanStore: An Architecture for Global-Scale Persistent Storage. in the Ninth international Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 2000. Cambridge, MA.
[24]
Bhagwan, R., et al. Total recall: System support for automated availability management. in the First ACM/Usenix Symposium on Networked Systems Design and Implementation (NSDI). 2004. San Francisco, CA.
[25]
Weatherspoon, H. and J. D. Kubiatowicz. Erasure Coding vs. Replication: A Quantitative Comparison. in the First International Workshop on Peer-to-Peer Systems (IPTPS). 2002. Cambridge, MA.
[26]
Aguilera, M. K., R. Janakiraman, and L. Xu. Using Erasure Codes Efficiently for Storage in a Distributed System. in DSN 2005: The International Conference on Dependable Systems and Networks. 2005. Yokohama, Japan.
[27]
Collins, R. L. and J. S. Plank. Assessing the performance of Erasure Codes in the Wide Area. in DSN-2005: The International Conference on Dependable Systems and Networks. 2005. Yokohama, Japan.
[28]
Lumb, C. R., R. Golding, and G. R. Ganger, D-SPTF: decentralized request distribution in brick-based storage systems, in 11th International Conference on Architectural Support for Programming Languages and Operating Systems. 2004: Boston, MA.
[29]
Wu, C. and R. Burns, Improving I/O Performance of Clustered Storage Systems by Adaptive Request Distribution, in The 15th IEEE International Symposium on High Performance Distributed Computing (HPDC-15). 2006: Paris.

Cited By

View all
  • (2022)Performance and time improvement of LT code-based cloud storageEURASIP Journal on Wireless Communications and Networking10.1186/s13638-022-02136-02022:1Online publication date: 21-Jun-2022
  • (2022)KuaProceedings of the 9th ACM Conference on Information-Centric Networking10.1145/3517212.3558083(56-66)Online publication date: 6-Sep-2022
  • (2019)Securing Cloud Data Under Key ExposureIEEE Transactions on Cloud Computing10.1109/TCC.2017.26705597:3(838-849)Online publication date: 1-Jul-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '07: Proceedings of the 2007 ACM/IEEE conference on Supercomputing
November 2007
723 pages
ISBN:9781595937643
DOI:10.1145/1362622
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 November 2007

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

SC '07
Sponsor:

Acceptance Rates

SC '07 Paper Acceptance Rate 54 of 268 submissions, 20%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 26 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Performance and time improvement of LT code-based cloud storageEURASIP Journal on Wireless Communications and Networking10.1186/s13638-022-02136-02022:1Online publication date: 21-Jun-2022
  • (2022)KuaProceedings of the 9th ACM Conference on Information-Centric Networking10.1145/3517212.3558083(56-66)Online publication date: 6-Sep-2022
  • (2019)Securing Cloud Data Under Key ExposureIEEE Transactions on Cloud Computing10.1109/TCC.2017.26705597:3(838-849)Online publication date: 1-Jul-2019
  • (2018)Toward Shared Ownership in the CloudIEEE Transactions on Information Forensics and Security10.1109/TIFS.2018.283764813:12(3019-3034)Online publication date: 1-Dec-2018
  • (2017)Content Espresso: A Distributed Large File Sharing System for Digital Content ProductionsIEICE Transactions on Information and Systems10.1587/transinf.2017EDP7048E100.D:9(2100-2117)Online publication date: 2017
  • (2017)PBSEProceedings of the 2017 Symposium on Cloud Computing10.1145/3127479.3131622(295-308)Online publication date: 24-Sep-2017
  • (2017)Delay-Optimized File Retrieval under LT-Based Cloud StorageIEEE Transactions on Cloud Computing10.1109/TCC.2015.24303475:4(656-666)Online publication date: 1-Oct-2017
  • (2017)Binary random systematic erasure code for RAID system10.1063/1.4977391(090007)Online publication date: 2017
  • (2016)Multi-code Distributed Storage2016 IEEE 9th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD.2016.0119(839-842)Online publication date: Jun-2016
  • (2016)Toward high-performance key-value stores through GPU encoding and locality-aware encodingJournal of Parallel and Distributed Computing10.1016/j.jpdc.2016.04.01596:C(27-37)Online publication date: 1-Oct-2016
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media