Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Failure correction techniques for large disk arrays

Published: 01 April 1989 Publication History
  • Get Citation Alerts
  • Abstract

    The ever increasing need for I/O bandwidth will be met with ever larger arrays of disks. These arrays require redundancy to protect against data loss. This paper examines alternative choices for encodings, or codes, that reliably store information in disk arrays. Codes are selected to maximize mean time to data loss or minimize disks containing redundant data, but are all constrained to minimize performance penalties associated with updating information or recovering from catastrophic disk failures. We also codes that give highly reliable data storage with low redundant data overhead for arrays of 1000 information disks.

    References

    [1]
    Berlekamp, E.R., Algebraic Coding Theory, McGraw- Hill, New York, 1968.
    [2]
    Bollobas, B., Comblnatorics, Set Systems, Hypergraphs, Families of Vectors, and Combinatorial Probability, Cambridge University Press, 1986.
    [3]
    Boral, H., DeWitt, D., "Database machines: an idea whose time has passed?," Database Machines, ed. H.O. Leilich, M. Missikoff, Springer-Verlag, September 983.
    [4]
    Gibson, G.A., L. Hellerstein, R.M. Karp, R.H. Katz, D.A. Patterson, "Coding Techniques for Handling Failures in Large Disk Arrays," UC Berkeley Tech Report UCB CSD 88-477, December 1988.
    [5]
    Gray, J., "Why do computers stop and what can be done about it.'?," Tandem Technical Report 85.7, lune 1985.
    [6]
    Friedberg, S.H., A.J. Insel, L.E. Spence, Linear Algebra, Prentice-Hall, Englewood Cliffs, NJ, 1979.
    [7]
    Hall, M, Jr., Combinatorial Theory, Blaisdell Publishing Co., 1967.
    [8]
    Muman, T., "Challenges in Winchester Disk Drives, High Performance Disk Drives," Challenges in Winchester Technology, A Short Course, lIST, Santa Clara University, Dec 1987.
    [9]
    Jilke, W., "Disk array mass storage systems: the new opportunity," Amperif Corp., Sept 1986.
    [10]
    Kim, M.Y., "Parallel operation of magnetic disk storage devices: synchronized disk interleaving," Database Machines, Fourth Int. Workshop on, ed., D.J. DeWitt, H. Boral, Springer-Vedag, March 1985.
    [11]
    Kim, M.Y., A.N. Tantawi, "Asynchronized disk interleaving," IBM T.I. Watson Research Center Technical Report RC-12497, February 1987.
    [12]
    Klietz, A., J. Tumer, and T.C. Jacobson, "TurboNFS: fast shared access for Cray disk storage," Proc. of Cray User Group Convention, Apr. 1988.
    [13]
    Livny, M., S. K. hoshafian, H. Boral, "Multi-disk management algorithms," Proc. of ACM SiGMETRICS, May 1987.
    [14]
    Siewiorek, D.P., R.S. Swarz, The Theory and Practice of Reliable System Design, Digital Press, 1982.
    [15]
    Maginnis, N.B., "Store more, spend less: mid-range options around," CornputerworM, Nov. 16, 1987, p. 71.
    [16]
    Park, A., K. Balasubramanian, "Providing fault tolerance in parallel secondary storage systems," Princeton Technical Report CS-TR-057-86, November 1986.
    [17]
    Patterson, D.A., G.A. Gibson, R.H. Katz, "A case for redundant arrays of inexpensive disks (RAID)," ACM SiGMOD 88, Chicago, June 1-3, 1988.
    [18]
    Peterson, W.W., Error-Correcting Codes, M.I.T. Press and John Wiley & Sons, 1961.
    [19]
    Rubinstein, R.Y., Simulation and the Monte Carlo Method, John Wiley & Sons, 1981.
    [20]
    Salem, K., H. Garcia-Molina, "Disk striping," IEEE 1986 Int. Conf. on Data Engineering, 1986.
    [21]
    Schrijver, A., ed., "Packing and Covering in Combinatorics," Mathematical Centre Tracts 106, Mathematisch Centrum, Amsterdam t 979.
    [22]
    Schulze. M., G. Gibson, R. Katz, D. Patterson, "How reliable is a RAID?," COMPCON Spring 89, San Francisco, 1989.
    [23]
    Vasudeva, A., "A case for disk array storage system," Systems Design and Networks Conference Proc., Mass Storage Trends and Systems Integration, ed. Kenneth Majithia, April 1988.

    Cited By

    View all
    • (2015)Scale-RS: An Efficient Scaling Scheme for RS-Coded Storage ClustersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2014.232615626:6(1704-1717)Online publication date: 1-Jun-2015
    • (2010)Speculative pipelining for compute cloud programming2010 - MILCOM 2010 MILITARY COMMUNICATIONS CONFERENCE10.1109/MILCOM.2010.5680451(2026-2034)Online publication date: Oct-2010
    • (2020)RAIDPProceedings of the Fifteenth European Conference on Computer Systems10.1145/3342195.3387546(1-17)Online publication date: 17-Apr-2020
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 17, Issue 2
    Special issue: Proceedings of ASPLOS-III: the third international conference on architecture support for programming languages and operating systems
    April 1989
    291 pages
    ISSN:0163-5964
    DOI:10.1145/68182
    Issue’s Table of Contents
    • cover image ACM Conferences
      ASPLOS III: Proceedings of the third international conference on Architectural support for programming languages and operating systems
      April 1989
      303 pages
      ISBN:0897913000
      DOI:10.1145/70082
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 April 1989
    Published in SIGARCH Volume 17, Issue 2

    Check for updates

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)44
    • Downloads (Last 6 weeks)8

    Other Metrics

    Citations

    Cited By

    View all
    • (2015)Scale-RS: An Efficient Scaling Scheme for RS-Coded Storage ClustersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2014.232615626:6(1704-1717)Online publication date: 1-Jun-2015
    • (2010)Speculative pipelining for compute cloud programming2010 - MILCOM 2010 MILITARY COMMUNICATIONS CONFERENCE10.1109/MILCOM.2010.5680451(2026-2034)Online publication date: Oct-2010
    • (2020)RAIDPProceedings of the Fifteenth European Conference on Computer Systems10.1145/3342195.3387546(1-17)Online publication date: 17-Apr-2020
    • (2016)RESAR: Reliable Storage at Exabyte Scale2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)10.1109/MASCOTS.2016.13(211-220)Online publication date: Sep-2016
    • (2013)Related WorkMulti Tenancy for Cloud-Based In-Memory Column Databases10.1007/978-3-319-00497-6_7(95-112)Online publication date: 3-Jun-2013
    • (2012)Resilience to Various Failures for Read-mostly In-memory Data StructuresProceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum10.1109/IPDPSW.2012.198(1572-1580)Online publication date: 21-May-2012
    • (2010)Flat XOR-based erasure codes in storage systemsProceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)10.1109/MSST.2010.5496983(1-14)Online publication date: 3-May-2010
    • (2010)DFSBProceedings of the 2010 First International Conference on Networking and Distributed Computing10.1109/ICNDC.2010.10(8-12)Online publication date: 21-Oct-2010
    • (2009)DiskReduceProceedings of the 4th Annual Workshop on Petascale Data Storage10.1145/1713072.1713075(6-10)Online publication date: 14-Nov-2009
    • (2009)Trust-Based Design and Check of FPGA Circuits Using Two-Level Randomized ECC StructuresACM Transactions on Reconfigurable Technology and Systems10.1145/1502781.15082092:1(1-36)Online publication date: 1-Mar-2009
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media