Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

ROS: A Rack-based Optical Storage System with Inline Accessibility for Long-Term Data Preservation

Published: 09 November 2018 Publication History
  • Get Citation Alerts
  • Abstract

    The combination of the explosive growth in digital data and the demand to preserve much of these data in the long term has made it imperative to find a more cost-effective way than HDD arrays and a more easily accessible way than tape libraries to store massive amounts of data. While modern optical discs are capable of guaranteeing more than 50-year data preservation without media replacement, individual optical discs’ lack of the performance and capacity relative to HDDs or tapes has significantly limited their use in datacenters. This article presents a Rack-scale Optical disc library System, or ROS in short, which provides a PB-level total capacity and inline accessibility on thousands of optical discs built within a 42U Rack. A rotatable roller and robotic arm separating and fetching discs are designed to improve disc placement density and simplify the mechanical structure. A hierarchical storage system based on SSDs, hard disks, and optical discs is proposed to effectively hide the delay of mechanical operation. However, an optical library file system (OLFS) based on FUSE is proposed to schedule mechanical operation and organize data on the tiered storage with a POSIX user interface to provide an illusion of inline data accessibility. We further optimize OLFS by reducing unnecessary user/kernel context switches inheriting from legacy FUSE framework. We evaluate ROS on a few key performance metrics, including operation delays of the mechanical structure and software overhead in a prototype PB-level ROS system. The results show that ROS stacked on Samba and FUSE as network-attached storage (NAS) mode almost saturates the throughput provided by underlying samba via 10GbE network for external users, as well as in this scenario provides about 53ms file write and 15ms read latency, exhibiting its inline accessibility. Besides, ROS is able to effectively hide and virtualize internal complex operational behaviors and be easily deployable in datacenters.

    References

    [1]
    Amazon. 2017. Amazone Galcier. Retrieved from http://aws.amazon.com/glacier/.
    [2]
    Optical Storage Technology Association. 2003. Universal disk format specification. Retrieved from http://www.osta.org/specs/pdf/udf250.pdf.
    [3]
    Shobana Balakrishnan, Richard Black, Austin Donnelly, Paul England, Adam Glass, Dave Harper, Sergey Legtchenko, Aaron Ogus, Eric Peterson, and Antony Rowstron. 2014. Pelican: A building block for exascale cold data storage. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). USENIX Association, 351--365.
    [4]
    Jean Jacques Cassiman, Segolene Ayme, Beatrice Godard, and Jorg Schmidtke. 2003. Data storage and DNA banking for biomedical research: Informed consent, confidentiality, quality issues, ownership, return of benefits. A professional perspective. Eur. J. Hum. Genet. 11 (2003), S88--S122.
    [5]
    Doug Beaver, Sanjeev Kumar, Harry C. Li, Jason Sobel, and Peter Vajgel. 2010. Finding a needle in haystack: Facebook’s photo storage. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation (OSDI’10). USENIX Association, Berkeley, CA, 47--60.
    [6]
    John Bent, Garth Gibson, Gary Grider, Ben McClelland, Paul Nowoczynski, James Nunez, Milo Polte, and Meghan Wingate. 2009. PLFS: A checkpoint filesystem for parallel applications. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC’09). ACM, New York, NY, Article 21.
    [7]
    S. Boyd, A. Horvath, and D. Dornfeld. 2011. Life-cycle assessment of NAND flash memory. IEEE Trans. Semicond. Manufactur. 24, 1 (Feb. 2011), 117--124.
    [8]
    Victor Chang. 2015. Towards a big data system disaster recovery in a private cloud. Ad Hoc Networks 35 (July 2015), 65--82.
    [9]
    Brian Cornell, Peter A. Dinda, and Fabián E. Bustamante. 2004. Wayback: A user-level versioning file system for linux. In Proceedings of the Annual Conference on USENIX Annual Technical Conference (ATEC’04). USENIX Association, Berkeley, CA, 27.
    [10]
    Panasonic Corp. 2016. Data archiver LB-DH8 series. Retrieved from http://panasonic.net/avc/archiver/lb-dh8/.
    [11]
    Douglas Crockford. 2016. JavaScript object notation. Retrieved from http://www.json.org/.
    [12]
    B. Biskeborn, M. Richmond, A. Abe, D. Pease, A. Amir, and L. V. Real. 2010. The linear tape file system. In Proceedings of the IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST’10). IEEE, 1--8.
    [13]
    W. Wamsteker, I. Skillen, J. D. Ponz, A. de la Fuente, M. Barylak, and I. Yurrita. 2000. INES: Astronomy data distribution for the future. Astrophys. Space Sci. 273, 1 (Sep. 2000), 155--161.
    [14]
    G. Deepika. 2011. Holographic versatile disc. In Proceedings of the 2011 National Conference on Innovations in Emerging Technology (NCOIET’11). IEEE, 145--146.
    [15]
    Hiroshi Fujiwara. 2016. What is the importance of data archives and what are the issues? Retrieved from http://panasonic.net/avc/archiver/voices/experts01_bbtower.html.
    [16]
    Gregory R. Ganger and M. Frans Kaashoek. 1997. Embedded inodes and explicit grouping: Exploiting disk bandwidth for small files. In Proceedings of the Annual Conference on USENIX Annual Technical Conference (ATEC’97). USENIX Association, Berkeley, CA, 1.
    [17]
    Vasily Tarasov and George Amvrosiadis. 2018. Filebench. Retrieved from https://github.com/filebench/filebench/wiki.
    [18]
    Google. 2018. Archival cloud storage: Nearline and Coldline. Retrieved from http://cloud.google.com/storage/archival/.
    [19]
    Matthias Grawinkel, Lars Nagel, Markus Mäsker, Federico Padua, André Brinkmann, and Lennart Sorth. 2015. Analysis of the ECMWF storage landscape. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). USENIX Association, 15--27.
    [20]
    Min Gu and Xiangping Li. 2010. The road to multi-dimensionalbit-by-bit optical data storage. Optics Photon. News 21, 7 (July 2010), 28--33.
    [21]
    P. Gupta, A. Wildani, E. L. Miller, D. S. H. Rosenthal, and D. D. E. Long. 2016. Effects of prolonged media usage and long-term planning on archival systems. In Proceedings of the IEEE 32nd Symposium on Mass Storage Systems and Technologies (MSST’16). IEEE, 1--12.
    [22]
    Kazutoshi Katayama, Yuka Chinda, Osamu Shimizu, Yasutomo Goto, Mayumi Suzuki, and Hitoshi Noguchi. 2015. Long term stabilities of magnetic tape for data storage in office environment. J. Appl. Phys. 117, 17 (Feb. 2015), 17E305.
    [23]
    Sameer Kumar and Thomas R. McCaffrey. 2003. Engineering economics at a hard disk drive manufacturer. Technovation 23, 9 (Sep. 2003), 749--755.
    [24]
    Sergey Legtchenko, Xiaozhou Li, Antony I. T. Rowstron, Austin Donnelly, and Richard Black. 2016. Flamingo: Enabling evolvable HDD-based near-line storage. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). USENIX Association, 213--226.
    [25]
    Rich Miller. 2014. Inside Facebook’s Blu-Ray Cold Storage Data Center. Retrieved from http://datacenterfrontier.com/inside-facebooks-blu-ray-cold-storage-data-center/.
    [26]
    Yaoyu Cao, Min Gu, and Xiangping Li. 2014. Optical storage arrays: A perspective for future big data storage. Light: Science and Applications 3, e177 (May 2014).
    [27]
    Hiroyuki Minemura, Koichi Watanabe, Kazuyoshi Adachi, and Reiji Tamura. 2006. High-speed write/read techniques for blu-ray write-once discs. Japan. J. Appl. Phys. 45, 2S (Feb. 2006), 1213. Retrieved from http://stacks.iop.org/1347-4065/45/i=2S/a=1213.
    [28]
    Subramanian Muralidhar, Wyatt Lloyd, Sabyasachi Roy, Cory Hill, Ernest Lin, Weiwen Liu, Satadru Pan, Shiva Shankar, Viswanath Sivakumar, Linpeng Tang, and Sanjeev Kumar. 2014. F4: Facebook’s warm BLOB storage system. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI’14). USENIX Association, Berkeley, CA, 383--398. Retrieved from http://dl.acm.org/citation.cfm?id=2685048.2685078.
    [29]
    NASA. 2017. The Lou Mass Storage System. Retrieved from http://www.nas.nasa.gov/hecc/resources/storage_systems.html.
    [30]
    Babak Nikoobakht and Mostafa A. El-Sayed. 2003. Preparation and growth mechanism of gold nanorods (NRs) using seed-mediated growth method. Chem. Mater. 35, 10 (Apr. 2003), 1957--1962.
    [31]
    Academy of Motion Picture Arts, Sciences, and Technology Council. 2007. The digital dilemma: Strategic issues in archiving and accessing digital motion picture materials. Technical Report. Beverly Hills, CA.
    [32]
    Kestutis Patiejunas. 2014. Freezing Exabytes of Data at Facebook’s Cold Storage. Technical Report. Washington, D.C.
    [33]
    Marty Perlmutter. 2017. The Lost Picture Show: Hollywood Archivists Can’t Outpace Obsolescence. Retrieved from http://spectrum.ieee.org/computing/it/the-lost-picture-show-hollywood-archivists-cant-outpace-obsolescence.
    [34]
    Min Gu Peter Zijlstra and James W. M. Chon. 2009. Five-dimensional optical recording mediated by surface plasmons in gold nanorods. Nature 459 (May 2009), 410--413.
    [35]
    Aditya Rajgarhia and Ashish Gehani. 2010. Performance and extension of user space file systems. In Proceedings of the 2010 ACM Symposium on Applied Computing (SAC’10). ACM, New York, NY, 206--213.
    [36]
    Arnon Rosenthal, Peter Mork, Maya Hao Li, Jean Stanford, David Koester, and Patti Reynolds. 2010. Cloud computing: A new business paradigm for biomedical information sharing. J. Biomed. Info. 43, 2 (Apr. 2010), 342--353.
    [37]
    Bianca Schroeder, Raghav Lagisetty, and Arif Merchant. 2016. Flash reliability in production: The expected and the unexpected. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). USENIX Association, 67--80.
    [38]
    Sony. 2016. Sony Everspan. Retrieved from http://www.everspan.com/specs/.
    [39]
    Ivan Svrcek. 2009. Accelerated life cycle comparison of millenniata archival DVD. Retrieved from http://www.esystor.com/images/China_Lake_Full_Report.pdf.
    [40]
    C. Thompson. 2014. Optical disc system for long term archiving of multi-media content. In Proceedings of the 21st International Conference on Systems, Signals and Image Processing (IWSSIP’14). IEEE, 11--14.
    [41]
    Cristian Ungureanu, Benjamin Atkin, Akshat Aranya, Salil Gokhale, Stephen Rago, Grzegorz Calkowski, Cezary Dubnicki, and Aniruddha Bohra. 2010. HydraFS: A high-throughput file system for the hydrastor content-addressable storage system. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). USENIX Association, 225--239.
    [42]
    Akinobu Watanabe. 2013. Optical Library System for Long-term Preservation with Extended Error Correction Coding. Technical Report. Long Beach, CA.
    [43]
    Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, and Carlos Maltzahn. 2006. Ceph: A scalable, high-performance distributed file system. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI’06). USENIX Association, Berkeley, CA, 307--320.
    [44]
    Naotaka Yamamoto, Osamu Tatebe, and Satoshi Sekiguchi. 2004. Parallel and distributed astronomical data analysis on grid datafarm. In Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing (GRID’04). IEEE Computer Society, Washington, DC, 461--466.
    [45]
    Shuanglong Zhang, Helen Catanese, and An-I Andy Wang. 2016. The composite-file file system: Decoupling the one-to-one mapping of files and metadata for better performance. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). USENIX Association, 15--22.

    Cited By

    View all
    • (2023)Towards Migration-Free "Just-in-Case" Data Archival for Future Cloud Data Lakes Using Synthetic DNAProceedings of the VLDB Endowment10.14778/3594512.359452216:8(1923-1929)Online publication date: 1-Apr-2023
    • (2021)Coupling Right-Provisioned Cold Storage Data Centers with DeduplicationProceedings of the 50th International Conference on Parallel Processing10.1145/3472456.3472485(1-11)Online publication date: 9-Aug-2021
    • (2019)LT-TCO: A TCO Calculation Model of Data Centers for Long-Term Data Preservation2019 IEEE International Conference on Networking, Architecture and Storage (NAS)10.1109/NAS.2019.8834714(1-8)Online publication date: Aug-2019
    • Show More Cited By

    Index Terms

    1. ROS: A Rack-based Optical Storage System with Inline Accessibility for Long-Term Data Preservation

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Storage
      ACM Transactions on Storage  Volume 14, Issue 3
      Special Issue on FAST 2018 and Regular Papers
      August 2018
      210 pages
      ISSN:1553-3077
      EISSN:1553-3093
      DOI:10.1145/3282875
      • Editor:
      • Sam H. Noh
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 09 November 2018
      Accepted: 01 June 2018
      Revised: 01 April 2018
      Received: 01 October 2017
      Published in TOS Volume 14, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Archive storage
      2. file system
      3. hierarchical storage
      4. optical disc
      5. storage management

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • National Basic Research Program of China
      • Wuhan National Laboratory for Optoelectronics Fund
      • US NSF

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)19
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 09 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Towards Migration-Free "Just-in-Case" Data Archival for Future Cloud Data Lakes Using Synthetic DNAProceedings of the VLDB Endowment10.14778/3594512.359452216:8(1923-1929)Online publication date: 1-Apr-2023
      • (2021)Coupling Right-Provisioned Cold Storage Data Centers with DeduplicationProceedings of the 50th International Conference on Parallel Processing10.1145/3472456.3472485(1-11)Online publication date: 9-Aug-2021
      • (2019)LT-TCO: A TCO Calculation Model of Data Centers for Long-Term Data Preservation2019 IEEE International Conference on Networking, Architecture and Storage (NAS)10.1109/NAS.2019.8834714(1-8)Online publication date: Aug-2019
      • (2019)Defuse: Decoupling Metadata and Data Processing in FUSE Framework for Performance ImprovementIEEE Access10.1109/ACCESS.2019.29429547(138473-138484)Online publication date: 2019

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media