Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2335755.2335805acmotherconferencesArticle/Chapter ViewAbstractPublication PagesxsedeConference Proceedingsconference-collections
research-article

The data supercell

Published: 16 July 2012 Publication History

Abstract

The Data SuperCell (DSC) is a new, disk-based data archive deployed and in production at the Pittsburgh Supercomputing Center (PSC). It specifically deals with the archival demands of large data processing in an economic way. DSC incorporates PSCs SLASH2, layered filesystem technology, with commodity hardware and open software, to provide superior functionality, flexibility, manageability, reliability, performance and cost. Below, we describe DSC functionality goals; SLASH2 architecture, capabilities and suitability for archival applications; ZFS as an underlying file system; DSC architecture, structure and capabilities; followed by discussion of our experience with DSC, some performance measurements and plans for further development.

References

[1]
SLASH2 - (https://quipu.psc.teragrid.org/slash2)
[2]
ZFS - (http://en.wikipedia.org/wiki/ZFS)
[3]
Nowoczynski, P.; Stone, N.; Yanovich, J.; Sommerfield, J. 2008. Zest - Checkpoint storage system for large supercomputers. Petascale Data Storage Workshop, 2008. PDSW '08. 3rd Digital Object Identifier: 10.1109/PDSW.2008.4811883 Publication Year: 2008, Page(s): 1--5
[4]
Sandia Portals (http://www.cs.sandia.gov/Portals/)
[5]
File System in Userspace -- FUSE (http://fuse.sourceforge.net/)
[6]
ZFS-FUSE (http://zfs-fuse.net/)
[7]
GPFS/HPSS Interface -- GHI (www.hpss-collaboration.org/documents/HPSS-GPFS2009.pdf)
[8]
Data Supercell (http://www.psc.edu/general/filesys/far/data.php)
[9]
Simms, S. C., M. Davy, B. Hammond, M. Link, C. Stewart, R. Bramley, B. Plale, D. Gannon, M. - H. Baik, S. Teige, et al., All in a day's work: advancing data-intensive research with the data capacitor" Conference on High Performance Networking and Computing, Tampa, FL, ACM, pp. 244, 11/2006.
[10]
Data Capacitor (https://pti.iu.edu/dc)
[11]
IOzone Filesystem Benchmark (http://www.iozone.org)
[12]
Lustre-HSM (http://wiki.lustre.org/images/4/4d/Lustre_hsm_seminar_lug10.pdf)
[13]
NWFS2 (http://www.pdsi-scidac.org/docs/sc06/pnnl_sc06_pdsi.pdf)
[14]
Albedo (https://www.xsede.org/web/guest/psc-albedo)
[15]
ExTENCI (http://www.ogf.org/OGF34/materials/2418/ExTENCI-GIN-OGF34.pdf)
[16]
GLUSTER (http://www.gluster.org)
[17]
GPFS (http://www-03.ibm.com/systems/software/gpfs/)
[18]
TeraGrid Data Movement with GPFS-WAN and Parallel NFS. 2007. Supercomputing '07 Bandwidth Challenge.
[19]
High Performance Storage System - HPSS (http://www.hpss-collaboration.org/)
[20]
MooseFS (http://www.moosefs.org)
[21]
The integrated Rule-Oriented Data System -- iRODS (http://www.irods.org)
[22]
ZFS on Linux (http://zfsonlinux.org)

Cited By

View all
  • (2017)Demonstrating Distributed Workflow Computing with a Federating Wide-Area File SystemPractice and Experience in Advanced Research Computing 2017: Sustainability, Success and Impact10.1145/3093338.3093389(1-7)Online publication date: 9-Jul-2017
  • (2017)Optimizing High Performance Big Data Cancer WorkflowsPractice and Experience in Advanced Research Computing 2017: Sustainability, Success and Impact10.1145/3093338.3093372(1-4)Online publication date: 9-Jul-2017
  • (2016)TCGA Expedition: A Data Acquisition and Management System for TCGA DataPLOS ONE10.1371/journal.pone.016539511:10(e0165395)Online publication date: 27-Oct-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
XSEDE '12: Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond
July 2012
423 pages
ISBN:9781450316026
DOI:10.1145/2335755
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 July 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. FUSE
  2. NARA
  3. SLASH2
  4. XSEDE
  5. multi-resident file system
  6. overlay file system
  7. replicated file system
  8. storage cloud

Qualifiers

  • Research-article

Conference

XSEDE12

Acceptance Rates

Overall Acceptance Rate 129 of 190 submissions, 68%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2017)Demonstrating Distributed Workflow Computing with a Federating Wide-Area File SystemPractice and Experience in Advanced Research Computing 2017: Sustainability, Success and Impact10.1145/3093338.3093389(1-7)Online publication date: 9-Jul-2017
  • (2017)Optimizing High Performance Big Data Cancer WorkflowsPractice and Experience in Advanced Research Computing 2017: Sustainability, Success and Impact10.1145/3093338.3093372(1-4)Online publication date: 9-Jul-2017
  • (2016)TCGA Expedition: A Data Acquisition and Management System for TCGA DataPLOS ONE10.1371/journal.pone.016539511:10(e0165395)Online publication date: 27-Oct-2016
  • (2015)HPC in Weather ForecastInternational Journal of Cloud Applications and Computing10.4018/ijcac.20150101025:1(14-31)Online publication date: 1-Jan-2015
  • (2015)Needs Assessment for Research Use of High-Throughput Sequencing at a Large Academic Medical CenterPLOS ONE10.1371/journal.pone.013116610:6(e0131166)Online publication date: 26-Jun-2015
  • (2015)BridgesProceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure10.1145/2792745.2792775(1-8)Online publication date: 26-Jul-2015
  • (2013)A sustainable national gateway for biological computationProceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery10.1145/2484762.2484817(1-3)Online publication date: 22-Jul-2013

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media