Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2484762.2484803acmotherconferencesArticle/Chapter ViewAbstractPublication PagesxsedeConference Proceedingsconference-collections
research-article

Making campus bridging work for researchers: a case study with mlRho

Published: 22 July 2013 Publication History

Abstract

An increasing number of biologists' computational demands have outgrown the capacity of desktop workstations and they are turning to supercomputers to run their simulations and calculations. Many of today's computational problems, however, require larger resource commitments than even individual universities can provide. XSEDE is one of the first places researchers turn to when they outgrow their campus resources. XSEDE machines are far larger (by at least an order of magnitude) than what most universities offer. Transitioning from a campus resource to an XSEDE resource is seldom a trivial task. XSEDE has taken many steps to make this easier, including the Campus Bridging initiative, the Campus Champions program, the Extended Collaborative Support Service (ECSS) [1] program, and through education and outreach.
In this paper, our team of biologists and application support analysts (including a Campus Champion) dissect a computationally intensive biology project and share the insights we gain to help strengthen the programs mentioned above. We worked on a project to calculate population mutation and recombination rates of tens of genome profiles using mlRho [2], a serial, open-source, genome analysis code. For the initial investigation, we estimated that we would need 6.3 million service units (SUs) on the Ranger system. Three of the most important places where the biologists needed help in transitioning to XSEDE were (i) preparing the proposal for 6.3 million SUs on XSEDE, (ii) scaling up the existing workflow to hundreds of cores and (iii) performance optimization. The Campus Bridging initiative makes all of these tasks easier by providing tools and a consistent software stack across centers.
Ideally, Campus Champions are able to provide support on (i), (ii) and (iii), while ECSS staff can assist with (ii) and (iii). But (i), (ii) and (iii) are often not part of a Campus Champion's regular job description. To someone writing an XSEDE proposal for the first time, a link to the guidelines and a few pointers may not always be enough for a successful application. In this paper we describe a new role for a campus bridging expert to play in closing the gaps between existing programs and present mlRho as a case study.

References

[1]
XSEDE Extended Collaborative Support Service(ECSS). XSEDE Extended Collaborative Support Service(ECSS). https://www.xsede.org/ecss, 2012.
[2]
Bernhard Haubold, Peter Pfaffelhuber, and Michael Lynch. mlRho - a program for estimating the population mutation and recombination rates from shotgun-sequenced diploid genomes. Molecular Ecology, 19:277--284, 2010.
[3]
NSF Advisory Committee for Cyberinfrastructure Task Force on Campus Bridging. Technical Report Final Report, March 2011.
[4]
Craig A. Stewart, Richard Knepper, James Ferguson, Felix Bachmann, Ian Foster, Andrew Grimshaw, Victor Hazlewood, and David Lifka. What is campus bridging and what is XSEDE doing about it? In Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond, XSEDE '12, pages 47:1--47:8, Chicago, Illinois, 2012. ACM.
[5]
SAGA BigJob. SAGA BigJob. https://github.com/saga-project/BigJob, 2012.
[6]
Sarah P. Otto and Thomas Lenormand. Resolving the paradox of sex and recombination. Nat Rev Genet, 3 (4):252--261, 04 2002.
[7]
Michael P. H. Stumpf and Gilean A. T. McVean. Estimating recombination rates from population-genetic data. Nat Rev Genet, 4(12):959--968, 12 2003.
[8]
Montgomery Slatkin. Linkage disequilibrium -- understanding the evolutionary past and mapping the medical future. Nature Reviews Genetics, 9:477--485, 2008.
[9]
Michael Lynch. Estimation of nucleotide diversity, disequilibrium coefficients, and mutation rates from high-coverage genome-sequencing projects. Molecular Biology and Evolution, 25(11):2409--2419, 2008.
[10]
Michael Lynch, Louis-Marie Bobay, Francesco Catania, Jean-FranÃǧois Gout, and Mina Rho. The Repatterning of Eukaryotic Genomes by Random Genetic Drift. Annual Review of Genomics and Human Genetics, 12 (1):347--366, 2011.
[11]
A. Luckow, L. Lacinski, and S. Jha. SAGA BigJob: An Extensible and Interoperable Pilot-Job Abstraction for Distributed Applications and Systems. In The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pages 135--144, 2010.
[12]
A. Luckow, S. Jha, J. Kim, A. Merzky, and B. Schnor. Adaptive Replica-Exchange Simulations. Royal Society Philosophical Transactions A, 2009.
[13]
Abhinav Thota, Andre Luckow, and Shantenu Jha. Efficient large-scale replica-exchange simulations on production infrastructure. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 369(1949):3318--3335, 2011.
[14]
H. Li, B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin, and 1000 Genome Project Data Processing Subgroup. The sequence alignment/map format and SAMtools. Bioinformatics, 25:2078--2079, 2009.
[15]
Intel Xeon Phi Coprocessor System Software Developers Guide. Intel Xeon Phi Coprocessor System Software Developers Guide. http://software.intel.com/sites/default/files/article/334766/intel-xeon-phi-systemsoftwaredevelopersguide.pdf, 2012.

Cited By

View all
  • (2018)Making campus bridging work for researchersConcurrency and Computation: Practice & Experience10.1002/cpe.326626:13(2141-2148)Online publication date: 29-Dec-2018

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
XSEDE '13: Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery
July 2013
433 pages
ISBN:9781450321709
DOI:10.1145/2484762
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 July 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. BigJob
  2. XSEDE
  3. genetics
  4. high-throughput
  5. mlRho
  6. optimization
  7. performance tuning
  8. pilot-job

Qualifiers

  • Research-article

Funding Sources

Conference

XSEDE '13

Acceptance Rates

Overall Acceptance Rate 129 of 190 submissions, 68%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Making campus bridging work for researchersConcurrency and Computation: Practice & Experience10.1002/cpe.326626:13(2141-2148)Online publication date: 29-Dec-2018

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media