Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2792745.2792749acmotherconferencesArticle/Chapter ViewAbstractPublication PagesxsedeConference Proceedingsconference-collections
research-article

NCBI-BLAST programs optimization on XSEDE resources for sustainable aquaculture

Published: 26 July 2015 Publication History

Abstract

The development of genomic resources of non-model organisms is now becoming commonplace as the cost of sequencing continues to decrease. The Genome Informatics Facility in collaboration with the Southwest Fisheries Science Center (SWFSC), NOAA is creating these resources for sustainable aquaculture in Seriola lalandi. Gene prediction and annotation are common steps in the pipeline to generate genomic resources, which are computationally intense and time consuming. In our steps to create genomic resources for Seriola lalandi, we found BLAST to be one of our most rate limiting steps. Therefore, we took advantage of our XSEDE Extended Collaborative Support Services (ECSS) to reduce the amount of time required to process our transcriptome data by 300 percent. In this paper, we describe an optimized method for the BLAST tool on the Stampede cluster, which works with any existing datasets or database, without any modification. At modest core counts, our results are similar to the MPI-enabled BLAST algorithm (mpiBLAST), but also allow the much needed and improved flexibility of output formats that the latest versions of BLAST provide. Reducing this time-consuming bottleneck in BLAST will be broadly applicable to the annotation of large sequencing datasets for any organism.

References

[1]
Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J., 1990. Basic local alignment search tool. J Mol Biol 215, 3 (Oct 5), 403--410. DOI= http://dx.doi.org/10.1016/S0022-2836(05)80360-2.
[2]
Chan, W. M. and Consortium, U., 2010. The UniProt Knowledgebase (UniProtKB): a freely accessible, comprehensive and expertly curated protein sequence database. Genetics Research 92, 1 (Feb), 78--79.
[3]
Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. A., Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q. D., Chen, Z. H., Mauceli, E., Hacohen, N., Gnirke, A., Rhind, N., Di Palma, F., Birren, B. W., Nusbaum, C., Lindblad-Toh, K., Friedman, N., and Regev, A., 2011. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature Biotechnology 29, 7 (Jul), 644-U130. DOI= http://dx.doi.org/DOI 10.1038/nbt.1883.
[4]
Lin, H. S., MA, X. S., Feng, W. C., and Samatova, N. F., 2011. Coordinating Computation and I/O in Massively Parallel Sequence Search. Ieee Transactions on Parallel and Distributed Systems 22, 4 (Apr), 529--543. DOI= http://dx.doi.org/Doi 10.1109/Tpds.2010.101.
[5]
Vouzis, P. D. and Sahinidis, N. V., 2011. GPU-Blast: using graphics processors to accelerate protein sequence alignment. Bioinformatics 27, 2 (Jan 15), 182--188. DOI= http://dx.doi.org/10.1093/bioinformatics/btq644.
[6]
Wilson, L. A. and Fonner, J. M., 2014. Launcher: A Shell-based Framework for Rapid Development of Parallel Parametric Studies. In Proceedings of the Proceedings of the 2014 Annual Conference on Extreme Science and Engineering Discovery Environment (Atlanta, GA, USA2014), ACM, 2616534, 1--8. DOI= http://dx.doi.org/10.1145/2616498.2616534.
[7]
Zhang, Z., Schwartz, S., Wagner, L., and Miller, W., 2000. A greedy algorithm for aligning DNA sequences. J Comput Biol 7, 1--2 (Feb--Apr), 203--214. DOI= http://dx.doi.org/10.1089/10665270050081478.

Cited By

View all
  • (2018)Massively Parallel Implementation of Sequence Alignment with Basic Local Alignment Search Tool Using Parallel Computing in Java LibraryJournal of Computational Biology10.1089/cmb.2018.007925:8(871-881)Online publication date: Aug-2018
  • (2016)The XSEDE BLAST GatewayProceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale10.1145/2949550.2949653(1-8)Online publication date: 17-Jul-2016

Index Terms

  1. NCBI-BLAST programs optimization on XSEDE resources for sustainable aquaculture

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    XSEDE '15: Proceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure
    July 2015
    296 pages
    ISBN:9781450337205
    DOI:10.1145/2792745
    © 2015 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

    Sponsors

    • San Diego Super Computing Ctr: San Diego Super Computing Ctr
    • HPCWire: HPCWire
    • Omnibond: Omnibond Systems, LLC
    • SGI
    • Internet2
    • Indiana University: Indiana University
    • CASC: The Coalition for Academic Scientific Computation
    • NICS: National Institute for Computational Sciences
    • Intel: Intel
    • DDN: DataDirect Networks, Inc
    • DELL
    • CORSA: CORSA Technology
    • ALLINEA: Allinea Software
    • Cray
    • RENCI: Renaissance Computing Institute

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 July 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. NCBI-BLAST
    2. mpi-BLAST
    3. optimization
    4. stampede

    Qualifiers

    • Research-article

    Conference

    XSEDE '15
    Sponsor:
    • San Diego Super Computing Ctr
    • HPCWire
    • Omnibond
    • Indiana University
    • CASC
    • NICS
    • Intel
    • DDN
    • CORSA
    • ALLINEA
    • RENCI

    Acceptance Rates

    XSEDE '15 Paper Acceptance Rate 49 of 70 submissions, 70%;
    Overall Acceptance Rate 129 of 190 submissions, 68%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 15 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)Massively Parallel Implementation of Sequence Alignment with Basic Local Alignment Search Tool Using Parallel Computing in Java LibraryJournal of Computational Biology10.1089/cmb.2018.007925:8(871-881)Online publication date: Aug-2018
    • (2016)The XSEDE BLAST GatewayProceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale10.1145/2949550.2949653(1-8)Online publication date: 17-Jul-2016

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media