Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/369133.369190acmconferencesArticle/Chapter ViewAbstractPublication PagesrecombConference Proceedingsconference-collections
Article

The greedy path-merging algorithm for sequence assembly

Published: 22 April 2001 Publication History
  • Get Citation Alerts
  • Abstract

    Two different approaches to determining the human genome are currently being pursued: one is the “clone-by-clone” approach, employed by the publicly-funded. Human Genome Project, and the other is the “whole genome shotgun” approach, favored by researchers at Celera Genomics. An interim strategy employed at Celera, called hierarchical assembly, makes use of preliminary data produced by both approaches. This paper introduces the Bactig Ordering Problem, which is a key problem that arises in this context, and presents an efficient heuristic called the greedy path-merginq algorithm that performs well on real data.

    References

    [1]
    D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, B. A. Rapp, and D. L. Wheeler. Genbank. Nucleic Acids Research, 28(1):15-8, 2000.
    [2]
    M.R. Garey and D. S. Johnson. Computers and Intractability, a guide to the theory of NP-completeness. Bell Telephone Laboratories, Inc., 1979.
    [3]
    D.H. Huson, K. Reinert, S. A. Kravitz, K. A. Remington, A. L. Delcher, I. M. Dew, A. L. Halpern, Z. Lai, G. G. Sutton, and E. W. Myers. Design and operation of an hierarchical assembler for the human genome. In preparation.
    [4]
    E.S. Lander and M. S. Waterman. Genomic mapping by fingerprinting random clones: A mathematical analysis. Genomics, 2:231-239, 1988.
    [5]
    E. Marshall. A high-stakes gamble on genome sequencing. Science, 284(5422):1906-1909, 1999.
    [6]
    E. Marshall. Sequencers endorse plan for draft in 1 year. Science, 284(5419):1439-1441, 1999.
    [7]
    E. Marshall. Human genome. Rival genome sequences celebrate a milestone together. Science, 288(5475):2294-5, 2000.
    [8]
    E.W. Myers, G. G. Sutton, A. L. Delcher, I. M. Dew, D. P. Fasulo, M. J. Flanigan, S. A. Kravitz, C. M. Mobarry, K. H. J. Reinert, K. A. Remington, E. L. Anson, R. A. Bolanos, H-H. Chou, C. M. Jordan, A. L. Halpern, S. Lonardi, E. M. Beasley, R. C. Brandon, L. Chen, P. J. Dunn, Z. Lai, Y. Liang, D. R. Nusskern, M. Zhan, Q. Zhang, X. Zheng, G. M. Rubin, M. D. Adams, and J. C. Venter. A whole-genome assembly of Drosophila. Science, 287:2196-2204, 2000.
    [9]
    F. Sanger, A. R. Coulson, G. F. Hong, D. F. Hill, and G. B. Petersen. Nucleotide sequence of bacteriophage A DNA. J. Mol. Bio., 162(4):729-73, 1992.
    [10]
    F. Sanger, S. Nicklen, and A. R. Coulson. DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences, 74(12):5463-5467, 1977.
    [11]
    U.S. Dep. of Energy, Office of Energy Research, and Office of Biological and Environmental Research. Human genome program report. http ://www. ornl. gov/hgmis/publicat/97pr/, 1997.
    [12]
    J.L. Webber and E. W. Myers. Human whole-genome shotgun sequencing. Genome Research, 7(5):401-409, 1997.

    Cited By

    View all

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    RECOMB '01: Proceedings of the fifth annual international conference on Computational biology
    April 2001
    316 pages
    ISBN:1581133537
    DOI:10.1145/369133
    • Chairman:
    • Thomas Lengauer
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 April 2001

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Conference

    RECOMB01
    Sponsor:

    Acceptance Rates

    RECOMB '01 Paper Acceptance Rate 35 of 128 submissions, 27%;
    Overall Acceptance Rate 148 of 538 submissions, 28%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)MetaCarvel: linking assembly graph motifs to biological variantsGenome Biology10.1186/s13059-019-1791-320:1Online publication date: 26-Aug-2019
    • (2015)The Theory and Practice of Genome Sequence AssemblyAnnual Review of Genomics and Human Genetics10.1146/annurev-genom-090314-05003216:1(153-172)Online publication date: 24-Aug-2015
    • (2011)Bambus 2Bioinformatics10.1093/bioinformatics/btr52027:21(2964-2971)Online publication date: 1-Nov-2011
    • (2009)Parametric Complexity of Sequence Assembly: Theory and Applications to Next Generation SequencingJournal of Computational Biology10.1089/cmb.2009.000516:7(897-908)Online publication date: Jul-2009
    • (2005)BACCardI---a tool for the validation of genomic assemblies, assisting genome finishing and intergenome comparisonBioinformatics10.1093/bioinformatics/bti09121:7(853-859)Online publication date: 1-Apr-2005
    • (2004)Hierarchical Scaffolding With BambusGenome Research10.1101/gr.153620414:1(149-159)Online publication date: 5-Jan-2004
    • (2003)A Protein Sequence Prediction Method by Mining Sequence DataThe KIPS Transactions:PartD10.3745/KIPSTD.2003.10D.2.26110D:2(261-266)Online publication date: 1-Apr-2003
    • (2003)The Restriction Scaffold ProblemJournal of Computational Biology10.1089/1066527036068808410:3-4(385-398)Online publication date: Jun-2003
    • (2002)Genome Sequence AssemblyComputer10.1109/MC.2002.101690135:7(47-54)Online publication date: 1-Jul-2002
    • (2001)Visualization challenges for a new cyberpharmaceutical computing paradigmProceedings of the IEEE 2001 symposium on parallel and large-data visualization and graphics10.5555/502125.502127(7-18)Online publication date: 22-Oct-2001
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media