Abstract
Computationally inferring the identities and their relative frequencies from pooled samples that are whole-genome or segmentally genotyped or sequenced (e.g., using next-generation sequencing) in a pool is useful for population genetics analysis. To carry out such analysis, one needs to understand basics of how to use high-performance computing (HPC) facilities and the specifics of corresponding computational tools. Here, we describe the basic knowledge and step-by-step usage of a number of tools for haplotype inference on genotyping or next-generation sequencing data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Schlotterer C, Tobler R, Kofler R, Nolte V (2014) Sequencing pools of individuals - mining genome-wide polymorphism data without big funding. Nat Rev Genet 15:749–763
Zhang H, Yang HC, Yang Y (2008) PoooL: an efficient method for estimating haplotype frequencies from large DNA pools. Bioinformatics 24:1942–1948
Kuk AY, Zhang H, Yang Y (2009) Computationally feasible estimation of haplotype frequencies from pooled DNA with and without Hardy-Weinberg equilibrium. Bioinformatics 25:379–386
Long Q, Jeffares DC, Zhang Q, Ye K, Nizhynska V et al (2011) PoolHap: inferring haplotype frequencies from pooled samples by next generation sequencing. PLoS One 6:e15292
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079
Kessner D, Turner TL, Novembre J (2013) Maximum likelihood estimation of frequencies of known haplotypes from pooled sequence data. Mol Biol Evol 30:1145–1158
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760
Stephens M, Smith NJ, Donnelly P (2001) A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68:978–989
Scheet P, Stephens M (2006) A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78:629–644
Pirinen M, Kulathinal S, Gasbarra D, Sillanpaa MJ (2008) Estimating population haplotype frequencies from pooled DNA samples using PHASE algorithm. Genet Res (Camb) 90:509–524
Long Q, MacArthur D, Ning Z, Tyler-Smith C (2009) HI: haplotype improver using paired-end short reads. Bioinformatics 25:2436–2437
Sasaki E, Sugino RP, Innan H (2013) The linkage method: a novel approach for SNP detection and haplotype reconstruction from a single diploid individual using next-generation sequence data. Mol Biol Evol 30:2187–2196
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K et al (2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303
Acknowledgment
We are grateful to the communications with Dr. Yaning Yang on PoooL and the communications with Dr. Darren Kessner on HARP. This work was partially supported by the start-up grant of University of Calgary and NIH grants (HG008451 and AG046170)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media LLC
About this protocol
Cite this protocol
Long, Q. (2017). Computational Haplotype Inference from Pooled Samples. In: Tiemann-Boege, I., Betancourt, A. (eds) Haplotyping. Methods in Molecular Biology, vol 1551. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6750-6_15
Download citation
DOI: https://doi.org/10.1007/978-1-4939-6750-6_15
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-6748-3
Online ISBN: 978-1-4939-6750-6
eBook Packages: Springer Protocols