Abstract
With the recent flood of genome sequence data, there has been increasing interest in rare variants and methods to detect their association to disease. Many of these methods are collapsing strategies which bin rare variants based on allele frequency and functional predictions; but at this point, most have been limited to candidate gene studies with a small number of candidate genes. We propose a novel method to collapse rare variants based on incorporating biological information from the public domain. This paper introduces the functionality of BioBin, a biologically informed method to collapse rare variants and detect associations with a particular phenotype. We tested BioBin using low coverage data from the 1000 Genomes Project and discovered appropriate binning characteristics based on what one might expect given the size of the gene. We also tested BioBin using the pilot targeted exome data from 1000 Genomes Project. We used biologically-informed binning and differences in minor allele frequencies as a means to distinguish between two ancestral populations. Although BioBin is still in developmental stages, it will be a useful tool in analyzing sequence data and uncovering novel associations with complex disease.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Eichler, E.E., Flint, J., Gibson, G., Kong, A., Leal, S.M., Moore, J.H., Nadeau, J.H.: Missing heritability and strategies for finding the underlying causes of complex disease. Nat. Rev. Genet. 11(6), 446–450 (2010)
Hindorff, L.A., Sethupathy, P., Junkins, H.A., Ramos, E.M., Mehta, J.P., Collins, F.S., Manolio, T.A.: Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad Sci. U S A 106(23), 9362–9367 (2009)
Johansen, C.T., Wang, J., Lanktree, M.B., Cao, H., McIntyre, A.D., Ban, M.R., Martins, R.A., Kennedy, B.A., Hassell, R.G., Visser, M.E., Schwartz, S.M., Voight, B.F., Elosua, R., Salomaa, V., O’Donnell, C.J., Dallinga-Thie, G.M., Anand, S.S., Yusuf, S., Huff, M.W., Kathiresan, S., Hegele, R.A.: Excess of rare variants in genes identified by genome-wide association study of hypertriglyceridemia. Nat. Genet. 42(8), 684–687 (2010)
Walsh, T., Shahin, H., Elkan-Miller, T., Lee, M.K., Thornton, A.M., Roeb, W., Abu, R.A., Loulus, S., Avraham, K.B., King, M.C., Kanaan, M.: Whole exome sequencing and homozygosity mapping identify mutation in the cell polarity protein GPSM2 as the cause of nonsyndromic hearing loss DFNB82. Am. J. Hum. Genet. 87(1), 90–94 (2010)
Bhatia, G., Bansal, V., Harismendy, O., Schork, N.J., Topol, E.J., Frazer, K., Bafna, V.: A covering method for detecting genetic associations between rare variants and common phenotypes. PLoS Comput. Biol. 6(10), e1000954 (2010)
Ionita-Laza, I., Buxbaum, J.D., Laird, N.M., Lange, C.: A new testing strategy to identify rare variants with either risk or protective effect on disease. PLoS Genet. 7(2), e1001289 (2011)
Li, B., Leal, S.M.: Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83(3), 311–321 (2008)
Haack, T.B., Danhauser, K., Haberberger, B., Hoser, J., Strecker, V., Boehm, D., Uziel, G., Lamantea, E., Invernizzi, F., Poulton, J., Rolinski, B., Iuso, A., Biskup, S., Schmidt, T., Mewes, H.W., Wittig, I., Meitinger, T., Zeviani, M., Prokisch, H.: Exome sequencing identifies ACAD9 mutations as a cause of complex I deficiency. Nat. Genet. 42(12), 1131–1134 (2010)
Morgenthaler, S., Thilly, W.G.: A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST). Mutat. Res. 615(1-2), 28–56 (2007)
Bansal, V., Libiger, O., Torkamani, A., Schork, N.J.: Statistical analysis strategies for association studies involving rare variants. Nat. Rev. Genet. 11(11), 773–785 (2010)
Basu, S., Pan, W.: Comparison of Statistical Tests for Association with Rare Variants. 1-33. 11-30-2010. Research report, Division of Biostatistics, University of Minnesota
Madsen, B.E., Browning, S.R.: A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet. 5(2), e1000384 (2009)
Price, A.L., Kryukov, G.V., de Bakker, P.I., Purcell, S.M., Staples, J., Wei, L.J., Sunyaev, S.R.: Pooled association tests for rare variants in exon-resequencing studies. Am. J. Hum. Genet. 86(6), 832–838 (2010)
Han, F., Pan, W.: A data-adaptive sum test for disease association with multiple common or rare variants. Hum. Hered. 70(1), 42–54 (2010)
Hoffmann, T.J., Marini, N.J., Witte, J.S.: Comprehensive approach to analyzing rare genetic variants. PLoS One 5(11), e13584 (2010)
Yandell, M., Huff, C., Hu, H., Singleton, M., Moore, B., Xing, J., Jorde, L.B., Reese, M.G.: A probabilistic disease-gene finder for personal genomes. Genome Res. 21(9), 1529–1542 (2011)
Bush, W.S., Dudek, S.M., Ritchie, M.D.: Biofilter: a knowledge-integration system for the multilocus analysis of genome-wide association studies. In: Pac. Symp. Biocomput., pp. 368–379 (2009)
Durbin, R.M., Abecasis, G.R., Altshuler, D.L., Auton, A., Brooks, L.D., Durbin, R.M., Gibbs, R.A., Hurles, M.E., McVean, G.A.: A map of human genome variation from population-scale sequencing. Nature 467(7319), 1061–1073 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Buchanan, C.C., Wallace, J.R., Frase, A.T., Torstenson, E.S., Pendergrass, S.A., Ritchie, M.D. (2012). A Biologically Informed Method for Detecting Associations with Rare Variants. In: Giacobini, M., Vanneschi, L., Bush, W.S. (eds) Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. EvoBIO 2012. Lecture Notes in Computer Science, vol 7246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29066-4_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-29066-4_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29065-7
Online ISBN: 978-3-642-29066-4
eBook Packages: Computer ScienceComputer Science (R0)