Abstract
Genotype imputation has been widely utilized for two reasons in the analysis of Genome-Wide Association Studies (GWAS). One reason is to increase the power for association studies when causal SNPs are not collected in the GWAS. The second reason is to aid the interpretation of a GWAS result by predicting the association statistics at untyped variants. In this paper, we show that prediction of association statistics at untyped variants that have an influence on the trait produces overly conservative results. Current imputation methods assume that none of the variants in a region (locus consists of multiple variants) affect the trait, which is often inconsistent with the observed data. In this paper, we propose a new method, CAUSAL-Imp, which can impute the association statistics at untyped variants while taking into account variants in the region that may affect the trait. Our method builds on recent methods that impute the marginal statistics for GWAS by utilizing the fact that marginal statistics follow a multivariate normal distribution. We utilize both simulated and real data sets to assess the performance of our method. We show that traditional imputation approaches underestimate the association statistics for variants involved in the trait, and our results demonstrate that our approach provides less biased estimates of these association statistics.
Y. Wu and F. Hormozdiari—These authors contributed equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zeggini, E., Weedon, M.N., Lindgren, C.M., et al.: Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science 316(5829), 1336–1341 (2007)
Sladek, R., Rocheleau, G., Rung, J., et al.: A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 445(7130), 881–885 (2007)
Hakonarson, H., Grant, S.F.A., Bradfield, J.P., et al.: A genome-wide association study identifies kiaa0350 as a type 1 diabetes gene. Nature 448(7153), 591–594 (2007)
Yang, J., Manolio, T.A., Pasquale, L.R., et al.: Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 43(6), 519–525 (2011)
Kottgen, A., Albrecht, E., Teumer, A., et al.: Genome-wide association analyses identify 18 new loci associated with serum urate concentrations. Nat. Genet. 45(2), 145–154 (2013)
Yi, L., Vitart, V., Burdon, K.P., et al.: Genome-wide association analyses identify multiple loci associated with central corneal thickness and keratoconus. Nat. Genet. 45(2), 155–163 (2013)
Ripke, S., O’Dushlaine, C., Chambert, K., et al.: Genome-wide association analysis identifies 13 new risk loci for schizophrenia. Nat. Genet. 45(10), 1150–1159 (2013)
Reich, D.E., Cargill, M., Bolk, S., et al.: Linkage disequilibrium in the human genome. Nature 411(6834), 199–204 (2001)
Pritchard, J.K., Przeworski, M.: Linkage disequilibrium in humans: models and data. Am. J. Hum. Genet. 69(1), 1–14 (2001)
Browning, S.R.: Missing data imputation and haplotype phase inference for genome-wide association studies. Hum. Genet. 124(5), 439–450 (2008)
Howie, B., Fuchsberger, C., Stephens, M., Marchini, J., Abecasis, G.R.: Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44(8), 955–959 (2012)
Howie, B.N., Donnelly, P., Marchini, J.: A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5(6), e1000529 (2009)
Li, Y., Willer, C., Sanna, S., Abecasis, G.: Genotype imputation. Annu. Rev. Genomics Hum. Genet. 10, 387–406 (2009)
Li, Y., Willer, C.J., Ding, J., Scheet, P., Abecasis, G.R.: Mach: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet. Epidemiol 34(8), 816–834 (2010)
Marchini, J., Howie, B.: Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 11(7), 499–511 (2010)
Marchini, J., Howie, B.: Comparing algorithms for genotype imputation. Am. J. Hum. Genet. 83(4), 535–539 (2008). (author reply 539–540)
Marchini, J., Howie, B., Myers, S., McVean, G., Donnelly, P.: A new multipoint method for genome-wide association studies by imputation of genotypes. Nat. Genet. 39(7), 906–913 (2007)
Han, B., Kang, H.M., Eskin, E.: Rapid and accurate multiple testing correction and power estimation for millions of correlated markers. PLoS Genet. 5(4), e1000456 (2009)
Kostem, E., Lozano, J.A., Eskin, E.: Increasing power of genome-wide association studies by collecting additional single-nucleotide polymorphisms. Genetics 188(2), 449–460 (2011)
Hormozdiari, F., Kostem, E., Kang, E.Y., Pasaniuc, B., Eskin, E.: Identifying causal variants at loci with multiple signals of association. Genetics 198(2), 497–508 (2014)
Hormozdiari, F., Kichaev, G., Yang, W.-Y., Pasaniuc, B., Eskin, E.: Identification of causal genes for complex traits. Bioinformatics 31(12), i206–i213 (2015)
Hormozdiari, F., van de Bunt, M., Segrè, A.V., et al.: Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99(6), 1245–1260 (2016)
Lee, D., Bigdeli, T.B., Riley, B.P., Fanous, A.H., Bacanu, S.A.: DIST: direct imputation of summary statistics for unmeasured SNPs. Bioinformatics 29(22), 2925–2927 (2013)
Pasaniuc, B., Zaitlen, N., Shi, H., et al.: Fast and accurate imputation of summary statistics enhances evidence of functional enrichment. Bioinformatics 30(20), 2906–2914 (2014)
Sabatti, C., Service, S.K., Hartikainen, A.-L., et al.: Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. Nat. Genet. 41(1), 35–46 (2009)
Durbin, R.M., Altshuler, D.L., Durbin, R.M., et al.: A map of human genome variation from population-scale sequencing. Nature 467(7319), 1061–1073 (2010)
McVean, G.A., Altshuler, D.M., Durbin, R.M., et al.: An integrated map of genetic variation from 1,092 human genomes. Nature 491(7422), 56–65 (2012)
Zaitlen, N., Kang, H.M., Eskin, E., Halperin, E.: Leveraging the hapmap correlation structure in association studies. Am. J. Hum. Genet. 80(4), 683–691 (2007)
Joo, J.W.J., Hormozdiari, F., Han, B., Eskin, E.: Multiple testing correction in linear mixed models. Genome Biol. 17(1), 62 (2016)
Devlin, B., Roeder, K.: Genomic control for association studies. Biometrics 55(4), 997–1004 (1999)
Duong, D., Zou, J., Hormozdiari, F., et al.: Using genomic annotations increases statistical power to detect eGenes. Bioinformatics 32(12), i156–i163 (2016)
Hormozdiari, F., Kang, E.Y., Bilow, M., et al.: Imputing phenotypes for genome-wide association studies. Am. J. Hum. Genet. 99(1), 89–103 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Wu, Y., Hormozdiari, F., Joo, J.W.J., Eskin, E. (2017). Improving Imputation Accuracy by Inferring Causal Variants in Genetic Studies. In: Sahinalp, S. (eds) Research in Computational Molecular Biology. RECOMB 2017. Lecture Notes in Computer Science(), vol 10229. Springer, Cham. https://doi.org/10.1007/978-3-319-56970-3_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-56970-3_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56969-7
Online ISBN: 978-3-319-56970-3
eBook Packages: Computer ScienceComputer Science (R0)