Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2983323.2983685acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article
Public Access

Efficient Algorithms for the Two Locus Problem in Genome-Wide Association Study: Algorithms for the Two Locus Problem

Published: 24 October 2016 Publication History

Abstract

Advances made in sequencing technology have resulted in the sequencing of thousands of genomes. Novel analysis tools are needed to process these data and extract useful information. Such tools could aid in personalized medicine. As an example, we could identify the causes for a disease by comparing the genomes of people who have the disease and those who do not have this disease. Given that human variability happens due to single nucleotide polymorphisms (SNPs), we could focus our attention on these SNPs. Investigations that try to understand human variability using SNPs fall under genome-wide association study (GWAS). A crucial step in GWAS is the identification of the correlation between genotypes (SNPs) and phenotypes (i.e., characteristics such as the presence of a disease). This step can be modeled as the k-locus problem (where k is any integer). A number of algorithms have been proposed in the literature for this problem when k = 2. In this paper we present an algorithm for solving the 2-locus problem that is up to two orders of magnitude faster than the previous best known algorithms.

References

[1]
Achlioptas, B. Schölkopf and K. Borgwardt, Two-locus association mapping in subquadratic runtime, ACM SIGKDD international conference on Knowledge discovery and data mining (KDD) 2011.
[2]
E. Aston, D.A. Ralph, D.P. Lalo, S. Manjeshwar, B.A. Gramling, D.C. DeFreese, A.D. West, D.E. Branam, L.F. Thompson, M.A. Craft, D.S. Mitchell, C.D. Shimasaki, J.J. Mulvihill, and E.R. Jupe, Oligogenic combinations associated with breast cancer risk in women under 53 years of age, Human Genetics, 116(3):208--221, Feb. 2005.
[3]
S. Charikar. Similarity estimation techniques from rounding algorithms, Proc. ACM Symposium on Theory of Computing (STOC), 2002.
[4]
Chiu, E. Keogh, and S. Lonardi, Probabilistic discovery of time series motifs, Proc. of the 9th International Conference on Knowledge Discovery and Data mining (KDD), pp. 493--498, 2003.
[5]
H. Cho, D.L. Nicolae, L.H. Gold, C.T. Fields, et al., Identification of novel susceptibility loci for inflammatory bowel disease on chromosomes 1p, 3q, and 4q: evidence for epistasis between 1p and IBD1, Proceedings of the National Academy of Sciences of the United States of America, 95(13):7502--7507, June 1998.
[6]
H. Consortium, A second generation human haplotype map of over 3.1 million SNPs, Nature, 449(7164):851--61, Oct. 2007.
[7]
J. Cordell, Detecting gene-gene interactions that underlie human diseases, Nat Rev Genet, 10(6):392--404, June 2009.
[8]
J. Cox, M. Frigge, D.L. Nicolae, P. Concannon, C.L. Hanis, G.I. Bell, and A. Kong, Loci on chromosomes 2 (NIDDM1) and 15 interact to increase susceptibility to diabetes in mexican americans, Nature Genetics, 21(2):213--215, Feb. 1999.
[9]
. Motwani and P. Raghavan, Randomized Algorithms, Cambridge University Press, 1995.
[10]
K. Musani, D. Shriner, N. Liu, R. Feng, et al.,Detection of gene x gene interactions in genome-wide association studies of human population data, Human Heredity, 63(2):67--84, 2007.
[11]
Nakamichi, Y. Ukai, and H. Kishino, Detection of closely linked multiple quantitative trait loci using a genetic algorithm, Genetics, 158(1):463--475, May 2001.
[12]
. Paturi, S. Rajasekaran, and J. Reif. The light bulb problem, Information and Computation 117, 1995, pp. 187--192.
[13]
Wang, X. Liu, K. Robbins, and R. Rekaya, AntEpiSeeker: detecting epistatic interactions for case-control studies using a two-stage ant colony optimization algorithm, BMC Bioinformatics, 3:117--117, 2010.
[14]
Xu, C.D. Langefeld, S.L. Zheng, E.M. Gillanders, et al.,Interaction effect of PTEN and CDKN1B chromosomal regions on prostate cancer linkage, Human Genetics, 115(3):255--262, Aug. 2004.
[15]
Zhang, S. Huang, F. Zou, and W. Wang, TEAM: efficient two-locus epistasis tests in human genome-wide association study, Bioinformatics (Oxford, England), 26(12):i217--227, June 2010.
[16]
Zhang, F. Zou, and W. Wang, Fastanova: an efficient algorithm for genome-wide association study, Proc. 14th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD), pages 821--829, Las Vegas, Nevada, USA, 2008.

Cited By

View all
  • (2016)Efficient Algorithms for the Three Locus Problem in Genome-Wide Association Study2016 IEEE 16th International Conference on Data Mining (ICDM)10.1109/ICDM.2016.0152(1155-1160)Online publication date: Dec-2016

Index Terms

  1. Efficient Algorithms for the Two Locus Problem in Genome-Wide Association Study: Algorithms for the Two Locus Problem

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
      October 2016
      2566 pages
      ISBN:9781450340731
      DOI:10.1145/2983323
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 24 October 2016

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. genome-wide association study
      2. string correlations
      3. two-locus problem

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      CIKM'16
      Sponsor:
      CIKM'16: ACM Conference on Information and Knowledge Management
      October 24 - 28, 2016
      Indiana, Indianapolis, USA

      Acceptance Rates

      CIKM '16 Paper Acceptance Rate 160 of 701 submissions, 23%;
      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)27
      • Downloads (Last 6 weeks)7
      Reflects downloads up to 09 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2016)Efficient Algorithms for the Three Locus Problem in Genome-Wide Association Study2016 IEEE 16th International Conference on Data Mining (ICDM)10.1109/ICDM.2016.0152(1155-1160)Online publication date: Dec-2016

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media