Genome Wide Epistasis Study of On-Statin Cardiovascular Events with Iterative Feature Reduction and Selection
Abstract
:1. Introduction
2. Results
2.1. Demographics
2.2. Feature Selection with RF-IFRS
2.3. Epistasis Screening
3. Discussion
3.1. Risk Variants and Interactions for CVD
3.2. Novelty and Application to Clinical Practice
3.3. Angiogenesis, Endothelial Function, and Vasculogenesis in CVD
3.4. RF-IFRS Replicates Existing Gene Associations with CVD and Incorporates Novel Interactions
3.5. Limitations
4. Materials and Methods
4.1. Clinical Dataset
4.2. Data Pre-Processing
4.3. Random Forest Iterative Feature Reduction and Selection (RF-IFRS)
4.4. Testing for Epistasis
4.5. Poly-Epistatic Risk and Pathway Analysis
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Grundy, S.M.; Stone, N.J.; Bailey, A.L.; Beam, C.; Birtcher, K.K.; Blumenthal, R.S.; Braun, L.T.; de Ferranti, S.; Faiella-Tommasino, J.; Forman, D.E.; et al. 2018 AHA/ACC/AACVPR/AAPA/ABC/ACPM/ADA/AGS/ APhA/ASPC/NLA/PCNA Guideline on the Management of Blood Cholesterol: A Report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines. Circulation 2019, 139, e1082–e1143. [Google Scholar] [CrossRef] [PubMed]
- Ramos, R.; García-Gil, M.; Comas-Cufí, M.; Quesada, M.; Marrugat, J.; Elosua, R.; Sala, J.; Grau, M.; Martí, R.; Ponjoan, A.; et al. Statins for Prevention of Cardiovascular Events in a Low-Risk Population With Low Ankle Brachial Index. J. Am. Coll. Cardiol. 2016, 67, 630–640. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gutierrez, J.; Ramirez, G.; Rundek, T.; Sacco, R.L. Statin therapy in the prevention of recurrent cardiovascular events: A sex-based meta-analysis. Arch. Intern. Med. 2012, 172, 909–919. [Google Scholar] [CrossRef] [PubMed]
- Efficacy and safety of statin therapy in older people: A meta-analysis of individual participant data from 28 randomised controlled trials. Lancet 2019, 393, 407–415. [CrossRef] [Green Version]
- Ramsey, L.B.; Johnson, S.G.; Caudle, K.E.; Haidar, C.E.; Voora, D.; Wilke, R.A.; Maxwell, W.D.; McLeod, H.L.; Krauss, R.M.; Roden, D.M.; et al. The clinical pharmacogenetics implementation consortium guideline for SLCO1B1 and simvastatin-induced myopathy: 2014 update. Clin. Pharmacol. Ther. 2014, 96, 423–428. [Google Scholar] [CrossRef] [PubMed]
- Ruiz-Iruela, C.; Candás-Estébanez, B.; Pintó-Sala, X.; Baena-Díez, N.; Caixàs-Pedragós, A.; Güell-Miró, R.; Navarro-Badal, R.; Calmarza, P.; Puzo-Foncilla, J.L.; Alía-Ramos, P.; et al. Genetic contribution to lipid target achievement with statin therapy: A prospective study. Pharm. J. 2020, 20, 494–504. [Google Scholar] [CrossRef] [PubMed]
- Kessler, T.; Vilne, B.; Schunkert, H. The impact of genome-wide association studies on the pathophysiology and therapy of cardiovascular disease. EMBO Mol. Med. 2016, 8, 688–701. [Google Scholar] [CrossRef] [PubMed]
- Roguin, A.; Koch, W.; Kastrati, A.; Aronson, D.; Schomig, A.; Levy, A.P. Haptoglobin genotype is predictive of major adverse cardiac events in the 1-year period after percutaneous transluminal coronary angioplasty in individuals with diabetes. Diabetes Care 2003, 26, 2628–2631. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhao, C.; Zhu, P.; Shen, Q.; Jin, L. Prospective association of a genetic risk score with major adverse cardiovascular events in patients with coronary artery disease. Medicine 2017, 96, e9473. [Google Scholar] [CrossRef]
- Wang, L.; McLeod, H.L.; Weinshilboum, R.M. Genomics and drug response. N. Engl. J. Med. 2011, 364, 1144–1153. [Google Scholar] [CrossRef] [Green Version]
- Gibson, G. On the utilization of polygenic risk scores for therapeutic targeting. PLoS Genet. 2019, 15, e1008060. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Jiang, R.; Tang, W.; Wu, X.; Fu, W. A random forest approach to the detection of epistatic interactions in case-control studies. BMC Bioinform. 2009, 10 (Suppl. S1), S65. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, J.; Malley, J.D.; Andrew, A.S.; Karagas, M.R.; Moore, J.H. Detecting gene-gene interactions using a permutation-based random forest method. BioData Min 2016, 9, 14. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Nembrini, S.; König, I.R.; Wright, M.N. The revival of the Gini importance? Bioinformatics 2018, 34, 3711–3718. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Vailhé, B.; Vittet, D.; Feige, J.J. In vitro models of vasculogenesis and angiogenesis. Lab. Investig. 2001, 81, 439–452. [Google Scholar] [CrossRef] [Green Version]
- Yang, W.; Ng, F.L.; Chan, K.; Pu, X.; Poston, R.N.; Ren, M.; An, W.; Zhang, R.; Wu, J.; Yan, S.; et al. Coronary-Heart-Disease-Associated Genetic Variant at the COL4A1/COL4A2 Locus Affects COL4A1/ COL4A2 Expression, Vascular Cell Survival, Atherosclerotic Plaque Stability and Risk of Myocardial Infarction. PLoS Genet. 2016, 12, e1006127. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Vargas, J.D.; Manichaikul, A.; Wang, X.Q.; Rich, S.S.; Rotter, J.I.; Post, W.S.; Polak, J.F.; Budoff, M.J.; Bluemke, D.A. Common genetic variants and subclinical atherosclerosis: The Multi-Ethnic Study of Atherosclerosis (MESA). Atherosclerosis 2016, 245, 230–236. [Google Scholar] [CrossRef] [Green Version]
- Dehghan, A.; Bis, J.C.; White, C.C.; Smith, A.V.; Morrison, A.C.; Cupples, L.A.; Trompet, S.; Chasman, D.I.; Lumley, T.; Völker, U.; et al. Genome-Wide Association Study for Incident Myocardial Infarction and Coronary Heart Disease in Prospective Cohort Studies: The CHARGE Consortium. PLoS ONE 2016, 11, e0144997. [Google Scholar] [CrossRef] [Green Version]
- Vargas, J.D.; Manichaikul, A.; Wang, X.Q.; Rich, S.S.; Rotter, J.I.; Post, W.S.; Polak, J.F.; Budoff, M.J.; Bluemke, D.A. Detailed analysis of association between common single nucleotide polymorphisms and subclinical atherosclerosis: The Multi-ethnic Study of Atherosclerosis. Data Brief 2016, 7, 229–242. [Google Scholar] [CrossRef] [Green Version]
- Lygirou, V.; Latosinska, A.; Makridakis, M.; Mullen, W.; Delles, C.; Schanstra, J.P.; Zoidakis, J.; Pieske, B.; Mischak, H.; Vlahou, A. Plasma proteomic analysis reveals altered protein abundances in cardiovascular disease. J. Transl. Med. 2018, 16, 104. [Google Scholar] [CrossRef]
- Mesitskaya, D.F.; Syrkin, A.L.; Aksenova, M.G.; Zhang, Y.; Zamyatnin, A.A.; Kopylov, P.Y. Thromboxane A Synthase: A New Target for the Treatment of Cardiovascular Diseases. Cardiovasc. Hematol. Agents Med. Chem. 2018, 16, 81–87. [Google Scholar] [CrossRef]
- Toumaniantz, G.; Ferland-McCollough, D.; Cario-Toumaniantz, C.; Pacaud, P.; Loirand, G. The Rho protein exchange factor Vav3 regulates vascular smooth muscle cell proliferation and migration. Cardiovasc. Res. 2010, 86, 131–140. [Google Scholar] [CrossRef]
- Xu, J.Z.; Zhang, J.L.; Zhang, W.G. Antisense RNA: The new favorite in genetic research. J. Zhejiang Univ. Sci. B 2018, 19, 739–749. [Google Scholar] [CrossRef] [PubMed]
- Aslibekyan, S.; Kabagambe, E.K.; Irvin, M.R.; Straka, R.J.; Borecki, I.B.; Tiwari, H.K.; Tsai, M.Y.; Hopkins, P.N.; Shen, J.; Lai, C.Q.; et al. A genome-wide association study of inflammatory biomarker changes in response to fenofibrate treatment in the Genetics of Lipid Lowering Drug and Diet Network. Pharm. Genom. 2012, 22, 191–197. [Google Scholar] [CrossRef] [Green Version]
- Peeters, T.; Monteagudo, S.; Tylzanowski, P.; Luyten, F.P.; Lories, R.; Cailotto, F. SMOC2 inhibits calcification of osteoprogenitor and endothelial cells. PLoS ONE 2018, 13, e0198104. [Google Scholar] [CrossRef] [PubMed]
- Howson, J.M.M.; Zhao, W.; Barnes, D.R.; Ho, W.K.; Young, R.; Paul, D.S.; Waite, L.L.; Freitag, D.F.; Fauman, E.B.; Salfati, E.L.; et al. Fifteen new risk loci for coronary artery disease highlight arterial-wall- specific mechanisms. Nat. Genet. 2017, 49, 1113–1119. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Privratsky, J.R.; Paddock, C.M.; Florey, O.; Newman, D.K.; Muller, W.A.; Newman, P.J. Relative contribution of PECAM-1 adhesion and signaling to the maintenance of vascular integrity. J. Cell Sci. 2011, 124, 1477–1485. [Google Scholar] [CrossRef] [Green Version]
- Ueland, T.; Åkerblom, A.; Ghukasyan, T.; Michelsen, A.E.; Becker, R.C.; Bertilsson, M.; Budaj, A.; Cornel, J.H.; Himmelmann, A.; James, S.K.; et al. ALCAM predicts future cardiovascular death in acute coronary syndromes: Insights from the PLATO trial. Atherosclerosis 2020, 293, 35–41. [Google Scholar] [CrossRef] [PubMed]
- Shendre, A.; Irvin, M.R.; Wiener, H.; Zhi, D.; Limdi, N.A.; Overton, E.T.; Shrestha, S. Local Ancestry and Clinical Cardiovascular Events Among African Americans From the Atherosclerosis Risk in Communities Study. J. Am. Heart Assoc. 2017, 6. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Folestad, E.; Kunath, A.; Wågsäter, D. PDGF-C and PDGF-D signaling in vascular diseases and animal models. Mol. Aspects Med. 2018, 62, 1–11. [Google Scholar] [CrossRef]
- Moriya, J.; Wu, X.; Zavala-Solorio, J.; Ross, J.; Liang, X.H.; Ferrara, N. Platelet-derived growth factor C promotes revascularization in ischemic limbs of diabetic mice. J. Vasc. Surg. 2014, 59, 1402–1409. [Google Scholar] [CrossRef] [Green Version]
- Zhang, H.; He, Y.; Dai, S.; Xu, Z.; Luo, Y.; Wan, T.; Luo, D.; Jones, D.; Tang, S.; Chen, H.; et al. AIP1 functions as an endogenous inhibitor of VEGFR2-mediated signaling and inflammatory angiogenesis in mice. J. Clin. Investig. 2008, 118, 3904–3916. [Google Scholar] [CrossRef] [Green Version]
- Harrison, S.C.; Cooper, J.A.; Li, K.; Talmud, P.J.; Sofat, R.; Stephens, J.W.; Hamsten, A.; Sanders, J.; Montgomery, H.; Neil, A.; et al. Association of a sequence variant in DAB2IP with coronary heart disease. Eur. Heart J. 2012, 33, 881–888. [Google Scholar] [CrossRef] [PubMed]
- Gretarsdottir, S.; Baas, A.F.; Thorleifsson, G.; Holm, H.; den Heijer, M.; de Vries, J.P.P.M.; Kranendonk, S.E.; Zeebregts, C.J.A.M.; van Sterkenburg, S.M.; Geelkerken, R.H.; et al. Genome-wide association study identifies a sequence variant within the DAB2IP gene conferring susceptibility to abdominal aortic aneurysm. Nat. Genet. 2010, 42, 692–697. [Google Scholar] [CrossRef]
- Xu, J.J.; Jiang, L.; Xu, L.J.; Gao, Z.; Zhao, X.Y.; Zhang, Y.; Song, Y.; Liu, R.; Sun, K.; Gao, R.L.; et al. Association of CDKN2B-AS1 Polymorphisms with Premature Triple-vessel Coronary Disease and Their Sex Specificity in the Chinese Population. Biomed. Environ. Sci. 2018, 31, 787–796. [Google Scholar] [CrossRef]
- Heit, C.; Jackson, B.C.; McAndrews, M.; Wright, M.W.; Thompson, D.C.; Silverman, G.A.; Nebert, D.W.; Vasiliou, V. Update of the human and mouse SERPIN gene superfamily. Hum. Genomics 2013, 7, 22. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Aulchenko, Y.S.; Ripke, S.; Isaacs, A.; van Duijn, C.M. GenABEL: An R library for genome-wide association analysis. Bioinformatics 2007, 23, 1294–1296. [Google Scholar] [CrossRef] [Green Version]
- Chang, C.C.; Chow, C.C.; Tellier, L.C.; Vattikuti, S.; Purcell, S.M.; Lee, J.J. Second-generation PLINK: Rising to the challenge of larger and richer datasets. Gigascience 2015, 4, 7. [Google Scholar] [CrossRef] [PubMed]
- Wright, M.; Ziegler, A. ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R. J. Stat. Softw. Artic. 2017, 77, 1–17. [Google Scholar] [CrossRef] [Green Version]
- Szymczak, S.; Holzinger, E.; Dasgupta, A.; Malley, J.D.; Molloy, A.M.; Mills, J.L.; Brody, L.C.; Stambolian, D.; Bailey-Wilson, J.E. r2VIM: A new variable selection method for random forests in genome-wide association studies. BioData Min. 2016, 9, 7. [Google Scholar] [CrossRef] [Green Version]
- Degenhardt, F.; Seifert, S.; Szymczak, S. Evaluation of variable selection methods for random forests and omics data sets. Brief. Bioinform. 2019, 20, 492–503. [Google Scholar] [CrossRef] [Green Version]
- Lewis Schmalohr, C.; Grossbach, J.; Clément-Ziza, M.; Beyer, A. Detection of epistatic interactions with Random Forest. bioRxiv 2018. [Google Scholar] [CrossRef]
- Berger, A. FUNDAMENTALS OF BIOSTATISTICS. Am. J. Public Health Nat. Health 1969, 59, 1266. [Google Scholar] [CrossRef]
- Benjamini, Y.; Hochberg, Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J. Royal Stat. Soc. Ser. B (Methodological) 1995, 57, 289–300. [Google Scholar] [CrossRef]
- Hothorn, T.; Hornik, K.; Zeileis, A. Unbiased Recursive Partitioning: A Conditional Inference Framework. J. Comput. Graph. Stat. 2006, 15, 651–674. [Google Scholar] [CrossRef] [Green Version]
Variable | Control | Case | p |
---|---|---|---|
Female (%) | 38.0% | 31.2% | <0.001 |
White (%) | 99.3% | 99.4% | 0.507 |
BMI (Mean ± SD) | 29.03 ± 7.37 | 28.57 ± 7.035 | 0.253 |
Age First MACE (Median ± IQR) | N/A | 65 ± 16 | 1 |
Variant 1 | Variant 2 | ||
---|---|---|---|
sex | CDCA7 (3’ 242.48 kb) rs6731912 | 0.001 | 0.029 |
sex | NAALADL2 (3’ 441.92 kb) rs1471695 | <0.001 | 0.082 |
sex | HAND2-AS1 (3’ 157.42 kb) rs9312547 | <0.001 | 0.007 |
sex | NNMT (5’ 4.01 kb) rs2244175 | 0.021 | 0.016 |
sex | ANKFN1 (5’ 115.11 kb) rs8082489 | <0.001 | 0.007 |
SZT2 rs2842180 | COL4A2 rs9515203 | <0.001 | 0.004 |
VAV3-AS1 rs3747945 | NPAS3 rs8008403 | 0.001 | 0.011 |
KCNT2 (3’ 1239.56 kb) rs6693848 | PECAM1 rs2812 | <0.001 | 0.004 |
KCNT2 (3’ 1239.56 kb) rs6693848 | PECAM1 rs9303470 | <0.001 | 0.004 |
KCNT2 (3’ 1239.56 kb) rs6693848 | PECAM1 (5’ 1.22 kb) rs6504218 | 0.032 | 0.004 |
ALCAM (5’ 150.37 kb) rs9818420 | STMND1 rs927629 | <0.001 | 0.001 |
NAALADL2 (3’ 441.92 kb) rs1471695 | RFX7 (5’ 10.73 kb) rs2713935 | <0.001 | 0.005 |
PDGFC rs1425486 | FTMT (5’ 478.9 kb) rs246210 | <0.001 | 0.001 |
FTMT (5’ 478.9 kb) rs246210 | DAB2IP rs7025486 | <0.001 | 0.001 |
ZFP2 rs953741 | CDKN2B-AS1 rs1333042 | 0.011 | 0.004 |
STMND1 rs927629 | SMOC2 rs13205533 | <0.001 | 0.004 |
SMOC2 rs13205533 | PECAM1 rs2812 | 0.043 | 0.016 |
TBXAS1 rs6464448 | COL4A2 rs9515203 | 0.014 | 0.009 |
TMEM178B rs7790976 | COL4A2 rs9515203 | 0.043 | 0.004 |
CDKN2B-AS1 rs2383207 | SERPINA13 rs17826595 | 0.001 | 0.016 |
SFMBT2 rs10453997 | CWF19L2 (3’ 106.96 kb) rs4754193 | <0.001 | 0.001 |
CWF19L2 (3’ 106.96 kb) rs4754193 | NNMT (5’ 4.01 kb) rs2244175 | 0.006 | 0.011 |
GATM (3’ 12.69 kb) rs2461700 | ZNF404 rs1978723 | <0.001 | 0.005 |
Diseases or Functions | Genes | FDR |
---|---|---|
Angiogenesis | ALCAM CDKN2B COL4A2 DAB2IP PDGFC PECAM1 SMOC2 VAV3 | 0.0188 |
Carotid artery disease | NNMT VAV3 | 0.034 |
Development of vasculature | ALCAM CDKN2B COL4A2 DAB2IP NPAS3 PDGFC PECAM1 SMOC2 VAV3 | 0.0188 |
Endothelial cell development | COL4A2 PDGFC PECAM1 SMOC2 | 0.0291 |
Formation of blood vessel | CDKN2B COL4A2 PECAM1 | 0.0242 |
Formation of endothelial tube | COL4A2 PECAM1 | 0.0291 |
Function of endothelial tissue | PECAM1 VAV3 | 0.0188 |
Migration of endothelial cells | ALCAM COL4A2 PECAM1 SMOC2 VAV3 | 0.0188 |
Quantity of endothelial cells | ALCAM PDGFC | 0.023 |
Vasculogenesis | ALCAM CDKN2B COL4A2 PDGFC PECAM1 SMOC2 | 0.0242 |
MACE on statin, defined as either AMI or revascularization on statin |
---|
AMI on statins: Case definition (all three conditions required): |
- At least two ICD9 code for AMI or other acute and subacute forms of ischemic heart disease within a five-day window |
- Confirmed lab within the same time window |
- Statin prescribed prior to the AMI event in medical records at least 180 days |
Revascularization while on statin: Case definition (both conditions required): |
- At least one revascularization CPT code |
- Statin prescribed prior to the revascularization event in medical records at least 180 days |
Case Exclusion: |
- No diagnosis code for AMI, other acute and subacute forms of ischemic heart disease, or historical AMI assigned previously |
- No revascularization CPT codes assigned previously |
- No MACE (Major Adverse Cardiovascular Events) found in previous problem list by NLP |
Control definition: |
- Statin prescribed |
- No diagnosis code for AMI, other acute and subacute forms of ischemic heart, or historical AMI assigned previously |
- No revascularization CPT codes assigned previously |
- No MACE found in previous problem list by NLP |
- Controls match cases by age, gender, statin type (e.g., simvastatin), and statin exposure |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Adams, S.M.; Feroze, H.; Nguyen, T.; Eum, S.; Cornelio, C.; Harralson, A.F. Genome Wide Epistasis Study of On-Statin Cardiovascular Events with Iterative Feature Reduction and Selection. J. Pers. Med. 2020, 10, 212. https://doi.org/10.3390/jpm10040212
Adams SM, Feroze H, Nguyen T, Eum S, Cornelio C, Harralson AF. Genome Wide Epistasis Study of On-Statin Cardiovascular Events with Iterative Feature Reduction and Selection. Journal of Personalized Medicine. 2020; 10(4):212. https://doi.org/10.3390/jpm10040212
Chicago/Turabian StyleAdams, Solomon M., Habiba Feroze, Tara Nguyen, Seenae Eum, Cyrille Cornelio, and Arthur F. Harralson. 2020. "Genome Wide Epistasis Study of On-Statin Cardiovascular Events with Iterative Feature Reduction and Selection" Journal of Personalized Medicine 10, no. 4: 212. https://doi.org/10.3390/jpm10040212
APA StyleAdams, S. M., Feroze, H., Nguyen, T., Eum, S., Cornelio, C., & Harralson, A. F. (2020). Genome Wide Epistasis Study of On-Statin Cardiovascular Events with Iterative Feature Reduction and Selection. Journal of Personalized Medicine, 10(4), 212. https://doi.org/10.3390/jpm10040212