Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

Data-driven discovery of causal interactions

  • Regular Paper
  • Published:
International Journal of Data Science and Analytics Aims and scope Submit manuscript

Abstract

Causal discovery is a primary focus in many fields. Various methods have been developed to mine causal relationships from observational data. Most of the methods are only capable of identifying individual causes without considering their interactions. However, in real life, many effects are due to multiple factors that interact with each other. Therefore, detecting the interactions between those causal factors is essential for understanding the real causal mechanisms. So far, there are no efficient data-driven approaches to discovering causal interactions from data, especially large data sets. In this paper, we propose a general data-driven framework and develop four algorithms instantiated from the framework to detect causal interactions, directly from data. Extensive experiments on both synthetic and real-world data have shown that the proposed framework and the algorithms can achieve high effectiveness and efficiency for causal interaction discovery.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Ahrens, W., Krickeberg, K., Pigeot, I.: An introduction to epidemiology. In: Ahrens, W., Pigeot, I. (eds.) Handbook of Epidemiology, pp 1–40. Springer, Berlin (2005)

    Chapter  Google Scholar 

  2. Bartel, D.P.: MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116(2), 281–297 (2004)

    Article  Google Scholar 

  3. Dao, B., Nguyen, T., Venkatesh, S., Phung, D.: Latent sentiment topic modelling and nonparametric discovery of online mental health-related communities. Int. J. Data Sci. Anal. 4(3), 209–31 (2017)

    Article  Google Scholar 

  4. Eberhardt, F.: Introduction to the foundations of causal discovery. Int. J. Data Sci. Anal. 3(2), 81–91 (2017)

    Article  Google Scholar 

  5. Fleiss, J.L., Levin, B., Paik, M.C.: Statistical Methods for Rates and Proportions. Wiley, New York (2013)

    MATH  Google Scholar 

  6. Hahn, L.W., Ritchie, M.D., Moore, J.H.: Multifactor dimensionality reduction software for detecting gene–gene and gene-environment interactions. Bioinformatics 19(3), 376–382 (2003)

    Article  Google Scholar 

  7. Hastie, T., Tibshirani, R., Narasimhan, B., Chu, G.: Package ‘impute’ (2016). https://bioconductor.org/packages/release/bioc/manuals/impute/man/impute.pdf

  8. Hunter, D.J.: Gene-environment interactions in human diseases. Nat. Rev. Genet. 6(4), 287–298 (2005)

    Article  Google Scholar 

  9. Imbens, G.W.: The role of the propensity score in estimating dose–response functions. Biometrika 87(3), 706–710 (2000)

    Article  MathSciNet  Google Scholar 

  10. Jiang, X., Neapolitan, R.E., Barmada, M.M., Visweswaran, S., Cooper, G.F.: A fast algorithm for learning epistatic genomic relationships. AMIA Ann. Symp. Proc. 2010, 341–345 (2010)

    Google Scholar 

  11. Kanehisa, M., Goto, S.: KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28(1), 27–30 (2000)

    Article  Google Scholar 

  12. Knol, M.J., VanderWeele, T.J., Groenwold, R.H.H., Klungel, O.H., Rovers, M.M., Grobbee, D.E.: Estimating measures of interaction on an additive scale for preventive exposures. Eur. J. Epidemiol. 26(6), 433–438 (2011)

    Article  Google Scholar 

  13. Kupper, L.L., Hogan, M.D.: Interaction in epidemiologic studies. Am. J. Epidemiol. 108(6), 447–453 (1978)

    Article  Google Scholar 

  14. Le, T.D., Zhang, J., Liu, L., Li, J.: Ensemble methods for miRNA target prediction from expression data. PLoS ONE 10(6), e0131-627 (2015)

    Article  Google Scholar 

  15. Lewis, B.P., Shih, I.H., Jones-Rhoades, M.W., Bartel, D.P., Burge, C.B.: Prediction of mammalian microRNA targets. Cell 115(7), 787–798 (2003)

    Article  Google Scholar 

  16. Li, J., Le, T.D., Liu, L., Liu, J., Jin, Z., Sun, B., Ma, S.: From observational studies to causal rule mining. ACM Trans. Intell. Syst. Technol. 7(2), 14 (2015)

    Article  Google Scholar 

  17. Li, J., Ma, S., Le, T., Liu, L., Liu, J.: Causal decision trees. IEEE Trans. Knowl. Data Eng. PP(99), 1–14 (2016)

    Google Scholar 

  18. Liddell, F.D.K.: The interaction of asbestos and smoking in lung cancer. Ann. Occup. Hyg. 45(5), 341–356 (2001)

    Article  Google Scholar 

  19. Ma, S., Li, J., Liu, L., Le, T.D.: Discovering Context Specific Causal Relationships. arXiv preprint arXiv:1808.06316 (2018)

  20. Ma, S., Li, J., Liu, L., Le, T.D.: Mining combined causes in large data sets. Knowl. Based Syst. 92, 104–111 (2016)

    Article  Google Scholar 

  21. Miller, D.J., Zhang, Y., Yu, G., Liu, Y., Chen, L., Langefeld, C.D., Herrington, D., Wang, Y.: An algorithm for learning maximum entropy probability models of disease risk that efficiently searches and sparingly encodes multilocus genomic interactions. Bioinformatics 25(19), 2478–2485 (2009)

    Article  Google Scholar 

  22. Novick, L.R., Cheng, P.W.: Assessing interactive causal influence. Psychol. Rev. 111(2), 455 (2004)

    Article  Google Scholar 

  23. Pearl, J.: Causality: Models, Reasoning and Inference. Cambridge University Press, Cambridge (2000)

    MATH  Google Scholar 

  24. Ritchie, M.E., Phipson, B., Wu, D., Hu, Y., Law, C.W., Shi, W., Smyth, G.K.: Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43(7), e47 (2015)

    Article  Google Scholar 

  25. Robins, J.M.: Marginal structural models versus structural nested models as tools for causal inference. In: Halloran, M.E., Berry, D. (eds.) Statistical Models in Epidemiology, the Environment, and Clinical Trials, pp 95–133. Springer, New York (2000)

    Chapter  Google Scholar 

  26. Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70(1), 41–55 (1983)

    Article  MathSciNet  Google Scholar 

  27. Rosenbaum, P.R., Rubin, D.B.: Reducing bias in observational studies using subclassification on the propensity score. J. Am. Stat. Assoc. 79(387), 516–524 (1984)

    Article  Google Scholar 

  28. Rosenblum, M., van der Laan, M.J.: Optimizing randomized trial designs to distinguish which subpopulations benefit from treatment. Biometrika 98(4), 845–860 (2011)

    Article  MathSciNet  Google Scholar 

  29. Rothman, K.J.: Causes. Am. J. Epidemiol. 104(6), 587–592 (1976)

    Article  Google Scholar 

  30. Rothman, K.J., Greenland, S., Lash, T.L.: Modern Epidemiology. Lippincott Williams & Wilkins, Philadelphia (2008)

    Google Scholar 

  31. Rubin, D.B.: Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66(5), 688 (1974)

    Article  Google Scholar 

  32. Song, J., Satoshi, O., Masahito, K.: Tell cause from effect: models and evaluation. Int. J. Data Sci. Anal. 4(2), 99–112 (2017)

    Article  Google Scholar 

  33. Soulakis, N.D., Carson, M.B., Lee, Y.J., Schneider, D.H., Skeehan, C.T., Scholtens, D.M.: Visualizing collaborative electronic health record usage for hospitalized patients with heart failure. J. Am. Med. Inf. Assoc. 22(2), 299–311 (2015)

    Article  Google Scholar 

  34. Van der Weele, T.J.: On the distinction between interaction and effect modification. Epidemiology 20(6), 863–871 (2009)

    Article  Google Scholar 

  35. Van der Weele, T.J., Robins, J.M.: A theory of sufficient cause interactions. COBRA Preprint Series, p. 13 (2006)

  36. Van der Weele, T.J., Robins, J.M.: Empirical and counterfactual conditions for sufficient cause interactions. Biometrika 95(1), 49–61 (2008)

    Article  MathSciNet  Google Scholar 

  37. Vimaleswaran, K.S., Power, C., Hyppnen, E.: Interaction between vitamin D receptor gene polymorphisms and 25-hydroxyvitamin D concentrations on metabolic and cardiovascular disease outcomes. Diabetes Metab. 40(5), 386–389 (2014)

    Article  Google Scholar 

  38. White, P.A.: Causal judgement from contingency information: judging interactions between two causal candidates. Q. J. Exp. Psychol. Sect. A 55(3), 819–838 (2002)

    Article  Google Scholar 

  39. Yang, S., Natarajan, S.: Knowledge intensive learning: combining qualitative constraints with causal independence for parameter learning in probabilistic models. In: Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science, pp 580–595. Springer, Berlin (2013)

    Chapter  Google Scholar 

Download references

Acknowledgements

This work has been partially supported by Australian Research Council (ARC) Discovery grant DP140103617 and ARC Discovery grant DP170101306.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Saisai Ma.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, S., Liu, L., Li, J. et al. Data-driven discovery of causal interactions. Int J Data Sci Anal 8, 285–297 (2019). https://doi.org/10.1007/s41060-018-0168-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41060-018-0168-0

Keywords