Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Disentangling causality: assumptions in causal discovery and inference

Published: 27 February 2023 Publication History

Abstract

Causality has been a burgeoning field of research leading to the point where the literature abounds with different components addressing distinct parts of causality. For researchers, it has been increasingly difficult to discern the assumptions they have to abide by in order to glean sound conclusions from causal concepts or methods. This paper aims to disambiguate the different causal concepts that have emerged in causal inference and causal discovery from observational data by attributing them to different levels of Pearl’s Causal Hierarchy. We will provide the reader with a comprehensive arrangement of assumptions necessary to engage in causal reasoning at the desired level of the hierarchy. Therefore, the assumptions underlying each of these causal concepts will be emphasized and their concomitant graphical components will be examined. We show which assumptions are necessary to bridge the gaps between causal discovery, causal identification and causal inference from a parametric and a non-parametric perspective. Finally, this paper points to further research areas related to the strong assumptions that researchers have glibly adopted to take part in causal discovery, causal identification and causal inference.

References

[1]
Andersen H When to expect violations of causal faithfulness and why it matters Philos Sci 2013 80 5 672-683
[2]
Andersson SA, Madigan D, and Perlman MD Alternative markov properties for chain graphs Scand J Stat 2001 28 1 33-85
[3]
Andrew A, Spillard S, Collyer J, et al (2022) Developing optimal causal cyber-defence agents via cyber security simulation.
[4]
Arbour D, Garant D, Jensen D (2016) Inferring network effects from observational data. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. Association for Computing Machinery, New York, KDD ’16, pp 715–724,
[5]
Aronow P and Samii C Estimating average causal effects under general interference, with application to a social network experiment Ann Appl Stat 2017 11 4 1912-1947
[6]
Assaad CK, Devijver E, and Gaussier E Survey and evaluation of causal discovery methods for time series J Artif Intell Res 2022 73 767-819
[7]
Bareinboim E, Brito C, Pearl J (2012) Local characterizations of causal bayesian networks. In: Graph structures for knowledge representation and reasoning. Springer, Berlin, pp 1–17,
[8]
Bareinboim E, Correa JD, Ibeling D, et al. On pearl’s hierarchy and the foundations of causal inference Probab Causal Inference 2022 10 1145/3501714 3501743
[9]
Beuzen T, Marshall L, and Splinter KD A comparison of methods for discretizing continuous variables in bayesian networks Environ Modell Softw 2018 108 61-66
[10]
Bhattacharya R, Malinsky D, Shpitser I (2020) Causal inference under interference and network uncertainty. In: Adams RP, Gogate V (eds) In: Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, PMLR, pp 1028–1038
[11]
Bongers S, Forré P, Peters J, et al. Foundations of structural causal models with cycles and latent variables Ann Stat 2021 49 5 2885-2915
[12]
Boutilier C, Friedman N, Goldszmidt M, et al (1996) Context-specific independence in bayesian networks. In: Proceedings of the twelfth international conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., San Francisco, UAI’96, pp 115–123
[13]
Carli F, Leonelli M, Riccomagno E, et al (2020) The R package stagedtrees for structural learning of stratified staged trees.
[14]
Cartwright N Causal diversity and the markov condition Synthese 1999 121 1/2 3-27 http://www.jstor.org/stable/20118219
[15]
Chen SH and Pollino CA Good practice in bayesian network modelling Environ Modell Softw 2012 37 134-145
[16]
Chickering DM Optimal structure identification with greedy search J Mach Learn Res 2003 3 507-554
[17]
Cole SR and Frangakis CE The consistency statement in causal inference: a definition or an assumption? Epidemiology 2009 20 1 3-5
[18]
Colombo D, Maathuis MH, Kalisch M, et al (2012) Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann Stat pp 294–321.
[19]
Correa J and Bareinboim E A calculus for stochastic interventions: causal effect identification and surrogate experiments Proc AAAI Confer Artif Intell 2020 06 10093-10100
[20]
Cox DR Planning of experiments 1958 New York Wiley
[21]
Cox PM, Betts RA, Jones CD, et al. Acceleration of global warming due to carbon-cycle feedbacks in a coupled climate model Nature 2000 408 6809 184-187
[22]
D’Amour A, Ding P, Feller A, et al. Overlap in observational studies with high-dimensional covariates J Econ 2021 221 2 644-654
[23]
Dawid AP Causal inference without counterfactuals J Am Stat Assoc 2000 95 450 407-424
[24]
Dawid AP (2010) Beware of the dag! In: Proceedings of workshop on Ccausality: objectives and assessment at NIPS 2008, vol 6. PMLR, Whistler, pp 59–86, https://proceedings.mlr.press/v6/dawid10a.html
[25]
Dhir N, Hoeltgebaum H, Adams N, et al (2021) Prospective artificial intelligence approaches for active cyber defence.
[26]
Duarte E, Solus L (2021) Representation of context-specific causal models with observational and interventional data.
[27]
Eberhardt F Introduction to the epistemology of causation Philos Compass 2009 4 6 913-925
[28]
Eberhardt F Introduction to the foundations of causal discovery Int J Data Sci Anal 2016 3 81-91
[29]
Eberhardt F and Scheines R Interventions and causal inference Philos Sci 2007 74 5 981-995
[30]
Forster M, Raskutti G, Stern R, et al. The frugal inference of causal relations Br J Philos Sci 2018 69 3 821-848
[31]
Geiger D and Heckerman D Knowledge representation and inference in similarity networks and bayesian multinets Artif Intell 1996 82 1 45-74
[32]
Gibbard A, Harper WL (1978) Counterfactuals and two kinds of expected utility. In: Ifs. Springer, pp 153–190,
[33]
Glymour C, Zhang K, and Spirtes P Review of causal discovery methods based on graphical models Front Genet 2019 10 524
[34]
Goudet O, Kalainathan D, Sebag M, et al (2019) Learning bivariate functional causal models. In: Cause effect pairs in machine learning. Springer, pp 101–153
[35]
Granger CW (1969) Investigating causal relations by econometric models and cross-spectral methods. Econometrica: J Econ Soc pp 424–438
[36]
Grimbly SJ, Shock J, Pretorius A (2021) Causal multi-agent reinforcement learning: review and open problems.
[37]
Grosz MP, Rohrer JM, and Thoemmes F The taboo against explicit causal inference in nonexperimental psychology Perspect Psychol Sci 2020 15 5 1243-1255
[38]
Guo R, Cheng L, Li J, et al. A survey of learning causality with data: Problems and methods ACM Comput Surv (CSUR) 2020 53 4 1-37
[39]
Guo R, Li J, Liu H (2020b) Learning individual causal effects from networked observational data. In: Proceedings of the 13th international conference on web search and data mining, pp 232–240,
[40]
Halloran ME, Struchiner CJ (1995) Causal inference in infectious diseases. Epidemiology pp 142–151.
[41]
Hanea A, Napoles OM, and Ababei D Non-parametric bayesian networks: improving theory and reviewing applications Reliabil Eng Syst Saf 2015 144 265-284
[42]
Hartford J, Lewis G, Leyton-Brown K, et al (2017) Deep iv: a flexible approach for counterfactual prediction. In: International conference on machine learning, PMLR, proceedings of machine learning research, pp 1414–1423
[43]
Hauser A and Bühlmann P Jointly interventional and observational data: estimation of interventional markov equivalence classes of directed acyclic graphs J R Stat Soc Ser B (Stat Methodol) 2015 77 1 291-318
[44]
Hausman DM and Woodward J Independence, invariance and the causal markov condition Br J Philos Sci 1999 50 4 521-583
[45]
Hernan M and Robins J Causal inference: what if 2020 Boca Raton CRC Press
[46]
Holland PW Statistics and causal inference J Am Stat Assoc 1986 81 396 945-960
[47]
Hoyer P, Janzing D, Mooij JM, et al (2008a) Nonlinear causal discovery with additive noise models. Adv Neural Inform Process Syst 21
[48]
Hoyer PO, Shimizu S, Kerminen AJ, et al. Estimation of causal effects using linear non-gaussian causal models with hidden variables Int J Approx Reason 2008 49 2 362-378
[49]
Hudgens MG and Halloran ME Toward causal inference with interference J Am Stat Assoc 2008 103 482 832-842
[50]
Hyttinen A, Pensar J, Kontinen J, et al (2018) Structure learning for bayesian networks over labeled dags. In: Proceedings of the ninth international conference on probabilistic graphical models, proceedings of machine learning research, vol 72. PMLR, pp 133–144
[51]
Imbens GW and Rubin DB Rubin causal model 2010 London Palgrave Macmillan UK 229-241
[52]
Imbens GW and Rubin DB Causal inference for statistics, social, and biomedical sciences: an introduction 2015 Cambridge Cambridge University Press 229-241
[53]
Koller D and Friedman N Probabilistic graphical models: principles and techniques 2009 Cambridge MIT Press
[54]
Kreif N, DiazOrdaz K (2019) Machine learning in policy evaluation: new tools for causal inference.
[55]
Lacerda G, Spirtes PL, Ramsey J, et al (2012) Discovering cyclic causal models by independent components analysis.
[56]
Langseth H, Nielsen TD, Rumí R, et al. Inference in hybrid bayesian networks Reliabil Eng Syst Saf 2009 94 10 1499-1509
[57]
Lauritzen SL Graphical Models 1996 Oxford Oxford University Press
[58]
Lauritzen SL and Richardson TS Chain graph models and their causal interpretations J R Stat Soc Ser B (Stat Methodol) 2002 64 3 321-348
[59]
Leonelli M, Varando G (2021) Context-specific causal discovery for categorical data using staged trees.
[60]
Louizos C, Shalit U, Mooij JM, et al (2017) Causal effect inference with deep latent-variable models. Adv Neural Inform Process Syst 30
[61]
Maes S, Meganck S, and Manderick B Inference in multi-agent causal models Int J Approx Reason 2007 46 2 274-299
[62]
Mahmood A (2011). Structure learning of causal bayesian networks: a survey.
[63]
Maier M, Marazopoulou K, Arbour D, et al (2013a) A sound and complete algorithm for learning causal models from relational data.
[64]
Maier M, Marazopoulou K, Jensen D (2013b) Reasoning about independence in probabilistic models of relational data.
[65]
Malinsky D, Danks D (2018) Causal discovery algorithms: a practical guide. Philos Compass 13(1):e12470.
[66]
Malinsky D, Shpitser I, Richardson T (2019) A potential outcomes calculus for identifying conditional path-specific effects. In: Proceedings of the twenty-second international conference on artificial intelligence and statistics, PMLR, pp 3080–3088
[67]
Marx A, Gretton A, Mooij JM (2021) A weaker faithfulness assumption based on triple interactions. In: Proceedings of the thirty-seventh conference on uncertainty in artificial intelligence, PMLR, Proceedings of machine learning research, pp 451–460
[68]
Naimi AI, Cole SR, and Kennedy EH An introduction to g methods Int J Epidemiol 2016 46 2 756-762
[69]
Nichols A Causal inference with observational data Stata J 2007 7 4 507-541
[70]
Nogueira AR, Gama J, and Ferreira CA Causal discovery in machine learning: theories and applications Journal of Dynamics & Games 2021 8 3 203
[71]
Nogueira AR, Pugnana A, Ruggieri S, et al. Methods and tools for causal discovery and causal inference Wiley Interdiscip Rev 2022 12 2
[72]
Ogarrio JM, Spirtes P, Ramsey J (2016) A hybrid causal search algorithm for latent variable models. In: Proceedings of the eighth international conference on probabilistic graphical models, PMLR, pp 368–379
[73]
Ogburn EL and VanderWeele TJ Causal diagrams for interference Stat Sci 2014 29 4 559-578
[74]
Pearl J (1997) On the identification of nonparametric structural models. In: Berkane M (ed) Latent variable modeling and applications to causality. Springer, New York, pp 29–68,
[75]
Pearl J (2009) Causality. Cambridge University Press.
[76]
Pearl J, Mackenzie D (2018) The book of why: the new science of cause and effect. Basic books
[77]
Peña JM (2016) Learning acyclic directed mixed graphs from observations and interventions. In: Conference on probabilistic graphical models, PMLR, pp 392–402
[78]
Pensar J, Nyman H, Koski T, et al. Labeled directed acyclic graphs: a generalization of context-specific independence in directed graphical models Data Mining Knowledge Discov 2015 29 2 503-533
[79]
Perkovic E (2020) Identifying causal effects in maximally oriented partially directed acyclic graphs. In: Proceedings of the 36th conference on uncertainty in artificial intelligence (UAI), proceedings of machine learning research, vol 124. PMLR, pp 530–539
[80]
Peters J, Mooij JM, Janzing D, et al. Causal discovery with continuous additive noise models J Mach Learn Res 2014 15 58 2009-2053
[81]
Ramsey J, Glymour M, Sanchez-Romero R, et al. A million variables and more: the fast greedy equivalence search algorithm for learning high-dimensional graphical causal models, with an application to functional magnetic resonance images Int J Data Sci Anal 2017 3 2 121-129
[82]
Richardson T and Spirtes P Ancestral graph markov models Ann Stat 2002 30 4 962-1030
[83]
Richardson TS (2014) A factorization criterion for acyclic directed mixed graphs.
[84]
Richardson TS, Robins JM (2013a) Single world intervention graphs: a primer
[85]
Richardson TS, Robins JM (2013b) Single world intervention graphs (swigs): aunification of the counterfactual and graphical approaches to causality. Center for the Statistics and the Social Sciences, University of Washington Series Working Paper 128(30)
[86]
Robins J A new approach to causal inference in mortality studies with a sustained exposure period-application to control of the healthy worker survivor effect Math Modell 1986 7 9–12 1393-1512
[87]
Robins J, Hernán M, and Siebert U Effects of multiple interventions Comparat Quant Health Risks 2004 1 2191-2230
[88]
Robins JM, Richardson TS (2011) Alternative graphical causal models and the identification of direct effects. In: Causality and psychopathology: finding the determinants of disorders and their cures, vol 84. Oxford University Press, pp 103–158,
[89]
Robins JM, Richardson TS, Shpitser I (2022) An interventionist approach to mediation analysis. In: Probabilistic and causal inference: the works of Judea Pearl, pp 713–764,
[90]
Rosenbaum PR and Rubin DB The central role of the propensity score in observational studies for causal effects Biometrika 1983 70 1 41-55
[91]
Rubenstein PK, Weichwald S, Bongers S, et al (2017) Causal consistency of structural equation models.
[92]
Rubin DB Estimating causal effects of treatments in randomized and nonrandomized studies J Eucat Psychol 1974 66 5 688-701
[93]
Rubin DB Bayesian inference for causal effects: the role of randomization The Annals of Statistics 1978 6 1 34-58
[94]
Rubin DB Randomization analysis of experimental data: the fisher randomization test comment J Am Stat Assoc 1980 75 371 591-593
[95]
Runge J Causal network reconstruction from time series: from theoretical assumptions to practical estimation Chaos 2018 28 7 075310
[96]
Salmerón A, Rumí R, Langseth H, et al. A review of inference algorithms for hybrid bayesian networks J Artif Intell Res 2018 62 799-828
[97]
Shalit U Can we learn individual-level treatment policies from clinical data? Biostatistics 2020 21 2 359-362
[98]
Shenoy PP and West JC Inference in hybrid bayesian networks using mixtures of polynomials Int J Approx Reason 2011 52 5 641-657
[99]
Sherman E, Shpitser I (2018) Identification and estimation of causal effects from dependent data. Adv Neural Inform Process Syst 31
[100]
Shimizu S, Hoyer PO, Hyvärinen A, et al. A linear non-gaussian acyclic model for causal discovery J Mach Learn Res 2006 7 72 2003-2030
[101]
Shpitser I (2015) Segregated graphs and marginals of chain graph models. Adv Neural Inform Process Syst 28
[102]
Shpitser I, Pearl J (2006) Identification of joint interventional distributions in recursive semi-markovian causal models. In: Proceedings of the 21st national conference on artificial intelligence-volume 2. AAAI Press, AAAI’06, pp 1219–1226
[103]
Shpitser I and Tchetgen ET Causal inference with a graphical hierarchy of interventions Ann Stat 2016 44 6 2433-2466
[104]
Shpitser I, Richardson TS, Robins JM (2022) Multivariate counterfactual systems and causal graphical models, 1st edn., Association for computing machinery, New York, pp 813–852.
[105]
Silva R (2016) Observational-interventional priors for dose-response learning. Adv Neural Inform Process Syst 29
[106]
Smith JQ and Anderson PE Conditional independence and chain event graphs Artif Intell 2008 172 1 42-68
[107]
Sobel DM and Legare CH Causal learning in children WIREs Cognit Sci 2014 5 4 413-427
[108]
Soto MG, Sucar LE, and Escalante HJ Causal games and causal nash equilibrium Res Comput Sci 2020 149 123-133
[109]
Spirtes P and Glymour C An algorithm for fast recovery of sparse causal graphs Soc Sci Comput Rev 1991 9 1 62-72
[110]
Spirtes P, Glymour CN, Scheines R (1990) Causality from probability. In: Conference proceedings: advanced computing for the social sciences
[111]
Spirtes P, Glymour CN, Scheines R, et al. Causation, prediction, and search 2000 Cambridge MIT Press
[112]
Stuart EA Matching methods for causal inference: a review and a look forward Stat Sci 2010 25 1 1-21
[113]
Tchetgen EJT and VanderWeele TJ On causal inference in the presence of interference Stat Methods Med Res 2012 21 1 55-75
[114]
Tikka S, Hyttinen A, and Karvanen J Identifying causal effects via context-specific independence relations Adv Neural Inform Process Syst 2019 32 15
[115]
VanderWeele TJ Concerning the consistency assumption in causal inference Epidemiology 2009 20 6 880-883
[116]
VanderWeele TJ and Hernan MA Causal inference under multiple versions of treatment J Causal Inference 2013 1 1 1-20
[117]
Vowels MJ, Camgoz NC, Bowden R (2022) D’ya like dags? A survey on structure learning and causal discovery. ACM Comput Surv (CSUR).
[118]
Yao L, Chu Z, Li S, et al. A survey on causal inference ACM Trans Knowl Disco Data (TKDD) 2021 15 5 1-46
[119]
Young JG, Hernán MA, and Robins JM Identification, estimation and approximation of risk under interventions that depend on the natural value of treatment using observational data Epidemiol Methods 2014 3 1 1-19
[120]
Yu K, Li J, Liu L (2016) A review on algorithms for constraint-based causal discovery.
[121]
Yuan C, Druzdzel MJ (2007) Importance sampling for general hybrid bayesian networks. In: Artificial intelligence and statistics, proceedings of machine learning research, vol 2. PMLR, pp 652–659
[122]
Zhang J A comparison of three occam’s razors for markovian causal models Br J Philos Sci 2013 64 423-448
[123]
Zhang J and Spirtes P Intervention, determinism, and the causal minimality condition Synthese 2011 182 3 335-347
[124]
Zhang J and Spirtes P The three faces of faithfulness Synthese 2015 193 4 1011-1027
[125]
Zhang K, Hyvärinen A (2009) On the identifiability of the post-nonlinear causal model. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence, pp 647–655

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Artificial Intelligence Review
Artificial Intelligence Review  Volume 56, Issue 9
Sep 2023
1648 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 27 February 2023
Accepted: 01 February 2023

Author Tags

  1. Causal discovery
  2. Causal identification
  3. Causal inference
  4. Observational data
  5. Causal assumptions

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media