Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1609/aaai.v38i18.30036guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

Identification of causal structure in the presence of missing data with additive noise model

Published: 07 January 2025 Publication History

Abstract

Missing data are an unavoidable complication frequently encountered in many causal discovery tasks. When a missing process depends on the missing values themselves (known as self-masking missingness), the recovery of the joint distribution becomes unattainable, and detecting the presence of such self-masking missingness remains a perplexing challenge. Consequently, due to the inability to reconstruct the original distribution and to discern the underlying missingness mechanism, simply applying existing causal discovery methods would lead to wrong conclusions. In this work, we found that the recent advances additive noise model has the potential for learning causal structure under the existence of the self-masking missingness. With this observation, we aim to investigate the identification problem of learning causal structure from missing data under an additive noise model with different missingness mechanisms, where the 'no self-masking missingness' assumption can be eliminated appropriately. Specifically, we first elegantly extend the scope of identifiability of causal skeleton to the case with weak self-masking missingness (i.e., no other variable could be the cause of self-masking indicators except itself). We further provide the sufficient and necessary identification conditions of the causal direction under additive noise model and show that the causal structure can be identified up to an IN-equivalent pattern. We finally propose a practical algorithm based on the above theoretical results on learning the causal skeleton and causal direction. Extensive experiments on synthetic and real data demonstrate the efficiency and effectiveness of the proposed algorithms.

References

[1]
Bhattacharya, R.; Nabi, R.; Shpitser, I.; and Robins, J. M. 2020. Identification in missing data models represented by directed acyclic graphs. In Uncertainty in Artificial Intelligence, 1149-1158. PMLR.
[2]
Cai, R.; Qiao, J.; Zhang, Z.; and Hao, Z. 2018. Self: structural equational likelihood framework for causal discovery. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32.
[3]
Gain, A.; and Shpitser, I. 2018. Structure learning under missing data. In International Conference on Probabilistic Graphical Models, 121-132. PMLR.
[4]
Gao, E.; Ng, I.; Gong, M.; Shen, L.; Huang, W.; Liu, T.; Zhang, K.; and Bondell, H. 2022. MissDAG: Causal Discovery in the Presence of Missing Data with Continuous Additive Noise Models. In Koyejo, S.; Mohamed, S.; Agarwal, A.; Belgrave, D.; Cho, K.; and Oh, A., eds., Advances in Neural Information Processing Systems, volume 35, 5024-5038. Curran Associates, Inc.
[5]
Gretton, A.; Herbrich, R.; Smola, A. J.; Bousquet, O.; and Schölkopf, B. 2005. Kernel Methods for Measuring Independence. Journal of Machine Learning Research, 6: 2075-2129.
[6]
Heins, M. J.; Knoop, H.; Burk, W. J.; and Bleijenberg, G. 2013. The process of cognitive behaviour therapy for chronic fatigue syndrome: which changes in perpetuating cognitions and behaviour are related to a reduction in fatigue? Journal of psychosomatic research, 75(3): 235-241.
[7]
Hoyer, P.; Janzing, D.; Mooij, J. M.; Peters, J.; and Schölkopf, B. 2008. Nonlinear causal discovery with additive noise models. Advances in neural information processing systems, 21.
[8]
Liu, Y.; and Constantinou, A. C. 2021. Greedy structure learning from data that contains systematic missing values. arXiv preprint arXiv:2107.04184.
[9]
Meek, C. 1995. Causal inference and causal explanation with background knowledge. In Proceedings of the Eleventh conference on Uncertainty in artificial intelligence, 403-410.
[10]
Mohan, K. 2018. On Handling Self-Masking and Other Hard Missing Data Problems. In AAAI Symposium 2018.
[11]
Mohan, K.; and Pearl, J. 2021. Graphical models for processing missing data. Journal of the American Statistical Association, 116(534): 1023-1037.
[12]
Mohan, K.; Pearl, J.; and Tian, J. 2013. Graphical models for inference with missing data. Advances in neural information processing systems, 26.
[13]
Nabi, R.; Bhattacharya, R.; and Shpitser, I. 2020. Full law identification in graphical models of missing data: Completeness results. In International Conference on Machine Learning, 7153-7163. PMLR.
[14]
Osborne, J. W. 2013. Best practices in data cleaning: A complete guide to everything you need to do before and after collecting your data. Sage.
[15]
Peters, J.; and Bühlmann, P. 2014. Identifiability of Gaussian structural equation models with equal error variances. Biometrika, 101(1): 219-228.
[16]
Qiao, J.; Cai, R.; Zhang, K.; Zhang, Z.; and Hao, Z. 2021. Causal discovery with confounding cascade nonlinear additive noise models. ACM Transactions on Intelligent Systems and Technology (TIST), 12(6): 1-28.
[17]
Qiao, J.; Chen, Z.; Yu, J.; Cai, R.; and Hao, Z. 2023. Identification of Causal Structure in the Presence of Missing Data with Additive Noise Model. arXiv:2312.12206.
[18]
Rahmadi, R.; Groot, P.; Heins, M.; Knoop, H.; Heskes, T.; et al. 2017. Causality on cross-sectional data: stable specification search in constrained structural equation modeling. Applied Soft Computing, 52: 687-698.
[19]
Rubin, D. B. 1976. Inference and missing data. Biometrika, 63(3): 581-592.
[20]
Rubin, D. B. 2004. Multiple imputation for nonresponse in surveys, volume 81. John Wiley & Sons.
[21]
Shimizu, S.; Hoyer, P. O.; Hyvärinen, A.; Kerminen, A.; and Jordan, M. 2006. A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7(10).
[22]
Spirtes, P.; Glymour, C. N.; Scheines, R.; and Heckerman, D. 2000. Causation, prediction, and search. MIT press.
[23]
Strobl, E. V.; Visweswaran, S.; and Spirtes, P. L. 2018. Fast causal inference with non-random missingness by test-wise deletion. International journal of data science and analytics, 6(1): 47-62.
[24]
Tu, R.; Zhang, C.; Ackermann, P.; Mohan, K.; Kjellström, H.; and Zhang, K. 2019. Causal discovery in the presence of missing data. In The 22nd International Conference on Artificial Intelligence and Statistics, 1762-1770. PMLR.
[25]
Zhang, Q.; Filippi, S.; Gretton, A.; and Sejdinovic, D. 2018. Large-scale kernel methods for independence testing. Statistics and Computing, 28: 113-130.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
AAAI'24/IAAI'24/EAAI'24: Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence
February 2024
23861 pages
ISBN:978-1-57735-887-9

Sponsors

  • Association for the Advancement of Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 07 January 2025

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media