Abstract
Probabilistic methods for causal discovery are based on the detection of patterns of correlation between variables. They are based on statistical theory and have revolutionised the study of causality. However, when correlation itself is unreliable, so are probabilistic methods: unusual data can lead to spurious causal links, while nonmonotonic functional relationships between variables can prevent the detection of causal links. We describe a new heuristic method for inferring causality between two continuous variables, based on randomness and unimodality tests and making few assumptions about the data. We evaluate the method against probabilistic and additive noise algorithms on real and artificial datasets, and show that it performs competitively.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Basu, S., DasGupta, A.: The mean, median, and mode of unimodal distributions: A characterization. Theory Probab. Appl. 41(2), 210–223 (1997)
Black, S.E.: Do Better schools matter? parental valuation of elementary education. Q. J. Econ. 114(2), 577–599 (1999)
Buehlmann, P., Peters, J., Ernest, J.: CAM: causal additive models, high-dimensional order search and penalized regression. Ann. Stat. 42, 2526–2556 (2014)
Bunge, M.: Causality and Modern Science. Transaction Publishers (2009)
Chay, K.Y., Greenstone, M.: Does air quality matter? Evidence from the housing market. J. Polit. Econ. 113(2), 376–424 (2005)
Chiodo, A.J., Hernandez-Murillo, R., Owyang, M.T.: Nonlinear effects of school quality on house prices. Federal Reserve Bank St. Louis Rev. 92(6), 185–204 (2010)
Cooper, G.F.: The computational complexity of probabilistic inference using bayesian belief networks. Artificial Intelligence 42(2-3), 393–405 (1990)
Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J.: Modeling wine preferences by data mining from physicochemical properties. Decis. Support. Syst. 47 (4), 547–553 (2009)
Currie, J., Davis, L., Greenstone, M., Walker, R.: Environmental health risks and housing values: evidence from 1,600 toxic plant openings and closings. Am. Econ. Rev. 105(2), 678–709 (2015)
Daniušis, P., Janzing, D., Mooij, J.M., Zscheischler, J., Steudel, B., Zhang, K., Schölkopf, B.: Inferring deterministic causal relations. In: Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, pp. 143–150 (2010)
Fukumizu, K., Gretton, A., Sun, X., Schoelkopf, B.: Kernel measures of conditional dependence. In: Proceedings of the 20th International Conference on Advances in Neural Information Processing Systems, pp. 489–496. MIT Press (2007)
Granger, C.W.: Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37(3), 424–438 (1969)
Guyon, I., Aliferis, C., Elisseeff, A.: Causal feature selection. In: Liu, H., Motoda, H. (eds.) Computational Methods of Feature Selection. Chapman and Hall/CRC (2007)
Harrison, D., Rubinfeld, D.L.: Hedonic prices and the demand for clean air. J. Environ. Econ. Manag. 5, 81–102 (1978)
Hoover, K.D.: Nonstationary time series, cointegration, and the principle of the common cause. Brit. J. Phil. Sci. 54, 527–551 (2003)
Hoyer, P.O., Janzing, D., Mooij, J.M., Peters, J., Schölkopf, B.: Nonlinear causal discovery with additive noise models. Adv. Neural Inf. Process. Syst. 21, 689–696 (2009)
Janzing, D., Mooij, J., Zhang, K., Lemeire, J., Zscheischler, J., Daniušis, P., Steudel, B., Schölkopf, B.: Information-geometric approach to inferring causal directions. Artif. Intell. 182-3, 1–31 (2012)
Kalisch, M., Maechler, M., Colombo, D., Maathuis, M.H., Buehlmann, P.: Causal inference using graphical models with the R package. J. Statist. Softw., 47(11) (2012)
Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml (2013)
Margaritis, D.: Distribution-free learning of Bayesian network structure in continuous domains. In: Proceedings of the 20th National Conference on Artificial Intelligence AAAI, pp. 825–830 (2005)
Mooij, J.M., Janzing, D., Zscheischler, J., Schölkopf, B.: CauseEffectPairs repository http://webdav.tuebingen.mpg.de/causality/ (2014)
Mooij, J.M., Peters, J., Janzing, D., Zscheischler, J., B. Schölkopf.: Distinguishing cause from effect using observational data: Methods and benchmarks. Technical Report arXiv:1412.3773v1 Max-Planck-Institute for Intelligent Systems at Tuebingen (2014)
Parr, R., Mackay, J.: Secrets of the Sommeliers: How to Think and Drink Like the World’s Top Wine Professionals. Penguin Random House (2010)
Pearl, J.: Causality, Models, Reasoning, and Inference. Cambridge University Press (2000)
Peters, J., Ernest, J.: CAM: Causal Additive Model (CAM). R package version 1.0, http://CRAN.R-project.org/package=CAM (2015)
Prestwich, S.D., Tarim, S.A., Ozkan, I.: Causal discovery by randomness test. In: Proceedings of the 14th International Symposium on Artificial Intelligence and Mathematics (2016)
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, http://www.R-project.org/ (2016)
Redmond, M.A., Baveja, A.: A data-driven software tool for enabling cooperative information sharing among police departments. Eur. J. Oper. Res. 141, 660–678 (2002)
Reichenbach, H.: The Direction of Time. University of California Press, Berkeley (1956)
Reiss, J.: Causation, Evidence, and Inference. Routledge (2015)
Salkind, N.J., Rasmussen, K.: Encyclopedia of Measurement and Statistics. SAGE Publications Inc (2007)
Shimizu, S., Hoyer, P.O., Hyvarinen, A., Kerminen, A.J.: A linear non-gaussian acyclic model for causal discovery. J. Mach. Learn. Res. 7, 2003–2030 (2006)
Smith, V.K., Huang, J.C.: Hedonic models and air pollution: twenty-five years and counting. Environ. Resour. Econ. 3, 381–394 (1993)
Sober. E.: Venetian sea levels, British bread prices, and the principle of the common cause. Brit. J. Phil. Sci. 52, 331–346 (2001)
Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction and Search. MIT Press, Cambridge (2000)
Ben Taieb, S., Hyndman, R.J.: A gradient boosting approach to the kaggle load forecasting competition. Int. J. Forecast. 30(2), 382–394 (2014)
Wald, A., Wolfowitz, J.: On a test whether two samples are from the same population. Ann. Math. Statist. 11, 147–162 (1940)
You, J.: Darpa sets out to automate research. Science 347(6221), 465 (2015)
Yule, G.: Why do we sometimes get nonsense-correlations between time series? J. R. Stat. Soc. 89, 1–64 (1926)
Zahirovic-Herbert, V., Turnbull, G.K.: School quality, house prices and liquidity. J. Real Estate Financ. Econ. 37(2), 113–130 (2008)
Zhang, K., Hyvarinen, A.: On the Identifiability of the Post-Nonlinear Causal model. In: Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, pp. 647–655 (2009)
Zhang, K., Peters, J., Janzing, D., Schoelkopf, B.: Kernel-based conditional independence test and application in causal discovery. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (2011)
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported in part by a research grant from Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289.
Rights and permissions
About this article
Cite this article
Prestwich, S.D., Tarim, S.A. & Ozkan, I. A new causal discovery heuristic. Ann Math Artif Intell 82, 245–259 (2018). https://doi.org/10.1007/s10472-018-9575-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-018-9575-0