Search | arXiv e-print repository

Discovering Cyclic Causal Models with Latent Variables: A General SAT-Based Procedure

Authors: Antti Hyttinen, Patrik O. Hoyer, Frederick Eberhardt, Matti Jarvisalo

Abstract: We present a very general approach to learning the structure of causal models based on d-separation constraints, obtained from any given set of overlapping passive observational or experimental data sets. The procedure allows for both directed cycles (feedback loops) and the presence of latent variables. Our approach is based on a logical representation of causal pathways, which permits the integr… ▽ More We present a very general approach to learning the structure of causal models based on d-separation constraints, obtained from any given set of overlapping passive observational or experimental data sets. The procedure allows for both directed cycles (feedback loops) and the presence of latent variables. Our approach is based on a logical representation of causal pathways, which permits the integration of quite general background knowledge, and inference is performed using a Boolean satisfiability (SAT) solver. The procedure is complete in that it exhausts the available information on whether any given edge can be determined to be present or absent, and returns "unknown" otherwise. Many existing constraint-based causal discovery algorithms can be seen as special cases, tailored to circumstances in which one or more restricting assumptions apply. Simulations illustrate the effect of these assumptions on discovery and how the present algorithm scales. △ Less

Submitted 26 September, 2013; originally announced September 2013.

Comments: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013)

Report number: UAI-P-2013-PG-301-310

arXiv:1210.4879 [pdf]

Causal Discovery of Linear Cyclic Models from Multiple Experimental Data Sets with Overlapping Variables

Authors: Antti Hyttinen, Frederick Eberhardt, Patrik O. Hoyer

Abstract: Much of scientific data is collected as randomized experiments intervening on some and observing other variables of interest. Quite often, a given phenomenon is investigated in several studies, and different sets of variables are involved in each study. In this article we consider the problem of integrating such knowledge, inferring as much as possible concerning the underlying causal structure wi… ▽ More Much of scientific data is collected as randomized experiments intervening on some and observing other variables of interest. Quite often, a given phenomenon is investigated in several studies, and different sets of variables are involved in each study. In this article we consider the problem of integrating such knowledge, inferring as much as possible concerning the underlying causal structure with respect to the union of observed variables from such experimental or passive observational overlapping data sets. We do not assume acyclicity or joint causal sufficiency of the underlying data generating model, but we do restrict the causal relationships to be linear and use only second order statistics of the data. We derive conditions for full model identifiability in the most generic case, and provide novel techniques for incorporating an assumption of faithfulness to aid in inference. In each case we seek to establish what is and what is not determined by the data at hand. △ Less

Submitted 16 October, 2012; originally announced October 2012.

Comments: Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

Report number: UAI-P-2012-PG-387-396

arXiv:1207.1977 [pdf, other]

Estimating a Causal Order among Groups of Variables in Linear Models

Authors: Doris Entner, Patrik O. Hoyer

Abstract: The machine learning community has recently devoted much attention to the problem of inferring causal relationships from statistical data. Most of this work has focused on uncovering connections among scalar random variables. We generalize existing methods to apply to collections of multi-dimensional random vectors, focusing on techniques applicable to linear models. The performance of the resulti… ▽ More The machine learning community has recently devoted much attention to the problem of inferring causal relationships from statistical data. Most of this work has focused on uncovering connections among scalar random variables. We generalize existing methods to apply to collections of multi-dimensional random vectors, focusing on techniques applicable to linear models. The performance of the resulting algorithms is evaluated and compared in simulations, which show that our methods can, in many cases, provide useful information on causal relationships even for relatively small sample sizes. △ Less

Submitted 9 July, 2012; originally announced July 2012.

Comments: To appear at the International Conference on Artificial Neural Networks 2012 (proceedings to be published in LNCS, Springer); To be presented at the UAI Workshop on Causal Structure Learning 2012

arXiv:1207.1413 [pdf]

Discovery of non-gaussian linear causal models using ICA

Authors: Shohei Shimizu, Aapo Hyvarinen, Yutaka Kano, Patrik O. Hoyer

Abstract: In recent years, several methods have been proposed for the discovery of causal structure from non-experimental data (Spirtes et al. 2000; Pearl 2000). Such methods make various assumptions on the data generating process to facilitate its identification from purely observational data. Continuing this line of research, we show how to discover the complete causal structure of continuous-valued data,… ▽ More In recent years, several methods have been proposed for the discovery of causal structure from non-experimental data (Spirtes et al. 2000; Pearl 2000). Such methods make various assumptions on the data generating process to facilitate its identification from purely observational data. Continuing this line of research, we show how to discover the complete causal structure of continuous-valued data, under the assumptions that (a) the data generating process is linear, (b) there are no unobserved confounders, and (c) disturbance variables have non-gaussian distributions of non-zero variances. The solution relies on the use of the statistical method known as independent component analysis (ICA), and does not require any pre-specified time-ordering of the variables. We provide a complete Matlab package for performing this LiNGAM analysis (short for Linear Non-Gaussian Acyclic Model), and demonstrate the effectiveness of the method using artificially generated data. △ Less

Submitted 4 July, 2012; originally announced July 2012.

Comments: Appears in Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI2005)

Report number: UAI-P-2005-PG-525-533

arXiv:1206.3273 [pdf]

Discovering Cyclic Causal Models by Independent Components Analysis

Authors: Gustavo Lacerda, Peter L. Spirtes, Joseph Ramsey, Patrik O. Hoyer

Abstract: We generalize Shimizu et al's (2006) ICA-based approach for discovering linear non-Gaussian acyclic (LiNGAM) Structural Equation Models (SEMs) from causally sufficient, continuous-valued observational data. By relaxing the assumption that the generating SEM's graph is acyclic, we solve the more general problem of linear non-Gaussian (LiNG) SEM discovery. LiNG discovery algorithms output the distri… ▽ More We generalize Shimizu et al's (2006) ICA-based approach for discovering linear non-Gaussian acyclic (LiNGAM) Structural Equation Models (SEMs) from causally sufficient, continuous-valued observational data. By relaxing the assumption that the generating SEM's graph is acyclic, we solve the more general problem of linear non-Gaussian (LiNG) SEM discovery. LiNG discovery algorithms output the distribution equivalence class of SEMs which, in the large sample limit, represents the population distribution. We apply a LiNG discovery algorithm to simulated data. Finally, we give sufficient conditions under which only one of the SEMs in the output class is 'stable'. △ Less

Submitted 13 June, 2012; originally announced June 2012.

Comments: Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI2008)

Report number: UAI-P-2008-PG-366-374

arXiv:1206.3260 [pdf]

Causal discovery of linear acyclic models with arbitrary distributions

Authors: Patrik O. Hoyer, Aapo Hyvarinen, Richard Scheines, Peter L. Spirtes, Joseph Ramsey, Gustavo Lacerda, Shohei Shimizu

Abstract: An important task in data analysis is the discovery of causal relationships between observed variables. For continuous-valued data, linear acyclic causal models are commonly used to model the data-generating process, and the inference of such models is a well-studied problem. However, existing methods have significant limitations. Methods based on conditional independencies (Spirtes et al. 1993; P… ▽ More An important task in data analysis is the discovery of causal relationships between observed variables. For continuous-valued data, linear acyclic causal models are commonly used to model the data-generating process, and the inference of such models is a well-studied problem. However, existing methods have significant limitations. Methods based on conditional independencies (Spirtes et al. 1993; Pearl 2000) cannot distinguish between independence-equivalent models, whereas approaches purely based on Independent Component Analysis (Shimizu et al. 2006) are inapplicable to data which is partially Gaussian. In this paper, we generalize and combine the two approaches, to yield a method able to learn the model structure in many cases for which the previous methods provide answers that are either incorrect or are not as informative as possible. We give exact graphical conditions for when two distinct models represent the same family of distributions, and empirically demonstrate the power of our method through thorough simulations. △ Less

Submitted 13 June, 2012; originally announced June 2012.

Comments: Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI2008)

Report number: UAI-P-2008-PG-282-289

arXiv:1205.2641 [pdf]

Bayesian Discovery of Linear Acyclic Causal Models

Authors: Patrik O. Hoyer, Antti Hyttinen

Abstract: Methods for automated discovery of causal relationships from non-interventional data have received much attention recently. A widely used and well understood model family is given by linear acyclic causal models (recursive structural equation models). For Gaussian data both constraint-based methods (Spirtes et al., 1993; Pearl, 2000) (which output a single equivalence class) and Bayesian score-bas… ▽ More Methods for automated discovery of causal relationships from non-interventional data have received much attention recently. A widely used and well understood model family is given by linear acyclic causal models (recursive structural equation models). For Gaussian data both constraint-based methods (Spirtes et al., 1993; Pearl, 2000) (which output a single equivalence class) and Bayesian score-based methods (Geiger and Heckerman, 1994) (which assign relative scores to the equivalence classes) are available. On the contrary, all current methods able to utilize non-Gaussianity in the data (Shimizu et al., 2006; Hoyer et al., 2008) always return only a single graph or a single equivalence class, and so are fundamentally unable to express the degree of certainty attached to that output. In this paper we develop a Bayesian score-based approach able to take advantage of non-Gaussianity when estimating linear acyclic causal models, and we empirically demonstrate that, at least on very modest size networks, its accuracy is as good as or better than existing methods. We provide a complete code package (in R) which implements all algorithms and performs all of the analysis provided in the paper, and hope that this will further the application of these methods to solving causal inference problems. △ Less

Submitted 9 May, 2012; originally announced May 2012.

Comments: Appears in Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI2009)

Report number: UAI-P-2009-PG-240-248

arXiv:1202.3735 [pdf]

Noisy-OR Models with Latent Confounding

Authors: Antti Hyttinen, Frederick Eberhardt, Patrik O. Hoyer

Abstract: Given a set of experiments in which varying subsets of observed variables are subject to intervention, we consider the problem of identifiability of causal models exhibiting latent confounding. While identifiability is trivial when each experiment intervenes on a large number of variables, the situation is more complicated when only one or a few variables are subject to intervention per experiment… ▽ More Given a set of experiments in which varying subsets of observed variables are subject to intervention, we consider the problem of identifiability of causal models exhibiting latent confounding. While identifiability is trivial when each experiment intervenes on a large number of variables, the situation is more complicated when only one or a few variables are subject to intervention per experiment. For linear causal models with latent variables Hyttinen et al. (2010) gave precise conditions for when such data are sufficient to identify the full model. While their result cannot be extended to discrete-valued variables with arbitrary cause-effect relationships, we show that a similar result can be obtained for the class of causal models whose conditional probability distributions are restricted to a `noisy-OR' parameterization. We further show that identification is preserved under an extension of the model that allows for negative influences, and present learning algorithms that we test for accuracy, scalability and robustness. △ Less

Submitted 14 February, 2012; originally announced February 2012.

Report number: UAI-P-2011-PG-363-372

arXiv:cs/0603038 [pdf, ps, other]

Estimation of linear, non-gaussian causal models in the presence of confounding latent variables

Authors: Patrik O. Hoyer, Shohei Shimizu, Antti J. Kerminen

Abstract: The estimation of linear causal models (also known as structural equation models) from data is a well-known problem which has received much attention in the past. Most previous work has, however, made an explicit or implicit assumption of gaussianity, limiting the identifiability of the models. We have recently shown (Shimizu et al, 2005; Hoyer et al, 2006) that for non-gaussian distributions th… ▽ More The estimation of linear causal models (also known as structural equation models) from data is a well-known problem which has received much attention in the past. Most previous work has, however, made an explicit or implicit assumption of gaussianity, limiting the identifiability of the models. We have recently shown (Shimizu et al, 2005; Hoyer et al, 2006) that for non-gaussian distributions the full causal model can be estimated in the no hidden variables case. In this contribution, we discuss the estimation of the model when confounding latent variables are present. Although in this case uniqueness is no longer guaranteed, there is at most a finite set of models which can fit the data. We develop an algorithm for estimating this set, and describe numerical simulations which confirm the theoretical arguments and demonstrate the practical viability of the approach. Full Matlab code is provided for all simulations. △ Less

Submitted 22 May, 2006; v1 submitted 9 March, 2006; originally announced March 2006.

Comments: 8 pages, 4 figures, pdflatex

arXiv:cs/0408058 [pdf, ps, other]

Non-negative matrix factorization with sparseness constraints

Authors: Patrik O. Hoyer

Abstract: Non-negative matrix factorization (NMF) is a recently developed technique for finding parts-based, linear representations of non-negative data. Although it has successfully been applied in several applications, it does not always result in parts-based representations. In this paper, we show how explicitly incorporating the notion of `sparseness' improves the found decompositions. Additionally, w… ▽ More Non-negative matrix factorization (NMF) is a recently developed technique for finding parts-based, linear representations of non-negative data. Although it has successfully been applied in several applications, it does not always result in parts-based representations. In this paper, we show how explicitly incorporating the notion of `sparseness' improves the found decompositions. Additionally, we provide complete MATLAB code both for standard NMF and for our extension. Our hope is that this will further the application of these methods to solving novel data-analysis problems. △ Less

Submitted 25 August, 2004; originally announced August 2004.

arXiv:cs/0202009 [pdf, ps, other]

Non-negative sparse coding

Authors: Patrik O. Hoyer

Abstract: Non-negative sparse coding is a method for decomposing multivariate data into non-negative sparse components. In this paper we briefly describe the motivation behind this type of data representation and its relation to standard sparse coding and non-negative matrix factorization. We then give a simple yet efficient multiplicative algorithm for finding the optimal values of the hidden components.… ▽ More Non-negative sparse coding is a method for decomposing multivariate data into non-negative sparse components. In this paper we briefly describe the motivation behind this type of data representation and its relation to standard sparse coding and non-negative matrix factorization. We then give a simple yet efficient multiplicative algorithm for finding the optimal values of the hidden components. In addition, we show how the basis vectors can be learned from the observed data. Simulations demonstrate the effectiveness of the proposed method. △ Less

Submitted 11 February, 2002; originally announced February 2002.

ACM Class: I.4.2

Showing 1–11 of 11 results for author: Hoyer, P O