Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–40 of 40 results for author: Hyvärinen, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.16849  [pdf, other

    stat.ML cs.LG

    Identifiable Feature Learning for Spatial Data with Nonlinear ICA

    Authors: Hermanni Hälvä, Jonathan So, Richard E. Turner, Aapo Hyvärinen

    Abstract: Recently, nonlinear ICA has surfaced as a popular alternative to the many heuristic models used in deep representation learning and disentanglement. An advantage of nonlinear ICA is that a sophisticated identifiability theory has been developed; in particular, it has been proven that the original components can be recovered under sufficiently strong latent dependencies. Despite this general theory… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

    Comments: Work under review

  2. arXiv:2310.15709  [pdf, other

    stat.ML cs.LG

    Causal Representation Learning Made Identifiable by Grouping of Observational Variables

    Authors: Hiroshi Morioka, Aapo Hyvärinen

    Abstract: A topic of great current interest is Causal Representation Learning (CRL), whose goal is to learn a causal model for hidden features in a data-driven manner. Unfortunately, CRL is severely ill-posed since it is a combination of the two notoriously ill-posed problems of representation learning and causal discovery. Yet, finding practical identifiability conditions that guarantee a unique solution i… ▽ More

    Submitted 7 June, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

  3. arXiv:2310.03902  [pdf, other

    stat.ML cs.LG

    Provable benefits of annealing for estimating normalizing constants: Importance Sampling, Noise-Contrastive Estimation, and beyond

    Authors: Omar Chehab, Aapo Hyvarinen, Andrej Risteski

    Abstract: Recent research has developed several Monte Carlo methods for estimating the normalization constant (partition function) based on the idea of annealing. This means sampling successively from a path of distributions that interpolate between a tractable "proposal" distribution and the unnormalized "target" distribution. Prominent estimators in this family include annealed importance sampling and ann… ▽ More

    Submitted 9 October, 2023; v1 submitted 5 October, 2023; originally announced October 2023.

  4. arXiv:2303.16535  [pdf, other

    cs.LG stat.ML

    Nonlinear Independent Component Analysis for Principled Disentanglement in Unsupervised Deep Learning

    Authors: Aapo Hyvarinen, Ilyes Khemakhem, Hiroshi Morioka

    Abstract: A central problem in unsupervised deep learning is how to find useful representations of high-dimensional data, sometimes called "disentanglement". Most approaches are heuristic and lack a proper theoretical foundation. In linear representation learning, independent component analysis (ICA) has been successful in many applications areas, and it is principled, i.e., based on a well-defined probabil… ▽ More

    Submitted 5 September, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

    Comments: Revised version, to appear in Patterns

  5. arXiv:2302.02672  [pdf, other

    stat.ML cs.LG

    Identifiability of latent-variable and structural-equation models: from linear to nonlinear

    Authors: Aapo Hyvärinen, Ilyes Khemakhem, Ricardo Monti

    Abstract: An old problem in multivariate statistics is that linear Gaussian models are often unidentifiable, i.e. some parameters cannot be uniquely estimated. In factor (component) analysis, an orthogonal rotation of the factors is unidentifiable, while in linear regression, the direction of effect cannot be identified. For such linear models, non-Gaussianity of the (latent) variables has been shown to pro… ▽ More

    Submitted 3 May, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: Revised final version of invited review to be published at Annals of the Institute of Statistical Mathematics

  6. arXiv:2301.09696  [pdf, other

    stat.ML cs.LG

    Optimizing the Noise in Self-Supervised Learning: from Importance Sampling to Noise-Contrastive Estimation

    Authors: Omar Chehab, Alexandre Gramfort, Aapo Hyvarinen

    Abstract: Self-supervised learning is an increasingly popular approach to unsupervised learning, achieving state-of-the-art results. A prevalent approach consists in contrasting data points and noise points within a classification task: this requires a good noise distribution which is notoriously hard to specify. While a comprehensive theory is missing, it is widely assumed that the optimal noise distributi… ▽ More

    Submitted 23 January, 2023; originally announced January 2023.

    Comments: arXiv admin note: text overlap with arXiv:2203.01110

  7. arXiv:2205.15409  [pdf, other

    cs.LG cs.AI cs.NE

    Painful intelligence: What AI can tell us about human suffering

    Authors: Aapo Hyvärinen

    Abstract: This book uses the modern theory of artificial intelligence (AI) to understand human suffering or mental pain. Both humans and sophisticated AI agents process information about the world in order to achieve goals and obtain rewards, which is why AI can be used as a model of the human brain and mind. This book intends to make the theory accessible to a relatively general audience, requiring only so… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

    Comments: Book with 231 pages

  8. arXiv:2203.01110  [pdf, other

    stat.ML cs.LG

    The Optimal Noise in Noise-Contrastive Learning Is Not What You Think

    Authors: Omar Chehab, Alexandre Gramfort, Aapo Hyvarinen

    Abstract: Learning a parametric model of a data distribution is a well-known statistical problem that has seen renewed interest as it is brought to scale in deep learning. Framing the problem as a self-supervised task, where data samples are discriminated from noise samples, is at the core of state-of-the-art methods, beginning with Noise-Contrastive Estimation (NCE). Yet, such contrastive learning requires… ▽ More

    Submitted 26 July, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

  9. arXiv:2111.15431  [pdf, other

    cs.LG stat.ML

    Binary Independent Component Analysis: A Non-stationarity-based Approach

    Authors: Antti Hyttinen, Vitória Barin-Pacela, Aapo Hyvärinen

    Abstract: We consider independent component analysis of binary data. While fundamental in practice, this case has been much less developed than ICA for continuous data. We start by assuming a linear mixing model in a continuous-valued latent space, followed by a binary observation model. Importantly, we assume that the sources are non-stationary; this is necessary since any non-Gaussianity would essentially… ▽ More

    Submitted 2 August, 2022; v1 submitted 30 November, 2021; originally announced November 2021.

    Comments: This is an updated version (including a slight name change) which was published at UAI2022

  10. arXiv:2110.13502  [pdf, other

    cs.LG

    Shared Independent Component Analysis for Multi-Subject Neuroimaging

    Authors: Hugo Richard, Pierre Ablin, Bertrand Thirion, Alexandre Gramfort, Aapo Hyvärinen

    Abstract: We consider shared response modeling, a multi-view learning problem where one wants to identify common components from multiple datasets or views. We introduce Shared Independent Component Analysis (ShICA) that models each view as a linear transform of shared independent components contaminated by additive Gaussian noise. We show that this model is identifiable if the components are either non-Gau… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

    Comments: Accepted at NeurIPS 2021

  11. arXiv:2106.09620  [pdf, other

    stat.ML cs.LG

    Disentangling Identifiable Features from Noisy Data with Structured Nonlinear ICA

    Authors: Hermanni Hälvä, Sylvain Le Corff, Luc Lehéricy, Jonathan So, Yongjie Zhu, Elisabeth Gassiat, Aapo Hyvarinen

    Abstract: We introduce a new general identifiable framework for principled disentanglement referred to as Structured Nonlinear Independent Component Analysis (SNICA). Our contribution is to extend the identifiability theory of deep generative models for a very broad class of structured models. While previous works have shown identifiability for specific classes of time-series models, our theorems extend thi… ▽ More

    Submitted 27 October, 2021; v1 submitted 17 June, 2021; originally announced June 2021.

    Comments: Accepted for publication at NeurIPS 2021

  12. arXiv:2102.10964  [pdf, other

    stat.ML cs.LG

    Adaptive Multi-View ICA: Estimation of noise levels for optimal inference

    Authors: Hugo Richard, Pierre Ablin, Aapo Hyvärinen, Alexandre Gramfort, Bertrand Thirion

    Abstract: We consider a multi-view learning problem known as group independent component analysis (group ICA), where the goal is to recover shared independent sources from many views. The statistical modeling of this problem requires to take noise into account. When the model includes additive noise on the observations, the likelihood is intractable. By contrast, we propose Adaptive multiView ICA (AVICA), a… ▽ More

    Submitted 22 February, 2021; originally announced February 2021.

  13. arXiv:2011.02268  [pdf, other

    stat.ML cs.LG

    Causal Autoregressive Flows

    Authors: Ilyes Khemakhem, Ricardo Pio Monti, Robert Leech, Aapo Hyvärinen

    Abstract: Two apparently unrelated fields -- normalizing flows and causality -- have recently received considerable attention in the machine learning community. In this work, we highlight an intrinsic correspondence between a simple family of autoregressive normalizing flows and identifiable causal models. We exploit the fact that autoregressive flow architectures define an ordering over variables, analogou… ▽ More

    Submitted 24 February, 2021; v1 submitted 4 November, 2020; originally announced November 2020.

    Comments: Published at AISTATS2021. Code available at https://github.com/piomonti/carefl

  14. arXiv:2007.16104  [pdf, other

    stat.ML cs.LG eess.SP q-bio.NC q-bio.QM

    Uncovering the structure of clinical EEG signals with self-supervised learning

    Authors: Hubert Banville, Omar Chehab, Aapo Hyvärinen, Denis-Alexander Engemann, Alexandre Gramfort

    Abstract: Objective. Supervised learning paradigms are often limited by the amount of labeled data that is available. This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG), where labeling can be costly in terms of specialized expertise and human processing time. Consequently, deep learning architectures designed to learn on EEG data have yielded relati… ▽ More

    Submitted 31 July, 2020; originally announced July 2020.

    Comments: 32 pages, 9 figures

  15. arXiv:2007.09390  [pdf, other

    stat.ML cs.LG

    Autoregressive flow-based causal discovery and inference

    Authors: Ricardo Pio Monti, Ilyes Khemakhem, Aapo Hyvarinen

    Abstract: We posit that autoregressive flow models are well-suited to performing a range of causal inference tasks - ranging from causal discovery to making interventional and counterfactual predictions. In particular, we exploit the fact that autoregressive architectures define an ordering over variables, analogous to a causal ordering, in order to propose a single flow architecture to perform all three af… ▽ More

    Submitted 26 July, 2020; v1 submitted 18 July, 2020; originally announced July 2020.

    Comments: 6 pages, 3 figures. Accepted at the 2nd ICML Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models

  16. arXiv:2006.15090  [pdf, other

    stat.ML cs.LG

    Relative gradient optimization of the Jacobian term in unsupervised deep learning

    Authors: Luigi Gresele, Giancarlo Fissore, Adrián Javaloy, Bernhard Schölkopf, Aapo Hyvärinen

    Abstract: Learning expressive probabilistic models correctly describing the data is a ubiquitous problem in machine learning. A popular approach for solving it is mapping the observations into a representation space with a simple joint distribution, which can typically be written as a product of its marginals -- thus drawing a connection with the field of nonlinear independent component analysis. Deep densi… ▽ More

    Submitted 26 October, 2020; v1 submitted 26 June, 2020; originally announced June 2020.

  17. arXiv:2006.12107  [pdf, other

    stat.ML cs.LG

    Hidden Markov Nonlinear ICA: Unsupervised Learning from Nonstationary Time Series

    Authors: Hermanni Hälvä, Aapo Hyvärinen

    Abstract: Recent advances in nonlinear Independent Component Analysis (ICA) provide a principled framework for unsupervised feature learning and disentanglement. The central idea in such works is that the latent components are assumed to be independent conditional on some observed auxiliary variables, such as the time-segment index. This requires manual segmentation of data into non-stationary segments whic… ▽ More

    Submitted 22 June, 2020; originally announced June 2020.

    Comments: Accepted for publication at UAI 2020

  18. arXiv:2006.10944  [pdf, other

    stat.ML cs.LG

    Independent Innovation Analysis for Nonlinear Vector Autoregressive Process

    Authors: Hiroshi Morioka, Hermanni Hälvä, Aapo Hyvärinen

    Abstract: The nonlinear vector autoregressive (NVAR) model provides an appealing framework to analyze multivariate time series obtained from a nonlinear dynamical system. However, the innovation (or error), which plays a key role by driving the dynamics, is almost always assumed to be additive. Additivity greatly limits the generality of the model, hindering analysis of general NVAR processes which have non… ▽ More

    Submitted 25 February, 2021; v1 submitted 18 June, 2020; originally announced June 2020.

  19. arXiv:2006.06635  [pdf, other

    stat.ML cs.LG

    Modeling Shared Responses in Neuroimaging Studies through MultiView ICA

    Authors: Hugo Richard, Luigi Gresele, Aapo Hyvärinen, Bertrand Thirion, Alexandre Gramfort, Pierre Ablin

    Abstract: Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization. However, the aggregation of data coming from multiple subjects is challenging, since it requires accounting for large variability in anatomy, functional topography and stimulus response across individuals. Data modeling is especially hard for ecologically relevant condit… ▽ More

    Submitted 24 December, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: Accepted to NeurIPS 2020

  20. arXiv:2002.11537  [pdf, other

    stat.ML cs.LG

    ICE-BeeM: Identifiable Conditional Energy-Based Deep Models Based on Nonlinear ICA

    Authors: Ilyes Khemakhem, Ricardo Pio Monti, Diederik P. Kingma, Aapo Hyvärinen

    Abstract: We consider the identifiability theory of probabilistic models and establish sufficient conditions under which the representations learned by a very broad family of conditional energy-based models are unique in function space, up to a simple transformation. In our model family, the energy function is the dot-product between two feature extractors, one for the dependent variable, and one for the co… ▽ More

    Submitted 26 October, 2020; v1 submitted 26 February, 2020; originally announced February 2020.

    Comments: Accepted for publication at NeurIPS 2020

  21. arXiv:1911.05419  [pdf, other

    cs.LG eess.SP stat.ML

    Self-supervised representation learning from electroencephalography signals

    Authors: Hubert Banville, Isabela Albuquerque, Aapo Hyvärinen, Graeme Moffat, Denis-Alexander Engemann, Alexandre Gramfort

    Abstract: The supervised learning paradigm is limited by the cost - and sometimes the impracticality - of data collection and labeling in multiple domains. Self-supervised learning, a paradigm which exploits the structure of unlabeled data to create learning problems that can be solved with standard supervised approaches, has shown great promise as a pretraining or feature learning approach in fields like c… ▽ More

    Submitted 13 November, 2019; originally announced November 2019.

  22. arXiv:1911.00265  [pdf, other

    cs.LG stat.ML

    Robust contrastive learning and nonlinear ICA in the presence of outliers

    Authors: Hiroaki Sasaki, Takashi Takenouchi, Ricardo Monti, Aapo Hyvärinen

    Abstract: Nonlinear independent component analysis (ICA) is a general framework for unsupervised representation learning, and aimed at recovering the latent variables in data. Recent practical methods perform nonlinear ICA by solving a series of classification problems based on logistic regression. However, it is well-known that logistic regression is vulnerable to outliers, and thus the performance can be… ▽ More

    Submitted 1 November, 2019; originally announced November 2019.

  23. arXiv:1907.09588  [pdf, other

    stat.ML cs.LG

    Direction Matters: On Influence-Preserving Graph Summarization and Max-cut Principle for Directed Graphs

    Authors: Wenkai Xu, Gang Niu, Aapo Hyvärinen, Masashi Sugiyama

    Abstract: Summarizing large-scaled directed graphs into small-scale representations is a useful but less studied problem setting. Conventional clustering approaches, which based on "Min-Cut"-style criteria, compress both the vertices and edges of the graph into the communities, that lead to a loss of directed edge information. On the other hand, compressing the vertices while preserving the directed edge in… ▽ More

    Submitted 22 July, 2019; originally announced July 2019.

  24. arXiv:1907.04809  [pdf, other

    stat.ML cs.LG

    Variational Autoencoders and Nonlinear ICA: A Unifying Framework

    Authors: Ilyes Khemakhem, Diederik P. Kingma, Ricardo Pio Monti, Aapo Hyvärinen

    Abstract: The framework of variational autoencoders allows us to efficiently learn deep latent-variable models, such that the model's marginal distribution over observed variables fits the data. Often, we're interested in going a step further, and want to approximate the true joint distribution over observed and latent variables, including the true prior and posterior distributions over latent variables. Th… ▽ More

    Submitted 21 December, 2020; v1 submitted 10 July, 2019; originally announced July 2019.

    Comments: Accepted for publication at AISTATS 2020. This is a slightly updated version of the published manuscript; see Corrigendum at the end of the paper

    Journal ref: Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, pages 2207-2217, year 2020

  25. arXiv:1905.05976  [pdf, ps, other

    math.ST cs.LG stat.ML

    Information criteria for non-normalized models

    Authors: Takeru Matsuda, Masatoshi Uehara, Aapo Hyvarinen

    Abstract: Many statistical models are given in the form of non-normalized densities with an intractable normalization constant. Since maximum likelihood estimation is computationally intensive for these models, several estimation methods have been developed which do not require explicit computation of the normalization constant, such as noise contrastive estimation (NCE) and score matching. However, model s… ▽ More

    Submitted 27 July, 2021; v1 submitted 15 May, 2019; originally announced May 2019.

    Journal ref: Journal of Machine Learning Research, 22(158):1--33, 2021

  26. arXiv:1904.09096  [pdf, other

    stat.ML cs.LG

    Causal Discovery with General Non-Linear Relationships Using Non-Linear ICA

    Authors: Ricardo Pio Monti, Kun Zhang, Aapo Hyvarinen

    Abstract: We consider the problem of inferring causal relationships between two or more passively observed variables. While the problem of such causal discovery has been extensively studied especially in the bivariate setting, the majority of current methods assume a linear causal relationship, and the few methods which consider non-linear dependencies usually make the assumption of additive noise. Here, we… ▽ More

    Submitted 19 April, 2019; originally announced April 2019.

  27. arXiv:1903.02334  [pdf, other

    stat.ML cs.LG

    Neural Empirical Bayes

    Authors: Saeed Saremi, Aapo Hyvarinen

    Abstract: We unify $\textit{kernel density estimation}$ and $\textit{empirical Bayes}$ and address a set of problems in unsupervised learning with a geometric interpretation of those methods, rooted in the $\textit{concentration of measure}$ phenomenon. Kernel density is viewed symbolically as $X\rightharpoonup Y$ where the random variable $X$ is smoothed to $Y= X+N(0,σ^2 I_d)$, and empirical Bayes is the m… ▽ More

    Submitted 21 April, 2020; v1 submitted 6 March, 2019; originally announced March 2019.

    Comments: 23 pages, 10 figures

    Journal ref: Journal of Machine Learning Research 20(181), 1-23, 2019

  28. arXiv:1806.01754  [pdf, ps, other

    stat.ML cs.LG

    Neural-Kernelized Conditional Density Estimation

    Authors: Hiroaki Sasaki, Aapo Hyvärinen

    Abstract: Conditional density estimation is a general framework for solving various problems in machine learning. Among existing methods, non-parametric and/or kernel-based methods are often difficult to use on large datasets, while methods based on neural networks usually make restrictive parametric assumptions on the probability densities. Here, we propose a novel method for estimating the conditional den… ▽ More

    Submitted 5 June, 2018; originally announced June 2018.

  29. arXiv:1805.09567  [pdf, other

    stat.ML cs.LG

    A Unified Probabilistic Model for Learning Latent Factors and Their Connectivities from High-Dimensional Data

    Authors: Ricardo Pio Monti, Aapo Hyvärinen

    Abstract: Connectivity estimation is challenging in the context of high-dimensional data. A useful preprocessing step is to group variables into clusters, however, it is not always clear how to do so from the perspective of connectivity estimation. Another practical challenge is that we may have data from multiple related classes (e.g., multiple subjects or conditions) and wish to incorporate constraints on… ▽ More

    Submitted 24 May, 2018; originally announced May 2018.

    Comments: 13 pages, 6 figures. To appear in UAI 2018

  30. arXiv:1805.08651  [pdf, other

    stat.ML cs.LG

    Nonlinear ICA Using Auxiliary Variables and Generalized Contrastive Learning

    Authors: Aapo Hyvarinen, Hiroaki Sasaki, Richard E. Turner

    Abstract: Nonlinear ICA is a fundamental problem for unsupervised representation learning, emphasizing the capacity to recover the underlying latent variables generating the data (i.e., identifiability). Recently, the very first identifiability proofs for nonlinear ICA have been proposed, leveraging the temporal structure of the independent components. Here, we propose a general framework for nonlinear ICA,… ▽ More

    Submitted 4 February, 2019; v1 submitted 22 May, 2018; originally announced May 2018.

    Comments: Camera-ready version of article accepted for AISTATS2019

  31. arXiv:1805.08306  [pdf, other

    stat.ML cs.LG

    Deep Energy Estimator Networks

    Authors: Saeed Saremi, Arash Mehrjou, Bernhard Schölkopf, Aapo Hyvärinen

    Abstract: Density estimation is a fundamental problem in statistical learning. This problem is especially challenging for complex high-dimensional data due to the curse of dimensionality. A promising solution to this problem is given here in an inference-free hierarchical framework that is built on score matching. We revisit the Bayesian interpretation of the score function and the Parzen score matching, an… ▽ More

    Submitted 21 May, 2018; originally announced May 2018.

  32. arXiv:1805.07516  [pdf, other

    stat.ML cs.LG

    Estimation of Non-Normalized Mixture Models and Clustering Using Deep Representation

    Authors: Takeru Matsuda, Aapo Hyvarinen

    Abstract: We develop a general method for estimating a finite mixture of non-normalized models. Here, a non-normalized model is defined to be a parametric distribution with an intractable normalization constant. Existing methods for estimating non-normalized models without computing the normalization constant are not applicable to mixture models because they contain more than one intractable normalization c… ▽ More

    Submitted 19 May, 2018; originally announced May 2018.

    Journal ref: Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:2555-2563, 2019

  33. arXiv:1605.06336  [pdf, other

    stat.ML cs.LG

    Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA

    Authors: Aapo Hyvarinen, Hiroshi Morioka

    Abstract: Nonlinear independent component analysis (ICA) provides an appealing framework for unsupervised feature learning, but the models proposed so far are not identifiable. Here, we first propose a new intuitive principle of unsupervised deep learning from time series which uses the nonstationary structure of the data. Our learning principle, time-contrastive learning (TCL), finds a representation which… ▽ More

    Submitted 20 May, 2016; originally announced May 2016.

  34. arXiv:1408.2038  [pdf

    cs.LG stat.ML

    A direct method for estimating a causal ordering in a linear non-Gaussian acyclic model

    Authors: Shohei Shimizu, Aapo Hyvarinen, Yoshinobu Kawahara

    Abstract: Structural equation models and Bayesian networks have been widely used to analyze causal relations between continuous variables. In such frameworks, linear acyclic models are typically used to model the datagenerating process of variables. Recently, it was shown that use of non-Gaussianity identifies a causal ordering of variables in a linear acyclic model without using any prior knowledge on the… ▽ More

    Submitted 9 August, 2014; originally announced August 2014.

    Comments: Appears in Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI2009)

    Report number: UAI-P-2009-PG-506-513

  35. arXiv:1307.2307  [pdf, ps, other

    stat.ML cs.LG

    Bridging Information Criteria and Parameter Shrinkage for Model Selection

    Authors: Kun Zhang, Heng Peng, Laiwan Chan, Aapo Hyvarinen

    Abstract: Model selection based on classical information criteria, such as BIC, is generally computationally demanding, but its properties are well studied. On the other hand, model selection based on parameter shrinkage by $\ell_1$-type penalties is computationally efficient. In this paper we make an attempt to combine their strengths, and propose a simple approach that penalizes the likelihood with data-d… ▽ More

    Submitted 8 July, 2013; originally announced July 2013.

    Comments: 16 pages, 3 figures

  36. arXiv:1207.1413  [pdf

    cs.LG cs.MS stat.ML

    Discovery of non-gaussian linear causal models using ICA

    Authors: Shohei Shimizu, Aapo Hyvarinen, Yutaka Kano, Patrik O. Hoyer

    Abstract: In recent years, several methods have been proposed for the discovery of causal structure from non-experimental data (Spirtes et al. 2000; Pearl 2000). Such methods make various assumptions on the data generating process to facilitate its identification from purely observational data. Continuing this line of research, we show how to discover the complete causal structure of continuous-valued data,… ▽ More

    Submitted 4 July, 2012; originally announced July 2012.

    Comments: Appears in Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI2005)

    Report number: UAI-P-2005-PG-525-533

  37. arXiv:1206.3260  [pdf

    stat.ML cs.AI cs.LG

    Causal discovery of linear acyclic models with arbitrary distributions

    Authors: Patrik O. Hoyer, Aapo Hyvarinen, Richard Scheines, Peter L. Spirtes, Joseph Ramsey, Gustavo Lacerda, Shohei Shimizu

    Abstract: An important task in data analysis is the discovery of causal relationships between observed variables. For continuous-valued data, linear acyclic causal models are commonly used to model the data-generating process, and the inference of such models is a well-studied problem. However, existing methods have significant limitations. Methods based on conditional independencies (Spirtes et al. 1993; P… ▽ More

    Submitted 13 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI2008)

    Report number: UAI-P-2008-PG-282-289

  38. arXiv:1205.2599  [pdf

    stat.ML cs.LG

    On the Identifiability of the Post-Nonlinear Causal Model

    Authors: Kun Zhang, Aapo Hyvarinen

    Abstract: By taking into account the nonlinear effect of the cause, the inner noise effect, and the measurement distortion effect in the observed variables, the post-nonlinear (PNL) causal model has demonstrated its excellent performance in distinguishing the cause from effect. However, its identifiability has not been properly addressed, and how to apply it in the case of more than two variables is also a… ▽ More

    Submitted 9 May, 2012; originally announced May 2012.

    Comments: Appears in Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI2009)

    Report number: UAI-P-2009-PG-647-655

  39. arXiv:1203.3533  [pdf

    cs.LG stat.ML

    Source Separation and Higher-Order Causal Analysis of MEG and EEG

    Authors: Kun Zhang, Aapo Hyvarinen

    Abstract: Separation of the sources and analysis of their connectivity have been an important topic in EEG/MEG analysis. To solve this problem in an automatic manner, we propose a two-layer model, in which the sources are conditionally uncorrelated from each other, but not independent; the dependence is caused by the causality in their time-varying variances (envelopes). The model is identified in two steps… ▽ More

    Submitted 15 March, 2012; originally announced March 2012.

    Comments: Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010)

    Report number: UAI-P-2010-PG-709-716

  40. arXiv:1203.3506  [pdf

    cs.LG stat.ML

    A Family of Computationally Efficient and Simple Estimators for Unnormalized Statistical Models

    Authors: Miika Pihlaja, Michael Gutmann, Aapo Hyvarinen

    Abstract: We introduce a new family of estimators for unnormalized statistical models. Our family of estimators is parameterized by two nonlinear functions and uses a single sample from an auxiliary distribution, generalizing Maximum Likelihood Monte Carlo estimation of Geyer and Thompson (1992). The family is such that we can estimate the partition function like any other parameter in the model. The estima… ▽ More

    Submitted 15 March, 2012; originally announced March 2012.

    Comments: Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010)

    Report number: UAI-P-2010-PG-442-449