Header logo is ei

ei Thumb sm dsa
Jonas Peters (Project leader)
Alumni
ei Thumb sm img 0113 bs
ei Thumb sm dominik 270 cut s
Dominik Janzing
Senior Research Scientist
ei Thumb sm   biwei huang
Biwei Huang
Alumni
ei Thumb sm simon gabriel
ei Thumb sm thumb eleni sgouritsa
ei Thumb sm 20140526 mpi is tueb schoelkopf 107 smaller
Philipp Geiger
Ph.D. Student
ei Thumb sm krikamol2
31 results

2015


no image
Semi-Supervised Interpolation in an Anticausal Learning Scenario

Janzing, D., Schölkopf, B.

Journal of Machine Learning Research, 16, pages: 1923-1948, September 2015 (article)

link (url) Project Page [BibTex]

2015

link (url) Project Page [BibTex]


no image
Removing systematic errors for exoplanet search via latent causes

Schölkopf, B., Hogg, D., Wang, D., Foreman-Mackey, D., Janzing, D., Simon-Gabriel, C., Peters, J.

In Proceedings of The 32nd International Conference on Machine Learning, 37, pages: 2218–2226, JMLR Workshop and Conference Proceedings, (Editors: Bach, F. and Blei, D.), JMLR, ICML, 2015 (inproceedings)

Extended version on Arxiv Web Project Page [BibTex]

Extended version on Arxiv Web Project Page [BibTex]


no image
Causal Inference by Identification of Vector Autoregressive Processes with Hidden Components

Geiger, P., Zhang, K., Schölkopf, B., Gong, M., Janzing, D.

In Proceedings of the 32nd International Conference on Machine Learning, 37, pages: 1917–1925, JMLR Workshop and Conference Proceedings, (Editors: F. Bach and D. Blei), JMLR, ICML, 2015 (inproceedings)

PDF link (url) Project Page [BibTex]

PDF link (url) Project Page [BibTex]


no image
Telling cause from effect in deterministic linear dynamical systems

Shajarisales, N., Janzing, D., Schölkopf, B., Besserve, M.

In Proceedings of the 32nd International Conference on Machine Learning, 37, pages: 285–294, JMLR Workshop and Conference Proceedings, (Editors: F. Bach and D. Blei), JMLR, ICML, 2015 (inproceedings)

PDF Project Page Project Page [BibTex]

PDF Project Page Project Page [BibTex]


no image
Inference of Cause and Effect with Unsupervised Inverse Regression

Sgouritsa, E., Janzing, D., Hennig, P., Schölkopf, B.

In Proceedings of the 18th International Conference on Artificial Intelligence and Statistics, 38, pages: 847-855, JMLR Workshop and Conference Proceedings, (Editors: Lebanon, G. and Vishwanathan, S.V.N.), JMLR.org, AISTATS, 2015 (inproceedings)

Web PDF Project Page [BibTex]

Web PDF Project Page [BibTex]

2014


no image
Estimating Causal Effects by Bounding Confounding

Geiger, P., Janzing, D., Schölkopf, B.

In Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence , pages: 240-249 , (Editors: Nevin L. Zhang and Jin Tian), AUAI Press Corvallis, Oregon , UAI, 2014 (inproceedings)

PDF Project Page [BibTex]

2014

PDF Project Page [BibTex]


no image
Causal Discovery with Continuous Additive Noise Models

Peters, J., Mooij, J., Janzing, D., Schölkopf, B.

Journal of Machine Learning Research, 15, pages: 2009-2053, 2014 (article)

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]

2013


no image
From Ordinary Differential Equations to Structural Causal Models: the deterministic case

Mooij, J., Janzing, D., Schölkopf, B.

In Proceedings of the Twenty-Ninth Conference Annual Conference on Uncertainty in Artificial Intelligence, pages: 440-448, (Editors: A Nicholson and P Smyth), AUAI Press, Corvallis, Oregon, UAI, 2013 (inproceedings)

PDF Project Page [BibTex]

2013

PDF Project Page [BibTex]


no image
Identifying Finite Mixtures of Nonparametric Product Distributions and Causal Inference of Confounders

Sgouritsa, E., Janzing, D., Peters, J., Schölkopf, B.

In Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence (UAI), pages: 556-565, (Editors: A Nicholson and P Smyth), AUAI Press Corvallis, Oregon, USA, UAI, 2013 (inproceedings)

PDF Project Page Project Page [BibTex]

PDF Project Page Project Page [BibTex]


no image
Causal Inference on Time Series using Restricted Structural Equation Models

Peters, J., Janzing, D., Schölkopf, B.

In Advances in Neural Information Processing Systems 26, pages: 154-162, (Editors: C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K.Q. Weinberger), 27th Annual Conference on Neural Information Processing Systems (NIPS), 2013 (inproceedings)

PDF Project Page [BibTex]

PDF Project Page [BibTex]

2012


no image
Information-geometric approach to inferring causal directions

Janzing, D., Mooij, J., Zhang, K., Lemeire, J., Zscheischler, J., Daniušis, P., Steudel, B., Schölkopf, B.

Artificial Intelligence, 182-183, pages: 1-31, May 2012 (article)

Abstract
While conventional approaches to causal inference are mainly based on conditional (in)dependences, recent methods also account for the shape of (conditional) distributions. The idea is that the causal hypothesis “X causes Y” imposes that the marginal distribution PX and the conditional distribution PY|X represent independent mechanisms of nature. Recently it has been postulated that the shortest description of the joint distribution PX,Y should therefore be given by separate descriptions of PX and PY|X. Since description length in the sense of Kolmogorov complexity is uncomputable, practical implementations rely on other notions of independence. Here we define independence via orthogonality in information space. This way, we can explicitly describe the kind of dependence that occurs between PY and PX|Y making the causal hypothesis “Y causes X” implausible. Remarkably, this asymmetry between cause and effect becomes particularly simple if X and Y are deterministically related. We present an inference method that works in this case. We also discuss some theoretical results for the non-deterministic case although it is not clear how to employ them for a more general inference method.

Web DOI Project Page [BibTex]

2012

Web DOI Project Page [BibTex]


no image
On Causal and Anticausal Learning

Schölkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., Mooij, J.

In Proceedings of the 29th International Conference on Machine Learning, pages: 1255-1262, (Editors: J Langford and J Pineau), Omnipress, New York, NY, USA, ICML, 2012 (inproceedings)

PDF PDF Project Page [BibTex]

PDF PDF Project Page [BibTex]

2011


no image
Causal Inference on Discrete Data using Additive Noise Models

Peters, J., Janzing, D., Schölkopf, B.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(12):2436-2450, December 2011 (article)

Abstract
Inferring the causal structure of a set of random variables from a finite sample of the joint distribution is an important problem in science. The case of two random variables is particularly challenging since no (conditional) independences can be exploited. Recent methods that are based on additive noise models suggest the following principle: Whenever the joint distribution {\bf P}^{(X,Y)} admits such a model in one direction, e.g., Y=f(X)+N, N \perp\kern-6pt \perp X, but does not admit the reversed model X=g(Y)+\tilde{N}, \tilde{N} \perp\kern-6pt \perp Y, one infers the former direction to be causal (i.e., X\rightarrow Y). Up to now, these approaches only dealt with continuous variables. In many situations, however, the variables of interest are discrete or even have only finitely many states. In this work, we extend the notion of additive noise models to these cases. We prove that it almost never occurs that additive noise models can be fit in both directions. We further propose an efficient algorithm that is able to perform this way of causal inference on finite samples of discrete variables. We show that the algorithm works on both synthetic and real data sets.

PDF Web DOI Project Page [BibTex]

2011

PDF Web DOI Project Page [BibTex]


no image
Detecting low-complexity unobserved causes

Janzing, D., Sgouritsa, E., Stegle, O., Peters, J., Schölkopf, B.

In pages: 383-391, (Editors: FG Cozman and A Pfeffer), AUAI Press, Corvallis, OR, USA, 27th Conference on Uncertainty in Artificial Intelligence (UAI), July 2011 (inproceedings)

Abstract
We describe a method that infers whether statistical dependences between two observed variables X and Y are due to a \direct" causal link or only due to a connecting causal path that contains an unobserved variable of low complexity, e.g., a binary variable. This problem is motivated by statistical genetics. Given a genetic marker that is correlated with a phenotype of interest, we want to detect whether this marker is causal or it only correlates with a causal one. Our method is based on the analysis of the location of the conditional distributions P(Y jx) in the simplex of all distributions of Y . We report encouraging results on semi-empirical data.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]


no image
Identifiability of causal graphs using functional models

Peters, J., Mooij, J., Janzing, D., Schölkopf, B.

In pages: 589-598, (Editors: FG Cozman and A Pfeffer), AUAI Press, Corvallis, OR, USA, 27th Conference on Uncertainty in Artificial Intelligence (UAI), July 2011 (inproceedings)

Abstract
This work addresses the following question: Under what assumptions on the data generating process can one infer the causal graph from the joint distribution? The approach taken by conditional independencebased causal discovery methods is based on two assumptions: the Markov condition and faithfulness. It has been shown that under these assumptions the causal graph can be identified up to Markov equivalence (some arrows remain undirected) using methods like the PC algorithm. In this work we propose an alternative by Identifiable Functional Model Classes (IFMOCs). As our main theorem we prove that if the data generating process belongs to an IFMOC, one can identify the complete causal graph. To the best of our knowledge this is the first identifiability result of this kind that is not limited to linear functional relationships. We discuss how the IFMOC assumption and the Markov and faithfulness assumptions relate to each other and explain why we believe that the IFMOC assumption can be tested more easily on given data. We further provide a practical algorithm that recovers the causal graph from finitely many data; experiments on simulated data support the theoretical fndings.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]


no image
Kernel-based Conditional Independence Test and Application in Causal Discovery

Zhang, K., Peters, J., Janzing, D., Schölkopf, B.

In pages: 804-813, (Editors: FG Cozman and A Pfeffer), AUAI Press, Corvallis, OR, USA, 27th Conference on Uncertainty in Artificial Intelligence (UAI), July 2011 (inproceedings)

Abstract
Conditional independence testing is an important problem, especially in Bayesian network learning and causal discovery. Due to the curse of dimensionality, testing for conditional independence of continuous variables is particularly challenging. We propose a Kernel-based Conditional Independence test (KCI-test), by constructing an appropriate test statistic and deriving its asymptotic distribution under the null hypothesis of conditional independence. The proposed method is computationally efficient and easy to implement. Experimental results show that it outperforms other methods, especially when the conditioning set is large or the sample size is not very large, in which case other methods encounter difficulties.

PDF Web Project Page Project Page [BibTex]

PDF Web Project Page Project Page [BibTex]


no image
Testing whether linear equations are causal: A free probability theory approach

Zscheischler, J., Janzing, D., Zhang, K.

In pages: 839-847, (Editors: Cozman, F.G. , A. Pfeffer), AUAI Press, Corvallis, OR, USA, 27th Conference on Uncertainty in Artificial Intelligence (UAI), July 2011 (inproceedings)

Abstract
We propose a method that infers whether linear relations between two high-dimensional variables X and Y are due to a causal influence from X to Y or from Y to X. The earlier proposed so-called Trace Method is extended to the regime where the dimension of the observed variables exceeds the sample size. Based on previous work, we postulate conditions that characterize a causal relation between X and Y . Moreover, we describe a statistical test and argue that both causal directions are typically rejected if there is a common cause. A full theoretical analysis is presented for the deterministic case but our approach seems to be valid for the noisy case, too, for which we additionally present an approach based on a sparsity constraint. The discussed method yields promising results for both simulated and real world data.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]


no image
k-NN Regression Adapts to Local Intrinsic Dimension

Kpotufe, S.

In Advances in Neural Information Processing Systems 24, pages: 729-737, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger), Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
Many nonparametric regressors were recently shown to converge at rates that depend only on the intrinsic dimension of data. These regressors thus escape the curse of dimension when high-dimensional data has low intrinsic dimension (e.g. a manifold). We show that k-NN regression is also adaptive to intrinsic dimension. In particular our rates are local to a query x and depend only on the way masses of balls centered at x vary with radius. Furthermore, we show a simple way to choose k = k(x) locally at any x so as to nearly achieve the minimax rate at x in terms of the unknown intrinsic dimension in the vicinity of x. We also establish that the minimax rate does not depend on a particular choice of metric space or distribution, but rather that this minimax rate holds for any metric space and doubling measure.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]

2010


no image
Causal Inference Using the Algorithmic Markov Condition

Janzing, D., Schölkopf, B.

IEEE Transactions on Information Theory, 56(10):5168-5194, October 2010 (article)

Abstract
Inferring the causal structure that links $n$ observables is usually based upon detecting statistical dependences and choosing simple graphs that make the joint measure Markovian. Here we argue why causal inference is also possible when the sample size is one. We develop a theory how to generate causal graphs explaining similarities between single objects. To this end, we replace the notion of conditional stochastic independence in the causal Markov condition with the vanishing of conditional algorithmic mutual information and describe the corresponding causal inference rules. We explain why a consistent reformulation of causal inference in terms of algorithmic complexity implies a new inference principle that takes into account also the complexity of conditional probability densities, making it possible to select among Markov equivalent causal graphs. This insight provides a theoretical foundation of a heuristic principle proposed in earlier work. We also sketch some ideas on how to replace Kolmogorov complexity with decidable complexity criteria. This can be seen as an algorithmic analog of replacing the empirically undecidable question of statistical independence with practical independence tests that are based on implicit or explicit assumptions on the underlying distribution.

PDF Web DOI Project Page [BibTex]

2010

PDF Web DOI Project Page [BibTex]


no image
Justifying Additive Noise Model-Based Causal Discovery via Algorithmic Information Theory

Janzing, D., Steudel, B.

Open Systems and Information Dynamics, 17(2):189-212, June 2010 (article)

Abstract
A recent method for causal discovery is in many cases able to infer whether X causes Y or Y causes X for just two observed variables X and Y. It is based on the observation that there exist (non-Gaussian) joint distributions P(X,Y) for which Y may be written as a function of X up to an additive noise term that is independent of X and no such model exists from Y to X. Whenever this is the case, one prefers the causal model X → Y. Here we justify this method by showing that the causal hypothesis Y → X is unlikely because it requires a specific tuning between P(Y) and P(X|Y) to generate a distribution that admits an additive noise model from X to Y. To quantify the amount of tuning, needed we derive lower bounds on the algorithmic information shared by P(Y) and P(X|Y). This way, our justification is consistent with recent approaches for using algorithmic information theory for causal reasoning. We extend this principle to the case where P(X,Y) almost admits an additive noise model. Our results suggest that the above conclusion is more reliable if the complexity of P(Y) is high.

PDF Web DOI Project Page [BibTex]


no image
Inferring deterministic causal relations

Daniusis, P., Janzing, D., Mooij, J., Zscheischler, J., Steudel, B., Zhang, K., Schölkopf, B.

In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, pages: 143-150, (Editors: P Grünwald and P Spirtes), AUAI Press, Corvallis, OR, USA, UAI, July 2010 (inproceedings)

Abstract
We consider two variables that are related to each other by an invertible function. While it has previously been shown that the dependence structure of the noise can provide hints to determine which of the two variables is the cause, we presently show that even in the deterministic (noise-free) case, there are asymmetries that can be exploited for causal inference. Our method is based on the idea that if the function and the probability density of the cause are chosen independently, then the distribution of the effect will, in a certain sense, depend on the function. We provide a theoretical analysis of this method, showing that it also works in the low noise regime, and link it to information geometry. We report strong empirical results on various real-world data sets from different domains.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]


no image
Telling cause from effect based on high-dimensional observations

Janzing, D., Hoyer, P., Schölkopf, B.

In Proceedings of the 27th International Conference on Machine Learning, pages: 479-486, (Editors: J Fürnkranz and T Joachims), International Machine Learning Society, Madison, WI, USA, ICML, June 2010 (inproceedings)

Abstract
We describe a method for inferring linear causal relations among multi-dimensional variables. The idea is to use an asymmetry between the distributions of cause and effect that occurs if the covariance matrix of the cause and the structure matrix mapping the cause to the effect are independently chosen. The method applies to both stochastic and deterministic causal relations, provided that the dimensionality is sufficiently high (in some experiments, 5 was enough). It is applicable to Gaussian as well as non-Gaussian data.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]


no image
Probabilistic latent variable models for distinguishing between cause and effect

Mooij, J., Stegle, O., Janzing, D., Zhang, K., Schölkopf, B.

In Advances in Neural Information Processing Systems 23, pages: 1687-1695, (Editors: J Lafferty and CKI Williams and J Shawe-Taylor and RS Zemel and A Culotta), Curran, Red Hook, NY, USA, 24th Annual Conference on Neural Information Processing Systems (NIPS), 2010 (inproceedings)

Abstract
We propose a novel method for inferring whether X causes Y or vice versa from joint observations of X and Y. The basic idea is to model the observed data using probabilistic latent variable models, which incorporate the effects of unobserved noise. To this end, we consider the hypothetical effect variable to be a function of the hypothetical cause variable and an independent noise term (not necessarily additive). An important novel aspect of our work is that we do not restrict the model class, but instead put general non-parametric priors on this function and on the distribution of the cause. The causal direction can then be inferred by using standard Bayesian model selection. We evaluate our approach on synthetic data and real-world data and report encouraging results.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]


no image
Identifying Cause and Effect on Discrete Data using Additive Noise Models

Peters, J., Janzing, D., Schölkopf, B.

In JMLR Workshop and Conference Proceedings Volume 9: AISTATS 2010, pages: 597-604, (Editors: YW Teh and M Titterington), JMLR, Cambridge, MA, USA, 13th International Conference on Artificial Intelligence and Statistics, May 2010 (inproceedings)

Abstract
Inferring the causal structure of a set of random variables from a finite sample of the joint distribution is an important problem in science. Recently, methods using additive noise models have been suggested to approach the case of continuous variables. In many situations, however, the variables of interest are discrete or even have only finitely many states. In this work we extend the notion of additive noise models to these cases. Whenever the joint distribution P(X;Y ) admits such a model in one direction, e.g. Y = f(X) + N; N ? X, it does not admit the reversed model X = g(Y ) + ~N ; ~N ? Y as long as the model is chosen in a generic way. Based on these deliberations we propose an efficient new algorithm that is able to distinguish between cause and effect for a finite sample of discrete variables. We show that this algorithm works both on synthetic and real data sets.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]


no image
Causal Markov condition for submodular information measures

Steudel, B., Janzing, D., Schölkopf, B.

In Proceedings of the 23rd Annual Conference on Learning Theory, pages: 464-476, (Editors: AT Kalai and M Mohri), OmniPress, Madison, WI, USA, COLT, June 2010 (inproceedings)

Abstract
The causal Markov condition (CMC) is a postulate that links observations to causality. It describes the conditional independences among the observations that are entailed by a causal hypothesis in terms of a directed acyclic graph. In the conventional setting, the observations are random variables and the independence is a statistical one, i.e., the information content of observations is measured in terms of Shannon entropy. We formulate a generalized CMC for any kind of observations on which independence is defined via an arbitrary submodular information measure. Recently, this has been discussed for observations in terms of binary strings where information is understood in the sense of Kolmogorov complexity. Our approach enables us to find computable alternatives to Kolmogorov complexity, e.g., the length of a text after applying existing data compression schemes. We show that our CMC is justified if one restricts the attention to a class of causal mechanisms that is adapted to the respective information measure. Our justification is similar to deriving the statistical CMC from functional models of causality, where every variable is a deterministic function of its observed causes and an unobserved noise term. Our experiments on real data demonstrate the performance of compression based causal inference.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]

2009


no image
Robot Learning

Peters, J., Morimoto, J., Tedrake, R., Roy, N.

IEEE Robotics and Automation Magazine, 16(3):19-20, September 2009 (article)

Abstract
Creating autonomous robots that can learn to act in unpredictable environments has been a long-standing goal of robotics, artificial intelligence, and the cognitive sciences. In contrast, current commercially available industrial and service robots mostly execute fixed tasks and exhibit little adaptability. To bridge this gap, machine learning offers a myriad set of methods, some of which have already been applied with great success to robotics problems. As a result, there is an increasing interest in machine learning and statistics within the robotics community. At the same time, there has been a growth in the learning community in using robots as motivating applications for new algorithms and formalisms. Considerable evidence of this exists in the use of learning in high-profile competitions such as RoboCup and the Defense Advanced Research Projects Agency (DARPA) challenges, and the growing number of research programs funded by governments around the world.

PDF Web DOI Project Page [BibTex]

2009

PDF Web DOI Project Page [BibTex]


no image
Nonlinear causal discovery with additive noise models

Hoyer, P., Janzing, D., Mooij, J., Peters, J., Schölkopf, B.

In Advances in neural information processing systems 21, pages: 689-696, (Editors: D Koller and D Schuurmans and Y Bengio and L Bottou), Curran, Red Hook, NY, USA, 22nd Annual Conference on Neural Information Processing Systems (NIPS), June 2009 (inproceedings)

Abstract
The discovery of causal relationships between a set of observed variables is a fundamental problem in science. For continuous-valued data linear acyclic causal models are often used because these models are well understood and there are well-known methods to fit them to data. In reality, of course, many causal relationships are more or less nonlinear, raising some doubts as to the applicability and usefulness of purely linear methods. In this contribution we show that in fact the basic linear framework can be generalized to nonlinear models with additive noise. In this extended framework, nonlinearities in the data-generating process are in fact a blessing rather than a curse, as they typically provide information on the underlying causal system and allow more aspects of the true data-generating mechanisms to be identified. In addition to theoretical results we show simulations and some simple real data experiments illustrating the identification power provided by nonlinearities.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]


no image
Identifying confounders using additive noise models

Janzing, D., Peters, J., Mooij, J., Schölkopf, B.

In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, pages: 249-257, (Editors: J Bilmes and AY Ng), AUAI Press, Corvallis, OR, USA, UAI, June 2009 (inproceedings)

Abstract
We propose a method for inferring the existence of a latent common cause ("confounder") of two observed random variables. The method assumes that the two effects of the confounder are (possibly nonlinear) functions of the confounder plus independent, additive noise. We discuss under which conditions the model is identifiable (up to an arbitrary reparameterization of the confounder) from the joint distribution of the effects. We state and prove a theoretical result that provides evidence for the conjecture that the model is generically identifiable under suitable technical conditions. In addition, we propose a practical method to estimate the confounder from a finite i.i.d. sample of the effects and illustrate that the method works well on both simulated and real-world data.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]


no image
Regression by dependence minimization and its application to causal inference in additive noise models

Mooij, J., Janzing, D., Peters, J., Schölkopf, B.

In Proceedings of the 26th International Conference on Machine Learning, pages: 745-752, (Editors: A Danyluk and L Bottou and M Littman), ACM Press, New York, NY, USA, ICML, June 2009 (inproceedings)

Abstract
Motivated by causal inference problems, we propose a novel method for regression that minimizes the statistical dependence between regressors and residuals. The key advantage of this approach to regression is that it does not assume a particular distribution of the noise, i.e., it is non-parametric with respect to the noise distribution. We argue that the proposed regression method is well suited to the task of causal inference in additive noise models. A practical disadvantage is that the resulting optimization problem is generally non-convex and can be difficult to solve. Nevertheless, we report good results on one of the tasks of the NIPS 2008 Causality Challenge, where the goal is to distinguish causes from effects in pairs of statistically dependent variables. In addition, we propose an algorithm for efficiently inferring causal models from observational data for more than two variables. The required number of regressions and independence tests is quadratic in the number of variables, which is a significant improvement over the simple method that tests all possible DAGs.

PDF Web DOI Project Page [BibTex]

PDF Web DOI Project Page [BibTex]


no image
Nonlinear directed acyclic structure learning with weakly additive noise models

Tillman, R., Gretton, A., Spirtes, P.

In Advances in Neural Information Processing Systems 22, pages: 1847-1855, (Editors: Bengio, Y. , D. Schuurmans, J. Lafferty, C. Williams, A. Culotta), Curran, Red Hook, NY, USA, 23rd Annual Conference on Neural Information Processing Systems (NIPS), 2009 (inproceedings)

Abstract
The recently proposed emph{additive noise model} has advantages over previous structure learning algorithms, when attempting to recover some true data generating mechanism, since it (i) does not assume linearity or Gaussianity and (ii) can recover a unique DAG rather than an equivalence class. However, its original extension to the multivariate case required enumerating all possible DAGs, and for some special distributions, e.g. linear Gaussian, the model is invertible and thus cannot be used for structure learning. We present a new approach which combines a PC style search using recent advances in kernel measures of conditional dependence with local searches for additive noise models in substructures of the equivalence class. This results in a more computationally efficient approach that is useful for arbitrary distributions even when additive noise models are invertible. Experiments with synthetic and real data show that this method is more accurate than previous methods when data are nonlinear and/or non-Gaussian.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]