Department Talks

Metrics Matter, Examples from Binary and Multilabel Classification

IS Colloquium
  • 21 August 2017 • 11:15 12:15
  • Sanmi Koyejo
  • Empirical Inference meeting room (MPI-IS building, 4th floor)

Performance metrics are a key component of machine learning systems, and are ideally constructed to reflect real world tradeoffs. In contrast, much of the literature simply focuses on algorithms for maximizing accuracy. With the increasing integration of machine learning into real systems, it is clear that accuracy is an insufficient measure of performance for many problems of interest. Unfortunately, unlike accuracy, many real world performance metrics are non-decomposable i.e. cannot be computed as a sum of losses for each instance. Thus, known algorithms and associated analysis are not trivially extended, and direct approaches require expensive combinatorial optimization. I will outline recent results characterizing population optimal classifiers for large families of binary and multilabel classification metrics, including such nonlinear metrics as F-measure and Jaccard measure. Perhaps surprisingly, the prediction which maximizes the utility for a range of such metrics takes a simple form. This results in simple and scalable procedures for optimizing complex metrics in practice. I will also outline how the same analysis gives optimal procedures for selecting point estimates from complex posterior distributions for structured objects such as graphs. Joint work with Nagarajan Natarajan, Bowei Yan, Kai Zhong, Pradeep Ravikumar and Inderjit Dhillon.

Organizers: Mijung Park

Dominik Bach - TBA

IS Colloquium
  • 02 October 2017 • 11:15 12:15
  • Dominik Bach

  • Hannes Nickisch, Philips Research, Hamburg
  • MRZ seminar room

Coronary artery disease (CAD) is the single leading cause of death worldwide and Cardiac Computed Tomography Angiography (CCTA) is a non-invasive test to rule out CAD using the anatomical characterization of the coronary lesions. Recent studies suggest that coronary lesions’ hemodynamic significance can be assessed by Fractional Flow Reserve (FFR), which is usually measured invasively in the CathLab but can also be simulated from a patient-specific biophysical model based on CCTA data. We learn a parametric lumped model (LM) enabling fast computational fluid dynamic simulations of blood flow in elongated vessel networks to alleviate the computational burden of 3D finite element (FE) simulations. We adapt the coefficients balancing the local nonlinear hydraulic effects from a training set of precomputed FE simulations. Our LM yields accurate pressure predictions suggesting that costly FE simulations can be replaced by our fast LM paving the way to use a personalised interactive biophysical model with realtime feedback in clinical practice.

  • Catrin Misselhorn
  • Max Planck Haus Lecture Hall

The development of increasingly intelligent and autonomous technologies will inevitably lead to these systems having to face morally problematic situations. This is particularly true of artificial systems that are used in geriatric care environments. It will, therefore, be necessary in the long run to develop machines which have the capacity for a certain amount of autonomous moral decision-making. The goal of this talk is to provide the theoretical foundations for artificial morality, i.e., for implementing moral capacities in artificial systems in general and a roadmap for developing an assistive system in geriatric care which is capable of moral learning.

Organizers: Ludovic Righetti Philipp Hennig

Images of planets orbiting other stars

  • 01 March 2016 • 11:00 12:00
  • Sascha Quantz
  • AGBS Seminar Room

The detection and characterization of planets orbiting other stars than the Sun, i.e., so-called extrasolar planets, is one of the fastest growing and most vibrant research fields in modern astrophysics. In the last 25 years, more than 5400 extrasolar planets and planet candidates were revealed, but the vast majority of these objects was detected with indirect techniques, where the existence of the planet is inferred from periodic changes in the light coming from the central star. No photons from the planets themselves are detected. In this talk, however, I will focus on the direct detection of extrasolar planets. On the one hand I will describe the main challenges that have to be overcome in order to image planets around other stars. In addition to using the world’s largest telescopes and optimized cameras it was realized in last few years that by applying advanced image processing techniques significant sensitivity gains can be achieved. On the other hand I will demonstrate what can be learned if one is successful in “taking a picture” of an extrasolar planet. After all, there must be good scientific reasons and a strong motivation why the direct detection of extrasolar planets is one of the key science drivers for current and future projects on major ground- and space-based telescopes.

Organizers: Diana Rebmann

  • Aldo Faisal
  • MPH Lecture Hall

Our research questions are centred on a basic characteristic of human brains: variability in their behaviour and their underlying meaning for cognitive mechanisms. Such variability is emerging as a key ingredient in understanding biological principles (Faisal, Selen & Wolpert, 2008, Nature Rev Neurosci) and yet lacks adequate quantitative and computational methods for description and analysis. Crucially, we find that biological and behavioural variability contains important information that our brain and our technology can make us of (instead of just averaging it away): Using advanced body sensor networks, we measured eye-movements, full-body and hand kinematics of humans living in a studio flat and are going to present some insightful results on motor control and visual attention that suggest that the control of behaviour "in-the-wild" is predictably different ways than what we measure "in-the-lab". The results have implications for robotics, prosthetics and neuroscience.

Organizers: Matthias Hohmann

Probabilistic Numerics for Differential Equations

IS Colloquium
  • 11 January 2016 • 11:15 12:15
  • Tim Sullivan

Beginning with a seminal paper of Diaconis (1988), the aim of so-called "probabilistic numerics" is to compute probabilistic solutions to deterministic problems arising in numerical analysis by casting them as statistical inference problems. For example, numerical integration of a deterministic function can be seen as the integration of an unknown/random function, with evaluations of the integrand at the integration nodes proving partial information about the integrand. Advantages offered by this viewpoint include: access to the Bayesian representation of prior and posterior uncertainties; better propagation of uncertainty through hierarchical systems than simple worst-case error bounds; and appropriate accounting for numerical truncation and round-off error in inverse problems, so that the replicability of deterministic simulations is not confused with their accuracy, thereby yielding an inappropriately concentrated Bayesian posterior. This talk will describe recent work on probabilistic numerical solvers for ordinary and partial differential equations, including their theoretical construction, convergence rates, and applications to forward and inverse problems. Joint work with Andrew Stuart (Warwick).

Organizers: Philipp Hennig

  • Gernot Müller-Putz
  • MPH Lecture Hall

More than half of the persons with spinal cord injuries (SCI) are suffering from impairments of both hands, which results in a tremendous decrease of quality of life and represents a major barrier for inclusion in society. Functional restoration is possible with neuroprostheses (NPs) based on functional electrical stimulation (FES). A Brain-Computer Interface provides a means of control for such neuroprosthetics since users have limited abilities to use traditional assistive devices. This talk presents our early research on BCI-based NP control based on motor imagery, discusses hybrid BCI solutions and shows our work and effort on movement trajectory decoding. An outlook to future BCI applications will conclude this talk.

Organizers: Moritz Grosse-Wentrup

Imaging genomics of functional brain networks

IS Colloquium
  • 19 October 2015 • 11:15 12:15
  • Jonas Richiardi
  • Max Planck House, Lecture Hall

During rest, brain activity is intrinsically synchronized between different brain regions, forming networks of coherent activity. These functional networks (FNs), consisting of multiple regions widely distributed across lobes and hemispheres, appear to be a fundamental theme of neural organization in mammalian brains. Despite hundreds of studies detailing this phenomenon, the genetic and molecular mechanisms supporting these functional networks remain undefined. Previous work has mostly focused on polymorphisms in candidate genes, or used a twin study approach to demonstrate heritability of aspects of resting-state connectivity. The recent availability of high spatial resolution post-mortem brain gene expression datasets, together with several large-scale imaging genetics datasets, which contain joint in-vivo functional brain imaging data and genotype data for several hundred subjects, opens intriguing data analysis avenues. Using novel cross-modal graph-based statistics, we show that functional brain networks defined with resting-state fMRI can be recapitulated using measures of correlated gene expression, and that the relationship is not driven by gross tissue types. The set of genes we identify is significantly enriched for certain types of ion channels and synapse-related genes. We validate results by showing that polymorphisms in this set significantly correlate with alterations of in-vivo resting-state functional connectivity in a group of 259 adolescents. We further validate results on another species by showing that our list of genes is significantly associated with neuronal connectivity in the mouse brain. These results provide convergent, multimodal evidence that resting-state functional networks emerge from the orchestrated activity of dozens of genes linked to ion channel activity and synaptic function. Functional brain networks are also known to be perturbed in a variety of neurological and neuropsychological disorders, including Alzheimer's and schizophrenia. Given this link between disease and networks, and the fact that many brain disorders have genetic contributions, it seems that functional brain networks may be an interesting endophenotype for clinical use. We discuss the translational potential of the imaging genomics techniques we developed.

Organizers: Moritz Grosse-Wentrup Michel Besserve

  • Gael Varoquaux
  • Max Planck House Lecture Hall

Organizers: Moritz Grosse-Wentrup

High-dimensional statistical approaches for personalized medicine

IS Colloquium
  • 28 September 2015 • 12:00 13:00
  • Sach Mukherjee
  • Max Planck House Lecture Hall

Human diseases show considerable heterogeneity at the molecular level. Such heterogeneity is central to personalized medicine efforts that seek to exploit molecular data to better understand disease biology and inform clinical decision making. An emerging notion is that diseases and disease subgroups may differ not only at the level of mean molecular abundance, but also with respect to patterns of molecular interplay. I will discuss our ongoing efforts to develop methods to investigate such heterogeneity, with an emphasis on some high-dimensional aspects.

Organizers: Michel Besserve Jonas Peters

  • Kevin T. Kelly
  • Max Planck House Lecture Hall

In machine learning, the standard explanation of Ockham's razor is to minimize predictive risk. But prediction is interpreted passively---one may not rely on predictions to change the probability distribution used for training. That limitation may be overcome by studying alternatively manipulated systems in randomized experimental trials, but experiments on multivariate systems or on human subjects are often infeasible or immoral. Happily, the past three decades have witnessed the development of a range of statistical techniques for discovering causal relations from non-experimental data. One characteristic of such methods is a strong Ockham bias toward simpler causal theories---i.e., theories with fewer causal connections among the variables of interest. Our question is what Ockham's razor has to do with finding true (rather than merely plausible) causal theories from non-experimental data. The traditional story of minimizing predictive risk does not apply, because uniform consistency is often infeasible in non-experimental causal discovery: without strong and implausible assumptions, the probability of erroneous causal orientation may be arbitrarily high at any sample size. The standard justification for causal discovery methods is point-wise consistency, or convergence in probability to the true causes. But Ockham's razor is not necessary for point-wise convergence: a Bayesian with a strong prior bias toward a complex model would also be point-wise consistent. Either way, the crucial Ockham bias remains disconnected from learning performance. A method reverses its opinion in probability when it probably says A at some sample size and probably says B incompatible with A at a higher sample size. A method cycles in probability when it probably says A, then probably says B incompatible with A, and then probably says A again. Uniform consistency allows for no reversals or cycles in probability. Point-wise consistency allows for arbitrarily many. Lying plausibly between those two extremes is straightest possible convergence to the truth, which allows for only as many cycles and reversals in probability as are necessary to solve the learning problem at hand. We show that Ockham's razor is necessary for cycle-minimal convergence and that patience, or waiting for nature to choose among simplest theories, is necessary for reversal-minimal convergence. The idea yields very tight constraints on inductive statistical methods, both classical and Bayesian, with causal discovery methods as an important special case. It also provides a valid interpretation of significance and power when tests are used to fish inductively for models. The talk is self-contained for a general scientific audience. Novel concepts are illustrated amply with figures and simulations.

Organizers: Michel Besserve Kun Zhang