Header logo is ei



no image
Incremental Local Gaussian Regression

Meier, F., Hennig, P., Schaal, S.

In Advances in Neural Information Processing Systems 27, pages: 972-980, (Editors: Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger), 28th Annual Conference on Neural Information Processing Systems (NIPS), 2014, clmc (inproceedings)

PDF link (url) [BibTex]

PDF link (url) [BibTex]


no image
Domain adaptation-can quantity compensate for quality?

Ben-David, S., Urner, R.

Annals of Mathematics and Artificial Intelligence, 70(3):185-202, 2014 (article)

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
oxel level [18]F-FDG PET/MRI unsupervised segmentation of the tumor microenvironment

Katiyar, P., Divine, M. R., Pichler, B. J., Disselhorst, J. A.

World Molecular Imaging Conference, 2014 (poster)

[BibTex]

[BibTex]


no image
Sérsic galaxy models in weak lensing shape measurement: model bias, noise bias and their interaction

Kacprzak, T., Bridle, S., Rowe, B., Voigt, L., Zuntz, J., Hirsch, M., MacCrann, N.

Monthly Notices of the Royal Astronomical Society, 441(3):2528-2538, Oxford University Press, 2014 (article)

DOI [BibTex]

DOI [BibTex]


no image
Learning to Deblur

Schuler, C. J., Hirsch, M., Harmeling, S., Schölkopf, B.

In NIPS 2014 Deep Learning and Representation Learning Workshop, 28th Annual Conference on Neural Information Processing Systems (NIPS), 2014 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]


no image
Analysis of Distance Functions in Graphs

Alamgir, M.

University of Hamburg, Germany, University of Hamburg, Germany, 2014 (phdthesis)

[BibTex]

[BibTex]


no image
Efficient Bayesian Local Model Learning for Control

Meier, F., Hennig, P., Schaal, S.

In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, pages: 2244 - 2249, IROS, 2014, clmc (inproceedings)

Abstract
Model-based control is essential for compliant controland force control in many modern complex robots, like humanoidor disaster robots. Due to many unknown and hard tomodel nonlinearities, analytical models of such robots are oftenonly very rough approximations. However, modern optimizationcontrollers frequently depend on reasonably accurate models,and degrade greatly in robustness and performance if modelerrors are too large. For a long time, machine learning hasbeen expected to provide automatic empirical model synthesis,yet so far, research has only generated feasibility studies butno learning algorithms that run reliably on complex robots.In this paper, we combine two promising worlds of regressiontechniques to generate a more powerful regression learningsystem. On the one hand, locally weighted regression techniquesare computationally efficient, but hard to tune due to avariety of data dependent meta-parameters. On the other hand,Bayesian regression has rather automatic and robust methods toset learning parameters, but becomes quickly computationallyinfeasible for big and high-dimensional data sets. By reducingthe complexity of Bayesian regression in the spirit of local modellearning through variational approximations, we arrive at anovel algorithm that is computationally efficient and easy toinitialize for robust learning. Evaluations on several datasetsdemonstrate very good learning performance and the potentialfor a general regression learning tool for robotics.

PDF link (url) DOI [BibTex]

PDF link (url) DOI [BibTex]


no image
The sample complexity of agnostic learning under deterministic labels

Ben-David, S., Urner, R.

In Proceedings of the 27th Conference on Learning Theory, 35, pages: 527-542, (Editors: Balcan, M.-F. and Feldman, V. and Szepesvári, C.), JMLR, COLT, 2014 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]


no image
Towards an optimal stochastic alternating direction method of multipliers

Azadi, S., Sra, S.

Proceedings of the 31st International Conference on Machine Learning, 32, pages: 620-628, (Editors: Xing, E. P. and Jebara, T.), JMLR, ICML, 2014 (conference)

link (url) [BibTex]

link (url) [BibTex]


no image
Diminished White Matter Integrity in Patients with Systemic Lupus Erythematosus

Schmidt-Wilcke, T., Cagnoli, P., Wang, P., Schultz, T., Lotz, A., Mccune, W. J., Sundgren, P. C.

NeuroImage: Clinical, 5, pages: 291-297, 2014 (article)

DOI [BibTex]

DOI [BibTex]


no image
Open Problem: Finding Good Cascade Sampling Processes for the Network Inference Problem

Gomez Rodriguez, M., Song, L., Schölkopf, B.

Proceedings of the 27th Conference on Learning Theory, 35, pages: 1276-1279, (Editors: Balcan, M.-F. and Szepesvári, C.), JMLR.org, COLT, 2014 (conference)

PDF [BibTex]

PDF [BibTex]


no image
Information-Theoretic Bounded Rationality and ϵ-Optimality

Braun, DA, Ortega, PA

Entropy, 16(8):4662-4676, August 2014 (article)

Abstract
Bounded rationality concerns the study of decision makers with limited information processing resources. Previously, the free energy difference functional has been suggested to model bounded rational decision making, as it provides a natural trade-off between an energy or utility function that is to be optimized and information processing costs that are measured by entropic search costs. The main question of this article is how the information-theoretic free energy model relates to simple \(\epsilon\)-optimality models of bounded rational decision making, where the decision maker is satisfied with any action in an \(\epsilon\)-neighborhood of the optimal utility. We find that the stochastic policies that optimize the free energy trade-off comply with the notion of \(\epsilon\)-optimality. Moreover, this optimality criterion even holds when the environment is adversarial. We conclude that the study of bounded rationality based on \(\epsilon\)-optimality criteria that abstract away from the particulars of the information processing constraints is compatible with the information-theoretic free energy model of bounded rationality.

DOI [BibTex]

DOI [BibTex]


no image
Occam’s Razor in sensorimotor learning

Genewein, T, Braun, D

Proceedings of the Royal Society of London B, 281(1783):1-7, May 2014 (article)

Abstract
A large number of recent studies suggest that the sensorimotor system uses probabilistic models to predict its environment and makes inferences about unobserved variables in line with Bayesian statistics. One of the important features of Bayesian statistics is Occam's Razor—an inbuilt preference for simpler models when comparing competing models that explain some observed data equally well. Here, we test directly for Occam's Razor in sensorimotor control. We designed a sensorimotor task in which participants had to draw lines through clouds of noisy samples of an unobserved curve generated by one of two possible probabilistic models—a simple model with a large length scale, leading to smooth curves, and a complex model with a short length scale, leading to more wiggly curves. In training trials, participants were informed about the model that generated the stimulus so that they could learn the statistics of each model. In probe trials, participants were then exposed to ambiguous stimuli. In probe trials where the ambiguous stimulus could be fitted equally well by both models, we found that participants showed a clear preference for the simpler model. Moreover, we found that participants’ choice behaviour was quantitatively consistent with Bayesian Occam's Razor. We also show that participants’ drawn trajectories were similar to samples from the Bayesian predictive distribution over trajectories and significantly different from two non-probabilistic heuristics. In two control experiments, we show that the preference of the simpler model cannot be simply explained by a difference in physical effort or by a preference for curve smoothness. Our results suggest that Occam's Razor is a general behavioural principle already present during sensorimotor processing.

DOI [BibTex]

DOI [BibTex]


no image
Generalized Thompson sampling for sequential decision-making and causal inference

Ortega, PA, Braun, DA

Complex Adaptive Systems Modeling, 2(2):1-23, March 2014 (article)

Abstract
Purpose Sampling an action according to the probability that the action is believed to be the optimal one is sometimes called Thompson sampling. Methods Although mostly applied to bandit problems, Thompson sampling can also be used to solve sequential adaptive control problems, when the optimal policy is known for each possible environment. The predictive distribution over actions can then be constructed by a Bayesian superposition of the policies weighted by their posterior probability of being optimal. Results Here we discuss two important features of this approach. First, we show in how far such generalized Thompson sampling can be regarded as an optimal strategy under limited information processing capabilities that constrain the sampling complexity of the decision-making process. Second, we show how such Thompson sampling can be extended to solve causal inference problems when interacting with an environment in a sequential fashion. Conclusion In summary, our results suggest that Thompson sampling might not merely be a useful heuristic, but a principled method to address problems of adaptive sequential decision-making and causal inference.

DOI [BibTex]

DOI [BibTex]


no image
Assessing randomness and complexity in human motion trajectories through analysis of symbolic sequences

Peng, Z, Genewein, T, Braun, DA

Frontiers in Human Neuroscience, 8(168):1-13, March 2014 (article)

Abstract
Complexity is a hallmark of intelligent behavior consisting both of regular patterns and random variation. To quantitatively assess the complexity and randomness of human motion, we designed a motor task in which we translated subjects' motion trajectories into strings of symbol sequences. In the first part of the experiment participants were asked to perform self-paced movements to create repetitive patterns, copy pre-specified letter sequences, and generate random movements. To investigate whether the degree of randomness can be manipulated, in the second part of the experiment participants were asked to perform unpredictable movements in the context of a pursuit game, where they received feedback from an online Bayesian predictor guessing their next move. We analyzed symbol sequences representing subjects' motion trajectories with five common complexity measures: predictability, compressibility, approximate entropy, Lempel-Ziv complexity, as well as effective measure complexity. We found that subjects’ self-created patterns were the most complex, followed by drawing movements of letters and self-paced random motion. We also found that participants could change the randomness of their behavior depending on context and feedback. Our results suggest that humans can adjust both complexity and regularity in different movement types and contexts and that this can be assessed with information-theoretic measures of the symbolic sequences generated from movement trajectories.

DOI [BibTex]

DOI [BibTex]


no image
Curiosity-driven learning with Context Tree Weighting

Peng, Z, Braun, DA

pages: 366-367, IEEE, Piscataway, NJ, USA, 4th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics (IEEE ICDL-EPIROB), October 2014 (conference)

Abstract
In the first simulation, the intrinsic motivation of the agent was given by measuring learning progress through reduction in informational surprise (Figure 1 A-C). This way the agent should first learn the action that is easiest to learn (a1), and then switch to other actions that still allow for learning (a2) and ignore actions that cannot be learned at all (a3). This is exactly what we found in our simple environment. Compared to the original developmental learning algorithm based on learning progress proposed by Oudeyer [2], our Context Tree Weighting approach does not require local experts to do prediction, rather it learns the conditional probability distribution over observations given action in one structure. In the second simulation, the intrinsic motivation of the agent was given by measuring compression progress through improvement in compressibility (Figure 1 D-F). The agent behaves similarly: the agent first concentrates on the action with the most predictable consequence and then switches over to the regular action where the consequence is more difficult to predict, but still learnable. Unlike the previous simulation, random actions are also interesting to some extent because the compressed symbol strings use 8-bit representations, while only 2 bits are required for our observation space. Our preliminary results suggest that Context Tree Weighting might provide a useful representation to study problems of development.

DOI [BibTex]

DOI [BibTex]


no image
Monte Carlo methods for exact & efficient solution of the generalized optimality equations

Ortega, PA, Braun, DA, Tishby, N

pages: 4322-4327, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), June 2014 (conference)

Abstract
Previous work has shown that classical sequential decision making rules, including expectimax and minimax, are limit cases of a more general class of bounded rational planning problems that trade off the value and the complexity of the solution, as measured by its information divergence from a given reference. This allows modeling a range of novel planning problems having varying degrees of control due to resource constraints, risk-sensitivity, trust and model uncertainty. However, so far it has been unclear in what sense information constraints relate to the complexity of planning. In this paper, we introduce Monte Carlo methods to solve the generalized optimality equations in an efficient \& exact way when the inverse temperatures in a generalized decision tree are of the same sign. These methods highlight a fundamental relation between inverse temperatures and the number of Monte Carlo proposals. In particular, it is seen that the number of proposals is essentially independent of the size of the decision tree.

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2012


no image
Support Vector Machines, Support Measure Machines, and Quasar Target Selection

Muandet, K.

Center for Cosmology and Particle Physics (CCPP), New York University, December 2012 (talk)

[BibTex]

2012

[BibTex]


no image
Hilbert Space Embedding for Dirichlet Process Mixtures

Muandet, K.

NIPS Workshop on Confluence between Kernel Methods and Graphical Models, December 2012 (talk)

[BibTex]

[BibTex]


no image
Jensen-Bregman LogDet Divergence with Application to Efficient Similarity Search for Covariance Matrices

Cherian, A., Sra, S., Banerjee, A., Papanikolopoulos, N.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(9):2161-2174, December 2012 (article)

DOI [BibTex]

DOI [BibTex]


no image
Hippocampal-Cortical Interaction during Periods of Subcortical Silence

Logothetis, N., Eschenko, O., Murayama, Y., Augath, M., Steudel, T., Evrard, H., Besserve, M., Oeltermann, A.

Nature, 491, pages: 547-553, November 2012 (article)

Abstract
Hippocampal ripples, episodic high-frequency field-potential oscillations primarily occurring during sleep and calmness, have been described in mice, rats, rabbits, monkeys and humans, and so far they have been associated with retention of previously acquired awake experience. Although hippocampal ripples have been studied in detail using neurophysiological methods, the global effects of ripples on the entire brain remain elusive, primarily owing to a lack of methodologies permitting concurrent hippocampal recordings and whole-brain activity mapping. By combining electrophysiological recordings in hippocampus with ripple-triggered functional magnetic resonance imaging, here we show that most of the cerebral cortex is selectively activated during the ripples, whereas most diencephalic, midbrain and brainstem regions are strongly and consistently inhibited. Analysis of regional temporal response patterns indicates that thalamic activity suppression precedes the hippocampal population burst, which itself is temporally bounded by massive activations of association and primary cortical areas. These findings suggest that during off-line memory consolidation, synergistic thalamocortical activity may be orchestrating a privileged interaction state between hippocampus and cortex by silencing the output of subcortical centres involved in sensory processing or potentially mediating procedural learning. Such a mechanism would cause minimal interference, enabling consolidation of hippocampus-dependent memory.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Scalable graph kernels

Shervashidze, N.

Eberhard Karls Universität Tübingen, Germany, October 2012 (phdthesis)

Web [BibTex]

Web [BibTex]


no image
Probabilistic Modelling of Expression Variation in Modern eQTL Studies

Zwießele, M.

Eberhard Karls Universität Tübingen, Germany, October 2012 (mastersthesis)

[BibTex]

[BibTex]


no image
Thermodynamic limits of dynamic cooling

Allahverdyan, A., Hovhannisyan, K., Janzing, D., Mahler, G.

Physical Review E, 84(4):16, October 2012 (article)

Abstract
We study dynamic cooling, where an externally driven two-level system is cooled via reservoir, a quantum system with initial canonical equilibrium state. We obtain explicitly the minimal possible temperature Tmin>0 reachable for the two-level system. The minimization goes over all unitary dynamic processes operating on the system and reservoir and over the reservoir energy spectrum. The minimal work needed to reach Tmin grows as 1/Tmin. This work cost can be significantly reduced, though, if one is satisfied by temperatures slightly above Tmin. Our results on Tmin>0 prove unattainability of the absolute zero temperature without ambiguities that surround its derivation from the entropic version of the third law. We also study cooling via a reservoir consisting of N≫1 identical spins. Here we show that Tmin∝1/N and find the maximal cooling compatible with the minimal work determined by the free energy. Finally we discuss cooling by reservoir with an initially microcanonic state and show that although a purely microcanonic state can yield the zero temperature, the unattainability is recovered when taking into account imperfections in preparing the microcanonic state.

Web DOI [BibTex]

Web DOI [BibTex]


no image
GLIDE: GPU-Based Linear Regression for Detection of Epistasis

Kam-Thong, T., Azencott, C., Cayton, L., Pütz, B., Altmann, A., Karbalai, N., Sämann, P., Schölkopf, B., Müller-Myhsok, B., Borgwardt, K.

Human Heredity, 73(4):220-236, September 2012 (article)

Abstract
Due to recent advances in genotyping technologies, mapping phenotypes to single loci in the genome has become a standard technique in statistical genetics. However, one-locus mapping fails to explain much of the phenotypic variance in complex traits. Here, we present GLIDE, which maps phenotypes to pairs of genetic loci and systematically searches for the epistatic interactions expected to reveal part of this missing heritability. GLIDE makes use of the computational power of consumer-grade graphics cards to detect such interactions via linear regression. This enabled us to conduct a systematic two-locus mapping study on seven disease data sets from the Wellcome Trust Case Control Consortium and on in-house hippocampal volume data in 6 h per data set, while current single CPU-based approaches require more than a year’s time to complete the same task.

Web [BibTex]

Web [BibTex]


no image
Fast projection onto mixed-norm balls with applications

Sra, S.

Minining and Knowledge Discovery (DMKD), 25(2):358-377, September 2012 (article)

DOI [BibTex]

DOI [BibTex]


no image
Bayesian estimation of free energies from equilibrium simulations

Habeck, M.

Physical Review Letters, 109(10):5, September 2012 (article)

Abstract
Free energy calculations are an important tool in statistical physics and biomolecular simulation. This Letter outlines a Bayesian method to estimate free energies from equilibrium Monte Carlo simulations. A Gibbs sampler is developed that allows efficient sampling of free energies and the density of states. The Gibbs sampling output can be used to estimate expected free energy differences and their uncertainties. The probabilistic formulation offers a unifying framework for existing methods such as the weighted histogram analysis method and the multistate Bennett acceptance ratio; both are shown to be approximate versions of the full probabilistic treatment.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Influence Maximization in Continuous Time Diffusion Networks

Gomez Rodriguez, M., Schölkopf, B.

In Proceedings of the 29th International Conference on Machine Learning, pages: 313-320, (Editors: J, Langford and J, Pineau), Omnipress, New York, NY, USA, ICML, July 2012 (inproceedings)

Web [BibTex]

Web [BibTex]


no image
Submodular Inference of Diffusion Networks from Multiple Trees

Gomez Rodriguez, M., Schölkopf, B.

In Proceedings of the 29th International Conference on Machine Learning , pages: 489-496, (Editors: J Langford, and J Pineau), Omnipress, New York, NY, USA, ICML, July 2012 (inproceedings)

Web [BibTex]

Web [BibTex]


Thumb xl thumb hennigk2012
Quasi-Newton Methods: A New Direction

Hennig, P., Kiefel, M.

In Proceedings of the 29th International Conference on Machine Learning, pages: 25-32, ICML ’12, (Editors: John Langford and Joelle Pineau), Omnipress, New York, NY, USA, ICML, July 2012 (inproceedings)

Abstract
Four decades after their invention, quasi- Newton methods are still state of the art in unconstrained numerical optimization. Although not usually interpreted thus, these are learning algorithms that fit a local quadratic approximation to the objective function. We show that many, including the most popular, quasi-Newton methods can be interpreted as approximations of Bayesian linear regression under varying prior assumptions. This new notion elucidates some shortcomings of classical algorithms, and lights the way to a novel nonparametric quasi-Newton method, which is able to make more efficient use of available information at computational cost similar to its predecessors.

website+code pdf link (url) [BibTex]

website+code pdf link (url) [BibTex]


no image
Image denoising: Can plain Neural Networks compete with BM3D?

Burger, H., Schuler, C., Harmeling, S.

In pages: 2392 - 2399, 25th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2012 (inproceedings)

Abstract
Image denoising can be described as the problem of mapping from a noisy image to a noise-free image. The best currently available denoising methods approximate this mapping with cleverly engineered algorithms. In this work we attempt to learn this mapping directly with a plain multi layer perceptron (MLP) applied to image patches. While this has been done before, we will show that by training on large image databases we are able to compete with the current state-of-the-art image denoising methods. Furthermore, our approach is easily adapted to less extensively studied types of noise (by merely exchanging the training data), for which we achieve excellent results as well.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
PAC-Bayesian Inequalities for Martingales

Seldin, Y., Laviolette, F., Cesa-Bianchi, N., Shawe-Taylor, J., Auer, P.

IEEE Transactions on Information Theory, 58(12):7086-7093, June 2012 (article)

Abstract
We present a set of high-probability inequalities that control the concentration of weighted averages of multiple (possibly uncountably many) simultaneously evolving and interdependent martingales. We also present a comparison inequality that bounds expectation of a convex function of martingale difference type variables by expectation of the same function of independent Bernoulli variables. This inequality is applied to derive a tighter analog of Hoeffding-Azuma inequality.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Climate classifications: the value of unsupervised clustering

Zscheischler, J., Mahecha, M., Harmeling, S.

In Proceedings of the International Conference on Computational Science , 9, pages: 897-906, Procedia Computer Science, (Editors: H. Ali, Y. Shi, D. Khazanchi, M. Lees, G.D. van Albada, J. Dongarra, P.M.A. Sloot, J. Dongarra), Elsevier, Amsterdam, Netherlands, ICCS, June 2012 (inproceedings)

Abstract
Classifying the land surface according to di erent climate zones is often a prerequisite for global diagnostic or predictive modelling studies. Classical classifications such as the prominent K¨oppen–Geiger (KG) approach rely on heuristic decision rules. Although these heuristics may transport some process understanding, such a discretization may appear “arbitrary” from a data oriented perspective. In this contribution we compare the precision of a KG classification to an unsupervised classification (k-means clustering). Generally speaking, we revisit the problem of “climate classification” by investigating the inherent patterns in multiple data streams in a purely data driven way. One question is whether we can reproduce the KG boundaries by exploring di erent combinations of climate and remotely sensed vegetation variables. In this context we also investigate whether climate and vegetation variables build similar clusters. In terms of statistical performances, k-means clearly outperforms classical climate classifications. However, a subsequent stability analysis only reveals a meaningful number of clusters if both climate and vegetation data are considered in the analysis. This is a setback for the hope to explain vegetation by means of climate alone. Clearly, classification schemes like K¨oppen-Geiger will play an important role in the future. However, future developments in this area need to be assessed based on data driven approaches.

Web DOI [BibTex]

Web DOI [BibTex]


Thumb xl screen shot 2017 09 21 at 00.54.33
Entropy Search for Information-Efficient Global Optimization

Hennig, P., Schuler, C.

Journal of Machine Learning Research, 13, pages: 1809-1837, -, June 2012 (article)

Abstract
Contemporary global optimization algorithms are based on local measures of utility, rather than a probability measure over location and value of the optimum. They thus attempt to collect low function values, not to learn about the optimum. The reason for the absence of probabilistic global optimizers is that the corresponding inference problem is intractable in several ways. This paper develops desiderata for probabilistic optimization algorithms, then presents a concrete algorithm which addresses each of the computational intractabilities with a sequence of approximations and explicitly adresses the decision problem of maximizing information gain from each evaluation.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]


no image
Kernels for identifying patterns in datasets containing noise or transformation invariances

Schölkopf, B., Chapelle, C.

United States Patent, No. 8209269, June 2012 (patent)

[BibTex]


no image
A Neuromorphic Architecture for Object Recognition and Motion Anticipation Using Burst-STDP

Nere, A., Olcese, U., Balduzzi, D., Tononi, G.

PLoS ONE, 7(5):17, May 2012 (article)

Abstract
In this work we investigate the possibilities offered by a minimal framework of artificial spiking neurons to be deployed in silico. Here we introduce a hierarchical network architecture of spiking neurons which learns to recognize moving objects in a visual environment and determine the correct motor output for each object. These tasks are learned through both supervised and unsupervised spike timing dependent plasticity (STDP). STDP is responsible for the strengthening (or weakening) of synapses in relation to pre- and post-synaptic spike times and has been described as a Hebbian paradigm taking place both in vitro and in vivo. We utilize a variation of STDP learning, called burst-STDP, which is based on the notion that, since spikes are expensive in terms of energy consumption, then strong bursting activity carries more information than single (sparse) spikes. Furthermore, this learning algorithm takes advantage of homeostatic renormalization, which has been hypothesized to promote memory consolidation during NREM sleep. Using this learning rule, we design a spiking neural network architecture capable of object recognition, motion detection, attention towards important objects, and motor control outputs. We demonstrate the abilities of our design in a simple environment with distractor objects, multiple objects moving concurrently, and in the presence of noise. Most importantly, we show how this neural network is capable of performing these tasks using a simple leaky-integrate-and-fire (LIF) neuron model with binary synapses, making it fully compatible with state-of-the-art digital neuromorphic hardware designs. As such, the building blocks and learning rules presented in this paper appear promising for scalable fully neuromorphic systems to be implemented in hardware chips.

PDF Web DOI [BibTex]


no image
Simultaneous small animal PET/MR in activated and resting state reveals multiple brain networks

Wehrl, H., Lankes, K., Hossain, M., Bezrukov, I., Liu, C., Martirosian, P., Schick, F., Pichler, B.

20th Annual Meeting and Exhibition of the International Society for Magnetic Resonance in Medicine (ISMRM), May 2012 (talk)

Web [BibTex]

Web [BibTex]


no image
Online Kernel-based Learning for Task-Space Tracking Robot Control

Nguyen-Tuong, D., Peters, J.

IEEE Transactions on Neural Networks and Learning Systems, 23(9):1417-1425, May 2012 (article)

Abstract
Abstract—Task-space control of redundant robot systems based on analytical models is known to be susceptive to modeling errors. Here, data driven model learning methods may present an interesting alternative approach. However, learning models for task-space tracking control from sampled data is an illposed problem. In particular, the same input data point can yield many different output values, which can form a non-convex solution space. Because the problem is ill-posed, models cannot be learned from such data using common regression methods. While learning of task-space control mappings is globally illposed, it has been shown in recent work that it is locally a well-defined problem. In this paper, we use this insight to formulate a local, kernel-based learning approach for online model learning for task-space tracking control. We propose a parametrization for the local model which makes an application in task-space tracking control of redundant robots possible. The model parametrization further allows us to apply the kerneltrick and, therefore, enables a formulation within the kernel learning framework. For evaluations, we show the ability of the method for online model learning for task-space tracking control of redundant robots.

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Blind Retrospective Motion Correction of MR Images

Loktyushin, A., Nickisch, H., Pohmann, R., Schölkopf, B.

20th Annual Scientific Meeting ISMRM, May 2012 (poster)

Abstract
Patient motion in the scanner is one of the most challenging problems in MRI. We propose a new retrospective motion correction method for which no tracking devices or specialized sequences are required. We seek the motion parameters such that the image gradients in the spatial domain become sparse. We then use these parameters to invert the motion and recover the sharp image. In our experiments we acquired 2D TSE images and 3D FLASH/MPRAGE volumes of the human head. Major quality improvements are possible in the 2D case and substantial improvements in the 3D case.

Web [BibTex]

Web [BibTex]


no image
Information-geometric approach to inferring causal directions

Janzing, D., Mooij, J., Zhang, K., Lemeire, J., Zscheischler, J., Daniušis, P., Steudel, B., Schölkopf, B.

Artificial Intelligence, 182-183, pages: 1-31, May 2012 (article)

Abstract
While conventional approaches to causal inference are mainly based on conditional (in)dependences, recent methods also account for the shape of (conditional) distributions. The idea is that the causal hypothesis “X causes Y” imposes that the marginal distribution PX and the conditional distribution PY|X represent independent mechanisms of nature. Recently it has been postulated that the shortest description of the joint distribution PX,Y should therefore be given by separate descriptions of PX and PY|X. Since description length in the sense of Kolmogorov complexity is uncomputable, practical implementations rely on other notions of independence. Here we define independence via orthogonality in information space. This way, we can explicitly describe the kind of dependence that occurs between PY and PX|Y making the causal hypothesis “Y causes X” implausible. Remarkably, this asymmetry between cause and effect becomes particularly simple if X and Y are deterministically related. We present an inference method that works in this case. We also discuss some theoretical results for the non-deterministic case although it is not clear how to employ them for a more general inference method.

Web DOI [BibTex]

Web DOI [BibTex]


no image
A new PET insert for simultaneous PET/MR small animal imaging

Wehrl, H., Lankes, K., Hossain, M., Bezrukov, I., Liu, C., Martirosian, P., Reischl, G., Schick, F., Pichler, B.

20th Annual Meeting and Exhibition of the International Society for Magnetic Resonance in Medicine (ISMRM), May 2012 (talk)

Web [BibTex]

Web [BibTex]


no image
Sparse regularized regression identifies behaviorally-relevant stimulus features from psychophysical data

Schönfelder, V., Wichmann, F.

Journal of the Acoustical Society of America, 131(5):3953-3969, May 2012 (article)

Abstract
As a prerequisite to quantitative psychophysical models of sensory processing it is necessary to learn to what extent decisions in behavioral tasks depend on specific stimulus features, the perceptual cues. Based on relative linear combination weights, this study demonstrates how stimulus-response data can be analyzed in this regard relying on an L1-regularized multiple logistic regression, a modern statistical procedure developed in machine learning. This method prevents complex models from over-fitting to noisy data. In addition, it enforces “sparse” solutions, a computational approximation to the postulate that a good model should contain the minimal set of predictors necessary to explain the data. In simulations, behavioral data from a classical auditory tone-in-noise detection task were generated. The proposed method is shown to precisely identify observer cues from a large set of covarying, interdependent stimulus features—a setting where standard correlational and regression methods fail. The proposed method succeeds for a wide range of signal-to-noise ratios and for deterministic as well as probabilistic observers. Furthermore, the detailed decision rules of the simulated observers were reconstructed from the estimated linear model weights allowing predictions of responses on the basis of individual stimuli.

Web DOI [BibTex]

Web DOI [BibTex]


no image
glm-ie: The Generalised Linear Models Inference and Estimation Toolbox

Nickisch, H.

Journal of Machine Learning Research, 13, pages: 1699-1703, May 2012 (article)

Abstract
The glm-ie toolbox contains scalable estimation routines for GLMs (generalised linear models) and SLMs (sparse linear models) as well as an implementation of a scalable convex variational Bayesian inference relaxation. We designed the glm-ie package to be simple, generic and easily expansible. Most of the code is written in Matlab including some The code is fully compatible to both Matlab 7.x and GNU Octave 3.3.x. Abstract Probabilistic classification, sparse linear modelling and logistic regression are covered in a common algorithmical framework.

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
Learning Tracking Control with Forward Models

Bócsi, B., Hennig, P., Csató, L., Peters, J.

In pages: 259 -264, IEEE International Conference on Robotics and Automation (ICRA), May 2012 (inproceedings)

Abstract
Performing task-space tracking control on redundant robot manipulators is a difficult problem. When the physical model of the robot is too complex or not available, standard methods fail and machine learning algorithms can have advantages. We propose an adaptive learning algorithm for tracking control of underactuated or non-rigid robots where the physical model of the robot is unavailable. The control method is based on the fact that forward models are relatively straightforward to learn and local inversions can be obtained via local optimization. We use sparse online Gaussian process inference to obtain a flexible probabilistic forward model and second order optimization to find the inverse mapping. Physical experiments indicate that this approach can outperform state-of-the-art tracking control algorithms in this context.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
A Kernel-based Approach to Direct Action Perception

Kroemer, O., Ugur, E., Oztop, E., Peters, J.

In International Conference on Robotics and Automation (ICRA 2012), pages: 2605-2610, IEEE, IEEE International Conference on Robotics and Automation (ICRA), May 2012 (inproceedings)

Abstract
The direct perception of actions allows a robot to predict the afforded actions of observed novel objects. In addition to learning which actions are afforded, the robot must also learn to adapt its actions according to the object being manipulated. In this paper, we present a non-parametric approach to representing the affordance-bearing subparts of objects. This representation forms the basis of a kernel function for computing the similarity between different subparts. Using this kernel function, the robot can learn the required mappings to perform direct action perception. The proposed approach was successfully implemented on a real robot, which could then quickly learn to generalize grasping and pouring actions to novel objects.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Feature Selection via Dependence Maximization

Song, L., Smola, A., Gretton, A., Bedo, J., Borgwardt, K.

Journal of Machine Learning Research, 13, pages: 1393-1434, May 2012 (article)

Abstract
We introduce a framework of feature selection based on dependence maximization between the selected features and the labels of an estimation problem, using the Hilbert-Schmidt Independence Criterion. The key idea is that good features should be highly dependent on the labels. Our approach leads to a greedy procedure for feature selection. We show that a number of existing feature selectors are special cases of this framework. Experiments on both artificial and real-world data show that our feature selector works well in practice.

PDF [BibTex]

PDF [BibTex]


no image
Accelerating Nearest Neighbor Search on Manycore Systems

Cayton, L.

In Parallel Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International, pages: 402-413, IPDPS, May 2012 (inproceedings)

Abstract
We develop methods for accelerating metric similarity search that are effective on modern hardware. Our algorithms factor into easily parallelizable components, making them simple to deploy and efficient on multicore CPUs and GPUs. Despite the simple structure of our algorithms, their search performance is provably sublinear in the size of the database, with a factor dependent only on its intrinsic dimensionality. We demonstrate that our methods provide substantial speedups on a range of datasets and hardware platforms. In particular, we present results on a 48-core server machine, on graphics hardware, and on a multicore desktop.

Web DOI [BibTex]

Web DOI [BibTex]


no image
High gamma-power predicts performance in sensorimotor-rhythm brain-computer interfaces

Grosse-Wentrup, M., Schölkopf, B.

Journal of Neural Engineering, 9(4):046001, May 2012 (article)

Abstract
Subjects operating a brain–computer interface (BCI) based on sensorimotor rhythms exhibit large variations in performance over the course of an experimental session. Here, we show that high-frequency γ-oscillations, originating in fronto-parietal networks, predict such variations on a trial-to-trial basis. We interpret this finding as empirical support for an influence of attentional networks on BCI performance via modulation of the sensorimotor rhythm.

Web DOI [BibTex]