Header logo is ei


2016


no image
Analysis of multiparametric MRI using a semi-supervised random forest framework allows the detection of therapy response in ischemic stroke

Castaneda, S., Katiyar, P., Russo, F., Calaminus, C., Disselhorst, J. A., Ziemann, U., Kohlhofer, U., Quintanilla-Martinez, L., Poli, S., Pichler, B. J.

World Molecular Imaging Conference, 2016 (talk)

link (url) [BibTex]

2016

link (url) [BibTex]


no image
Screening Rules for Convex Problems

Raj, A., Olbrich, J., Gärtner, B., Schölkopf, B., Jaggi, M.

2016 (unpublished) Submitted

[BibTex]

[BibTex]


no image
Multi-view learning on multiparametric PET/MRI quantifies intratumoral heterogeneity and determines therapy efficacy

Katiyar, P., Divine, M. R., Kohlhofer, U., Quintanilla-Martinez, L., Siegemund, M., Pfizenmaier, K., Kontermann, R., Pichler, B. J., Disselhorst, J. A.

World Molecular Imaging Conference, 2016 (talk)

link (url) [BibTex]

link (url) [BibTex]

2013


no image
Studying large-scale brain networks: electrical stimulation and neural-event-triggered fMRI

Logothetis, N., Eschenko, O., Murayama, Y., Augath, M., Steudel, T., Evrard, H., Besserve, M., Oeltermann, A.

Twenty-Second Annual Computational Neuroscience Meeting (CNS*2013), July 2013, journal = {BMC Neuroscience}, year = {2013}, month = {7}, volume = {14}, number = {Supplement 1}, pages = {A1}, (talk)

Web [BibTex]

2013

Web [BibTex]


no image
Domain Generalization via Invariant Feature Representation

Muandet, K.

30th International Conference on Machine Learning (ICML2013), 2013 (talk)

PDF [BibTex]

PDF [BibTex]

2010


no image
Comparative Quantitative Evaluation of MR-Based Attenuation Correction Methods in Combined Brain PET/MR

Mantlik, F., Hofmann, M., Bezrukov, I., Kolb, A., Beyer, T., Reimold, M., Pichler, B., Schölkopf, B.

2010(M08-4), 2010 Nuclear Science Symposium and Medical Imaging Conference (NSS-MIC), November 2010 (talk)

Abstract
Combined PET/MR provides at the same time molecular and functional imaging as well as excellent soft tissue contrast. It does not allow one to directly measure the attenuation properties of scanned tissues, despite the fact that accurate attenuation maps are necessary for quantitative PET imaging. Several methods have therefore been proposed for MR-based attenuation correction (MR-AC). So far, they have only been evaluated on data acquired from separate MR and PET scanners. We evaluated several MR-AC methods on data from 10 patients acquired on a combined BrainPET/MR scanner. This allowed the consideration of specific PET/MR issues, such as the RF coil that attenuates and scatters 511 keV gammas. We evaluated simple MR thresholding methods as well as atlas and machine learning-based MR-AC. CT-based AC served as gold standard reference. To comprehensively evaluate the MR-AC accuracy, we used RoIs from 2 anatomic brain atlases with different levels of detail. Visual inspection of the PET images indicated that even the basic FLASH threshold MR-AC may be sufficient for several applications. Using a UTE sequence for bone prediction in MR-based thresholding occasionally led to false prediction of bone tissue inside the brain, causing a significant overestimation of PET activity. Although it yielded a lower mean underestimation of activity, it exhibited the highest variance of all methods. The atlas averaging approach had a smaller mean error, but showed high maximum overestimation on the RoIs of the more detailed atlas. The Nave Bayes and Atlas-Patch MR-AC yielded the smallest variance, and the Atlas-Patch also showed the smallest mean error. In conclusion, Atlas-based AC using only MR information on the BrainPET/MR yields a high level of accuracy that is sufficient for clinical quantitative imaging requirements. The Atlas-Patch approach was superior to alternative atlas-based methods, yielding a quantification error below 10% for all RoIs except very small ones.

[BibTex]

2010

[BibTex]


no image
Statistical image analysis and percolation theory

Davies, P., Langovoy, M., Wittich, O.

73rd Annual Meeting of the Institute of Mathematical Statistics (IMS), August 2010 (talk)

Abstract
We develop a novel method for detection of signals and reconstruction of images in the presence of random noise. The method uses results from percolation theory. We specifically address the problem of detection of objects of unknown shapes in the case of nonparametric noise. The noise density is unknown and can be heavy-tailed. We view the object detection problem as hypothesis testing for discrete statistical inverse problems. We present an algorithm that allows to detect objects of various shapes in noisy images. We prove results on consistency and algorithmic complexity of our procedures.

Web [BibTex]

Web [BibTex]


no image
Statistical image analysis and percolation theory

Langovoy, M., Wittich, O.

28th European Meeting of Statisticians (EMS), August 2010 (talk)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Cooperative Cuts: Graph Cuts with Submodular Edge Weights

Jegelka, S., Bilmes, J.

24th European Conference on Operational Research (EURO XXIV), July 2010 (talk)

Abstract
We introduce cooperative cut, a minimum cut problem whose cost is a submodular function on sets of edges: the cost of an edge that is added to a cut set depends on the edges in the set. Applications are e.g. in probabilistic graphical models and image processing. We prove NP hardness and a polynomial lower bound on the approximation factor, and upper bounds via four approximation algorithms based on different techniques. Our additional heuristics have attractive practical properties, e.g., to rely only on standard min-cut. Both our algorithms and heuristics appear to do well in practice.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Solving Large-Scale Nonnegative Least Squares

Sra, S.

16th Conference of the International Linear Algebra Society (ILAS), June 2010 (talk)

Abstract
We study the fundamental problem of nonnegative least squares. This problem was apparently introduced by Lawson and Hanson [1] under the name NNLS. As is evident from its name, NNLS seeks least-squares solutions that are also nonnegative. Owing to its wide-applicability numerous algorithms have been derived for NNLS, beginning from the active-set approach of Lawson and Han- son [1] leading up to the sophisticated interior-point method of Bellavia et al. [2]. We present a new algorithm for NNLS that combines projected subgradients with the non-monotonic gradient descent idea of Barzilai and Borwein [3]. Our resulting algorithm is called BBSG, and we guarantee its convergence by ex- ploiting properties of NNLS in conjunction with projected subgradients. BBSG is surprisingly simple and scales well to large problems. We substantiate our claims by empirically evaluating BBSG and comparing it with established con- vex solvers and specialized NNLS algorithms. The numerical results suggest that BBSG is a practical method for solving large-scale NNLS problems.

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
Matrix Approximation Problems

Sra, S.

EU Regional School: Rheinisch-Westf{\"a}lische Technische Hochschule Aachen, May 2010 (talk)

PDF AVI [BibTex]

PDF AVI [BibTex]


no image
BCI2000 and Python

Hill, NJ.

Invited lecture at the 7th International BCI2000 Workshop, Pacific Grove, CA, USA, May 2010 (talk)

Abstract
A tutorial, with exercises, on how to integrate your own Python code with the BCI2000 realtime software package.

PDF [BibTex]

PDF [BibTex]


no image
Extending BCI2000 Functionality with Your Own C++ Code

Hill, NJ.

Invited lecture at the 7th International BCI2000 Workshop, Pacific Grove, CA, USA, May 2010 (talk)

Abstract
A tutorial, with exercises, on how to use BCI2000 C++ framework to write your own real-time signal-processing modules.

[BibTex]

[BibTex]


no image
Machine-Learning Methods for Decoding Intentional Brain States

Hill, NJ.

Symposium "Non-Invasive Brain Computer Interfaces: Current Developments and Applications" (BIOMAG), March 2010 (talk)

Abstract
Brain-computer interfaces (BCI) work by making the user perform a specific mental task, such as imagining moving body parts or performing some other covert mental activity, or attending to a particular stimulus out of an array of options, in order to encode their intention into a measurable brain signal. Signal-processing and machine-learning techniques are then used to decode the measured signal to identify the encoded mental state and hence extract the user‘s initial intention. The high-noise high-dimensional nature of brain-signals make robust decoding techniques a necessity. Generally, the approach has been to use relatively simple feature extraction techniques, such as template matching and band-power estimation, coupled to simple linear classifiers. This has led to a prevailing view among applied BCI researchers that (sophisticated) machine-learning is irrelevant since “it doesn‘t matter what classifier you use once your features are extracted.” Using examples from our own MEG and EEG experiments, I‘ll demonstrate how machine-learning principles can be applied in order to improve BCI performance, if they are formulated in a domain-specific way. The result is a type of data-driven analysis that is more than “just” classification, and can be used to find better feature extractors.

PDF Web [BibTex]

PDF Web [BibTex]


no image
PAC-Bayesian Analysis in Unsupervised Learning

Seldin, Y.

Foundations and New Trends of PAC Bayesian Learning Workshop, March 2010 (talk)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Learning Motor Primitives for Robotics

Kober, J., Peters, J.

EVENT Lab: Reinforcement Learning in Robotics and Virtual Reality, January 2010 (talk)

Abstract
The acquisition and self-improvement of novel motor skills is among the most important problems in robotics. Motor primitives offer one of the most promising frameworks for the application of machine learning techniques in this context. Employing the Dynamic Systems Motor primitives originally introduced by Ijspeert et al. (2003), appropriate learning algorithms for a concerted approach of both imitation and reinforcement learning are presented. Using these algorithms new motor skills, i.e., Ball-in-a-Cup, Ball-Paddling and Dart-Throwing, are learned.

[BibTex]

[BibTex]


no image
From Motor Learning to Interaction Learning in Robots

Sigaud, O., Peters, J.

pages: 538, Studies in Computational Intelligence ; 264, (Editors: O Sigaud, J Peters), Springer, Berlin, Germany, January 2010 (book)

Abstract
From an engineering standpoint, the increasing complexity of robotic systems and the increasing demand for more autonomously learning robots, has become essential. This book is largely based on the successful workshop "From motor to interaction learning in robots" held at the IEEE/RSJ International Conference on Intelligent Robot Systems. The major aim of the book is to give students interested the topics described above a chance to get started faster and researchers a helpful compandium.

Web DOI [BibTex]

Web DOI [BibTex]

2006


no image
A Kernel Method for the Two-Sample-Problem

Gretton, A., Borgwardt, K., Rasch, M., Schölkopf, B., Smola, A.

20th Annual Conference on Neural Information Processing Systems (NIPS), December 2006 (talk)

Abstract
We propose two statistical tests to determine if two samples are from different distributions. Our test statistic is in both cases the distance between the means of the two samples mapped into a reproducing kernel Hilbert space (RKHS). The first test is based on a large deviation bound for the test statistic, while the second is based on the asymptotic distribution of this statistic. We show that the test statistic can be computed in $O(m^2)$ time. We apply our approach to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where our test performs strongly. We also demonstrate excellent performance when comparing distributions over graphs, for which no alternative tests currently exist.

PDF [BibTex]

2006

PDF [BibTex]


no image
Ab-initio gene finding using machine learning

Schweikert, G., Zeller, G., Zien, A., Ong, C., de Bona, F., Sonnenburg, S., Phillips, P., Rätsch, G.

NIPS Workshop on New Problems and Methods in Computational Biology, December 2006 (talk)

Web [BibTex]

Web [BibTex]


no image
Reinforcement Learning by Reward-Weighted Regression

Peters, J.

NIPS Workshop: Towards a New Reinforcement Learning? , December 2006 (talk)

Web [BibTex]

Web [BibTex]


no image
Graph boosting for molecular QSAR analysis

Saigo, H., Kadowaki, T., Kudo, T., Tsuda, K.

NIPS Workshop on New Problems and Methods in Computational Biology, December 2006 (talk)

Abstract
We propose a new boosting method that systematically combines graph mining and mathematical programming-based machine learning. Informative and interpretable subgraph features are greedily found by a series of graph mining calls. Due to our mathematical programming formulation, subgraph features and pre-calculated real-valued features are seemlessly integrated. We tested our algorithm on a quantitative structure-activity relationship (QSAR) problem, which is basically a regression problem when given a set of chemical compounds. In benchmark experiments, the prediction accuracy of our method favorably compared with the best results reported on each dataset.

Web [BibTex]

Web [BibTex]


no image
Inferring Causal Directions by Evaluating the Complexity of Conditional Distributions

Sun, X., Janzing, D., Schölkopf, B.

NIPS Workshop on Causality and Feature Selection, December 2006 (talk)

Abstract
We propose a new approach to infer the causal structure that has generated the observed statistical dependences among n random variables. The idea is that the factorization of the joint measure of cause and effect into P(cause)P(effect|cause) leads typically to simpler conditionals than non-causal factorizations. To evaluate the complexity of the conditionals we have tried two methods. First, we have compared them to those which maximize the conditional entropy subject to the observed first and second moments since we consider the latter as the simplest conditionals. Second, we have fitted the data with conditional probability measures being exponents of functions in an RKHS space and defined the complexity by a Hilbert-space semi-norm. Such a complexity measure has several properties that are useful for our purpose. We describe some encouraging results with both methods applied to real-world data. Moreover, we have combined constraint-based approaches to causal discovery (i.e., methods using only information on conditional statistical dependences) with our method in order to distinguish between causal hypotheses which are equivalent with respect to the imposed independences. Furthermore, we compare the performance to Bayesian approaches to causal inference.

Web [BibTex]


no image
Learning Optimal EEG Features Across Time, Frequency and Space

Farquhar, J., Hill, J., Schölkopf, B.

NIPS Workshop on Current Trends in Brain-Computer Interfacing, December 2006 (talk)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Semi-Supervised Learning

Zien, A.

Advanced Methods in Sequence Analysis Lectures, November 2006 (talk)

Web [BibTex]

Web [BibTex]


no image
A Machine Learning Approach for Determining the PET Attenuation Map from Magnetic Resonance Images

Hofmann, M., Steinke, F., Judenhofer, M., Claussen, C., Schölkopf, B., Pichler, B.

IEEE Medical Imaging Conference, November 2006 (talk)

Abstract
A promising new combination in multimodality imaging is MR-PET, where the high soft tissue contrast of Magnetic Resonance Imaging (MRI) and the functional information of Positron Emission Tomography (PET) are combined. Although many technical problems have recently been solved, it is still an open problem to determine the attenuation map from the available MR scan, as the MR intensities are not directly related to the attenuation values. One standard approach is an atlas registration where the atlas MR image is aligned with the patient MR thus also yielding an attenuation image for the patient. We also propose another approach, which to our knowledge has not been tried before: Using Support Vector Machines we predict the attenuation value directly from the local image information. We train this well-established machine learning algorithm using small image patches. Although both approaches sometimes yielded acceptable results, they also showed their specific shortcomings: The registration often fails with large deformations whereas the prediction approach is problematic when the local image structure is not characteristic enough. However, the failures often do not coincide and integration of both information sources is promising. We therefore developed a combination method extending Support Vector Machines to use not only local image structure but also atlas registered coordinates. We demonstrate the strength of this combination approach on a number of examples.

[BibTex]

[BibTex]


no image
Semi-Supervised Support Vector Machines and Application to Spam Filtering

Zien, A.

ECML Discovery Challenge Workshop, September 2006 (talk)

Abstract
After introducing the semi-supervised support vector machine (aka TSVM for "transductive SVM"), a few popular training strategies are briefly presented. Then the assumptions underlying semi-supervised learning are reviewed. Finally, two modern TSVM optimization techniques are applied to the spam filtering data sets of the workshop; it is shown that they can achieve excellent results, if the problem of the data being non-iid can be handled properly.

PDF Web [BibTex]


no image
Semi-Supervised Learning

Chapelle, O., Schölkopf, B., Zien, A.

pages: 508, Adaptive computation and machine learning, MIT Press, Cambridge, MA, USA, September 2006 (book)

Abstract
In the field of machine learning, semi-supervised learning (SSL) occupies the middle ground, between supervised learning (in which all training examples are labeled) and unsupervised learning (in which no label data are given). Interest in SSL has increased in recent years, particularly because of application domains in which unlabeled data are plentiful, such as images, text, and bioinformatics. This first comprehensive overview of SSL presents state-of-the-art algorithms, a taxonomy of the field, selected applications, benchmark experiments, and perspectives on ongoing and future research. Semi-Supervised Learning first presents the key assumptions and ideas underlying the field: smoothness, cluster or low-density separation, manifold structure, and transduction. The core of the book is the presentation of SSL methods, organized according to algorithmic strategies. After an examination of generative models, the book describes algorithms that implement the low-density separation assumption, graph-based methods, and algorithms that perform two-step learning. The book then discusses SSL applications and offers guidelines for SSL practitioners by analyzing the results of extensive benchmark experiments. Finally, the book looks at interesting directions for SSL research. The book closes with a discussion of the relationship between semi-supervised learning and transduction.

Web [BibTex]

Web [BibTex]


no image
Inferential Structure Determination: Probabilistic determination and validation of NMR structures

Habeck, M.

Gordon Research Conference on Computational Aspects of Biomolecular NMR, September 2006 (talk)

Web [BibTex]

Web [BibTex]


no image
Machine Learning Algorithms for Polymorphism Detection

Schweikert, G., Zeller, G., Clark, R., Ossowski, S., Warthmann, N., Shinn, P., Frazer, K., Ecker, J., Huson, D., Weigel, D., Schölkopf, B., Rätsch, G.

2nd ISCB Student Council Symposium, August 2006 (talk)

Abstract
Analyzing resequencing array data using machine learning, we obtain a genome-wide inventory of polymorphisms in 20 wild strains of Arabidopsis thaliana, including 750,000 single nucleotide poly- morphisms (SNPs) and thousands of highly polymorphic regions and deletions. We thus provide an unprecedented resource for the study of natural variation in plants.

Web [BibTex]

Web [BibTex]


no image
Inferential structure determination: Overview and new developments

Habeck, M.

Sixth CCPN Annual Conference: Efficient and Rapid Structure Determination by NMR, July 2006 (talk)

Web [BibTex]

Web [BibTex]


no image
MCMC inference in (Conditionally) Conjugate Dirichlet Process Gaussian Mixture Models

Rasmussen, C., Görür, D.

ICML Workshop on Learning with Nonparametric Bayesian Methods, June 2006 (talk)

Abstract
We compare the predictive accuracy of the Dirichlet Process Gaussian mixture models using conjugate and conditionally conjugate priors and show that better density models result from using the wider class of priors. We explore several MCMC schemes exploiting conditional conjugacy and show their computational merits on several multidimensional density estimation problems.

Web [BibTex]

Web [BibTex]


no image
Sampling for non-conjugate infinite latent feature models

Görür, D., Rasmussen, C.

(Editors: Bernardo, J. M.), 8th Valencia International Meeting on Bayesian Statistics (ISBA), June 2006 (talk)

Abstract
Latent variable models are powerful tools to model the underlying structure in data. Infinite latent variable models can be defined using Bayesian nonparametrics. Dirichlet process (DP) models constitute an example of infinite latent class models in which each object is assumed to belong to one of the, mutually exclusive, infinitely many classes. Recently, the Indian buffet process (IBP) has been defined as an extension of the DP. IBP is a distribution over sparse binary matrices with infinitely many columns which can be used as a distribution for non-exclusive features. Inference using Markov chain Monte Carlo (MCMC) in conjugate IBP models has been previously described, however requiring conjugacy restricts the use of IBP. We describe an MCMC algorithm for non-conjugate IBP models. Modelling the choice behaviour is an important topic in psychology, economics and related fields. Elimination by Aspects (EBA) is a choice model that assumes each alternative has latent features with associated weights that lead to the observed choice outcomes. We formulate a non-parametric version of EBA by using IBP as the prior over the latent binary features. We infer the features of objects that lead to the choice data by using our sampling scheme for inference.

PDF [BibTex]

PDF [BibTex]


no image
An Inventory of Sequence Polymorphisms For Arabidopsis

Clark, R., Ossowski, S., Schweikert, G., Rätsch, G., Shinn, P., Zeller, G., Warthmann, N., Fu, G., Hinds, D., Chen, H., Frazer, K., Huson, D., Schölkopf, B., Nordborg, M., Ecker, J., Weigel, D.

17th International Conference on Arabidopsis Research, April 2006 (talk)

Abstract
We have used high-density oligonucleotide arrays to characterize common sequence variation in 20 wild strains of Arabidopsis thaliana that were chosen for maximal genetic diversity. Both strands of each possible SNP of the 119 Mb reference genome were represented on the arrays, which were hybridized with whole genome, isothermally amplified DNA to minimize ascertainment biases. Using two complementary approaches, a model based algorithm, and a newly developed machine learning method, we identified over 550,000 SNPs with a false discovery rate of ~ 0.03 (average of 1 SNP for every 216 bp of the genome). A heuristic algorithm predicted in addition ~700 highly polymorphic or deleted regions per accession. Over 700 predicted polymorphisms with major functional effects (e.g., premature stop codons, or deletions of coding sequence) were validated by dideoxy sequencing. Using this data set, we provide the first systematic description of the types of genes that harbor major effect polymorphisms in natural populations at moderate allele frequencies. The data also provide an unprecedented resource for the study of genetic variation in an experimentally tractable, multicellular model organism.

[BibTex]

[BibTex]


no image
Machine Learning and Applications in Biology

Shin, H.

6th Course in Bioinformatics for Molecular Biologist, March 2006 (talk)

Abstract
The emergence of the fields of computational biology and bioinformatics has alleviated the burden of solving many biological problems, saving the time and cost required for experiments and also providing predictions that guide new experiments. Within computational biology, machine learning algorithms have played a central role in dealing with the flood of biological data. The goal of this tutorial is to raise awareness and comprehension of machine learning so that biologists can properly match the task at hand to the corresponding analytical approach. We start by categorizing biological problem settings and introduce the general machine learning schemes that fit best to each or these categories. We then explore representative models in further detail, from traditional statistical models to recent kernel models, presenting several up-to-date research projects in bioinfomatics to exemplify how biological questions can benefit from a machine learning approach. Finally, we discuss how cooperation between biologists and machine learners might be made smoother.

PDF [BibTex]

PDF [BibTex]


no image
Gaussian Processes for Machine Learning

Rasmussen, CE., Williams, CKI.

pages: 248, Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, USA, January 2006 (book)

Abstract
Gaussian processes (GPs) provide a principled, practical, probabilistic approach to learning in kernel machines. GPs have received increased attention in the machine-learning community over the past decade, and this book provides a long-needed systematic and unified treatment of theoretical and practical aspects of GPs in machine learning. The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics. The book deals with the supervised-learning problem for both regression and classification, and includes detailed algorithms. A wide variety of covariance (kernel) functions are presented and their properties discussed. Model selection is discussed both from a Bayesian and a classical perspective. Many connections to other well-known techniques from machine learning and statistics are discussed, including support-vector machines, neural networks, splines, regularization networks, relevance vector machines and others. Theoretical issues including learning curves and the PAC-Bayesian framework are treated, and several approximation methods for learning with large datasets are discussed. The book contains illustrative examples and exercises, and code and datasets are available on the Web. Appendixes provide mathematical background and a discussion of Gaussian Markov processes.

Web [BibTex]

Web [BibTex]

2003


no image
Learning Control and Planning from the View of Control Theory and Imitation

Peters, J., Schaal, S.

NIPS Workshop "Planning for the Real World: The promises and challenges of dealing with uncertainty", December 2003 (talk)

Abstract
Learning control and planning in high dimensional continuous state-action systems, e.g., as needed in a humanoid robot, has so far been a domain beyond the applicability of generic planning techniques like reinforcement learning and dynamic programming. This talk describes an approach we have taken in order to enable complex robotics systems to learn to accomplish control tasks. Adaptive learning controllers equipped with statistical learning techniques can be used to learn tracking controllers -- missing state information and uncertainty in the state estimates are usually addressed by observers or direct adaptive control methods. Imitation learning is used as an ingredient to seed initial control policies whose output is a desired trajectory suitable to accomplish the task at hand. Reinforcement learning with stochastic policy gradients using a natural gradient forms the third component that allows refining the initial control policy until the task is accomplished. In comparison to general learning control, this approach is highly prestructured and thus more domain specific. However, it seems to be a theoretically clean and feasible strategy for control systems of the complexity that we need to address.

Web [BibTex]

2003

Web [BibTex]


no image
Recurrent neural networks from learning attractor dynamics

Schaal, S., Peters, J.

NIPS Workshop on RNNaissance: Recurrent Neural Networks, December 2003 (talk)

Abstract
Many forms of recurrent neural networks can be understood in terms of dynamic systems theory of difference equations or differential equations. Learning in such systems corresponds to adjusting some internal parameters to obtain a desired time evolution of the network, which can usually be characterized in term of point attractor dynamics, limit cycle dynamics, or, in some more rare cases, as strange attractor or chaotic dynamics. Finding a stable learning process to adjust the open parameters of the network towards shaping the desired attractor type and basin of attraction has remain a complex task, as the parameter trajectories during learning can lead the system through a variety of undesirable unstable behaviors, such that learning may never succeed. In this presentation, we review a recently developed learning framework for a class of recurrent neural networks that employs a more structured network approach. We assume that the canonical system behavior is known a priori, e.g., it is a point attractor or a limit cycle. With either supervised learning or reinforcement learning, it is possible to acquire the transformation from a simple representative of this canonical behavior (e.g., a 2nd order linear point attractor, or a simple limit cycle oscillator) to the desired highly complex attractor form. For supervised learning, one shot learning based on locally weighted regression techniques is possible. For reinforcement learning, stochastic policy gradient techniques can be employed. In any case, the recurrent network learned by these methods inherits the stability properties of the simple dynamic system that underlies the nonlinear transformation, such that stability of the learning approach is not a problem. We demonstrate the success of this approach for learning various skills on a humanoid robot, including tasks that require to incorporate additional sensory signals as coupling terms to modify the recurrent network evolution on-line.

Web [BibTex]

Web [BibTex]


no image
Statistical Learning Theory

Bousquet, O.

Machine Learning Summer School, August 2003 (talk)

PDF [BibTex]

PDF [BibTex]


no image
Remarks on Statistical Learning Theory

Bousquet, O.

Machine Learning Summer School, August 2003 (talk)

PDF [BibTex]

PDF [BibTex]


no image
Rademacher and Gaussian averages in Learning Theory

Bousquet, O.

Universite de Marne-la-Vallee, March 2003 (talk)

PDF [BibTex]

PDF [BibTex]


no image
Introduction: Robots with Cognition?

Franz, MO.

6, pages: 38, (Editors: H.H. Bülthoff, K.R. Gegenfurtner, H.A. Mallot, R. Ulrich, F.A. Wichmann), 6. T{\"u}binger Wahrnehmungskonferenz (TWK), February 2003 (talk)

Abstract
Using robots as models of cognitive behaviour has a long tradition in robotics. Parallel to the historical development in cognitive science, one observes two major, subsequent waves in cognitive robotics. The first is based on ideas of classical, cognitivist Artificial Intelligence (AI). According to the AI view of cognition as rule-based symbol manipulation, these robots typically try to extract symbolic descriptions of the environment from their sensors that are used to update a common, global world representation from which, in turn, the next action of the robot is derived. The AI approach has been successful in strongly restricted and controlled environments requiring well-defined tasks, e.g. in industrial assembly lines. AI-based robots mostly failed, however, in the unpredictable and unstructured environments that have to be faced by mobile robots. This has provoked the second wave in cognitive robotics which tries to achieve cognitive behaviour as an emergent property from the interaction of simple, low-level modules. Robots of the second wave are called animats as their architecture is designed to closely model aspects of real animals. Using only simple reactive mechanisms and Hebbian-type or evolutionary learning, the resulting animats often outperformed the highly complex AI-based robots in tasks such as obstacle avoidance, corridor following etc. While successful in generating robust, insect-like behaviour, typical animats are limited to stereotyped, fixed stimulus-response associations. If one adopts the view that cognition requires a flexible, goal-dependent choice of behaviours and planning capabilities (H.A. Mallot, Kognitionswissenschaft, 1999, 40-48) then it appears that cognitive behaviour cannot emerge from a collection of purely reactive modules. It rather requires environmentally decoupled structures that work without directly engaging the actions that it is concerned with. This poses the current challenge to cognitive robotics: How can we build cognitive robots that show the robustness and the learning capabilities of animats without falling back into the representational paradigm of AI? The speakers of the symposium present their approaches to this question in the context of robot navigation and sensorimotor learning. In the first talk, Prof. Helge Ritter introduces a robot system for imitation learning capable of exploring various alternatives in simulation before actually performing a task. The second speaker, Angelo Arleo, develops a model of spatial memory in rat navigation based on his electrophysiological experiments. He validates the model on a mobile robot which, in some navigation tasks, shows a performance comparable to that of the real rat. A similar model of spatial memory is used to investigate the mechanisms of territory formation in a series of robot experiments presented by Prof. Hanspeter Mallot. In the last talk, we return to the domain of sensorimotor learning where Ralf M{\"o}ller introduces his approach to generate anticipatory behaviour by learning forward models of sensorimotor relationships.

Web [BibTex]

Web [BibTex]

2002


no image
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Schölkopf, B., Smola, A.

pages: 644, Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, USA, December 2002, Parts of this book, including an introduction to kernel methods, can be downloaded here. (book)

Abstract
In the 1990s, a new type of learning algorithm was developed, based on results from statistical learning theory: the Support Vector Machine (SVM). This gave rise to a new class of theoretically elegant learning machines that use a central concept of SVMs-kernels—for a number of learning tasks. Kernel machines provide a modular framework that can be adapted to different tasks and domains by the choice of the kernel function and the base algorithm. They are replacing neural networks in a variety of fields, including engineering, information retrieval, and bioinformatics. Learning with Kernels provides an introduction to SVMs and related kernel methods. Although the book begins with the basics, it also includes the latest research. It provides all of the concepts necessary to enable a reader equipped with some basic mathematical knowledge to enter the world of machine learning using theoretically well-founded yet easy-to-use kernel algorithms and to understand and apply the powerful algorithms that have been developed over the last few years.

Web [BibTex]

2002

Web [BibTex]