Header logo is ei


2008


no image
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search

Lampert, C., Blaschko, M., Hofmann, T.

In CVPR 2008, pages: 1-8, IEEE Computer Society, Los Alamitos, CA, USA, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June 2008, Best paper award (inproceedings)

Abstract
Most successful object recognition systems rely on binary classification, deciding only if an object is present or not, but not providing information on the actual object location. To perform localization, one can take a sliding window approach, but this strongly increases the computational cost, because the classifier function has to be evaluated over a large set of candidate subwindows. In this paper, we propose a simple yet powerful branchand- bound scheme that allows efficient maximization of a large class of classifier functions over all possible subimages. It converges to a globally optimal solution typically in sublinear time. We show how our method is applicable to different object detection and retrieval scenarios. The achieved speedup allows the use of classifiers for localization that formerly were considered too slow for this task, such as SVMs with a spatial pyramid kernel or nearest neighbor classifiers based on the 2-distance. We demonstrate state-of-the-art performance of the resulting systems on the UIUC Cars dataset, the PASCAL VOC 2006 dataset and in the PASCAL VOC 2007 competition.

PDF PDF Web DOI [BibTex]

2008

PDF PDF Web DOI [BibTex]


no image
Computed Torque Control with Nonparametric Regression Models

Nguyen-Tuong, D., Seeger, M., Peters, J.

In ACC 2008, pages: 212-217, IEEE Service Center, Piscataway, NJ, USA, 2008 American Control Conference, June 2008 (inproceedings)

Abstract
Computed torque control allows the design of considerably more precise, energy-efficient and compliant controls for robots. However, the major obstacle is the requirement of an accurate model for torque generation, which cannot be obtained in some cases using rigid-body formulations due to unmodeled nonlinearities, such as complex friction or actuator dynamics. In such cases, models approximated from robot data present an appealing alternative. In this paper, we compare two nonparametric regression methods for model approximation, i.e., locally weighted projection regression (LWPR) and Gaussian process regression (GPR). While locally weighted regression was employed for real-time model estimation in learning adaptive control, Gaussian process regression has not been used in control to-date due to high computational requirements. The comparison includes the assessment of model approximation for both regression methods using data originated from SARCOS robot arm, as well as an evaluation of the robot tracking p erformance in computed torque control employing the approximated models. Our results show that GPR can be applied for real-time control achieving higher accuracy. However, for the online learning LWPR is superior by reason of lower computational requirements.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Multi-Classification by Categorical Features via Clustering

Seldin, Y., Tishby, N.

In In the proceedings of the 25th International Conference on Machine Learning (ICML 2008), pages: 920-927, 25th International Conference on Machine Learning (ICML), June 2008 (inproceedings)

Abstract
We derive a generalization bound for multi-classification schemes based on grid clustering in categorical parameter product spaces. Grid clustering partitions the parameter space in the form of a Cartesian product of partitions for each of the parameters. The derived bound provides a means to evaluate clustering solutions in terms of the generalization power of a built-on classifier. For classification based on a single feature the bound serves to find a globally optimal classification rule. Comparison of the generalization power of individual features can then be used for feature ranking. Our experiments show that in this role the bound is much more precise than mutual information or normalized correlation indices.

PDF Web [BibTex]

PDF Web [BibTex]


no image
A Kernel Test of Nonlinear Granger Causality

Sun, X.

In Proceedings of the Workshop on Inference and Estimation in Probabilistic Time-Series Models, pages: 79-89, (Editors: Barber, D. , A. T. Cemgil, S. Chiappa), Isaac Newton Institute for Mathematical Sciences, Cambridge, United Kingdom, Workshop on Inference and Estimation in Probabilistic Time-Series Models, June 2008 (inproceedings)

Abstract
We present a novel test of nonlinear Granger causality in bivariate time series. The trace norm of conditional covariance operators is used to capture the prediction errors. Based on this measure, a subsampling-based multiple testing procedure tests the prediction improvement of one time series by the other one. The distributional properties of the resulting p-values reveal the direction of Granger causality. Encouraging results of experiments with simulated and real-world data support our approach.

PDF [BibTex]

PDF [BibTex]


no image
Thin-Plate Splines Between Riemannian Manifolds

Steinke, F., Hein, M., Schölkopf, B.

Workshop on Geometry and Statistics of Shapes, June 2008 (talk)

Abstract
With the help of differential geometry we describe a framework to define a thin-plate spline like energy for maps between arbitrary Riemannian manifolds. The so-called Eells energy only depends on the intrinsic geometry of the input and output manifold, but not on their respective representation. The energy can then be used for regression between manifolds, we present results for cases where the outputs are rotations, sets of angles, or points on 3D surfaces. In the future we plan to also target regression where the output is an element of "shape space", understood as a Riemannian manifold. One could also further explore the meaning of the Eells energy when applied to diffeomorphisms between shapes, especially with regard to its potential use as a distance measure between shapes that does not depend on the embedding or the parametrisation of the shapes.

Web [BibTex]

Web [BibTex]


Thumb xl teaser
Bayesian Color Constancy Revisited

Gehler, P., Rother, C., Blake, A., Minka, T., Sharp, T.

In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR, June 2008, http://dx.doi.org/10.1109/CVPR.2008.4587765 (inproceedings)

website+code+data pdf [BibTex]

website+code+data pdf [BibTex]


no image
Real-time Learning of Resolved Velocity Control on a Mitsubishi PA-10

Peters, J., Nguyen-Tuong, D.

In ICRA 2008, pages: 2872-2877, IEEE Service Center, Piscataway, NJ, USA, 2008 IEEE International Conference on Robotics and Automation, May 2008 (inproceedings)

Abstract
Learning inverse kinematics has long been fascinating the robot learning community. While humans acquire this transformation to complicated tool spaces with ease, it is not a straightforward application for supervised learning algorithms due to non-convex learning problem. However, the key insight that the problem can be considered convex in small local regions allows the application of locally linear learning methods. Nevertheless, the local solution of the problem depends on the data distribution which can result into inconsistent global solutions with large model discontinuities. While this problem can be treated in various ways in offline learning, it poses a serious problem for online learning. Previous approaches to the real-time learning of inverse kinematics avoid this problem using smart data generation, such as the learner biasses its own solution. Such biassed solutions can result into premature convergence, and from the resulting solution it is often hard to understand what has been learned in tha t local region. This paper improves and solves this problem by presenting a learning algorithm which can deal with this inconsistency through re-weighting the data online. Furthermore, we show that our algorithms work not only in simulation, but we present real-time learning results on a physical Mitsubishi PA-10 robot arm.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
State Space Compression with Predictive Representations

Boularias, A., Izadi, M., Chaib-Draa, B.

In Flairs 2008, pages: 41-46, (Editors: Wilson, D. C., H. C. Lane), AAAI Press, Menlo Park, CA, USA, 21st International Florida Artificial Intelligence Research Society Conference, May 2008 (inproceedings)

Abstract
Current studies have demonstrated that the representational power of predictive state representations (PSRs) is at least equal to the one of partially observable Markov decision processes (POMDPs). This is while early steps in planning and generalization with PSRs suggest substantial improvements compared to POMDPs. However, lack of practical algorithms for learning these representations severely restricts their applicability. The computational inefficiency of exact PSR learning methods naturally leads to the exploration of various approximation methods that can provide a good set of core tests through less computational effort. In this paper, we address this problem in an optimization framework. In particular, our approach aims to minimize the potential error that may be caused by missing a number of core tests. We provide analysis of the error caused by this compression and present an empirical evaluation illustrating the performance of this approach.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Learning resolved velocity control

Peters, J.

2008 IEEE International Conference on Robotics and Automation (ICRA), May 2008 (talk)

Web [BibTex]

Web [BibTex]


no image
Causal inference from statistical data

Sun, X.

Biologische Kybernetik, Technische Hochschule Karlsruhe, Karlsruhe, Germany, April 2008 (phdthesis)

Web [BibTex]

Web [BibTex]


no image
Pairwise Correlations and Multineuronal Firing Patterns in Primary Visual Cortex

Berens, P.

Biologische Kybernetik, Eberhard Karls Universität Tübingen, Tübingen, Germany, April 2008 (diplomathesis)

[BibTex]

[BibTex]


no image
Bayesian methods for protein structure determination

Habeck, M.

Machine Learning in Structural Bioinformatics, April 2008 (talk)

Web [BibTex]

Web [BibTex]


no image
Graph Mining with Variational Dirichlet Process Mixture Models

Tsuda, K., Kurihara, K.

In SDM 2008, pages: 432-442, (Editors: Zaki, M. J.), Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 8th SIAM International Conference on Data Mining, April 2008 (inproceedings)

Abstract
Graph data such as chemical compounds and XML documents are getting more common in many application domains. A main difficulty of graph data processing lies in the intrinsic high dimensionality of graphs, namely, when a graph is represented as a binary feature vector of indicators of all possible subgraph patterns, the dimensionality gets too large for usual statistical methods. We propose a nonparametric Bayesian method for clustering graphs and selecting salient patterns at the same time. Variational inference is adopted here, because sampling is not applicable due to extremely high dimensionality. The feature set minimizing the free energy is efficiently collected with the DFS code tree, where the generation of useless subgraphs is suppressed by a tree pruning condition. In experiments, our method is compared with a simpler approach based on frequent subgraph mining, and graph kernels.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Model-Based Reinforcement Learning with Continuous States and Actions

Deisenroth, M., Rasmussen, C., Peters, J.

In ESANN 2008, pages: 19-24, (Editors: Verleysen, M. ), d-side, Evere, Belgium, European Symposium on Artificial Neural Networks, April 2008 (inproceedings)

Abstract
Finding an optimal policy in a reinforcement learning (RL) framework with continuous state and action spaces is challenging. Approximate solutions are often inevitable. GPDP is an approximate dynamic programming algorithm based on Gaussian process (GP) models for the value functions. In this paper, we extend GPDP to the case of unknown transition dynamics. After building a GP model for the transition dynamics, we apply GPDP to this model and determine a continuous-valued policy in the entire state space. We apply the resulting controller to the underpowered pendulum swing up. Moreover, we compare our results on this RL task to a nearly optimal discrete DP solution in a fully known environment.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Learning Inverse Dynamics: A Comparison

Nguyen-Tuong, D., Peters, J., Seeger, M., Schölkopf, B.

In Advances in Computational Intelligence and Learning: Proceedings of the European Symposium on Artificial Neural Networks, pages: 13-18, (Editors: M Verleysen), d-side, Evere, Belgium, 16th European Symposium on Artificial Neural Networks (ESANN), April 2008 (inproceedings)

Abstract
While it is well-known that model can enhance the control performance in terms of precision or energy efficiency, the practical application has often been limited by the complexities of manually obtaining sufficiently accurate models. In the past, learning has proven a viable alternative to using a combination of rigid-body dynamics and handcrafted approximations of nonlinearities. However, a major open question is what nonparametric learning method is suited best for learning dynamics? Traditionally, locally weighted projection regression (LWPR), has been the standard method as it is capable of online, real-time learning for very complex robots. However, while LWPR has had significant impact on learning in robotics, alternative nonparametric regression methods such as support vector regression (SVR) and Gaussian processes regression (GPR) offer interesting alternatives with fewer open parameters and potentially higher accuracy. In this paper, we evaluate these three alternatives for model learning. Our comparison consists out of the evaluation of learning quality for each regression method using original data from SARCOS robot arm, as well as the robot tracking performance employing learned models. The results show that GPR and SVR achieve a superior learning precision and can be applied for real-time control obtaining higher accuracy. However, for the online learning LWPR presents the better method due to its lower computational requirements.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Development and Application of a Python Scripting Framework for BCI2000

Schreiner, T.

Biologische Kybernetik, Eberhard-Karls-Universität Tübingen, Tübingen, Germany, January 2008 (diplomathesis)

[BibTex]

[BibTex]


no image
Efficient and Invariant Regularisation with Application to Computer Graphics

Walder, CJ.

Biologische Kybernetik, University of Queensland, Brisbane, Australia, January 2008 (phdthesis)

Abstract
This thesis develops the theory and practise of reproducing kernel methods. Many functional inverse problems which arise in, for example, machine learning and computer graphics, have been treated with practical success using methods based on a reproducing kernel Hilbert space perspective. This perspective is often theoretically convenient, in that many functional analysis problems reduce to linear algebra problems in these spaces. Somewhat more complex is the case of conditionally positive definite kernels, and we provide an introduction to both cases, deriving in a particularly elementary manner some key results for the conditionally positive definite case. A common complaint of the practitioner is the long running time of these kernel based algorithms. We provide novel ways of alleviating these problems by essentially using a non-standard function basis which yields computational advantages. That said, by doing so we must also forego the aforementioned theoretical conveniences, and hence need some additional analysis which we provide in order to make the approach practicable. We demonstrate that the method leads to state of the art performance on the problem of surface reconstruction from points. We also provide some analysis of kernels invariant to transformations such as translation and dilation, and show that this indicates the value of learning algorithms which use conditionally positive definite kernels. Correspondingly, we provide a few approaches for making such algorithms practicable. We do this either by modifying the kernel, or directly solving problems with conditionally positive definite kernels, which had previously only been solved with positive definite kernels. We demonstrate the advantage of this approach, in particular by attaining state of the art classification performance with only one free parameter.

PDF [BibTex]

PDF [BibTex]

2007


no image
Reaction graph kernels for discovering missing enzymes in the plant secondary metabolism

Saigo, H., Hattori, M., Tsuda, K.

NIPS Workshop on Machine Learning in Computational Biology, December 2007 (talk)

Abstract
Secondary metabolic pathway in plant is important for finding druggable candidate enzymes. However, there are many enzymes whose functions are still undiscovered especially in organism-specific metabolic pathways. We propose reaction graph kernels for automatically assigning the EC numbers to unknown enzymatic reactions in a metabolic network. Experiments are carried out on KEGG/REACTION database and our method successfully predicted the first three digits of the EC number with 83% accuracy.We also exhaustively predicted missing enzymatic functions in the plant secondary metabolism pathways, and evaluated our results in biochemical validity.

Web [BibTex]

2007

Web [BibTex]


no image
Positional Oligomer Importance Matrices

Sonnenburg, S., Zien, A., Philips, P., Rätsch, G.

NIPS Workshop on Machine Learning in Computational Biology, December 2007 (talk)

Abstract
At the heart of many important bioinformatics problems, such as gene finding and function prediction, is the classification of biological sequences, above all of DNA and proteins. In many cases, the most accurate classifiers are obtained by training SVMs with complex sequence kernels, for instance for transcription starts or splice sites. However, an often criticized downside of SVMs with complex kernels is that it is very hard for humans to understand the learned decision rules and to derive biological insights from them. To close this gap, we introduce the concept of positional oligomer importance matrices (POIMs) and develop an efficient algorithm for their computation. We demonstrate how they overcome the limitations of sequence logos, and how they can be used to find relevant motifs for different biological phenomena in a straight-forward way. Note that the concept of POIMs is not limited to interpreting SVMs, but is applicable to general k−mer based scoring systems.

Web [BibTex]

Web [BibTex]


no image
Machine Learning Algorithms for Polymorphism Detection

Schweikert, G., Zeller, G., Weigel, D., Schölkopf, B., Rätsch, G.

NIPS Workshop on Machine Learning in Computational Biology, December 2007 (talk)

Web [BibTex]

Web [BibTex]


no image
An Automated Combination of Kernels for Predicting Protein Subcellular Localization

Zien, A., Ong, C.

NIPS Workshop on Machine Learning in Computational Biology, December 2007 (talk)

Abstract
Protein subcellular localization is a crucial ingredient to many important inferences about cellular processes, including prediction of protein function and protein interactions.We propose a new class of protein sequence kernels which considers all motifs including motifs with gaps. This class of kernels allows the inclusion of pairwise amino acid distances into their computation. We utilize an extension of the multiclass support vector machine (SVM)method which directly solves protein subcellular localization without resorting to the common approach of splitting the problem into several binary classification problems. To automatically search over families of possible amino acid motifs, we optimize over multiple kernels at the same time. We compare our automated approach to four other predictors on three different datasets, and show that we perform better than the current state of the art. Furthermore, our method provides some insights as to which features are most useful for determining subcellular localization, which are in agreement with biological reasoning.

Web [BibTex]

Web [BibTex]


no image
Challenges in Brain-Computer Interface Development: Induction, Measurement, Decoding, Integration

Hill, NJ.

Invited keynote talk at the launch of BrainGain, the Dutch BCI research consortium, November 2007 (talk)

Abstract
I‘ll present a perspective on Brain-Computer Interface development from T{\"u}bingen. Some of the benefits promised by BCI technology lie in the near foreseeable future, and some further away. Our motivation is to make BCI technology feasible for the people who could benefit from what it has to offer soon: namely, people in the "completely locked-in" state. I‘ll mention some of the challenges of working with this user group, and explain the specific directions they have motivated us to take in developing experimental methods, algorithms, and software.

[BibTex]

[BibTex]


no image
Towards compliant humanoids: an experimental assessment of suitable task space position/orientation controllers

Nakanishi, J., Mistry, M., Peters, J., Schaal, S.

In IROS 2007, 2007, pages: 2520-2527, (Editors: Grant, E. , T. C. Henderson), IEEE Service Center, Piscataway, NJ, USA, IEEE/RSJ International Conference on Intelligent Robots and Systems, November 2007 (inproceedings)

Abstract
Compliant control will be a prerequisite for humanoid robotics if these robots are supposed to work safely and robustly in human and/or dynamic environments. One view of compliant control is that a robot should control a minimal number of degrees-of-freedom (DOFs) directly, i.e., those relevant DOFs for the task, and keep the remaining DOFs maximally compliant, usually in the null space of the task. This view naturally leads to task space control. However, surprisingly few implementations of task space control can be found in actual humanoid robots. This paper makes a first step towards assessing the usefulness of task space controllers for humanoids by investigating which choices of controllers are available and what inherent control characteristics they have—this treatment will concern position and orientation control, where the latter is based on a quaternion formulation. Empirical evaluations on an anthropomorphic Sarcos master arm illustrate the robustness of the different controllers as well as the eas e of implementing and tuning them. Our extensive empirical results demonstrate that simpler task space controllers, e.g., classical resolved motion rate control or resolved acceleration control can be quite advantageous in face of inevitable modeling errors in model-based control, and that well chosen formulations are easy to implement and quite robust, such that they are useful for humanoids.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Sistema avanzato per la classificazione delle aree agricole in immagini ad elevata risoluzione geometrica: applicazione al territorio del Trentino

Arnoldi, E., Bruzzone, L., Carlin, L., Pedron, L., Persello, C.

In pages: 1-6, 11. Conferenza Nazionale ASITA, November 2007 (inproceedings)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Some Theoretical Aspects of Human Categorization Behavior: Similarity and Generalization

Jäkel, F.

Biologische Kybernetik, Eberhard-Karls-Universität Tübingen, Tübingen, Germany, November 2007, passed with "ausgezeichnet", summa cum laude, published online (phdthesis)

PDF [BibTex]

PDF [BibTex]


no image
Statistical Learning Theory Approaches to Clustering

Jegelka, S.

Biologische Kybernetik, Eberhard-Karls-Universität Tübingen, Tübingen, Germany, November 2007 (diplomathesis)

PDF [BibTex]

PDF [BibTex]


no image
Performance Stabilization and Improvement in Graph-based Semi-supervised Learning with Ensemble Method and Graph Sharpening

Choi, I., Shin, H.

In Korean Data Mining Society Conference, pages: 257-262, Korean Data Mining Society, Seoul, Korea, Korean Data Mining Society Conference, November 2007 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
Policy Learning for Robotics

Peters, J.

14th International Conference on Neural Information Processing (ICONIP), November 2007 (talk)

Web [BibTex]

Web [BibTex]


no image
Hilbert Space Representations of Probability Distributions

Gretton, A.

2nd Workshop on Machine Learning and Optimization at the ISM, October 2007 (talk)

Abstract
Many problems in unsupervised learning require the analysis of features of probability distributions. At the most fundamental level, we might wish to determine whether two distributions are the same, based on samples from each - this is known as the two-sample or homogeneity problem. We use kernel methods to address this problem, by mapping probability distributions to elements in a reproducing kernel Hilbert space (RKHS). Given a sufficiently rich RKHS, these representations are unique: thus comparing feature space representations allows us to compare distributions without ambiguity. Applications include testing whether cancer subtypes are distinguishable on the basis of DNA microarray data, and whether low frequency oscillations measured at an electrode in the cortex have a different distribution during a neural spike. A more difficult problem is to discover whether two random variables drawn from a joint distribution are independent. It turns out that any dependence between pairs of random variables can be encoded in a cross-covariance operator between appropriate RKHS representations of the variables, and we may test independence by looking at a norm of the operator. We demonstrate this independence test by establishing dependence between an English text and its French translation, as opposed to French text on the same topic but otherwise unrelated. Finally, we show that this operator norm is itself a difference in feature means.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Discriminative Subsequence Mining for Action Classification

Nowozin, S., BakIr, G., Tsuda, K.

In ICCV 2007, pages: 1919-1923, IEEE Computer Society, Los Alamitos, CA, USA, 11th IEEE International Conference on Computer Vision, October 2007 (inproceedings)

Abstract
Recent approaches to action classification in videos have used sparse spatio-temporal words encoding local appearance around interesting movements. Most of these approaches use a histogram representation, discarding the temporal order among features. But this ordering information can contain important information about the action itself, e.g. consider the sport disciplines of hurdle race and long jump, where the global temporal order of motions (running, jumping) is important to discriminate between the two. In this work we propose to use a sequential representation which retains this temporal order. Further, we introduce Discriminative Subsequence Mining to find optimal discriminative subsequence patterns. In combination with the LPBoost classifier, this amounts to simultaneously learning a classification function and performing feature selection in the space of all possible feature sequences. The resulting classifier linearly combines a small number of interpretable decision functions, each checking for the presence of a single discriminative pattern. The classifier is benchmarked on the KTH action classification data set and outperforms the best known results in the literature.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Regression with Intervals

Kashima, H., Yamazaki, K., Saigo, H., Inokuchi, A.

International Workshop on Data-Mining and Statistical Science (DMSS2007), October 2007, JSAI Incentive Award. Talk was given by Hisashi Kashima. (talk)

Web [BibTex]

Web [BibTex]


no image
Unsupervised Classification for non-invasive Brain-Computer-Interfaces

Eren, S., Grosse-Wentrup, M., Buss, M.

In Automed 2007, pages: 65-66, VDI Verlag, Düsseldorf, Germany, Automed Workshop, October 2007 (inproceedings)

Abstract
Non-invasive Brain-Computer-Interfaces (BCIs) are devices that infer the intention of human subjects from signals generated by the central nervous system and recorded outside the skull, e.g., by electroencephalography (EEG). They can be used to enable basic communication for patients who are not able to communicate by normal means, e.g., due to neuro-degenerative diseases such as amyotrophic lateral sclerosis (ALS) (see [Vaughan2003] for a review). One challenge in research on BCIs is minimizing the training time prior to usage of the BCI. Since EEG patterns vary across subjects, it is usually necessary to record a number of trials in which the intention of the user is known to train a classifier. This classifier is subsequently used to infer the intention of the BCI-user. In this paper, we present the application of an unsupervised classification method to a binary noninvasive BCI based on motor imagery. The result is a BCI that does not require any training, since the mapping from EEG pattern changes to the intention of the user is learned online by the BCI without any feedback. We present experimental results from six healthy subjects, three of which display classification errors below 15%. We conclude that unsupervised BCIs are a viable option, but not yet as reliable as supervised BCIs. The rest of this paper is organized as follows. In the Methods section, we first introduce the experimental paradigm. This is followed by a description of the methods used for spatial filtering, feature extraction, and unsupervised classification. We then present the experimental results, and conclude the paper with a brief discussion.

PDF Web [BibTex]

PDF Web [BibTex]


no image
A Hilbert Space Embedding for Distributions

Smola, A., Gretton, A., Song, L., Schölkopf, B.

In Algorithmic Learning Theory, Lecture Notes in Computer Science 4754 , pages: 13-31, (Editors: M Hutter and RA Servedio and E Takimoto), Springer, Berlin, Germany, 18th International Conference on Algorithmic Learning Theory (ALT), October 2007 (inproceedings)

Abstract
We describe a technique for comparing distributions without the need for density estimation as an intermediate step. Our approach relies on mapping the distributions into a reproducing kernel Hilbert space. Applications of this technique can be found in two-sample tests, which are used for determining whether two sets of observations arise from the same distribution, covariate shift correction, local learning, measures of independence, and density estimation.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Cluster Identification in Nearest-Neighbor Graphs

Maier, M., Hein, M., von Luxburg, U.

In ALT 2007, pages: 196-210, (Editors: Hutter, M. , R. A. Servedio, E. Takimoto), Springer, Berlin, Germany, 18th International Conference on Algorithmic Learning Theory, October 2007 (inproceedings)

Abstract
Assume we are given a sample of points from some underlying distribution which contains several distinct clusters. Our goal is to construct a neighborhood graph on the sample points such that clusters are ``identified‘‘: that is, the subgraph induced by points from the same cluster is connected, while subgraphs corresponding to different clusters are not connected to each other. We derive bounds on the probability that cluster identification is successful, and use them to predict ``optimal‘‘ values of k for the mutual and symmetric k-nearest-neighbor graphs. We point out different properties of the mutual and symmetric nearest-neighbor graphs related to the cluster identification problem.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Inducing Metric Violations in Human Similarity Judgements

Laub, J., Macke, J., Müller, K., Wichmann, F.

In Advances in Neural Information Processing Systems 19, pages: 777-784, (Editors: Schölkopf, B. , J. Platt, T. Hofmann), MIT Press, Cambridge, MA, USA, Twentieth Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
Attempting to model human categorization and similarity judgements is both a very interesting but also an exceedingly difficult challenge. Some of the difficulty arises because of conflicting evidence whether human categorization and similarity judgements should or should not be modelled as to operate on a mental representation that is essentially metric. Intuitively, this has a strong appeal as it would allow (dis)similarity to be represented geometrically as distance in some internal space. Here we show how a single stimulus, carefully constructed in a psychophysical experiment, introduces l2 violations in what used to be an internal similarity space that could be adequately modelled as Euclidean. We term this one influential data point a conflictual judgement. We present an algorithm of how to analyse such data and how to identify the crucial point. Thus there may not be a strict dichotomy between either a metric or a non-metric internal space but rather degrees to which potentially large subsets of stimuli are represented metrically with a small subset causing a global violation of metricity.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Cross-Validation Optimization for Large Scale Hierarchical Classification Kernel Methods

Seeger, M.

In Advances in Neural Information Processing Systems 19, pages: 1233-1240, (Editors: Schölkopf, B. , J. Platt, T. Hofmann), MIT Press, Cambridge, MA, USA, Twentieth Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
We propose a highly efficient framework for kernel multi-class models with a large and structured set of classes. Kernel parameters are learned automatically by maximizing the cross-validation log likelihood, and predictive probabilities are estimated. We demonstrate our approach on large scale text classification tasks with hierarchical class structure, achieving state-of-the-art results in an order of magnitude less time than previous work.

PDF Web [BibTex]

PDF Web [BibTex]


no image
A Local Learning Approach for Clustering

Wu, M., Schölkopf, B.

In Advances in Neural Information Processing Systems 19, pages: 1529-1536, (Editors: B Schölkopf and J Platt and T Hofmann), MIT Press, Cambridge, MA, USA, 20th Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
We present a local learning approach for clustering. The basic idea is that a good clustering result should have the property that the cluster label of each data point can be well predicted based on its neighboring data and their cluster labels, using current supervised learning methods. An optimization problem is formulated such that its solution has the above property. Relaxation and eigen-decomposition are applied to solve this optimization problem. We also briefly investigate the parameter selection issue and provide a simple parameter selection method for the proposed algorithm. Experimental results are provided to validate the effectiveness of the proposed approach.

PDF Web [BibTex]

PDF Web [BibTex]


no image
MR-Based PET Attenuation Correction: Method and Validation

Hofmann, M., Steinke, F., Scheel, V., Brady, M., Schölkopf, B., Pichler, B.

Joint Molecular Imaging Conference, September 2007 (talk)

Abstract
PET/MR combines the high soft tissue contrast of Magnetic Resonance Imaging (MRI) and the functional information of Positron Emission Tomography (PET). For quantitative PET information, correction of tissue photon attenuation is mandatory. Usually in conventional PET, the attenuation map is obtained from a transmission scan, which uses a rotating source, or from the CT scan in case of combined PET/CT. In the case of a PET/MR scanner, there is insufficient space for the rotating source and ideally one would want to calculate the attenuation map from the MR image instead. Since MR images provide information about proton density of the different tissue types, it is not trivial to use this data for PET attenuation correction. We present a method for predicting the PET attenuation map from a given the MR image, using a combination of atlas-registration and recognition of local patterns. Using "leave one out cross validation" we show on a database of 16 MR-CT image pairs that our method reliably allows estimating the CT image from the MR image. Subsequently, as in PET/CT, the PET attenuation map can be predicted from the CT image. On an additional dataset of MR/CT/PET triplets we quantitatively validate that our approach allows PET quantification with an error that is smaller than what would be clinically significant. We demonstrate our approach on T1-weighted human brain scans. However, the presented methods are more general and current research focuses on applying the established methods to human whole body PET/MRI applications.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Adaptive Spatial Filters with predefined Region of Interest for EEG based Brain-Computer-Interfaces

Grosse-Wentrup, M., Gramann, K., Buss, M.

In Advances in Neural Information Processing Systems 19, pages: 537-544, (Editors: Schölkopf, B. , J. Platt, T. Hofmann), MIT Press, Cambridge, MA, USA, Twentieth Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
The performance of EEG-based Brain-Computer-Interfaces (BCIs) critically depends on the extraction of features from the EEG carrying information relevant for the classification of different mental states. For BCIs employing imaginary movements of different limbs, the method of Common Spatial Patterns (CSP) has been shown to achieve excellent classification results. The CSP-algorithm however suffers from a lack of robustness, requiring training data without artifacts for good performance. To overcome this lack of robustness, we propose an adaptive spatial filter that replaces the training data in the CSP approach by a-priori information. More specifically, we design an adaptive spatial filter that maximizes the ratio of the variance of the electric field originating in a predefined region of interest (ROI) and the overall variance of the measured EEG. Since it is known that the component of the EEG used for discriminating imaginary movements originates in the motor cortex, we design two adaptive spatial filters with the ROIs centered in the hand areas of the left and right motor cortex. We then use these to classify EEG data recorded during imaginary movements of the right and left hand of three subjects, and show that the adaptive spatial filters outperform the CSP-algorithm, enabling classification rates of up to 94.7 % without artifact rejection.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Branch and Bound for Semi-Supervised Support Vector Machines

Chapelle, O., Sindhwani, V., Keerthi, S.

In Advances in Neural Information Processing Systems 19, pages: 217-224, (Editors: Schölkopf, B. , J. Platt, T. Hofmann), MIT Press, Cambridge, MA, USA, Twentieth Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
Semi-supervised SVMs (S3VMs) attempt to learn low-density separators by maximizing the margin over labeled and unlabeled examples. The associated optimization problem is non-convex. To examine the full potential of S3VMs modulo local minima problems in current implementations, we apply branch and bound techniques for obtaining exact, globally optimal solutions. Empirical evidence suggests that the globally optimal solution can return excellent generalization performance in situations where other implementations fail completely. While our current implementation is only applicable to small datasets, we discuss variants that can potentially lead to practically useful algorithms.

PDF Web [BibTex]

PDF Web [BibTex]


no image
A Kernel Method for the Two-Sample-Problem

Gretton, A., Borgwardt, K., Rasch, M., Schölkopf, B., Smola, A.

In Advances in Neural Information Processing Systems 19, pages: 513-520, (Editors: B Schölkopf and J Platt and T Hofmann), MIT Press, Cambridge, MA, USA, 20th Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
We propose two statistical tests to determine if two samples are from different distributions. Our test statistic is in both cases the distance between the means of the two samples mapped into a reproducing kernel Hilbert space (RKHS). The first test is based on a large deviation bound for the test statistic, while the second is based on the asymptotic distribution of this statistic. The test statistic can be computed in $O(m^2)$ time. We apply our approach to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where our test performs strongly. We also demonstrate excellent performance when comparing distributions over graphs, for which no alternative tests currently exist.

PDF Web [BibTex]

PDF Web [BibTex]


no image
An Efficient Method for Gradient-Based Adaptation of Hyperparameters in SVM Models

Keerthi, S., Sindhwani, V., Chapelle, O.

In Advances in Neural Information Processing Systems 19, pages: 673-680, (Editors: Schölkopf, B. , J. Platt, T. Hofmann), MIT Press, Cambridge, MA, USA, Twentieth Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
We consider the task of tuning hyperparameters in SVM models based on minimizing a smooth performance validation function, e.g., smoothed k-fold cross-validation error, using non-linear optimization techniques. The key computation in this approach is that of the gradient of the validation function with respect to hyperparameters. We show that for large-scale problems involving a wide choice of kernel-based models and validation functions, this computation can be very efficiently done; often within just a fraction of the training time. Empirical results show that a near-optimal set of hyperparameters can be identified by our approach with very few training rounds and gradient computations.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Learning Dense 3D Correspondence

Steinke, F., Schölkopf, B., Blanz, V.

In Advances in Neural Information Processing Systems 19, pages: 1313-1320, (Editors: B Schölkopf and J Platt and T Hofmann), MIT Press, Cambridge, MA, USA, 20th Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
Establishing correspondence between distinct objects is an important and nontrivial task: correctness of the correspondence hinges on properties which are difficult to capture in an a priori criterion. While previous work has used a priori criteria which in some cases led to very good results, the present paper explores whether it is possible to learn a combination of features that, for a given training set of aligned human heads, characterizes the notion of correct correspondence. By optimizing this criterion, we are then able to compute correspondence and morphs for novel heads.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Optimal Dominant Motion Estimation using Adaptive Search of Transformation Space

Ulges, A., Lampert, CH., Keysers, D., Breuel, TM.

In DAGM 2007, pages: 204-215, (Editors: Hamprecht, F. A., C. Schnörr, B. Jähne), Springer, Berlin, Germany, 29th Annual Symposium of the German Association for Pattern Recognition, September 2007 (inproceedings)

Abstract
The extraction of a parametric global motion from a motion field is a task with several applications in video processing. We present two probabilistic formulations of the problem and carry out optimization using the RAST algorithm, a geometric matching method novel to motion estimation in video. RAST uses an exhaustive and adaptive search of transformation space and thus gives -- in contrast to local sampling optimization techniques used in the past -- a globally optimal solution. Among other applications, our framework can thus be used as a source of ground truth for benchmarking motion estimation algorithms. Our main contributions are: first, the novel combination of a state-of- the-art MAP criterion for dominant motion estimation with a search procedure that guarantees global optimality. Second, experimental re- sults that illustrate the superior performance of our approach on synthetic flow fields as well as real-world video streams. Third, a significant speedup of the search achieved by extending the mod el with an additional smoothness prior.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Solving Deep Memory POMDPs with Recurrent Policy Gradients

Wierstra, D., Förster, A., Peters, J., Schmidhuber, J.

In ICANN‘07, pages: 697-706, Springer, Berlin, Germany, International Conference on Artificial Neural Networks, September 2007 (inproceedings)

Abstract
This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov decision problems (POMDPs) that require long-term memories of past observations. The approach involves approximating a policy gradient for a Recurrent Neural Network (RNN) by backpropagating return-weighted characteristic eligibilities through time. Using a “Long Short-Term Memory” architecture, we are able to outperform other RL methods on two important benchmark tasks. Furthermore, we show promising results on a complex car driving simulation task.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Output Grouping using Dirichlet Mixtures of Linear Gaussian State-Space Models

Chiappa, S., Barber, D.

In ISPA 2007, pages: 446-451, IEEE Computer Society, Los Alamitos, CA, USA, 5th International Symposium on Image and Signal Processing and Analysis, September 2007 (inproceedings)

Abstract
We consider a model to cluster the components of a vector time-series. The task is to assign each component of the vector time-series to a single cluster, basing this assignment on the simultaneous dynamical similarity of the component to other components in the cluster. This is in contrast to the more familiar task of clustering a set of time-series based on global measures of their similarity. The model is based on a Dirichlet Mixture of Linear Gaussian State-Space models (LGSSMs), in which each LGSSM is treated with a prior to encourage the simplest explanation. The resulting model is approximated using a ‘collapsed’ variational Bayes implementation.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Manifold Denoising

Hein, M., Maier, M.

In Advances in Neural Information Processing Systems 19, pages: 561-568, (Editors: Schölkopf, B. , J. Platt, T. Hofmann), MIT Press, Cambridge, MA, USA, Twentieth Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
We consider the problem of denoising a noisily sampled submanifold $M$ in $R^d$, where the submanifold $M$ is a priori unknown and we are only given a noisy point sample. The presented denoising algorithm is based on a graph-based diffusion process of the point sample. We analyze this diffusion process using recent results about the convergence of graph Laplacians. In the experiments we show that our method is capable of dealing with non-trivial high-dimensional noise. Moreover using the denoising algorithm as pre-processing method we can improve the results of a semi-supervised learning algorithm.

PDF Web [BibTex]

PDF Web [BibTex]


no image
How to Find Interesting Locations in Video: A Spatiotemporal Interest Point Detector Learned from Human Eye movements

Kienzle, W., Schölkopf, B., Wichmann, F., Franz, M.

In Pattern Recognition, pages: 405-414, (Editors: FA Hamprecht and C Schnörr and B Jähne), Springer, Berlin, Germany, 29th Annual Symposium of the German Association for Pattern Recognition (DAGM), September 2007 (inproceedings)

Abstract
Interest point detection in still images is a well-studied topic in computer vision. In the spatiotemporal domain, however, it is still unclear which features indicate useful interest points. In this paper we approach the problem by emph{learning} a detector from examples: we record eye movements of human subjects watching video sequences and train a neural network to predict which locations are likely to become eye movement targets. We show that our detector outperforms current spatiotemporal interest point architectures on a standard classification dataset.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Bayesian Inference for Sparse Generalized Linear Models

Seeger, M., Gerwinn, S., Bethge, M.

In ECML 2007, pages: 298-309, Lecture Notes in Computer Science ; 4701, (Editors: Kok, J. N., J. Koronacki, R. Lopez de Mantaras, S. Matwin, D. Mladenic, A. Skowron), Springer, Berlin, Germany, 18th European Conference on Machine Learning, September 2007 (inproceedings)

Abstract
We present a framework for efficient, accurate approximate Bayesian inference in generalized linear models (GLMs), based on the expectation propagation (EP) technique. The parameters can be endowed with a factorizing prior distribution, encoding properties such as sparsity or non-negativity. The central role of posterior log-concavity in Bayesian GLMs is emphasized and related to stability issues in EP. In particular, we use our technique to infer the parameters of a point process model for neuronal spiking data from multiple electrodes, demonstrating significantly superior predictive performance when a sparsity assumption is enforced via a Laplace prior distribution.

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Implicit Surfaces with Globally Regularised and Compactly Supported Basis Functions

Walder, C., Schölkopf, B., Chapelle, O.

In Advances in Neural Information Processing Systems 19, pages: 273-280, (Editors: B Schölkopf and J Platt and T Hofmann), MIT Press, Cambridge, MA, USA, 20th Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
We consider the problem of constructing a function whose zero set is to represent a surface, given sample points with surface normal vectors. The contributions include a novel means of regularising multi-scale compactly supported basis functions that leads to the desirable properties previously only associated with fully supported bases, and show equivalence to a Gaussian process with modified covariance function. We also provide a regularisation framework for simpler and more direct treatment of surface normals, along with a corresponding generalisation of the representer theorem. We demonstrate the techniques on 3D problems of up to 14 million data points, as well as 4D time series data.

PDF Web [BibTex]

PDF Web [BibTex]