Header logo is ei


2003


no image
Kernel Hebbian Algorithm for Iterative Kernel Principal Component Analysis

Kim, K., Franz, M., Schölkopf, B.

(109), MPI f. biologische Kybernetik, Tuebingen, June 2003 (techreport)

Abstract
A new method for performing a kernel principal component analysis is proposed. By kernelizing the generalized Hebbian algorithm, one can iteratively estimate the principal components in a reproducing kernel Hilbert space with only linear order memory complexity. The derivation of the method, a convergence proof, and preliminary applications in image hyperresolution are presented. In addition, we discuss the extension of the method to the online learning of kernel principal components.

PDF [BibTex]

2003

PDF [BibTex]


no image
Learning with Local and Global Consistency

Zhou, D., Bousquet, O., Lal, T., Weston, J., Schölkopf, B.

(112), Max Planck Institute for Biological Cybernetics, Tuebingen, Germany, June 2003 (techreport)

Abstract
We consider the learning problem in the transductive setting. Given a set of points of which only some are labeled, the goal is to predict the label of the unlabeled points. A principled clue to solve such a learning problem is the consistency assumption that a classifying function should be sufficiently smooth with respect to the structure revealed by these known labeled and unlabeled points. We present a simple algorithm to obtain such a smooth solution. Our method yields encouraging experimental results on a number of classification problems and demonstrates effective use of unlabeled data.

[BibTex]

[BibTex]


no image
Time Complexity Analysis of Fast Pattern Selection Algorithm for SVM

Shin, H., Cho, S.

In Proc. of the Korean Data Mining Conference, pages: 221-231, Korean Data Mining Conference, June 2003 (inproceedings)

[BibTex]

[BibTex]


no image
Ladungsträgerdynamik in optisch angeregten GaAs-Quantendrähten:Relaxation und Transport

Pfingsten, T.

Biologische Kybernetik, Institut für Festkörpertheorie, WWU Münster, June 2003 (diplomathesis)

PDF [BibTex]

PDF [BibTex]


no image
Dealing with large Diagonals in Kernel Matrices

Weston, J., Schölkopf, B., Eskin, E., Leslie, C., Noble, W.

Annals of the Institute of Statistical Mathematics, 55(2):391-408, June 2003 (article)

Abstract
In kernel methods, all the information about the training data is contained in the Gram matrix. If this matrix has large diagonal values, which arises for many types of kernels, then kernel methods do not perform well: We propose and test several methods for dealing with this problem by reducing the dynamic range of the matrix while preserving the positive definiteness of the Hessian of the quadratic programming problem that one has to solve when training a Support Vector Machine, which is a common kernel approach for pattern recognition.

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
The Metric Nearness Problem with Applications

Dhillon, I., Sra, S., Tropp, J.

Univ. of Texas at Austin, June 2003 (techreport)

GZIP [BibTex]

GZIP [BibTex]


no image
Implicit Wiener Series

Franz, M., Schölkopf, B.

(114), Max Planck Institute for Biological Cybernetics, June 2003 (techreport)

Abstract
The Wiener series is one of the standard methods to systematically characterize the nonlinearity of a neural system. The classical estimation method of the expansion coefficients via cross-correlation suffers from severe problems that prevent its application to high-dimensional and strongly nonlinear systems. We propose a new estimation method based on regression in a reproducing kernel Hilbert space that overcomes these problems. Numerical experiments show performance advantages in terms of convergence, interpretability and system size that can be handled.

PDF [BibTex]

PDF [BibTex]


no image
Machine Learning approaches to protein ranking: discriminative, semi-supervised, scalable algorithms

Weston, J., Leslie, C., Elisseeff, A., Noble, W.

(111), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, June 2003 (techreport)

Abstract
A key tool in protein function discovery is the ability to rank databases of proteins given a query amino acid sequence. The most successful method so far is a web-based tool called PSI-BLAST which uses heuristic alignment of a profile built using the large unlabeled database. It has been shown that such use of global information via an unlabeled data improves over a local measure derived from a basic pairwise alignment such as performed by PSI-BLAST's predecessor, BLAST. In this article we look at ways of leveraging techniques from the field of machine learning for the problem of ranking. We show how clustering and semi-supervised learning techniques, which aim to capture global structure in data, can significantly improve over PSI-BLAST.

PDF [BibTex]

PDF [BibTex]


no image
Fast Pattern Selection for Support Vector Classifiers

Shin, H., Cho, S.

In PAKDD 2003, pages: 376-387, (Editors: Whang, K.-Y. , J. Jeon, K. Shim, J. Srivastava), Springer, Berlin, Germany, 7th Pacific-Asia Conference on Knowledge Discovery and Data Mining, May 2003 (inproceedings)

Abstract
Training SVM requires large memory and long cpu time when the pattern set is large. To alleviate the computational burden in SVM training, we propose a fast preprocessing algorithm which selects only the patterns near the decision boundary. Preliminary simulation results were promising: Up to two orders of magnitude, training time reduction was achieved including the preprocessing, without any loss in classification accuracies.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
The em Algorithm for Kernel Matrix Completion with Auxiliary Data

Tsuda, K., Akaho, S., Asai, K.

Journal of Machine Learning Research, 4, pages: 67-81, May 2003 (article)

PDF [BibTex]

PDF [BibTex]


no image
Scaling Reinforcement Learning Paradigms for Motor Control

Peters, J., Vijayakumar, S., Schaal, S.

In JSNC 2003, 10, pages: 1-7, 10th Joint Symposium on Neural Computation (JSNC), May 2003 (inproceedings)

Abstract
Reinforcement learning offers a general framework to explain reward related learning in artificial and biological motor control. However, current reinforcement learning methods rarely scale to high dimensional movement systems and mainly operate in discrete, low dimensional domains like game-playing, artificial toy problems, etc. This drawback makes them unsuitable for application to human or bio-mimetic motor control. In this poster, we look at promising approaches that can potentially scale and suggest a novel formulation of the actor-critic algorithm which takes steps towards alleviating the current shortcomings. We argue that methods based on greedy policies are not likely to scale into high-dimensional domains as they are problematic when used with function approximation – a must when dealing with continuous domains. We adopt the path of direct policy gradient based policy improvements since they avoid the problems of unstabilizing dynamics encountered in traditional value iteration based updates. While regular policy gradient methods have demonstrated promising results in the domain of humanoid notor control, we demonstrate that these methods can be significantly improved using the natural policy gradient instead of the regular policy gradient. Based on this, it is proved that Kakade’s ‘average natural policy gradient’ is indeed the true natural gradient. A general algorithm for estimating the natural gradient, the Natural Actor-Critic algorithm, is introduced. This algorithm converges with probability one to the nearest local minimum in Riemannian space of the cost function. The algorithm outperforms nonnatural policy gradients by far in a cart-pole balancing evaluation, and offers a promising route for the development of reinforcement learning for truly high-dimensionally continuous state-action systems. Keywords: Reinforcement learning, neurodynamic programming, actorcritic methods, policy gradient methods, natural policy gradient

PDF Web [BibTex]

PDF Web [BibTex]


no image
Constructing Descriptive and Discriminative Non-linear Features: Rayleigh Coefficients in Kernel Feature Spaces

Mika, S., Rätsch, G., Weston, J., Schölkopf, B., Smola, A., Müller, K.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(5):623-628, May 2003 (article)

Abstract
We incorporate prior knowledge to construct nonlinear algorithms for invariant feature extraction and discrimination. Employing a unified framework in terms of a nonlinearized variant of the Rayleigh coefficient, we propose nonlinear generalizations of Fisher‘s discriminant and oriented PCA using support vector kernel functions. Extensive simulations show the utility of our approach.

DOI [BibTex]

DOI [BibTex]


no image
The Geometry Of Kernel Canonical Correlation Analysis

Kuss, M., Graepel, T.

(108), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, May 2003 (techreport)

Abstract
Canonical correlation analysis (CCA) is a classical multivariate method concerned with describing linear dependencies between sets of variables. After a short exposition of the linear sample CCA problem and its analytical solution, the article proceeds with a detailed characterization of its geometry. Projection operators are used to illustrate the relations between canonical vectors and variates. The article then addresses the problem of CCA between spaces spanned by objects mapped into kernel feature spaces. An exact solution for this kernel canonical correlation (KCCA) problem is derived from a geometric point of view. It shows that the expansion coefficients of the canonical vectors in their respective feature space can be found by linear CCA in the basis induced by kernel principal component analysis. The effect of mappings into higher dimensional feature spaces is considered critically since it simplifies the CCA problem in general. Then two regularized variants of KCCA are discussed. Relations to other methods are illustrated, e.g., multicategory kernel Fisher discriminant analysis, kernel principal component regression and possible applications thereof in blind source separation.

PDF [BibTex]

PDF [BibTex]


no image
A unifying computational framework for optimization and dynamic systemsapproaches to motor control

Mohajerian, P., Peters, J., Ijspeert, A., Schaal, S.

10th Joint Symposium on Neural Computation (JSNC 2003), 10, pages: 1, May 2003 (poster)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Kernel-based nonlinear blind source separation

Harmeling, S., Ziehe, A., Kawanabe, M., Müller, K.

Neural Computation, 15(5):1089-1124, May 2003 (article)

Abstract
We propose kTDSEP, a kernel-based algorithm for nonlinear blind source separation (BSS). It combines complementary research fields: kernel feature spaces and BSS using temporal information. This yields an efficient algorithm for nonlinear BSS with invertible nonlinearity. Key assumptions are that the kernel feature space is chosen rich enough to approximate the nonlinearity and that signals of interest contain temporal information. Both assumptions are fulfilled for a wide set of real-world applications. The algorithm works as follows: First, the data are (implicitly) mapped to a high (possibly infinite)—dimensional kernel feature space. In practice, however, the data form a smaller submanifold in feature space—even smaller than the number of training data points—a fact that has already been used by, for example, reduced set techniques for support vector machines. We propose to adapt to this effective dimension as a preprocessing step and to construct an orthonormal basis of this submanifold. The latter dimension-reduction step is essential for making the subsequent application of BSS methods computationally and numerically tractable. In the reduced space, we use a BSS algorithm that is based on second-order temporal decorrelation. Finally, we propose a selection procedure to obtain the original sources from the extracted nonlinear components automatically. Experiments demonstrate the excellent performance and efficiency of our kTDSEP algorithm for several problems of nonlinear BSS and for more than two sources.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
A case based comparison of identification with neural network and Gaussian process models.

Kocijan, J., Banko, B., Likar, B., Girard, A., Murray-Smith, R., Rasmussen, CE.

In Proceedings of the International Conference on Intelligent Control Systems and Signal Processing ICONS 2003, 1, pages: 137-142, (Editors: Ruano, E.A.), Proceedings of the International Conference on Intelligent Control Systems and Signal Processing ICONS, April 2003 (inproceedings)

Abstract
In this paper an alternative approach to black-box identification of non-linear dynamic systems is compared with the more established approach of using artificial neural networks. The Gaussian process prior approach is a representative of non-parametric modelling approaches. It was compared on a pH process modelling case study. The purpose of modelling was to use the model for control design. The comparison revealed that even though Gaussian process models can be effectively used for modelling dynamic systems caution has to be axercised when signals are selected.

PDF [BibTex]

PDF [BibTex]


no image
Kernel Methods for Classification and Signal Separation

Gretton, A.

pages: 226, Biologische Kybernetik, University of Cambridge, Cambridge, April 2003 (phdthesis)

PostScript [BibTex]

PostScript [BibTex]


no image
On-Line One-Class Support Vector Machines. An Application to Signal Segmentation

Gretton, A., Desobry, ..

In IEEE ICASSP Vol. 2, pages: 709-712, IEEE ICASSP, April 2003 (inproceedings)

Abstract
In this paper, we describe an efficient algorithm to sequentially update a density support estimate obtained using one-class support vector machines. The solution provided is an exact solution, which proves to be far more computationally attractive than a batch approach. This deterministic technique is applied to the problem of audio signal segmentation, with simulations demonstrating the computational performance gain on toy data sets, and the accuracy of the segmentation on audio signals.

PostScript [BibTex]

PostScript [BibTex]


no image
The Kernel Mutual Information

Gretton, A., Herbrich, R., Smola, A.

Max Planck Institute for Biological Cybernetics, April 2003 (techreport)

Abstract
We introduce two new functions, the kernel covariance (KC) and the kernel mutual information (KMI), to measure the degree of independence of several continuous random variables. The former is guaranteed to be zero if and only if the random variables are pairwise independent; the latter shares this property, and is in addition an approximate upper bound on the mutual information, as measured near independence, and is based on a kernel density estimate. We show that Bach and Jordan‘s kernel generalised variance (KGV) is also an upper bound on the same kernel density estimate, but is looser. Finally, we suggest that the addition of a regularising term in the KGV causes it to approach the KMI, which motivates the introduction of this regularisation. The performance of the KC and KMI is verified in the context of instantaneous independent component analysis (ICA), by recovering both artificial and real (musical) signals following linear mixing.

PostScript [BibTex]

PostScript [BibTex]


no image
Blind separation of post-nonlinear mixtures using gaussianizing transformations and temporal decorrelation

Ziehe, A., Kawanabe, M., Harmeling, S., Müller, K.

In ICA 2003, pages: 269-274, (Editors: Amari, S.-I. , A. Cichocki, S. Makino, N. Murata), 4th International Symposium on Independent Component Analysis and Blind Signal Separation, April 2003 (inproceedings)

Abstract
At the previous workshop (ICA2001) we proposed the ACE-TD method that reduces the post-nonlinear blind source separation problem (PNL BSS) to a linear BSS problem. The method utilizes the Alternating Conditional Expectation (ACE) algorithm to approximately invert the (post-){non-linear} functions. In this contribution, we propose an alternative procedure called Gaussianizing transformation, which is motivated by the fact that linearly mixed signals before nonlinear transformation are approximately Gaussian distributed. This heuristic, but simple and efficient procedure yields similar results as the ACE method and can thus be used as a fast and effective equalization method. After equalizing the nonlinearities, temporal decorrelation separation (TDSEP) allows us to recover the source signals. Numerical simulations on realistic examples are performed to compare "Gauss-TD" with "ACE-TD".

PDF Web [BibTex]

PDF Web [BibTex]


no image
The Kernel Mutual Information

Gretton, A., Herbrich, R., Smola, A.

In IEEE ICASSP Vol. 4, pages: 880-883, IEEE ICASSP, April 2003 (inproceedings)

Abstract
We introduce a new contrast function, the kernel mutual information (KMI), to measure the degree of independence of continuous random variables. This contrast function provides an approximate upper bound on the mutual information, as measured near independence, and is based on a kernel density estimate of the mutual information between a discretised approximation of the continuous random variables. We show that Bach and Jordan‘s kernel generalised variance (KGV) is also an upper bound on the same kernel density estimate, but is looser. Finally, we suggest that the addition of a regularising term in the KGV causes it to approach the KMI, which motivates the introduction of this regularisation.

PostScript [BibTex]

PostScript [BibTex]


no image
Analysing ICA component by injection noise

Harmeling, S., Meinecke, F., Müller, K.

In ICA 2003, pages: 149-154, (Editors: Amari, S.-I. , A. Cichocki, S. Makino, N. Murata), 4th International Symposium on Independent Component Analysis and Blind Signal Separation, April 2003 (inproceedings)

Abstract
Usually, noise is considered to be destructive. We present a new method that constructively injects noise to assess the reliability and the group structure of empirical ICA components. Simulations show that the true root-mean squared angle distances between the real sources and some source estimates can be approximated by our method. In a toy experiment, we see that we are also able to reveal the underlying group structure of extracted ICA components. Furthermore, an experiment with fetal ECG data demonstrates that our approach is useful for exploratory data analysis of real-world data.

PDF Web [BibTex]

PDF Web [BibTex]


no image
A Unifying Computational Framework for Optimization and Dynamic Systems Approaches to Motor Control

Mohajerian, P., Peters, J., Ijspeert, A., Schaal, S.

13th Annual Neural Control of Movement Meeting 2003, 13, pages: 1, April 2003 (poster)

[BibTex]

[BibTex]


no image
Tractable Inference for Probabilistic Data Models

Csato, L., Opper, M., Winther, O.

Complexity, 8(4):64-68, April 2003 (article)

Abstract
We present an approximation technique for probabilistic data models with a large number of hidden variables, based on ideas from statistical physics. We give examples for two nontrivial applications. © 2003 Wiley Periodicals, Inc.

PDF GZIP Web [BibTex]

PDF GZIP Web [BibTex]


no image
Feature selection and transduction for prediction of molecular bioactivity for drug design

Weston, J., Perez-Cruz, F., Bousquet, O., Chapelle, O., Elisseeff, A., Schölkopf, B.

Bioinformatics, 19(6):764-771, April 2003 (article)

Abstract
Motivation: In drug discovery a key task is to identify characteristics that separate active (binding) compounds from inactive (non-binding) ones. An automated prediction system can help reduce resources necessary to carry out this task. Results: Two methods for prediction of molecular bioactivity for drug design are introduced and shown to perform well in a data set previously studied as part of the KDD (Knowledge Discovery and Data Mining) Cup 2001. The data is characterized by very few positive examples, a very large number of features (describing three-dimensional properties of the molecules) and rather different distributions between training and test data. Two techniques are introduced specifically to tackle these problems: a feature selection method for unbalanced data and a classifier which adapts to the distribution of the the unlabeled test data (a so-called transductive method). We show both techniques improve identification performance and in conjunction provide an improvement over using only one of the techniques. Our results suggest the importance of taking into account the characteristics in this data which may also be relevant in other problems of a similar type.

Web [BibTex]


no image
Rademacher and Gaussian averages in Learning Theory

Bousquet, O.

Universite de Marne-la-Vallee, March 2003 (talk)

PDF [BibTex]

PDF [BibTex]


no image
Use of the Zero-Norm with Linear Models and Kernel Methods

Weston, J., Elisseeff, A., Schölkopf, B., Tipping, M.

Journal of Machine Learning Research, 3, pages: 1439-1461, March 2003 (article)

Abstract
We explore the use of the so-called zero-norm of the parameters of linear models in learning. Minimization of such a quantity has many uses in a machine learning context: for variable or feature selection, minimizing training error and ensuring sparsity in solutions. We derive a simple but practical method for achieving these goals and discuss its relationship to existing techniques of minimizing the zero-norm. The method boils down to implementing a simple modification of vanilla SVM, namely via an iterative multiplicative rescaling of the training data. Applications we investigate which aid our discussion include variable and feature selection on biological microarray data, and multicategory classification.

PDF PostScript PDF [BibTex]

PDF PostScript PDF [BibTex]


no image
Phase Information and the Recognition of Natural Images

Braun, D., Wichmann, F., Gegenfurtner, K.

6, pages: 138, (Editors: H.H. Bülthoff, K.R. Gegenfurtner, H.A. Mallot, R. Ulrich, F.A. Wichmann), 6. T{\"u}binger Wahrnehmungskonferenz (TWK), February 2003 (poster)

Abstract
Fourier phase plays an important role in determining image structure. For example, when the phase spectrum of an image showing a ower is swapped with the phase spectrum of an image showing a tank, then we will usually perceive a tank in the resulting image, even though the amplitude spectrum is still that of the ower. Also, when the phases of an image are randomly swapped across frequencies, the resulting image becomes impossible to recognize. Our goal was to evaluate the e ect of phase manipulations in a more quantitative manner. On each trial subjects viewed two images of natural scenes. The subject had to indicate which one of the two images contained an animal. The spectra of the images were manipulated by adding random phase noise at each frequency. The phase noise was uniformly distributed in the interval [;+], where  was varied between 0 degree and 180 degrees. Image pairs were displayed for 100 msec. Subjects were remarkably resistant to the addition of phase noise. Even with [120; 120] degree noise, subjects still were at a level of 75% correct. The introduction of phase noise leads to a reduction of image contrast. Subjects were slightly better than a simple prediction based on this contrast reduction. However, when contrast response functions were measured in the same experimental paradigm, we found that performance in the phase noise experiment was signi cantly lower than that predicted by the corresponding contrast reduction.

Web [BibTex]

Web [BibTex]


no image
Introduction: Robots with Cognition?

Franz, MO.

6, pages: 38, (Editors: H.H. Bülthoff, K.R. Gegenfurtner, H.A. Mallot, R. Ulrich, F.A. Wichmann), 6. T{\"u}binger Wahrnehmungskonferenz (TWK), February 2003 (talk)

Abstract
Using robots as models of cognitive behaviour has a long tradition in robotics. Parallel to the historical development in cognitive science, one observes two major, subsequent waves in cognitive robotics. The first is based on ideas of classical, cognitivist Artificial Intelligence (AI). According to the AI view of cognition as rule-based symbol manipulation, these robots typically try to extract symbolic descriptions of the environment from their sensors that are used to update a common, global world representation from which, in turn, the next action of the robot is derived. The AI approach has been successful in strongly restricted and controlled environments requiring well-defined tasks, e.g. in industrial assembly lines. AI-based robots mostly failed, however, in the unpredictable and unstructured environments that have to be faced by mobile robots. This has provoked the second wave in cognitive robotics which tries to achieve cognitive behaviour as an emergent property from the interaction of simple, low-level modules. Robots of the second wave are called animats as their architecture is designed to closely model aspects of real animals. Using only simple reactive mechanisms and Hebbian-type or evolutionary learning, the resulting animats often outperformed the highly complex AI-based robots in tasks such as obstacle avoidance, corridor following etc. While successful in generating robust, insect-like behaviour, typical animats are limited to stereotyped, fixed stimulus-response associations. If one adopts the view that cognition requires a flexible, goal-dependent choice of behaviours and planning capabilities (H.A. Mallot, Kognitionswissenschaft, 1999, 40-48) then it appears that cognitive behaviour cannot emerge from a collection of purely reactive modules. It rather requires environmentally decoupled structures that work without directly engaging the actions that it is concerned with. This poses the current challenge to cognitive robotics: How can we build cognitive robots that show the robustness and the learning capabilities of animats without falling back into the representational paradigm of AI? The speakers of the symposium present their approaches to this question in the context of robot navigation and sensorimotor learning. In the first talk, Prof. Helge Ritter introduces a robot system for imitation learning capable of exploring various alternatives in simulation before actually performing a task. The second speaker, Angelo Arleo, develops a model of spatial memory in rat navigation based on his electrophysiological experiments. He validates the model on a mobile robot which, in some navigation tasks, shows a performance comparable to that of the real rat. A similar model of spatial memory is used to investigate the mechanisms of territory formation in a series of robot experiments presented by Prof. Hanspeter Mallot. In the last talk, we return to the domain of sensorimotor learning where Ralf M{\"o}ller introduces his approach to generate anticipatory behaviour by learning forward models of sensorimotor relationships.

Web [BibTex]

Web [BibTex]


no image
Expectation Maximization for Clustering on Hyperspheres

Banerjee, A., Dhillon, I., Ghosh, J., Sra, S.

Univ. of Texas at Austin, February 2003 (techreport)

GZIP [BibTex]

GZIP [BibTex]


no image
Constraints measures and reproduction of style in robot imitation learning

Bakir, GH., Ilg, W., Franz, MO., Giese, M.

6, pages: 70, (Editors: H.H. Bülthoff, K.R. Gegenfurtner, H.A. Mallot, R. Ulrich, F.A. Wichmann), 6. T{\"u}binger Wahrnehmungskonferenz (TWK), February 2003 (poster)

Abstract
Imitation learning is frequently discussed as a method for generating complex behaviors in robots by imitating human actors. The kinematic and the dynamic properties of humans and robots are typically quite di erent, however. For this reason observed human trajectories cannot be directly transferred to robots, even if their geometry is humanoid. Instead the human trajectory must be approximated by trajectories that can be realized by the robot. During this approximation deviations from the human trajectory may arise that change the style of the executed movement. Alternatively, the style of the movement might be well reproduced, but the imitated trajectory might be suboptimal with respect to di erent constraint measures from robotics control, leading to non-robust behavior. Goal of the presented work is to quantify this trade-o between \imitation quality" and constraint compatibility for the imitation of complex writing movements. In our experiment, we used trajectory data from human writing movements (see the abstract of Ilg et al. in this volume). The human trajectories were mapped onto robot trajectories by minimizing an error measure that integrates constraints that are important for the imitation of movement style and a regularizing constraint that ensures smooth joint trajectories with low velocities. In a rst experiment, both the end-e ector position and the shoulder angle of the robot were optimized in order to achieve good imitation together with accurate control of the end-e ector position. In a second experiment only the end-e ector trajectory was imitated whereas the motion of the elbow joint was determined using the optimal inverse kinematic solution for the robot. For both conditions di erent constraint measures (dexterity and relative jointlimit distances) and a measure for imitation quality were assessed. By controling the weight of the regularization term we can vary continuously between robot behavior optimizing imitation quality, and behavior minimizing joint velocities.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Study of Human Classification using Psychophysics and Machine Learning

Graf, A., Wichmann, F., Bülthoff, H., Schölkopf, B.

6, pages: 149, (Editors: H.H. Bülthoff, K.R. Gegenfurtner, H.A. Mallot, R. Ulrich, F.A. Wichmann), 6. T{\"u}binger Wahrnehmungskonferenz (TWK), Febuary 2003 (poster)

Abstract
We attempt to reach a better understanding of classi cation in humans using both psychophysical and machine learning techniques. In our psychophysical paradigm the stimuli presented to the human subjects are modi ed using machine learning algorithms according to their responses. Frontal views of human faces taken from a processed version of the MPI face database are employed for a gender classi cation task. The processing assures that all heads have same mean intensity, same pixel-surface area and are centered. This processing stage is followed by a smoothing of the database in order to eliminate, as much as possible, scanning artifacts. Principal Component Analysis is used to obtain a low-dimensional representation of the faces in the database. A subject is asked to classify the faces and experimental parameters such as class (i.e. female/male), con dence ratings and reaction times are recorded. A mean classi cation error of 14.5% is measured and, on average, 0.5 males are classi ed as females and 21.3females as males. The mean reaction time for the correctly classi ed faces is 1229 +- 252 [ms] whereas the incorrectly classi ed faces have a mean reaction time of 1769 +- 304 [ms] showing that the reaction times increase with the subject's classi- cation error. Reaction times are also shown to decrease with increasing con dence, both for the correct and incorrect classi cations. Classi cation errors, reaction times and con dence ratings are then correlated to concepts of machine learning such as separating hyperplane obtained when considering Support Vector Machines, Relevance Vector Machines, boosted Prototype and K-means Learners. Elements near the separating hyperplane are found to be classi ed with more errors than those away from it. In addition, the subject's con dence increases when moving away from the hyperplane. A preliminary analysis on the available small number of subjects indicates that K-means classi cation seems to re ect the subject's classi cation behavior best. The above learnersare then used to generate \special" elements, or representations, of the low-dimensional database according to the labels given by the subject. A memory experiment follows where the representations are shown together with faces seen or unseen during the classi cation experiment. This experiment aims to assess the representations by investigating whether some representations, or special elements, are classi ed as \seen before" despite that they never appeared in the classi cation experiment, possibly hinting at their use during human classi cation.

PDF Web [BibTex]

PDF Web [BibTex]


no image
A Representation of Complex Movement Sequences Based on Hierarchical Spatio-Temporal Correspondence for Imitation Learning in Robotics

Ilg, W., Bakir, GH., Franz, MO., Giese, M.

6, pages: 74, (Editors: H.H. Bülthoff, K.R. Gegenfurtner, H.A. Mallot, R. Ulrich, F.A. Wichmann), 6. T{\"u}binger Wahrnehmungskonferenz (TWK), February 2003 (poster)

Abstract
Imitation learning of complex movements has become a popular topic in neuroscience, as well as in robotics. A number of conceptual as well as practical problems are still unsolved. One example is the determination of the aspects of movements which are relevant for imitation. Problems concerning the movement representation are twofold: (1) The movement characteristics of observed movements have to be transferred from the perceptual level to the level of generated actions. (2) Continuous spaces of movements with variable styles have to be approximated based on a limited number of learned example sequences. Therefore, one has to use representation with a high generalisation capability. We present methods for the representation of complex movement sequences that addresses these questions in the context of the imitation learning of writing movements using a robot arm with human-like geometry. For the transfer of complex movements from perception to action we exploit a learning-based method that represents complex action sequences by linear combination of prototypical examples (Ilg and Giese, BMCV 2002). The method of hierarchical spatio-temporal morphable models (HSTMM) decomposes action sequences automatically into movement primitives. These primitives are modeled by linear combinations of a small number of learned example trajectories. The learned spatio-temporal models are suitable for the analysis and synthesis of long action sequences, which consist of movement primitives with varying style parameters. The proposed method is illustrated by imitation learning of complex writing movements. Human trajectories were recorded using a commercial motion capture system (VICON). In the rst step the recorded writing sequences are decomposed into movement primitives. These movement primitives can be analyzed and changed in style by de ning linear combinations of prototypes with di erent linear weight combinations. Our system can imitate writing movements of di erent actors, synthesize new writing styles and can even exaggerate the writing movements of individual actors. Words and writing movements of the robot look very natural, and closely match the natural styles. These preliminary results makes the proposed method promising for further applications in learning-based robotics. In this poster we focus on the acquisition of the movement representation (identi cation and segmentation of movement primitives, generation of new writing styles by spatio-temporal morphing). The transfer of the generated writing movements to the robot considering the given kinematic and dynamic constraints is discussed in Bakir et al (this volume).

PDF Web [BibTex]

PDF Web [BibTex]


no image
Hierarchical Spatio-Temporal Morphable Models for Representation of complex movements for Imitation Learning

Ilg, W., Bakir, GH., Franz, MO., Giese, M.

In 11th International Conference on Advanced Robotics, (2):453-458, (Editors: Nunes, U., A. de Almeida, A. Bejczy, K. Kosuge and J.A.T. Machado), 11th International Conference on Advanced Robotics, January 2003 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
Modeling Data using Directional Distributions

Dhillon, I., Sra, S.

Univ. of Texas at Austin, January 2003 (techreport)

GZIP [BibTex]

GZIP [BibTex]


no image
Hyperkernels

Ong, CS., Smola, AJ., Williamson, RC.

In pages: 495-502, 2003 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
An Introduction to Variable and Feature Selection.

Guyon, I., Elisseeff, A.

Journal of Machine Learning, 3, pages: 1157-1182, 2003 (article)

[BibTex]

[BibTex]


no image
Feature Selection for Support Vector Machines by Means of Genetic Algorithms

Fröhlich, H., Chapelle, O., Schölkopf, B.

In 15th IEEE International Conference on Tools with AI, pages: 142-148, 15th IEEE International Conference on Tools with AI, 2003 (inproceedings)

[BibTex]

[BibTex]


no image
Propagation of Uncertainty in Bayesian Kernel Models - Application to Multiple-Step Ahead Forecasting

Quiñonero-Candela, J., Girard, A., Larsen, J., Rasmussen, CE.

In IEEE International Conference on Acoustics, Speech and Signal Processing, 2, pages: 701-704, IEEE International Conference on Acoustics, Speech and Signal Processing, 2003 (inproceedings)

Abstract
The object of Bayesian modelling is the predictive distribution, which in a forecasting scenario enables improved estimates of forecasted values and their uncertainties. In this paper we focus on reliably estimating the predictive mean and variance of forecasted values using Bayesian kernel based models such as the Gaussian Process and the Relevance Vector Machine. We derive novel analytic expressions for the predictive mean and variance for Gaussian kernel shapes under the assumption of a Gaussian input distribution in the static case, and of a recursive Gaussian predictive density in iterative forecasting. The capability of the method is demonstrated for forecasting of time-series and compared to approximate methods.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Unsupervised Clustering of Images using their Joint Segmentation

Seldin, Y., Starik, S., Werman, M.

In The 3rd International Workshop on Statistical and Computational Theories of Vision (SCTV 2003), pages: 1-24, 3rd International Workshop on Statistical and Computational Theories of Vision (SCTV), 2003 (inproceedings)

PDF Web [BibTex]

PDF Web [BibTex]


no image
A Note on Parameter Tuning for On-Line Shifting Algorithms

Bousquet, O.

Max Planck Institute for Biological Cybernetics, Tübingen, Germany, 2003 (techreport)

Abstract
In this short note, building on ideas of M. Herbster [2] we propose a method for automatically tuning the parameter of the FIXED-SHARE algorithm proposed by Herbster and Warmuth [3] in the context of on-line learning with shifting experts. We show that this can be done with a memory requirement of $O(nT)$ and that the additional loss incurred by the tuning is the same as the loss incurred for estimating the parameter of a Bernoulli random variable.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Dynamics of a rigid body in a Stokes fluid

Gonzalez, O., Graf, ABA., Maddocks, JH.

Journal of Fluid Mechanics, 2003 (article) Accepted

[BibTex]

[BibTex]


no image
A novel transient heater-foil technique for liquid crystal experiments on film cooled surfaces

Vogel, G., Graf, ABA., von Wolfersdorf, J., Weigand, B.

ASME Journal of Turbomachinery, 125, pages: 529-537, 2003 (article)

PDF [BibTex]

PDF [BibTex]


no image
Support Vector Machines

Schölkopf, B., Smola, A.

In Handbook of Brain Theory and Neural Networks (2nd edition), pages: 1119-1125, (Editors: MA Arbib), MIT Press, Cambridge, MA, USA, 2003 (inbook)

[BibTex]

[BibTex]


no image
Prediction at an Uncertain Input for Gaussian Processes and Relevance Vector Machines - Application to Multiple-Step Ahead Time-Series Forecasting

Quiñonero-Candela, J., Girard, A., Rasmussen, C.

(IMM-2003-18), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, 2003 (techreport)

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Large margin Methods in Label Sequence Learning

Altun, Y.

Brown University, Providence, RI, USA, 2003 (mastersthesis)

[BibTex]

[BibTex]


no image
Kernel Methods and Their Applications to Signal Processing

Bousquet, O., Perez-Cruz, F.

In Proceedings. (ICASSP ‘03), Special Session on Kernel Methods, pages: 860 , ICASSP, 2003 (inproceedings)

Abstract
Recently introduced in Machine Learning, the notion of kernels has drawn a lot of interest as it allows to obtain non-linear algorithms from linear ones in a simple and elegant manner. This, in conjunction with the introduction of new linear classification methods such as the Support Vector Machines has produced significant progress. The successes of such algorithms is now spreading as they are applied to more and more domains. Many Signal Processing problems, by their non-linear and high-dimensional nature may benefit from such techniques. We give an overview of kernel methods and their recent applications.

PDF PostScript [BibTex]

PDF PostScript [BibTex]