Header logo is ei


2001


no image
Four-legged Walking Gait Control Using a Neuromorphic Chip Interfaced to a Support Vector Learning Algorithm

Still, S., Schölkopf, B., Hepp, K., Douglas, R.

In Advances in Neural Information Processing Systems 13, pages: 741-747, (Editors: TK Leen and TG Dietterich and V Tresp), MIT Press, Cambridge, MA, USA, 14th Annual Neural Information Processing Systems Conference (NIPS), April 2001 (inproceedings)

Abstract
To control the walking gaits of a four-legged robot we present a novel neuromorphic VLSI chip that coordinates the relative phasing of the robot's legs similar to how spinal Central Pattern Generators are believed to control vertebrate locomotion [3]. The chip controls the leg movements by driving motors with time varying voltages which are the outputs of a small network of coupled oscillators. The characteristics of the chip's output voltages depend on a set of input parameters. The relationship between input parameters and output voltages can be computed analytically for an idealized system. In practice, however, this ideal relationship is only approximately true due to transistor mismatch and offsets.

PDF Web [BibTex]

2001

PDF Web [BibTex]


no image
Algorithmic Stability and Generalization Performance

Bousquet, O., Elisseeff, A.

In Advances in Neural Information Processing Systems 13, pages: 196-202, (Editors: Leen, T.K. , T.G. Dietterich, V. Tresp), MIT Press, Cambridge, MA, USA, Fourteenth Annual Neural Information Processing Systems Conference (NIPS), April 2001 (inproceedings)

Abstract
We present a novel way of obtaining PAC-style bounds on the generalization error of learning algorithms, explicitly using their stability properties. A {\em stable} learner being one for which the learned solution does not change much for small changes in the training set. The bounds we obtain do not depend on any measure of the complexity of the hypothesis space (e.g. VC dimension) but rather depend on how the learning algorithm searches this space, and can thus be applied even when the VC dimension in infinite. We demonstrate that regularization networks possess the required stability property and apply our method to obtain new bounds on their generalization performance.

PDF Web [BibTex]

PDF Web [BibTex]


no image
The Kernel Trick for Distances

Schölkopf, B.

In Advances in Neural Information Processing Systems 13, pages: 301-307, (Editors: TK Leen and TG Dietterich and V Tresp), MIT Press, Cambridge, MA, USA, 14th Annual Neural Information Processing Systems Conference (NIPS), April 2001 (inproceedings)

Abstract
A method is described which, like the kernel trick in support vector machines (SVMs), lets us generalize distance-based algorithms to operate in feature spaces, usually nonlinearly related to the input space. This is done by identifying a class of kernels which can be represented as norm-based distances in Hilbert spaces. It turns out that the common kernel algorithms, such as SVMs and kernel PCA, are actually really distance based algorithms and can be run with that class of kernels, too. As well as providing a useful new insight into how these algorithms work, the present work can form the basis for conceiving new algorithms.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Vicinal Risk Minimization

Chapelle, O., Weston, J., Bottou, L., Vapnik, V.

In Advances in Neural Information Processing Systems 13, pages: 416-422, (Editors: Leen, T.K. , T.G. Dietterich, V. Tresp), MIT Press, Cambridge, MA, USA, Fourteenth Annual Neural Information Processing Systems Conference (NIPS) , April 2001 (inproceedings)

Abstract
The Vicinal Risk Minimization principle establishes a bridge between generative models and methods derived from the Structural Risk Minimization Principle such as Support Vector Machines or Statistical Regularization. We explain how VRM provides a framework which integrates a number of existing algorithms, such as Parzen windows, Support Vector Machines, Ridge Regression, Constrained Logistic Classifiers and Tangent-Prop. We then show how the approach implies new algorithms for solving problems usually associated with generative models. New algorithms are described for dealing with pattern recognition problems with very different pattern distributions and dealing with unlabeled data. Preliminary empirical results are presented.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Feature Selection for SVMs

Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., Vapnik, V.

In Advances in Neural Information Processing Systems 13, pages: 668-674, (Editors: Leen, T.K. , T.G. Dietterich, V. Tresp), MIT Press, Cambridge, MA, USA, Fourteenth Annual Neural Information Processing Systems Conference (NIPS), April 2001 (inproceedings)

Abstract
We introduce a method of feature selection for Support Vector Machines. The method is based upon finding those features which minimize bounds on the leave-one-out error. This search can be efficiently performed via gradient descent. The resulting algorithms are shown to be superior to some standard feature selection algorithms on both toy data and real-life problems of face recognition, pedestrian detection and analyzing DNA microarray data.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Occam’s Razor

Rasmussen, CE., Ghahramani, Z.

In Advances in Neural Information Processing Systems 13, pages: 294-300, (Editors: Leen, T.K. , T.G. Dietterich, V. Tresp), MIT Press, Cambridge, MA, USA, Fourteenth Annual Neural Information Processing Systems Conference (NIPS), April 2001 (inproceedings)

Abstract
The Bayesian paradigm apparently only sometimes gives rise to Occam's Razor; at other times very large models perform well. We give simple examples of both kinds of behaviour. The two views are reconciled when measuring complexity of functions, rather than of the machinery used to implement them. We analyze the complexity of functions for some linear in the parameter models that are equivalent to Gaussian Processes, and always find Occam's Razor at work.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Pattern Selection Using the Bias and Variance of Ensemble

Shin, H., Cho, S.

Journal of the Korean Institute of Industrial Engineers, 28(1):112-127, March 2001 (article)

Abstract
[Abstract]: A useful pattern is a pattern that contributes much to learning. For a classification problem those patterns near the class boundary surfaces carry more information to the classifier. For a regression problem the ones near the estimated surface carry more information. In both cases, the usefulness is defined only for those patterns either without error or with negligible error. Using only the useful patterns gives several benefits. First, computational complexity in memory and time for learning is decreased. Second, overfitting is avoided even when the learner is over-sized. Third, learning results in more stable learners. In this paper, we propose a pattern “utility index” that measures the utility of an individual pattern. The utility index is based on the bias and variance of a pattern trained by a network ensemble. In classification, the pattern with a low bias and a high variance gets a high score. In regression, on the other hand, the one with a low bias and a low variance gets a high score. Based on the distribution of the utility index, the original training set is divided into a high-score group and a low-score group. Only the high-score group is then used for training. The proposed method is tested on synthetic and real-world benchmark datasets. The proposed approach gives a better or at least similar performance.

[BibTex]

[BibTex]


no image
Structure and Functionality of a Designed p53 Dimer.

Davison, TS., Nie, X., Ma, W., Lin, Y., Kay, C., Benchimol, S., Arrowsmith, C.

Journal of Molecular Biology, 307(2):605-617, March 2001 (article)

Abstract
P53 is a homotetrameric tumor suppressor protein involved in transcriptional control of genes that regulate cell proliferation and death. In order to probe the role that oligomerization plays in this capacity, we have previously designed and characterized a series of p53 proteins with altered oligomeric states through hydrophilc substitution of residues Met340 or Leu344 in the normally tetrameric oligomerization domain. Although such mutations have little effect on the overall secondary structural content of the oligomerization domain, both solubility and the resistance to thermal denaturation are substantially reduced relative to that of the wild-type domain. Here, we report the design and characterization of a double-mutant p53 with alterations of residues at positions Met340 and Leu344. The double-mutations Met340Glu/Leu344Lys and Met340Gln/Leu344Arg resulted in distinct dimeric forms of the protein. Furthermore, we have verified by NMR structure determination that the double-mutant Met340Gln/Leu344Arg is essentially a "half-tetramer". Analysis of the in vivo activities of full-length p53 oligomeric mutants reveals that while cell-cycle arrest requires tetrameric p53, transcriptional transactivation activity of monomers and dimers retain roughly background and half of the wild-type activity, respectively.

Web [BibTex]

Web [BibTex]


no image
An Introduction to Kernel-Based Learning Algorithms

Müller, K., Mika, S., Rätsch, G., Tsuda, K., Schölkopf, B.

IEEE Transactions on Neural Networks, 12(2):181-201, March 2001 (article)

Abstract
This paper provides an introduction to support vector machines, kernel Fisher discriminant analysis, and kernel principal component analysis, as examples for successful kernel-based learning methods. We first give a short background about Vapnik-Chervonenkis theory and kernel feature spaces and then proceed to kernel based learning in supervised and unsupervised scenarios including practical and algorithmic considerations. We illustrate the usefulness of kernel algorithms by discussing applications such as optical character recognition and DNA analysis

DOI [BibTex]

DOI [BibTex]


no image
Estimating the support of a high-dimensional distribution.

Schölkopf, B., Platt, J., Shawe-Taylor, J., Smola, A., Williamson, R.

Neural Computation, 13(7):1443-1471, March 2001 (article)

Abstract
Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a “simple” subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1. We propose a method to approach this problem by trying to estimate a function f that is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabeled data.

Web DOI [BibTex]

Web DOI [BibTex]


no image
An Improved Training Algorithm for Kernel Fisher Discriminants

Mika, S., Schölkopf, B., Smola, A.

In Proceedings AISTATS, pages: 98-104, (Editors: T Jaakkola and T Richardson), Morgan Kaufman, San Francisco, CA, Artificial Intelligence and Statistics (AISTATS), January 2001 (inproceedings)

Web [BibTex]

Web [BibTex]


no image
The psychometric function: II. Bootstrap-based confidence intervals and sampling

Wichmann, F., Hill, N.

Perception and Psychophysics, 63 (8), pages: 1314-1329, 2001 (article)

PDF [BibTex]

PDF [BibTex]


no image
Nonstationary Signal Classification using Support Vector Machines

Gretton, A., Davy, M., Doucet, A., Rayner, P.

In 11th IEEE Workshop on Statistical Signal Processing, pages: 305-305, 11th IEEE Workshop on Statistical Signal Processing, 2001 (inproceedings)

Abstract
In this paper, we demonstrate the use of support vector (SV) techniques for the binary classification of nonstationary sinusoidal signals with quadratic phase. We briefly describe the theory underpinning SV classification, and introduce the Cohen's group time-frequency representation, which is used to process the non-stationary signals so as to define the classifier input space. We show that the SV classifier outperforms alternative classification methods on this processed data.

PostScript [BibTex]

PostScript [BibTex]


no image
Enhanced User Authentication through Typing Biometrics with Artificial Neural Networks and K-Nearest Neighbor Algorithm

Wong, FWMH., Supian, ASM., Ismail, AF., Lai, WK., Ong, CS.

In 2001 (inproceedings)

[BibTex]

[BibTex]


no image
Predicting the Nonlinear Dynamics of Biological Neurons using Support Vector Machines with Different Kernels

Frontzek, T., Lal, TN., Eckmiller, R.

In Proceedings of the International Joint Conference on Neural Networks (IJCNN'2001) Washington DC, 2, pages: 1492-1497, Proceedings of the International Joint Conference on Neural Networks (IJCNN'2001) Washington DC, 2001 (inproceedings)

Abstract
Based on biological data we examine the ability of Support Vector Machines (SVMs) with gaussian, polynomial and tanh-kernels to learn and predict the nonlinear dynamics of single biological neurons. We show that SVMs for regression learn the dynamics of the pyloric dilator neuron of the australian crayfish, and we determine the optimal SVM parameters with regard to the test error. Compared to conventional RBF networks and MLPs, SVMs with gaussian kernels learned faster and performed a better iterated one-step-ahead prediction with regard to training and test error. From a biological point of view SVMs are especially better in predicting the most important part of the dynamics, where the membranpotential is driven by superimposed synaptic inputs to the threshold for the oscillatory peak.

PDF [BibTex]

PDF [BibTex]


no image
Computationally Efficient Face Detection

Romdhani, S., Torr, P., Schölkopf, B., Blake, A.

In Computer Vision, ICCV 2001, vol. 2, (73):695-700, IEEE, 8th International Conference on Computer Vision, 2001 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
Design and Verification of Supervisory Controller of High-Speed Train

Yoo, SP., Lee, DY., Son, HI.

In IEEE International Symposium on Industrial Electronics, pages: 1290-1295, IEEE Operations Center, Piscataway, NJ, USA, IEEE International Symposium on Industrial Electronics (ISIE), 2001 (inproceedings)

Abstract
A high-level controller, supervisory controller, is required to monitor, control, and diagnose the low-level controllers of the high-speed train. The supervisory controller controls low-level controllers by monitoring input and output signals, events, and the high-speed train can be modeled as a discrete event system (DES). The high-speed train is modeled with automata, and the high-level control specification is defined. The supervisory controller is designed using the high-speed train model and the control specification. The designed supervisory controller is verified and evaluated with simulation using a computer-aided software engineering (CASE) tool, Object GEODE

Web DOI [BibTex]

Web DOI [BibTex]


no image
Cerebellar Control of Robot Arms

Peters, J.

Biologische Kybernetik, Technische Univeristät München, München, Germany, 2001 (diplomathesis)

[BibTex]

[BibTex]


no image
The psychometric function: I. Fitting, sampling and goodness-of-fit

Wichmann, F., Hill, N.

Perception and Psychophysics, 63 (8), pages: 1293-1313, 2001 (article)

Abstract
The psychometric function relates an observer'sperformance to an independent variable, usually some physical quantity of a stimulus in a psychophysical task. This paper, together with its companion paper (Wichmann & Hill, 2001), describes an integrated approach to (1) fitting psychometric functions, (2) assessing the goodness of fit, and (3) providing confidence intervals for the function'sparameters and other estimates derived from them, for the purposes of hypothesis testing. The present paper deals with the first two topics, describing a constrained maximum-likelihood method of parameter estimation and developing several goodness-of-fit tests. Using Monte Carlo simulations, we deal with two specific difficulties that arise when fitting functions to psychophysical data. First, we note that human observers are prone to stimulus-independent errors (or lapses ). We show that failure to account for this can lead to serious biases in estimates of the psychometric function'sparameters and illustrate how the problem may be overcome. Second, we note that psychophysical data sets are usually rather small by the standards required by most of the commonly applied statistical tests. We demonstrate the potential errors of applying traditional X^2 methods to psychophysical data and advocate use of Monte Carlo resampling techniques that do not rely on asymptotic theory. We have made available the software to implement our methods

PDF [BibTex]

PDF [BibTex]


no image
On Unsupervised Learning of Mixtures of Markov Sources

Seldin, Y.

Biologische Kybernetik, The Hebrew University of Jerusalem, Israel, 2001 (diplomathesis)

PDF [BibTex]

PDF [BibTex]


no image
Towards Learning Path Planning for Solving Complex Robot Tasks

Frontzek, T., Lal, TN., Eckmiller, R.

In Proceedings of the International Conference on Artificial Neural Networks (ICANN'2001) Vienna, pages: 943-950, Proceedings of the International Conference on Artificial Neural Networks (ICANN'2001) Vienna, 2001 (inproceedings)

Abstract
For solving complex robot tasks it is necessary to incorporate path planning methods that are able to operate within different high-dimensional configuration spaces containing an unknown number of obstacles. Based on Advanced A*-algorithm (AA*) using expansion matrices instead of a simple expansion logic we propose a further improvement of AA* enabling the capability to learn directly from sample planning tasks. This is done by inserting weights into the expansion matrix which are modified according to a special learning rule. For an examplary planning task we show that Adaptive AA* learns movement vectors which allow larger movements than the initial ones into well-defined directions of the configuration space. Compared to standard approaches planning times are clearly reduced.

PDF [BibTex]

PDF [BibTex]


no image
Learning to predict the leave-one-out error of kernel based classifiers

Tsuda, K., Rätsch, G., Mika, S., Müller, K.

In International Conference on Artificial Neural Networks, ICANN'01, (LNCS 2130):331-338, (Editors: G. Dorffner, H. Bischof and K. Hornik), International Conference on Artificial Neural Networks, ICANN'01, 2001 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
A kernel approach for vector quantization with guaranteed distortion bounds

Tipping, M., Schölkopf, B.

In Artificial Intelligence and Statistics, pages: 129-134, (Editors: T Jaakkola and T Richardson), Morgan Kaufmann, San Francisco, CA, USA, 8th International Conference on Artificial Intelligence and Statistics (AI and STATISTICS), 2001 (inproceedings)

[BibTex]

[BibTex]


no image
Incorporating Invariances in Non-Linear Support Vector Machines

Chapelle, O., Schölkopf, B.

Max Planck Institute for Biological Cybernetics / Biowulf Technologies, 2001 (techreport)

Abstract
We consider the problem of how to incorporate in the Support Vector Machine (SVM) framework invariances given by some a priori known transformations under which the data should be invariant. It extends some previous work which was only applicable with linear SVMs and we show on a digit recognition task that the proposed approach is superior to the traditional Virtual Support Vector method.

PostScript [BibTex]

PostScript [BibTex]


no image
Unsupervised Segmentation and Classification of Mixtures of Markovian Sources

Seldin, Y., Bejerano, G., Tishby, N.

In The 33rd Symposium on the Interface of Computing Science and Statistics (Interface 2001 - Frontiers in Data Mining and Bioinformatics), pages: 1-15, 33rd Symposium on the Interface of Computing Science and Statistics (Interface - Frontiers in Data Mining and Bioinformatics), 2001 (inproceedings)

Abstract
We describe a novel algorithm for unsupervised segmentation of sequences into alternating Variable Memory Markov sources, first presented in [SBT01]. The algorithm is based on competitive learning between Markov models, when implemented as Prediction Suffix Trees [RST96] using the MDL principle. By applying a model clustering procedure, based on rate distortion theory combined with deterministic annealing, we obtain a hierarchical segmentation of sequences between alternating Markov sources. The method is applied successfully to unsupervised segmentation of multilingual texts into languages where it is able to infer correctly both the number of languages and the language switching points. When applied to protein sequence families (results of the [BSMT01] work), we demonstrate the method‘s ability to identify biologically meaningful sub-sequences within the proteins, which correspond to signatures of important functional sub-units called domains. Our approach to proteins classification (through the obtained signatures) is shown to have both conceptual and practical advantages over the currently used methods.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Tracking a Small Set of Experts by Mixing Past Posteriors

Bousquet, O., Warmuth, M.

In Proceedings of the 14th Annual Conference on Computational Learning Theory, Lecture Notes in Computer Science, 2111, pages: 31-47, Proceedings of the 14th Annual Conference on Computational Learning Theory, Lecture Notes in Computer Science, 2001 (inproceedings)

Abstract
In this paper, we examine on-line learning problems in which the target concept is allowed to change over time. In each trial a master algorithm receives predictions from a large set of $n$ experts. Its goal is to predict almost as well as the best sequence of such experts chosen off-line by partitioning the training sequence into $k+1$ sections and then choosing the best expert for each section. We build on methods developed by Herbster and Warmuth and consider an open problem posed by Freund where the experts in the best partition are from a small pool of size $m$. Since $k>>m$ the best expert shifts back and forth between the experts of the small pool. We propose algorithms that solve this open problem by mixing the past posteriors maintained by the master algorithm. We relate the number of bits needed for encoding the best partition to the loss bounds of the algorithms. Instead of paying $\log n$ for choosing the best expert in each section we first pay $\log {n\choose m}$ bits in the bounds for identifying the pool of $m$ experts and then $\log m$ bits per new section. In the bounds we also pay twice for encoding the boundaries of the sections.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Learning and Prediction of the Nonlinear Dynamics of Biological Neurons with Support Vector Machines

Frontzek, T., Lal, TN., Eckmiller, R.

In Proceedings of the International Conference on Artificial Neural Networks (ICANN'2001), pages: 390-398, Proceedings of the International Conference on Artificial Neural Networks (ICANN'2001), 2001 (inproceedings)

Abstract
Based on biological data we examine the ability of Support Vector Machines (SVMs) with gaussian kernels to learn and predict the nonlinear dynamics of single biological neurons. We show that SVMs for regression learn the dynamics of the pyloric dilator neuron of the australian crayfish, and we determine the optimal SVM parameters with regard to the test error. Compared to conventional RBF networks, SVMs learned faster and performed a better iterated one-step-ahead prediction with regard to training and test error. From a biological point of view SVMs are especially better in predicting the most important part of the dynamics, where the membranpotential is driven by superimposed synaptic inputs to the threshold for the oscillatory peak.

PDF [BibTex]

PDF [BibTex]


no image
Estimating a Kernel Fisher Discriminant in the Presence of Label Noise

Lawrence, N., Schölkopf, B.

In 18th International Conference on Machine Learning, pages: 306-313, (Editors: CE Brodley and A Pohoreckyj Danyluk), Morgan Kaufmann , San Fransisco, CA, USA, 18th International Conference on Machine Learning (ICML), 2001 (inproceedings)

Web [BibTex]

Web [BibTex]


no image
A Generalized Representer Theorem

Schölkopf, B., Herbrich, R., Smola, A.

In Lecture Notes in Computer Science, Vol. 2111, (2111):416-426, LNCS, (Editors: D Helmbold and R Williamson), Springer, Berlin, Germany, Annual Conference on Computational Learning Theory (COLT/EuroCOLT), 2001 (inproceedings)

[BibTex]

[BibTex]


no image
Bound on the Leave-One-Out Error for Density Support Estimation using nu-SVMs

Gretton, A., Herbrich, R., Schölkopf, B., Smola, A., Rayner, P.

University of Cambridge, 2001 (techreport)

[BibTex]

[BibTex]


no image
Markovian domain fingerprinting: statistical segmentation of protein sequences

Bejerano, G., Seldin, Y., Margalit, H., Tishby, N.

Bioinformatics, 17(10):927-934, 2001 (article)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Unsupervised Sequence Segmentation by a Mixture of Switching Variable Memory Markov Sources

Seldin, Y., Bejerano, G., Tishby, N.

In In the proceeding of the 18th International Conference on Machine Learning (ICML 2001), pages: 513-520, 18th International Conference on Machine Learning (ICML), 2001 (inproceedings)

Abstract
We present a novel information theoretic algorithm for unsupervised segmentation of sequences into alternating Variable Memory Markov sources. The algorithm is based on competitive learning between Markov models, when implemented as Prediction Suffix Trees (Ron et al., 1996) using the MDL principle. By applying a model clustering procedure, based on rate distortion theory combined with deterministic annealing, we obtain a hierarchical segmentation of sequences between alternating Markov sources. The algorithm seems to be self regulated and automatically avoids over segmentation. The method is applied successfully to unsupervised segmentation of multilingual texts into languages where it is able to infer correctly both the number of languages and the language switching points. When applied to protein sequence families, we demonstrate the method‘s ability to identify biologically meaningful sub-sequences within the proteins, which correspond to important functional sub-units called domains.

PDF [BibTex]

PDF [BibTex]


no image
The control structure of artificial creatures

Zhou, D., Dai, R.

Artificial Life and Robotics, 5(3), 2001, invited article (article)

Web [BibTex]

Web [BibTex]


no image
Support Vector Regression for Black-Box System Identification

Gretton, A., Doucet, A., Herbrich, R., Rayner, P., Schölkopf, B.

In 11th IEEE Workshop on Statistical Signal Processing, pages: 341-344, IEEE Signal Processing Society, Piscataway, NY, USA, 11th IEEE Workshop on Statistical Signal Processing, 2001 (inproceedings)

Abstract
In this paper, we demonstrate the use of support vector regression (SVR) techniques for black-box system identification. These methods derive from statistical learning theory, and are of great theoretical and practical interest. We briefly describe the theory underpinning SVR, and compare support vector methods with other approaches using radial basis networks. Finally, we apply SVR to modeling the behaviour of a hydraulic robot arm, and show that SVR improves on previously published results.

PostScript [BibTex]

PostScript [BibTex]


no image
Bound on the Leave-One-Out Error for 2-Class Classification using nu-SVMs

Gretton, A., Herbrich, R., Schölkopf, B., Rayner, P.

University of Cambridge, 2001, Updated May 2003 (literature review expanded) (techreport)

Abstract
Three estimates of the leave-one-out error for $nu$-support vector (SV) machine binary classifiers are presented. Two of the estimates are based on the geometrical concept of the {em span}, which was introduced in the context of bounding the leave-one-out error for $C$-SV machine binary classifiers, while the third is based on optimisation over the criterion used to train the $nu$-support vector classifier. It is shown that the estimates presented herein provide informative and efficient approximations of the generalisation behaviour, in both a toy example and benchmark data sets. The proof strategies in the $nu$-SV context are also compared with those used to derive leave-one-out error estimates in the $C$-SV case.

PostScript [BibTex]

PostScript [BibTex]


no image
Inference Principles and Model Selection

Buhmann, J., Schölkopf, B.

(01301), Dagstuhl Seminar, 2001 (techreport)

Web [BibTex]

Web [BibTex]


no image
Kernel Machine Based Learning for Multi-View Face Detection and Pose Estimation

Cheng, Y., Fu, Q., Gu, L., Li, S., Schölkopf, B., Zhang, H.

In Proceedings Computer Vision, 2001, Vol. 2, pages: 674-679, IEEE Computer Society, 8th International Conference on Computer Vision (ICCV), 2001 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
Some kernels for structured data

Bartlett, P., Schölkopf, B.

Biowulf Technologies, 2001 (techreport)

[BibTex]

[BibTex]


no image
Support Vector Machines: Theorie und Anwendung auf Prädiktion epileptischer Anfälle auf der Basis von EEG-Daten

Lal, TN.

Biologische Kybernetik, Institut für Angewandte Mathematik, Universität Bonn, 2001, Advised by Prof. Dr. S. Albeverio (diplomathesis)

ZIP [BibTex]

ZIP [BibTex]

1997


no image
Comparing support vector machines with Gaussian kernels to radial basis function classifiers

Schölkopf, B., Sung, K., Burges, C., Girosi, F., Niyogi, P., Poggio, T., Vapnik, V.

IEEE Transactions on Signal Processing, 45(11):2758-2765, November 1997 (article)

Abstract
The support vector (SV) machine is a novel type of learning machine, based on statistical learning theory, which contains polynomial classifiers, neural networks, and radial basis function (RBF) networks as special cases. In the RBF case, the SV algorithm automatically determines centers, weights, and threshold that minimize an upper bound on the expected test error. The present study is devoted to an experimental comparison of these machines with a classical approach, where the centers are determined by X-means clustering, and the weights are computed using error backpropagation. We consider three machines, namely, a classical RBF machine, an SV machine with Gaussian kernel, and a hybrid system with the centers determined by the SV method and the weights trained by error backpropagation. Our results show that on the United States postal service database of handwritten digits, the SV machine achieves the highest recognition accuracy, followed by the hybrid system. The SV approach is thus not only theoretically well-founded but also superior in a practical application.

Web DOI [BibTex]

1997

Web DOI [BibTex]


no image
The view-graph approach to visual navigation and spatial memory

Mallot, H., Franz, M., Schölkopf, B., Bülthoff, H.

In Artificial Neural Networks: ICANN ’97, pages: 751-756, (Editors: W Gerstner and A Germond and M Hasler and J-D Nicoud), Springer, Berlin, Germany, 7th International Conference on Artificial Neural Networks, October 1997 (inproceedings)

Abstract
This paper describes a purely visual navigation scheme based on two elementary mechanisms (piloting and guidance) and a graph structure combining individual navigation steps controlled by these mechanisms. In robot experiments in real environments, both mechanisms have been tested, piloting in an open environment and guidance in a maze with restricted movement opportunities. The results indicate that navigation and path planning can be brought about with these simple mechanisms. We argue that the graph of local views (snapshots) is a general and biologically plausible means of representing space and integrating the various mechanisms of map behaviour.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Predicting time series with support vector machines

Müller, K., Smola, A., Rätsch, G., Schölkopf, B., Kohlmorgen, J., Vapnik, V.

In Artificial Neural Networks: ICANN’97, pages: 999-1004, (Editors: Schölkopf, B. , C.J.C. Burges, A.J. Smola), Springer, Berlin, Germany, 7th International Conference on Artificial Neural Networks , October 1997 (inproceedings)

Abstract
Support Vector Machines are used for time series prediction and compared to radial basis function networks. We make use of two different cost functions for Support Vectors: training with (i) an e insensitive loss and (ii) Huber's robust loss function and discuss how to choose the regularization parameters in these models. Two applications are considered: data from (a) a noisy (normal and uniform noise) Mackey Glass equation and (b) the Santa Fe competition (set D). In both cases Support Vector Machines show an excellent performance. In case (b) the Support Vector approach improves the best known result on the benchmark by a factor of 29%.

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Predicting time series with support vectur machines

Müller, K., Smola, A., Rätsch, G., Schölkopf, B., Kohlmorgen, J., Vapnik, V.

In Artificial neural networks: ICANN ’97, pages: 999-1004, (Editors: W Gerstner and A Germond and M Hasler and J-D Nicoud), Springer, Berlin, Germany, 7th International Conference on Artificial Neural Networks , October 1997 (inproceedings)

Abstract
Support Vector Machines are used for time series prediction and compared to radial basis function networks. We make use of two different cost functions for Support Vectors: training with (i) an e insensitive loss and (ii) Huber's robust loss function and discuss how to choose the regularization parameters in these models. Two applications are considered: data from (a) a noisy (normal and uniform noise) Mackey Glass equation and (b) the Santa Fe competition (set D). In both cases Support Vector Machines show an excellent performance. In case (b) the Support Vector approach improves the best known result on the benchmark by a factor of 29%.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Kernel principal component analysis

Schölkopf, B., Smola, A., Müller, K.

In Artificial neural networks: ICANN ’97, LNCS, vol. 1327, pages: 583-588, (Editors: W Gerstner and A Germond and M Hasler and J-D Nicoud), Springer, Berlin, Germany, 7th International Conference on Artificial Neural Networks, October 1997 (inproceedings)

Abstract
A new method for performing a nonlinear form of Principal Component Analysis is proposed. By the use of integral operator kernel functions, one can efficiently compute principal components in highdimensional feature spaces, related to input space by some nonlinear map; for instance the space of all possible d-pixel products in images. We give the derivation of the method and present experimental results on polynomial feature extraction for pattern recognition.

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Homing by parameterized scene matching

Franz, M., Schölkopf, B., Bülthoff, H.

In Proceedings of the 4th European Conference on Artificial Life, (Eds.) P. Husbands, I. Harvey. MIT Press, Cambridge 1997, pages: 236-245, (Editors: P Husbands and I Harvey), MIT Press, Cambridge, MA, USA, 4th European Conference on Artificial Life (ECAL97), July 1997 (inproceedings)

Abstract
In visual homing tasks, animals as well as robots can compute their movements from the current view and a snapshot taken at a home position. Solving this problem exactly would require knowledge about the distances to visible landmarks, information, which is not directly available to passive vision systems. We propose a homing scheme that dispenses with accurate distance information by using parameterized disparity fields. These are obtained from an approximation that incorporates prior knowledge about perspective distortions of the visual environment. A mathematical analysis proves that the approximation does not prevent the scheme from approaching the goal with arbitrary accuracy. Mobile robot experiments are used to demonstrate the practical feasibility of the approach.

PDF [BibTex]

PDF [BibTex]


no image
Improving the accuracy and speed of support vector learning machines

Burges, C., Schölkopf, B.

In Advances in Neural Information Processing Systems 9, pages: 375-381, (Editors: M Mozer and MJ Jordan and T Petsche), MIT Press, Cambridge, MA, USA, Tenth Annual Conference on Neural Information Processing Systems (NIPS), May 1997 (inproceedings)

Abstract
Support Vector Learning Machines (SVM) are finding application in pattern recognition, regression estimation, and operator inversion for illposed problems . Against this very general backdrop any methods for improving the generalization performance, or for improving the speed in test phase of SVMs are of increasing interest. In this paper we combine two such techniques on a pattern recognition problem The method for improving generalization performance the "virtual support vector" method does so by incorporating known invariances of the problem This method achieves a drop in the error rate on 10.000 NIST test digit images of 1,4 % to 1 %. The method for improving the speed (the "reduced set" method) does so by approximating the support vector decision surface. We apply this method to achieve a factor of fifty speedup in test phase over the virtual support vector machine The combined approach yields a machine which is both 22 times faster than the original machine, and which has better generalization performance achieving 1,1 % error . The virtual support vector method is applicable to any SVM problem with known invariances The reduced set method is applicable to any support vector machine .

PDF Web [BibTex]

PDF Web [BibTex]


no image
Homing by parameterized scene matching

Franz, M., Schölkopf, B., Bülthoff, H.

(46), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, Febuary 1997 (techreport)

Abstract
In visual homing tasks, animals as well as robots can compute their movements from the current view and a snapshot taken at a home position. Solving this problem exactly would require knowledge about the distances to visible landmarks, information, which is not directly available to passive vision systems. We propose a homing scheme that dispenses with accurate distance information by using parameterized disparity fields. These are obtained from an approximation that incorporates prior knowledge about perspective distortions of the visual environment. A mathematical analysis proves that the approximation does not prevent the scheme from approaching the goal with arbitrary accuracy. Mobile robot experiments are used to demonstrate the practical feasibility of the approach.

[BibTex]

[BibTex]