Header logo is ei


2018


no image
Die kybernetische Revolution

Schölkopf, B.

15-Mar-2018, Süddeutsche Zeitung, 2018 (misc)

link (url) [BibTex]

2018

link (url) [BibTex]


no image
Maschinelles Lernen: Entwicklung ohne Grenzen?

Schökopf, B.

In Mit Optimismus in die Zukunft schauen. Künstliche Intelligenz - Chancen und Rahmenbedingungen, pages: 26-34, (Editors: Bender, G. and Herbrich, R. and Siebenhaar, K.), B&S Siebenhaar Verlag, 2018 (incollection)

[BibTex]

[BibTex]


no image
Methods in Psychophysics

Wichmann, F. A., Jäkel, F.

In Stevens’ Handbook of Experimental Psychology and Cognitive Neuroscience, 5 (Methodology), 7, 4th, John Wiley & Sons, Inc., 2018 (inbook)

[BibTex]

[BibTex]


no image
Transfer Learning for BCIs

Jayaram, V., Fiebig, K., Peters, J., Grosse-Wentrup, M.

In Brain–Computer Interfaces Handbook, pages: 425-442, 22, (Editors: Chang S. Nam, Anton Nijholt and Fabien Lotte), CRC Press, 2018 (incollection)

Project Page [BibTex]

Project Page [BibTex]

2007


no image
Bayesian Estimators for Robins-Ritov’s Problem

Harmeling, S., Toussaint, M.

(EDI-INF-RR-1189), School of Informatics, University of Edinburgh, October 2007 (techreport)

Abstract
Bayesian or likelihood-based approaches to data analysis became very popular in the field of Machine Learning. However, there exist theoretical results which question the general applicability of such approaches; among those a result by Robins and Ritov which introduce a specific example for which they prove that a likelihood-based estimator will fail (i.e. it does for certain cases not converge to a true parameter estimate, even given infinite data). In this paper we consider various approaches to formulate likelihood-based estimators in this example, basically by considering various extensions of the presumed generative model of the data. We can derive estimators which are very similar to the classical Horvitz-Thompson and which also account for a priori knowledge of an observation probability function.

PDF [BibTex]

2007

PDF [BibTex]


no image
Support Vector Machine Learning for Interdependent and Structured Output Spaces

Altun, Y., Hofmann, T., Tsochantaridis, I.

In Predicting Structured Data, pages: 85-104, Advances in neural information processing systems, (Editors: Bakir, G. H. , T. Hofmann, B. Schölkopf, A. J. Smola, B. Taskar, S. V. N. Vishwanathan), MIT Press, Cambridge, MA, USA, September 2007 (inbook)

Web [BibTex]

Web [BibTex]


no image
Brisk Kernel ICA

Jegelka, S., Gretton, A.

In Large Scale Kernel Machines, pages: 225-250, Neural Information Processing, (Editors: Bottou, L. , O. Chapelle, D. DeCoste, J. Weston), MIT Press, Cambridge, MA, USA, September 2007 (inbook)

Abstract
Recent approaches to independent component analysis have used kernel independence measures to obtain very good performance in ICA, particularly in areas where classical methods experience difficulty (for instance, sources with near-zero kurtosis). In this chapter, we compare two efficient extensions of these methods for large-scale problems: random subsampling of entries in the Gram matrices used in defining the independence measures, and incomplete Cholesky decomposition of these matrices. We derive closed-form, efficiently computable approximations for the gradients of these measures, and compare their performance on ICA using both artificial and music data. We show that kernel ICA can scale up to much larger problems than yet attempted, and that incomplete Cholesky decomposition performs better than random sampling.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Training a Support Vector Machine in the Primal

Chapelle, O.

In Large Scale Kernel Machines, pages: 29-50, Neural Information Processing, (Editors: Bottou, L. , O. Chapelle, D. DeCoste, J. Weston), MIT Press, Cambridge, MA, USA, September 2007, This is a slightly updated version of the Neural Computation paper (inbook)

Abstract
Most literature on Support Vector Machines (SVMs) concentrate on the dual optimization problem. In this paper, we would like to point out that the primal problem can also be solved efficiently, both for linear and non-linear SVMs, and that there is no reason to ignore this possibility. On the contrary, from the primal point of view new families of algorithms for large scale SVM training can be investigated.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Approximation Methods for Gaussian Process Regression

Quiñonero-Candela, J., Rasmussen, CE., Williams, CKI.

In Large-Scale Kernel Machines, pages: 203-223, Neural Information Processing, (Editors: Bottou, L. , O. Chapelle, D. DeCoste, J. Weston), MIT Press, Cambridge, MA, USA, September 2007 (inbook)

Abstract
A wealth of computationally efficient approximation methods for Gaussian process regression have been recently proposed. We give a unifying overview of sparse approximations, following Quiñonero-Candela and Rasmussen (2005), and a brief review of approximate matrix-vector multiplication methods.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Learning with Transformation Invariant Kernels

Walder, C., Chapelle, O.

(165), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, September 2007 (techreport)

Abstract
Abstract. This paper considers kernels invariant to translation, rotation and dilation. We show that no non-trivial positive definite (p.d.) kernels exist which are radial and dilation invariant, only conditionally positive definite (c.p.d.) ones. Accordingly, we discuss the c.p.d. case and provide some novel analysis, including an elementary derivation of a c.p.d. representer theorem. On the practical side, we give a support vector machine (s.v.m.) algorithm for arbitrary c.p.d. kernels. For the thin-plate kernel this leads to a classifier with only one parameter (the amount of regularisation), which we demonstrate to be as effective as an s.v.m. with the Gaussian kernel, even though the Gaussian involves a second parameter (the length scale).

PDF [BibTex]

PDF [BibTex]


no image
Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference

Schölkopf, B., Platt, J., Hofmann, T.

Proceedings of the Twentieth Annual Conference on Neural Information Processing Systems (NIPS 2006), pages: 1690, MIT Press, Cambridge, MA, USA, 20th Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (proceedings)

Abstract
The annual Neural Information Processing Systems (NIPS) conference is the flagship meeting on neural computation and machine learning. It draws a diverse group of attendees--physicists, neuroscientists, mathematicians, statisticians, and computer scientists--interested in theoretical and applied aspects of modeling, simulating, and building neural-like or intelligent systems. The presentations are interdisciplinary, with contributions in algorithms, learning theory, cognitive science, neuroscience, brain imaging, vision, speech and signal processing, reinforcement learning, and applications. Only twenty-five percent of the papers submitted are accepted for presentation at NIPS, so the quality is exceptionally high. This volume contains the papers presented at the December 2006 meeting, held in Vancouver.

Web [BibTex]

Web [BibTex]


no image
Trading Convexity for Scalability

Collobert, R., Sinz, F., Weston, J., Bottou, L.

In Large Scale Kernel Machines, pages: 275-300, Neural Information Processing, (Editors: Bottou, L. , O. Chapelle, D. DeCoste, J. Weston), MIT Press, Cambridge, MA, USA, September 2007 (inbook)

Abstract
Convex learning algorithms, such as Support Vector Machines (SVMs), are often seen as highly desirable because they offer strong practical properties and are amenable to theoretical analysis. However, in this work we show how nonconvexity can provide scalability advantages over convexity. We show how concave-convex programming can be applied to produce (i) faster SVMs where training errors are no longer support vectors, and (ii) much faster Transductive SVMs.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Scalable Semidefinite Programming using Convex Perturbations

Kulis, B., Sra, S., Jegelka, S.

(TR-07-47), University of Texas, Austin, TX, USA, September 2007 (techreport)

Abstract
Several important machine learning problems can be modeled and solved via semidefinite programs. Often, researchers invoke off-the-shelf software for the associated optimization, which can be inappropriate for many applications due to computational and storage requirements. In this paper, we introduce the use of convex perturbations for semidefinite programs (SDPs). Using a particular perturbation function, we arrive at an algorithm for SDPs that has several advantages over existing techniques: a) it is simple, requiring only a few lines of MATLAB, b) it is a first-order method which makes it scalable, c) it can easily exploit the structure of a particular SDP to gain efficiency (e.g., when the constraint matrices are low-rank). We demonstrate on several machine learning applications that the proposed algorithm is effective in finding fast approximations to large-scale SDPs.

PDF [BibTex]

PDF [BibTex]


no image
Density Estimation of Structured Outputs in Reproducing Kernel Hilbert Spaces

Altun, Y., Smola, A.

In Predicting Structured Data, pages: 283-300, Advances in neural information processing systems, (Editors: BakIr, G. H., T. Hofmann, B. Schölkopf, A. J. Smola, B. Taskar, S. V.N. Vishwanathan), MIT Press, Cambridge, MA, USA, September 2007 (inbook)

Abstract
In this paper we study the problem of estimating conditional probability distributions for structured output prediction tasks in Reproducing Kernel Hilbert Spaces. More specically, we prove decomposition results for undirected graphical models, give constructions for kernels, and show connections to Gaussian Process classi- cation. Finally we present ecient means of solving the optimization problem and apply this to label sequence learning. Experiments on named entity recognition and pitch accent prediction tasks demonstrate the competitiveness of our approach.

Web [BibTex]

Web [BibTex]


no image
Classifying Event-Related Desynchronization in EEG, ECoG and MEG signals

Hill, N., Lal, T., Tangermann, M., Hinterberger, T., Widman, G., Elger, C., Schölkopf, B., Birbaumer, N.

In Toward Brain-Computer Interfacing, pages: 235-260, Neural Information Processing, (Editors: G Dornhege and J del R Millán and T Hinterberger and DJ McFarland and K-R Müller), MIT Press, Cambridge, MA, USA, September 2007 (inbook)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Joint Kernel Maps

Weston, J., Bakir, G., Bousquet, O., Mann, T., Noble, W., Schölkopf, B.

In Predicting Structured Data, pages: 67-84, Advances in neural information processing systems, (Editors: GH Bakir and T Hofmann and B Schölkopf and AJ Smola and B Taskar and SVN Vishwanathan), MIT Press, Cambridge, MA, USA, September 2007 (inbook)

Web [BibTex]

Web [BibTex]


no image
Brain-Computer Interfaces for Communication in Paralysis: A Clinical Experimental Approach

Hinterberger, T., Nijboer, F., Kübler, A., Matuz, T., Furdea, A., Mochty, U., Jordan, M., Lal, T., Hill, J., Mellinger, J., Bensch, M., Tangermann, M., Widman, G., Elger, C., Rosenstiel, W., Schölkopf, B., Birbaumer, N.

In Toward Brain-Computer Interfacing, pages: 43-64, Neural Information Processing, (Editors: G. Dornhege and J del R Millán and T Hinterberger and DJ McFarland and K-R Müller), MIT Press, Cambridge, MA, USA, September 2007 (inbook)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Sparse Multiscale Gaussian Process Regression

Walder, C., Kim, K., Schölkopf, B.

(162), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, August 2007 (techreport)

Abstract
Most existing sparse Gaussian process (g.p.) models seek computational advantages by basing their computations on a set of m basis functions that are the covariance function of the g.p. with one of its two inputs fixed. We generalise this for the case of Gaussian covariance function, by basing our computations on m Gaussian basis functions with arbitrary diagonal covariance matrices (or length scales). For a fixed number of basis functions and any given criteria, this additional flexibility permits approximations no worse and typically better than was previously possible. Although we focus on g.p. regression, the central idea is applicable to all kernel based algorithms, such as the support vector machine. We perform gradient based optimisation of the marginal likelihood, which costs O(m2n) time where n is the number of data points, and compare the method to various other sparse g.p. methods. Our approach outperforms the other methods, particularly for the case of very few basis functions, i.e. a very high sparsity ratio.

PDF [BibTex]

PDF [BibTex]


no image
Efficient Subwindow Search for Object Localization

Blaschko, M., Hofmann, T., Lampert, C.

(164), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, August 2007 (techreport)

Abstract
Recent years have seen huge advances in object recognition from images. Recognition rates beyond 95% are the rule rather than the exception on many datasets. However, most state-of-the-art methods can only decide if an object is present or not. They are not able to provide information on the object location or extent within in the image. We report on a simple yet powerful scheme that extends many existing recognition methods to also perform localization of object bounding boxes. This is achieved by maximizing the classification score over all possible subrectangles in the image. Despite the impression that this would be computationally intractable, we show that in many situations efficient algorithms exist which solve a generalized maximum subrectangle problem. We show how our method is applicable to a variety object detection frameworks and demonstrate its performance by applying it to the popular bag of visual words model, achieving competitive results on the PASCAL VOC 2006 dataset.

PDF [BibTex]

PDF [BibTex]


no image
Pattern detection

Blake, A., Romdhani, S., Schölkopf, B., Torr, P. H. S.

United States Patent, No 7236626, June 2007 (patent)

[BibTex]

[BibTex]


no image
Cluster Identification in Nearest-Neighbor Graphs

Maier, M., Hein, M., von Luxburg, U.

(163), Max-Planck-Institute for Biological Cybernetics, Tübingen, Germany, May 2007 (techreport)

Abstract
Assume we are given a sample of points from some underlying distribution which contains several distinct clusters. Our goal is to construct a neighborhood graph on the sample points such that clusters are ``identified‘‘: that is, the subgraph induced by points from the same cluster is connected, while subgraphs corresponding to different clusters are not connected to each other. We derive bounds on the probability that cluster identification is successful, and use them to predict ``optimal‘‘ values of k for the mutual and symmetric k-nearest-neighbor graphs. We point out different properties of the mutual and symmetric nearest-neighbor graphs related to the cluster identification problem.

PDF [BibTex]

PDF [BibTex]


no image
Exploring model selection techniques for nonlinear dimensionality reduction

Harmeling, S.

(EDI-INF-RR-0960), School of Informatics, University of Edinburgh, March 2007 (techreport)

Abstract
Nonlinear dimensionality reduction (NLDR) methods have become useful tools for practitioners who are faced with the analysis of high-dimensional data. Of course, not all NLDR methods are equally applicable to a particular dataset at hand. Thus it would be useful to come up with model selection criteria that help to choose among different NLDR algorithms. This paper explores various approaches to this problem and evaluates them on controlled data sets. Comprehensive experiments will show that model selection scores based on stability are not useful, while scores based on Gaussian processes are helpful for the NLDR problem.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Probabilistic Structure Calculation

Rieping, W., Habeck, M., Nilges, M.

In Structure and Biophysics: New Technologies for Current Challenges in Biology and Beyond, pages: 81-98, NATO Security through Science Series, (Editors: Puglisi, J. D.), Springer, Berlin, Germany, March 2007 (inbook)

Web DOI [BibTex]

Web DOI [BibTex]


no image
Dirichlet Mixtures of Bayesian Linear Gaussian State-Space Models: a Variational Approach

Chiappa, S., Barber, D.

(161), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, March 2007 (techreport)

Abstract
We describe two related models to cluster multidimensional time-series under the assumption of an underlying linear Gaussian dynamical process. In the first model, times-series are assigned to the same cluster when they show global similarity in their dynamics, while in the second model times-series are assigned to the same cluster when they show simultaneous similarity. Both models are based on Dirichlet Mixtures of Bayesian Linear Gaussian State-Space Models in order to (semi) automatically determine an appropriate number of components in the mixture, and to additionally bias the components to a parsimonious parameterization. The resulting models are formally intractable and to deal with this we describe a deterministic approximation based on a novel implementation of Variational Bayes.

PDF [BibTex]

PDF [BibTex]


no image
Modeling data using directional distributions: Part II

Sra, S., Jain, P., Dhillon, I.

(TR-07-05), University of Texas, Austin, TX, USA, February 2007 (techreport)

Abstract
High-dimensional data is central to most data mining applications, and only recently has it been modeled via directional distributions. In [Banerjee et al., 2003] the authors introduced the use of the von Mises-Fisher (vMF) distribution for modeling high-dimensional directional data, particularly for text and gene expression analysis. The vMF distribution is one of the simplest directional distributions. TheWatson, Bingham, and Fisher-Bingham distributions provide distri- butions with an increasing number of parameters and thereby commensurately increased modeling power. This report provides a followup study to the initial development in [Banerjee et al., 2003] by presenting Expectation Maximization (EM) procedures for estimating parameters of a mixture of Watson (moW) distributions. The numerical challenges associated with parameter estimation for both of these distributions are significantly more difficult than for the vMF distribution. We develop new numerical approximations for estimating the parameters permitting us to model real- life data more accurately. Our experimental results establish that for certain data sets improved modeling power translates into better results.

PDF [BibTex]

PDF [BibTex]


no image
Automatic 3D Face Reconstruction from Single Images or Video

Breuer, P., Kim, K., Kienzle, W., Blanz, V., Schölkopf, B.

(160), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, February 2007 (techreport)

Abstract
This paper presents a fully automated algorithm for reconstructing a textured 3D model of a face from a single photograph or a raw video stream. The algorithm is based on a combination of Support Vector Machines (SVMs) and a Morphable Model of 3D faces. After SVM face detection, individual facial features are detected using a novel regression-and classification-based approach, and probabilistically plausible configurations of features are selected to produce a list of candidates for several facial feature positions. In the next step, the configurations of feature points are evaluated using a novel criterion that is based on a Morphable Model and a combination of linear projections. Finally, the feature points initialize a model-fitting procedure of the Morphable Model. The result is a high-resolution 3D surface model.

PDF [BibTex]

PDF [BibTex]


no image
On the Pre-Image Problem in Kernel Methods

BakIr, G., Schölkopf, B., Weston, J.

In Kernel Methods in Bioengineering, Signal and Image Processing, pages: 284-302, (Editors: G Camps-Valls and JL Rojo-Álvarez and M Martínez-Ramón), Idea Group Publishing, Hershey, PA, USA, January 2007 (inbook)

Abstract
In this chapter we are concerned with the problem of reconstructing patterns from their representation in feature space, known as the pre-image problem. We review existing algorithms and propose a learning based approach. All algorithms are discussed regarding their usability and complexity and evaluated on an image denoising application.

DOI [BibTex]

DOI [BibTex]


no image
Mathematik der Wahrnehmung: Wendepunkte

Wichman, F., Ernst, MO.

Akademische Mitteilungen zw{\"o}lf: F{\"u}nf Sinne, pages: 32-37, 2007 (misc)

[BibTex]

[BibTex]


no image
Some comments on ν-SVM

Dinuzzo, F., De Nicolao, G.

In A tribute to Antonio Lepschy, pages: -, (Editors: Picci, G. , M. E. Valcher), Edizione Libreria Progetto, Padova, Italy, 2007 (inbook)

[BibTex]

[BibTex]


no image
Relative Entropy Policy Search

Peters, J.

CLMC Technical Report: TR-CLMC-2007-2, Computational Learning and Motor Control Lab, Los Angeles, CA, 2007, clmc (techreport)

Abstract
This technical report describes a cute idea of how to create new policy search approaches. It directly relates to the Natural Actor-Critic methods but allows the derivation of one shot solutions. Future work may include the application to interesting problems.

PDF link (url) [BibTex]

PDF link (url) [BibTex]

2004


no image
Fast Binary and Multi-Output Reduced Set Selection

Weston, J., Bakir, G.

(132), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, November 2004 (techreport)

Abstract
We propose fast algorithms for reducing the number of kernel evaluations in the testing phase for methods such as Support Vector Machines (SVM) and Ridge Regression (RR). For non-sparse methods such as RR this results in significantly improved prediction time. For binary SVMs, which are already sparse in their expansion, the pay off is mainly in the cases of noisy or large-scale problems. However, we then further develop our method for multi-class problems where, after choosing the expansion to find vectors which describe all the hyperplanes jointly, we again achieve significant gains.

PostScript [BibTex]

2004

PostScript [BibTex]


no image
Joint Kernel Maps

Weston, J., Schölkopf, B., Bousquet, O., Mann, .., Noble, W.

(131), Max-Planck-Institute for Biological Cybernetics, Tübingen, November 2004 (techreport)

PDF [BibTex]

PDF [BibTex]


no image
Pattern detection methods and systems and face detection methods and systems

Blake, A., Romdhani, S., Schölkopf, B., Torr, P. H. S.

United States Patent, No 6804391, October 2004 (patent)

[BibTex]

[BibTex]


no image
Advanced Lectures on Machine Learning

Bousquet, O., von Luxburg, U., Rätsch, G.

ML Summer Schools 2003, LNAI 3176, pages: 240, Springer, Berlin, Germany, ML Summer Schools, September 2004 (proceedings)

Abstract
Machine Learning has become a key enabling technology for many engineering applications, investigating scientific questions and theoretical problems alike. To stimulate discussions and to disseminate new results, a summer school series was started in February 2002, the documentation of which is published as LNAI 2600. This book presents revised lectures of two subsequent summer schools held in 2003 in Canberra, Australia, and in T{\"u}bingen, Germany. The tutorial lectures included are devoted to statistical learning theory, unsupervised learning, Bayesian inference, and applications in pattern recognition; they provide in-depth overviews of exciting new developments and contain a large number of references. Graduate students, lecturers, researchers and professionals alike will find this book a useful resource in learning and teaching machine learning.

Web [BibTex]

Web [BibTex]


no image
Pattern Recognition: 26th DAGM Symposium, LNCS, Vol. 3175

Rasmussen, C., Bülthoff, H., Giese, M., Schölkopf, B.

Proceedings of the 26th Pattern Recognition Symposium (DAGM‘04), pages: 581, Springer, Berlin, Germany, 26th Pattern Recognition Symposium, August 2004 (proceedings)

Web DOI [BibTex]

Web DOI [BibTex]


no image
Semi-Supervised Induction

Yu, K., Tresp, V., Zhou, D.

(141), Max Planck Institute for Biological Cybernetics, Tuebingen, Germany, August 2004 (techreport)

Abstract
Considerable progress was recently achieved on semi-supervised learning, which differs from the traditional supervised learning by additionally exploring the information of the unlabelled examples. However, a disadvantage of many existing methods is that it does not generalize to unseen inputs. This paper investigates learning methods that effectively make use of both labelled and unlabelled data to build predictive functions, which are defined on not just the seen inputs but the whole space. As a nice property, the proposed method allows effcient training and can easily handle new test points. We validate the method based on both toy data and real world data sets.

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
On Hausdorff Distance Measures

Shapiro, MD., Blaschko, MB.

Department of Computer Science, University of Massachusetts Amherst, August 2004 (techreport)

[BibTex]

[BibTex]


no image
Object categorization with SVM: kernels for local features

Eichhorn, J., Chapelle, O.

(137), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, July 2004 (techreport)

Abstract
In this paper, we propose to combine an efficient image representation based on local descriptors with a Support Vector Machine classifier in order to perform object categorization. For this purpose, we apply kernels defined on sets of vectors. After testing different combinations of kernel / local descriptors, we have been able to identify a very performant one.

PDF [BibTex]

PDF [BibTex]


no image
Hilbertian Metrics and Positive Definite Kernels on Probability Measures

Hein, M., Bousquet, O.

(126), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, July 2004 (techreport)

Abstract
We investigate the problem of defining Hilbertian metrics resp. positive definite kernels on probability measures, continuing previous work. This type of kernels has shown very good results in text classification and has a wide range of possible applications. In this paper we extend the two-parameter family of Hilbertian metrics of Topsoe such that it now includes all commonly used Hilbertian metrics on probability measures. This allows us to do model selection among these metrics in an elegant and unified way. Second we investigate further our approach to incorporate similarity information of the probability space into the kernel. The analysis provides a better understanding of these kernels and gives in some cases a more efficient way to compute them. Finally we compare all proposed kernels in two text and one image classification problem.

PDF [BibTex]

PDF [BibTex]


no image
Kernels, Associated Structures and Generalizations

Hein, M., Bousquet, O.

(127), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, July 2004 (techreport)

Abstract
This paper gives a survey of results in the mathematical literature on positive definite kernels and their associated structures. We concentrate on properties which seem potentially relevant for Machine Learning and try to clarify some results that have been misused in the literature. Moreover we consider different lines of generalizations of positive definite kernels. Namely we deal with operator-valued kernels and present the general framework of Hilbertian subspaces of Schwartz which we use to introduce kernels which are distributions. Finally indefinite kernels and their associated reproducing kernel spaces are considered.

PDF [BibTex]

PDF [BibTex]


no image
Analysis of differential gene expression in healthy and osteoarthritic cartilage and isolated chondrocytes by microarray analysis

Aigner, T., Saas, J., Zien, A., Zimmer, R., Gebhard, P., Knorr, T.

In Volume 1: Cellular and Molecular Tools, pages: 109-128, (Editors: Sabatini, M., P. Pastoureau and F. De Ceuninck), Humana Press, July 2004 (inbook)

Abstract
The regulation of chondrocytes in osteoarthritic cartilage and the expression of specific gene products by these cells during early-onset and late-stage osteoarthritis are not well characterized. With the introduction of cDNA array technology, the measurement of thousands of different genes in one small tissue sample can be carried out. Interpretation of gene expression analyses in articular cartilage is aided by the fact that this tissue contains only one cell type in both normal and diseased conditions. However, care has to be taken not to over- and misinterpret results, and some major challenges must be overcome in order to utilize the potential of this technology properly in the field of osteoarthritis.

Web [BibTex]

Web [BibTex]


no image
Triangle Fixing Algorithms for the Metric Nearness Problem

Dhillon, I., Sra, S., Tropp, J.

Univ. of Texas at Austin, June 2004 (techreport)

PDF [BibTex]

PDF [BibTex]


no image
Advances in Neural Information Processing Systems 16: Proceedings of the 2003 Conference

Thrun, S., Saul, L., Schölkopf, B.

Proceedings of the Seventeenth Annual Conference on Neural Information Processing Systems (NIPS 2003), pages: 1621, MIT Press, Cambridge, MA, USA, 17th Annual Conference on Neural Information Processing Systems (NIPS), June 2004 (proceedings)

Abstract
The annual Neural Information Processing (NIPS) conference is the flagship meeting on neural computation. It draws a diverse group of attendees—physicists, neuroscientists, mathematicians, statisticians, and computer scientists. The presentations are interdisciplinary, with contributions in algorithms, learning theory, cognitive science, neuroscience, brain imaging, vision, speech and signal processing, reinforcement learning and control, emerging technologies, and applications. Only thirty percent of the papers submitted are accepted for presentation at NIPS, so the quality is exceptionally high. This volume contains all the papers presented at the 2003 conference.

Web [BibTex]

Web [BibTex]


no image
Distributed Command Execution

Stark, S., Berlin, M.

In BSD Hacks: 100 industrial-strength tips & tools, pages: 152-152, (Editors: Lavigne, Dru), O’Reilly, Beijing, May 2004 (inbook)

Abstract
Often you want to execute a command not only on one computer, but on several at once. For example, you might want to report the current statistics on a group of managed servers or update all of your web servers at once.

[BibTex]

[BibTex]


no image
Kamerakalibrierung und Tiefenschätzung: Ein Vergleich von klassischer Bündelblockausgleichung und statistischen Lernalgorithmen

Sinz, FH.

Wilhelm-Schickard-Institut für Informatik, Universität Tübingen, Tübingen, Germany, March 2004 (techreport)

Abstract
Die Arbeit verleicht zwei Herangehensweisen an das Problem der Sch{\"a}tzung der r{\"a}umliche Position eines Punktes aus den Bildkoordinaten in zwei verschiedenen Kameras. Die klassische Methode der B{\"u}ndelblockausgleichung modelliert zwei Einzelkameras und sch{\"a}tzt deren {\"a}ußere und innere Orientierung mit einer iterativen Kalibrationsmethode, deren Konvergenz sehr stark von guten Startwerten abh{\"a}ngt. Die Tiefensch{\"a}tzung eines Punkts geschieht durch die Invertierung von drei der insgesamt vier Projektionsgleichungen der Einzalkameramodelle. Die zweite Methode benutzt Kernel Ridge Regression und Support Vector Regression, um direkt eine Abbildung von den Bild- auf die Raumkoordinaten zu lernen. Die Resultate zeigen, daß der Ansatz mit maschinellem Lernen, neben einer erheblichen Vereinfachung des Kalibrationsprozesses, zu h{\"o}heren Positionsgenaugikeiten f{\"u}hren kann.

PDF [BibTex]

PDF [BibTex]


no image
Gaussian Processes in Machine Learning

Rasmussen, CE.

In 3176, pages: 63-71, Lecture Notes in Computer Science, (Editors: Bousquet, O., U. von Luxburg and G. Rätsch), Springer, Heidelberg, 2004, Copyright by Springer (inbook)

Abstract
We give a basic introduction to Gaussian Process regression models. We focus on understanding the role of the stochastic process and how it is used to define a distribution over functions. We present the simple equations for incorporating training data and examine how to learn the hyperparameters using the marginal likelihood. We explain the practical advantages of Gaussian Process and end with conclusions and a look at the current trends in GP work.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Multivariate Regression with Stiefel Constraints

Bakir, G., Gretton, A., Franz, M., Schölkopf, B.

(128), MPI for Biological Cybernetics, Spemannstr 38, 72076, Tuebingen, 2004 (techreport)

Abstract
We introduce a new framework for regression between multi-dimensional spaces. Standard methods for solving this problem typically reduce the problem to one-dimensional regression by choosing features in the input and/or output spaces. These methods, which include PLS (partial least squares), KDE (kernel dependency estimation), and PCR (principal component regression), select features based on different a-priori judgments as to their relevance. Moreover, loss function and constraints are chosen not primarily on statistical grounds, but to simplify the resulting optimisation. By contrast, in our approach the feature construction and the regression estimation are performed jointly, directly minimizing a loss function that we specify, subject to a rank constraint. A major advantage of this approach is that the loss is no longer chosen according to the algorithmic requirements, but can be tailored to the characteristics of the task at hand; the features will then be optimal with respect to this objective. Our approach also allows for the possibility of using a regularizer in the optimization. Finally, by processing the observations sequentially, our algorithm is able to work on large scale problems.

PDF [BibTex]

PDF [BibTex]


no image
Local Alignment Kernels for Biological Sequences

Vert, J., Saigo, H., Akutsu, T.

In Kernel Methods in Computational Biology, pages: 131-153, MIT Press, Cambridge, MA,, 2004 (inbook)

Web [BibTex]

Web [BibTex]


no image
Learning from Labeled and Unlabeled Data Using Random Walks

Zhou, D., Schölkopf, B.

Max Planck Institute for Biological Cybernetics, 2004 (techreport)

Abstract
We consider the general problem of learning from labeled and unlabeled data. Given a set of points, some of them are labeled, and the remaining points are unlabeled. The goal is to predict the labels of the unlabeled points. Any supervised learning algorithm can be applied to this problem, for instance, Support Vector Machines (SVMs). The problem of our interest is if we can implement a classifier which uses the unlabeled data information in some way and has higher accuracy than the classifiers which use the labeled data only. Recently we proposed a simple algorithm, which can substantially benefit from large amounts of unlabeled data and demonstrates clear superiority to supervised learning methods. In this paper we further investigate the algorithm using random walks and spectral graph theory, which shed light on the key steps in this algorithm.

PDF PostScript [BibTex]

PDF PostScript [BibTex]