Header logo is ei


2008


no image
Correlational Spectral Clustering

Blaschko, MB., Lampert, CH.

In CVPR 2008, pages: 1-8, IEEE Computer Society, Los Alamitos, CA, USA, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June 2008 (inproceedings)

Abstract
We present a new method for spectral clustering with paired data based on kernel canonical correlation analysis, called correlational spectral clustering. Paired data are common in real world data sources, such as images with text captions. Traditional spectral clustering algorithms either assume that data can be represented by a single similarity measure, or by co-occurrence matrices that are then used in biclustering. In contrast, the proposed method uses separate similarity measures for each data representation, and allows for projection of previously unseen data that are only observed in one representation (e.g. images but not text). We show that this algorithm generalizes traditional spectral clustering algorithms and show consistent empirical improvement over spectral clustering on a variety of datasets of images with associated text.

PDF Web DOI [BibTex]

2008

PDF Web DOI [BibTex]


no image
Approximate Dynamic Programming with Gaussian Processes

Deisenroth, M., Peters, J., Rasmussen, C.

In ACC 2008, pages: 4480-4485, IEEE Service Center, Piscataway, NJ, USA, 2008 American Control Conference, June 2008 (inproceedings)

Abstract
In general, it is difficult to determine an optimal closed-loop policy in nonlinear control problems with continuous-valued state and control domains. Hence, approximations are often inevitable. The standard method of discretizing states and controls suffers from the curse of dimensionality and strongly depends on the chosen temporal sampling rate. In this paper, we introduce Gaussian process dynamic programming (GPDP) and determine an approximate globally optimal closed-loop policy. In GPDP, value functions in the Bellman recursion of the dynamic programming algorithm are modeled using Gaussian processes. GPDP returns an optimal statefeedback for a finite set of states. Based on these outcomes, we learn a possibly discontinuous closed-loop policy on the entire state space by switching between two independently trained Gaussian processes. A binary classifier selects one Gaussian process to predict the optimal control signal. We show that GPDP is able to yield an almost optimal solution to an LQ problem using few sample points. Moreover, we successfully apply GPDP to the underpowered pendulum swing up, a complex nonlinear control problem.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search

Lampert, C., Blaschko, M., Hofmann, T.

In CVPR 2008, pages: 1-8, IEEE Computer Society, Los Alamitos, CA, USA, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June 2008, Best paper award (inproceedings)

Abstract
Most successful object recognition systems rely on binary classification, deciding only if an object is present or not, but not providing information on the actual object location. To perform localization, one can take a sliding window approach, but this strongly increases the computational cost, because the classifier function has to be evaluated over a large set of candidate subwindows. In this paper, we propose a simple yet powerful branchand- bound scheme that allows efficient maximization of a large class of classifier functions over all possible subimages. It converges to a globally optimal solution typically in sublinear time. We show how our method is applicable to different object detection and retrieval scenarios. The achieved speedup allows the use of classifiers for localization that formerly were considered too slow for this task, such as SVMs with a spatial pyramid kernel or nearest neighbor classifiers based on the 2-distance. We demonstrate state-of-the-art performance of the resulting systems on the UIUC Cars dataset, the PASCAL VOC 2006 dataset and in the PASCAL VOC 2007 competition.

PDF PDF Web DOI [BibTex]

PDF PDF Web DOI [BibTex]


no image
Computed Torque Control with Nonparametric Regression Models

Nguyen-Tuong, D., Seeger, M., Peters, J.

In ACC 2008, pages: 212-217, IEEE Service Center, Piscataway, NJ, USA, 2008 American Control Conference, June 2008 (inproceedings)

Abstract
Computed torque control allows the design of considerably more precise, energy-efficient and compliant controls for robots. However, the major obstacle is the requirement of an accurate model for torque generation, which cannot be obtained in some cases using rigid-body formulations due to unmodeled nonlinearities, such as complex friction or actuator dynamics. In such cases, models approximated from robot data present an appealing alternative. In this paper, we compare two nonparametric regression methods for model approximation, i.e., locally weighted projection regression (LWPR) and Gaussian process regression (GPR). While locally weighted regression was employed for real-time model estimation in learning adaptive control, Gaussian process regression has not been used in control to-date due to high computational requirements. The comparison includes the assessment of model approximation for both regression methods using data originated from SARCOS robot arm, as well as an evaluation of the robot tracking p erformance in computed torque control employing the approximated models. Our results show that GPR can be applied for real-time control achieving higher accuracy. However, for the online learning LWPR is superior by reason of lower computational requirements.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Multi-Classification by Categorical Features via Clustering

Seldin, Y., Tishby, N.

In In the proceedings of the 25th International Conference on Machine Learning (ICML 2008), pages: 920-927, 25th International Conference on Machine Learning (ICML), June 2008 (inproceedings)

Abstract
We derive a generalization bound for multi-classification schemes based on grid clustering in categorical parameter product spaces. Grid clustering partitions the parameter space in the form of a Cartesian product of partitions for each of the parameters. The derived bound provides a means to evaluate clustering solutions in terms of the generalization power of a built-on classifier. For classification based on a single feature the bound serves to find a globally optimal classification rule. Comparison of the generalization power of individual features can then be used for feature ranking. Our experiments show that in this role the bound is much more precise than mutual information or normalized correlation indices.

PDF Web [BibTex]

PDF Web [BibTex]


no image
A Kernel Test of Nonlinear Granger Causality

Sun, X.

In Proceedings of the Workshop on Inference and Estimation in Probabilistic Time-Series Models, pages: 79-89, (Editors: Barber, D. , A. T. Cemgil, S. Chiappa), Isaac Newton Institute for Mathematical Sciences, Cambridge, United Kingdom, Workshop on Inference and Estimation in Probabilistic Time-Series Models, June 2008 (inproceedings)

Abstract
We present a novel test of nonlinear Granger causality in bivariate time series. The trace norm of conditional covariance operators is used to capture the prediction errors. Based on this measure, a subsampling-based multiple testing procedure tests the prediction improvement of one time series by the other one. The distributional properties of the resulting p-values reveal the direction of Granger causality. Encouraging results of experiments with simulated and real-world data support our approach.

PDF [BibTex]

PDF [BibTex]


no image
Thin-Plate Splines Between Riemannian Manifolds

Steinke, F., Hein, M., Schölkopf, B.

Workshop on Geometry and Statistics of Shapes, June 2008 (talk)

Abstract
With the help of differential geometry we describe a framework to define a thin-plate spline like energy for maps between arbitrary Riemannian manifolds. The so-called Eells energy only depends on the intrinsic geometry of the input and output manifold, but not on their respective representation. The energy can then be used for regression between manifolds, we present results for cases where the outputs are rotations, sets of angles, or points on 3D surfaces. In the future we plan to also target regression where the output is an element of "shape space", understood as a Riemannian manifold. One could also further explore the meaning of the Eells energy when applied to diffeomorphisms between shapes, especially with regard to its potential use as a distance measure between shapes that does not depend on the embedding or the parametrisation of the shapes.

Web [BibTex]

Web [BibTex]


Thumb xl teaser
Bayesian Color Constancy Revisited

Gehler, P., Rother, C., Blake, A., Minka, T., Sharp, T.

In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR, June 2008, http://dx.doi.org/10.1109/CVPR.2008.4587765 (inproceedings)

website+code+data pdf [BibTex]

website+code+data pdf [BibTex]


no image
Real-time Learning of Resolved Velocity Control on a Mitsubishi PA-10

Peters, J., Nguyen-Tuong, D.

In ICRA 2008, pages: 2872-2877, IEEE Service Center, Piscataway, NJ, USA, 2008 IEEE International Conference on Robotics and Automation, May 2008 (inproceedings)

Abstract
Learning inverse kinematics has long been fascinating the robot learning community. While humans acquire this transformation to complicated tool spaces with ease, it is not a straightforward application for supervised learning algorithms due to non-convex learning problem. However, the key insight that the problem can be considered convex in small local regions allows the application of locally linear learning methods. Nevertheless, the local solution of the problem depends on the data distribution which can result into inconsistent global solutions with large model discontinuities. While this problem can be treated in various ways in offline learning, it poses a serious problem for online learning. Previous approaches to the real-time learning of inverse kinematics avoid this problem using smart data generation, such as the learner biasses its own solution. Such biassed solutions can result into premature convergence, and from the resulting solution it is often hard to understand what has been learned in tha t local region. This paper improves and solves this problem by presenting a learning algorithm which can deal with this inconsistency through re-weighting the data online. Furthermore, we show that our algorithms work not only in simulation, but we present real-time learning results on a physical Mitsubishi PA-10 robot arm.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
State Space Compression with Predictive Representations

Boularias, A., Izadi, M., Chaib-Draa, B.

In Flairs 2008, pages: 41-46, (Editors: Wilson, D. C., H. C. Lane), AAAI Press, Menlo Park, CA, USA, 21st International Florida Artificial Intelligence Research Society Conference, May 2008 (inproceedings)

Abstract
Current studies have demonstrated that the representational power of predictive state representations (PSRs) is at least equal to the one of partially observable Markov decision processes (POMDPs). This is while early steps in planning and generalization with PSRs suggest substantial improvements compared to POMDPs. However, lack of practical algorithms for learning these representations severely restricts their applicability. The computational inefficiency of exact PSR learning methods naturally leads to the exploration of various approximation methods that can provide a good set of core tests through less computational effort. In this paper, we address this problem in an optimization framework. In particular, our approach aims to minimize the potential error that may be caused by missing a number of core tests. We provide analysis of the error caused by this compression and present an empirical evaluation illustrating the performance of this approach.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Learning resolved velocity control

Peters, J.

2008 IEEE International Conference on Robotics and Automation (ICRA), May 2008 (talk)

Web [BibTex]

Web [BibTex]


no image
Bayesian methods for protein structure determination

Habeck, M.

Machine Learning in Structural Bioinformatics, April 2008 (talk)

Web [BibTex]

Web [BibTex]


no image
Graph Mining with Variational Dirichlet Process Mixture Models

Tsuda, K., Kurihara, K.

In SDM 2008, pages: 432-442, (Editors: Zaki, M. J.), Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 8th SIAM International Conference on Data Mining, April 2008 (inproceedings)

Abstract
Graph data such as chemical compounds and XML documents are getting more common in many application domains. A main difficulty of graph data processing lies in the intrinsic high dimensionality of graphs, namely, when a graph is represented as a binary feature vector of indicators of all possible subgraph patterns, the dimensionality gets too large for usual statistical methods. We propose a nonparametric Bayesian method for clustering graphs and selecting salient patterns at the same time. Variational inference is adopted here, because sampling is not applicable due to extremely high dimensionality. The feature set minimizing the free energy is efficiently collected with the DFS code tree, where the generation of useless subgraphs is suppressed by a tree pruning condition. In experiments, our method is compared with a simpler approach based on frequent subgraph mining, and graph kernels.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Model-Based Reinforcement Learning with Continuous States and Actions

Deisenroth, M., Rasmussen, C., Peters, J.

In ESANN 2008, pages: 19-24, (Editors: Verleysen, M. ), d-side, Evere, Belgium, European Symposium on Artificial Neural Networks, April 2008 (inproceedings)

Abstract
Finding an optimal policy in a reinforcement learning (RL) framework with continuous state and action spaces is challenging. Approximate solutions are often inevitable. GPDP is an approximate dynamic programming algorithm based on Gaussian process (GP) models for the value functions. In this paper, we extend GPDP to the case of unknown transition dynamics. After building a GP model for the transition dynamics, we apply GPDP to this model and determine a continuous-valued policy in the entire state space. We apply the resulting controller to the underpowered pendulum swing up. Moreover, we compare our results on this RL task to a nearly optimal discrete DP solution in a fully known environment.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Learning Inverse Dynamics: A Comparison

Nguyen-Tuong, D., Peters, J., Seeger, M., Schölkopf, B.

In Advances in Computational Intelligence and Learning: Proceedings of the European Symposium on Artificial Neural Networks, pages: 13-18, (Editors: M Verleysen), d-side, Evere, Belgium, 16th European Symposium on Artificial Neural Networks (ESANN), April 2008 (inproceedings)

Abstract
While it is well-known that model can enhance the control performance in terms of precision or energy efficiency, the practical application has often been limited by the complexities of manually obtaining sufficiently accurate models. In the past, learning has proven a viable alternative to using a combination of rigid-body dynamics and handcrafted approximations of nonlinearities. However, a major open question is what nonparametric learning method is suited best for learning dynamics? Traditionally, locally weighted projection regression (LWPR), has been the standard method as it is capable of online, real-time learning for very complex robots. However, while LWPR has had significant impact on learning in robotics, alternative nonparametric regression methods such as support vector regression (SVR) and Gaussian processes regression (GPR) offer interesting alternatives with fewer open parameters and potentially higher accuracy. In this paper, we evaluate these three alternatives for model learning. Our comparison consists out of the evaluation of learning quality for each regression method using original data from SARCOS robot arm, as well as the robot tracking performance employing learned models. The results show that GPR and SVR achieve a superior learning precision and can be applied for real-time control obtaining higher accuracy. However, for the online learning LWPR presents the better method due to its lower computational requirements.

PDF Web [BibTex]

PDF Web [BibTex]

2002


no image
Gender Classification of Human Faces

Graf, A., Wichmann, F.

In Biologically Motivated Computer Vision, pages: 1-18, (Editors: Bülthoff, H. H., S.W. Lee, T. A. Poggio and C. Wallraven), Springer, Berlin, Germany, Second International Workshop on Biologically Motivated Computer Vision (BMCV), November 2002 (inproceedings)

Abstract
This paper addresses the issue of combining pre-processing methods—dimensionality reduction using Principal Component Analysis (PCA) and Locally Linear Embedding (LLE)—with Support Vector Machine (SVM) classification for a behaviorally important task in humans: gender classification. A processed version of the MPI head database is used as stimulus set. First, summary statistics of the head database are studied. Subsequently the optimal parameters for LLE and the SVM are sought heuristically. These values are then used to compare the original face database with its processed counterpart and to assess the behavior of a SVM with respect to changes in illumination and perspective of the face images. Overall, PCA was superior in classification performance and allowed linear separability.

PDF PDF DOI [BibTex]

2002

PDF PDF DOI [BibTex]


no image
Insect-Inspired Estimation of Self-Motion

Franz, MO., Chahl, JS.

In Biologically Motivated Computer Vision, (2525):171-180, LNCS, (Editors: Bülthoff, H.H. , S.W. Lee, T.A. Poggio, C. Wallraven), Springer, Berlin, Germany, Second International Workshop on Biologically Motivated Computer Vision (BMCV), November 2002 (inproceedings)

Abstract
The tangential neurons in the fly brain are sensitive to the typical optic flow patterns generated during self-motion. In this study, we examine whether a simplified linear model of these neurons can be used to estimate self-motion from the optic flow. We present a theory for the construction of an optimal linear estimator incorporating prior knowledge about the environment. The optimal estimator is tested on a gantry carrying an omnidirectional vision sensor. The experiments show that the proposed approach leads to accurate and robust estimates of rotation rates, whereas translation estimates turn out to be less reliable.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Combining sensory Information to Improve Visualization

Ernst, M., Banks, M., Wichmann, F., Maloney, L., Bülthoff, H.

In Proceedings of the Conference on Visualization ‘02 (VIS ‘02), pages: 571-574, (Editors: Moorhead, R. , M. Joy), IEEE, Piscataway, NJ, USA, IEEE Conference on Visualization (VIS '02), October 2002 (inproceedings)

Abstract
Seemingly effortlessly the human brain reconstructs the three-dimensional environment surrounding us from the light pattern striking the eyes. This seems to be true across almost all viewing and lighting conditions. One important factor for this apparent easiness is the redundancy of information provided by the sensory organs. For example, perspective distortions, shading, motion parallax, or the disparity between the two eyes' images are all, at least partly, redundant signals which provide us with information about the three-dimensional layout of the visual scene. Our brain uses all these different sensory signals and combines the available information into a coherent percept. In displays visualizing data, however, the information is often highly reduced and abstracted, which may lead to an altered perception and therefore a misinterpretation of the visualized data. In this panel we will discuss mechanisms involved in the combination of sensory information and their implications for simulations using computer displays, as well as problems resulting from current display technology such as cathode-ray tubes.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Sampling Techniques for Kernel Methods

Achlioptas, D., McSherry, F., Schölkopf, B.

In Advances in neural information processing systems 14 , pages: 335-342, (Editors: TG Dietterich and S Becker and Z Ghahramani), MIT Press, Cambridge, MA, USA, 15th Annual Neural Information Processing Systems Conference (NIPS), September 2002 (inproceedings)

Abstract
We propose randomized techniques for speeding up Kernel Principal Component Analysis on three levels: sampling and quantization of the Gram matrix in training, randomized rounding in evaluating the kernel expansions, and random projections in evaluating the kernel itself. In all three cases, we give sharp bounds on the accuracy of the obtained approximations.

PDF Web [BibTex]

PDF Web [BibTex]


no image
The Infinite Hidden Markov Model

Beal, MJ., Ghahramani, Z., Rasmussen, CE.

In Advances in Neural Information Processing Systems 14, pages: 577-584, (Editors: Dietterich, T.G. , S. Becker, Z. Ghahramani), MIT Press, Cambridge, MA, USA, Fifteenth Annual Neural Information Processing Systems Conference (NIPS), September 2002 (inproceedings)

Abstract
We show that it is possible to extend hidden Markov models to have a countably infinite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the infinitely many transition parameters, leaving only three hyperparameters which can be learned from data. These three hyperparameters define a hierarchical Dirichlet process capable of capturing a rich set of transition dynamics. The three hyperparameters control the time scale of the dynamics, the sparsity of the underlying state-transition matrix, and the expected number of distinct hidden states in a finite sequence. In this framework it is also natural to allow the alphabet of emitted symbols to be infinite - consider, for example, symbols being possible words appearing in English text.

PDF Web [BibTex]

PDF Web [BibTex]


no image
A new discriminative kernel from probabilistic models

Tsuda, K., Kawanabe, M., Rätsch, G., Sonnenburg, S., Müller, K.

In Advances in Neural Information Processing Systems 14, pages: 977-984, (Editors: Dietterich, T.G. , S. Becker, Z. Ghahramani), MIT Press, Cambridge, MA, USA, Fifteenth Annual Neural Information Processing Systems Conference (NIPS), September 2002 (inproceedings)

Abstract
Recently, Jaakkola and Haussler proposed a method for constructing kernel functions from probabilistic models. Their so called \Fisher kernel" has been combined with discriminative classi ers such as SVM and applied successfully in e.g. DNA and protein analysis. Whereas the Fisher kernel (FK) is calculated from the marginal log-likelihood, we propose the TOP kernel derived from Tangent vectors Of Posterior log-odds. Furthermore, we develop a theoretical framework on feature extractors from probabilistic models and use it for analyzing the TOP kernel. In experiments our new discriminative TOP kernel compares favorably to the Fisher kernel.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Incorporating Invariances in Non-Linear Support Vector Machines

Chapelle, O., Schölkopf, B.

In Advances in Neural Information Processing Systems 14, pages: 609-616, (Editors: TG Dietterich and S Becker and Z Ghahramani), MIT Press, Cambridge, MA, USA, 15th Annual Neural Information Processing Systems Conference (NIPS), September 2002 (inproceedings)

Abstract
The choice of an SVM kernel corresponds to the choice of a representation of the data in a feature space and, to improve performance, it should therefore incorporate prior knowledge such as known transformation invariances. We propose a technique which extends earlier work and aims at incorporating invariances in nonlinear kernels. We show on a digit recognition task that the proposed approach is superior to the Virtual Support Vector method, which previously had been the method of choice.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Kernel feature spaces and nonlinear blind source separation

Harmeling, S., Ziehe, A., Kawanabe, M., Müller, K.

In Advances in Neural Information Processing Systems 14, pages: 761-768, (Editors: Dietterich, T. G., S. Becker, Z. Ghahramani), MIT Press, Cambridge, MA, USA, Fifteenth Annual Neural Information Processing Systems Conference (NIPS), September 2002 (inproceedings)

Abstract
In kernel based learning the data is mapped to a kernel feature space of a dimension that corresponds to the number of training data points. In practice, however, the data forms a smaller submanifold in feature space, a fact that has been used e.g. by reduced set techniques for SVMs. We propose a new mathematical construction that permits to adapt to the intrinsic dimension and to find an orthonormal basis of this submanifold. In doing so, computations get much simpler and more important our theoretical framework allows to derive elegant kernelized blind source separation (BSS) algorithms for arbitrary invertible nonlinear mixings. Experiments demonstrate the good performance and high computational efficiency of our kTDSEP algorithm for the problem of nonlinear BSS.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Algorithms for Learning Function Distinguishable Regular Languages

Fernau, H., Radl, A.

In Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition, pages: 64-73, (Editors: Caelli, T. , A. Amin, R. P.W. Duin, M. Kamel, D. de Ridder), Springer, Berlin, Germany, Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition, August 2002 (inproceedings)

Abstract
Function distinguishable languages were introduced as a new methodology of defining characterizable subclasses of the regular languages which are learnable from text. Here, we give details on the implementation and the analysis of the corresponding learning algorithms. We also discuss problems which might occur in practical applications.

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Decision Boundary Pattern Selection for Support Vector Machines

Shin, H., Cho, S.

In Proc. of the Korean Data Mining Conference, pages: 33-41, Korean Data Mining Conference, May 2002 (inproceedings)

[BibTex]

[BibTex]


no image
k-NN based Pattern Selection for Support Vector Classifiers

Shin, H., Cho, S.

In Proc. of the Korean Industrial Engineers Conference, pages: 645-651, Korean Industrial Engineers Conference, May 2002 (inproceedings)

[BibTex]

[BibTex]


no image
Microarrays: How Many Do You Need?

Zien, A., Fluck, J., Zimmer, R., Lengauer, T.

In RECOMB 2002, pages: 321-330, ACM Press, New York, NY, USA, Sixth Annual International Conference on Research in Computational Molecular Biology, April 2002 (inproceedings)

Abstract
We estimate the number of microarrays that is required in order to gain reliable results from a common type of study: the pairwise comparison of different classes of samples. Current knowlegde seems to suffice for the construction of models that are realistic with respect to searches for individual differentially expressed genes. Such models allow to investigate the dependence of the required number of samples on the relevant parameters: the biological variability of the samples within each class; the fold changes in expression; the detection sensitivity of the microarrays; and the acceptable error rates of the results. We supply experimentalists with general conclusions as well as a freely accessible Java applet at http://cartan.gmd.de/~zien/classsize/ for fine tuning simulations to their particular actualities. Since the situation can be assumed to be very similar for large scale proteomics and metabolomics studies, our methods and results might also apply there.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Pattern Selection for Support Vector Classifiers

Shin, H., Cho, S.

In Ideal 2002, pages: 97-103, (Editors: Yin, H. , N. Allinson, R. Freeman, J. Keane, S. Hubbard), Springer, Berlin, Germany, Third International Conference on Intelligent Data Engineering and Automated Learning, January 2002 (inproceedings)

Abstract
SVMs tend to take a very long time to train with a large data set. If "redundant" patterns are identified and deleted in pre-processing, the training time could be reduced significantly. We propose a k-nearest neighbors(k-NN) based pattern selection method. The method tries to select the patterns that are near the decision boundary and that are correctly labeled. The simulations over synthetic data sets showed promising results: (1) By converting a non-separable problem to a separable one, the search for an optimal error tolerance parameter became unnecessary. (2) SVM training time decreased by two orders of magnitude without any loss of accuracy. (3) The redundant SVs were substantially reduced.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
The leave-one-out kernel

Tsuda, K., Kawanabe, M.

In Artificial Neural Networks -- ICANN 2002, 2415, pages: 727-732, LNCS, (Editors: Dorronsoro, J. R.), Artificial Neural Networks -- ICANN, 2002 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
Localized Rademacher Complexities

Bartlett, P., Bousquet, O., Mendelson, S.

In Proceedings of the 15th annual conference on Computational Learning Theory, pages: 44-58, Proceedings of the 15th annual conference on Computational Learning Theory, 2002 (inproceedings)

Abstract
We investigate the behaviour of global and local Rademacher averages. We present new error bounds which are based on the local averages and indicate how data-dependent local averages can be estimated without {it a priori} knowledge of the class at hand.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Film Cooling: A Comparative Study of Different Heaterfoil Configurations for Liquid Crystals Experiments

Vogel, G., Graf, ABA., Weigand, B.

In ASME TURBO EXPO 2002, Amsterdam, GT-2002-30552, ASME TURBO EXPO, Amsterdam, 2002 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
Some Local Measures of Complexity of Convex Hulls and Generalization Bounds

Bousquet, O., Koltchinskii, V., Panchenko, D.

In Proceedings of the 15th annual conference on Computational Learning Theory, Proceedings of the 15th annual conference on Computational Learning Theory, 2002 (inproceedings)

Abstract
We investigate measures of complexity of function classes based on continuity moduli of Gaussian and Rademacher processes. For Gaussian processes, we obtain bounds on the continuity modulus on the convex hull of a function class in terms of the same quantity for the class itself. We also obtain new bounds on generalization error in terms of localized Rademacher complexities. This allows us to prove new results about generalization performance for convex hulls in terms of characteristics of the base class. As a byproduct, we obtain a simple proof of some of the known bounds on the entropy of convex hulls.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
A kernel approach for learning from almost orthogonal patterns

Schölkopf, B., Weston, J., Eskin, E., Leslie, C., Noble, W.

In Principles of Data Mining and Knowledge Discovery, Lecture Notes in Computer Science, 2430/2431, pages: 511-528, Lecture Notes in Computer Science, (Editors: T Elomaa and H Mannila and H Toivonen), Springer, Berlin, Germany, 13th European Conference on Machine Learning (ECML) and 6th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'2002), 2002 (inproceedings)

PostScript DOI [BibTex]

PostScript DOI [BibTex]


no image
Infinite Mixtures of Gaussian Process Experts

Rasmussen, CE., Ghahramani, Z.

In (Editors: Dietterich, Thomas G.; Becker, Suzanna; Ghahramani, Zoubin), 2002 (inproceedings)

Abstract
We present an extension to the Mixture of Experts (ME) model, where the individual experts are Gaussian Process (GP) regression models. Using a input-dependent adaptation of the Dirichlet Process, we implement a gating network for an infinite number of Experts. Inference in this model may be done efficiently using a Markov Chain relying on Gibbs sampling. The model allows the effective covariance function to vary with the inputs, and may handle large datasets -- thus potentially overcoming two of the biggest hurdles with GP models. Simulations show the viability of this approach.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Marginalized kernels for RNA sequence data analysis

Kin, T., Tsuda, K., Asai, K.

In Genome Informatics 2002, pages: 112-122, (Editors: Lathtop, R. H.; Nakai, K.; Miyano, S.; Takagi, T.; Kanehisa, M.), Genome Informatics, 2002, (Best Paper Award) (inproceedings)

Web [BibTex]

Web [BibTex]


no image
Luminance Artifacts on CRT Displays

Wichmann, F.

In IEEE Visualization, pages: 571-574, (Editors: Moorhead, R.; Gross, M.; Joy, K. I.), IEEE Visualization, 2002 (inproceedings)

Abstract
Most visualization panels today are still built around cathode-ray tubes (CRTs), certainly on personal desktops at work and at home. Whilst capable of producing pleasing images for common applications ranging from email writing to TV and DVD presentation, it is as well to note that there are a number of nonlinear transformations between input (voltage) and output (luminance) which distort the digital and/or analogue images send to a CRT. Some of them are input-independent and hence easy to fix, e.g. gamma correction, but others, such as pixel interactions, depend on the content of the input stimulus and are thus harder to compensate for. CRT-induced image distortions cause problems not only in basic vision research but also for applications where image fidelity is critical, most notably in medicine (digitization of X-ray images for diagnostic purposes) and in forms of online commerce, such as the online sale of images, where the image must be reproduced on some output device which will not have the same transfer function as the customer's CRT. I will present measurements from a number of CRTs and illustrate how some of their shortcomings may be problematic for the aforementioned applications.

[BibTex]

[BibTex]

1999


no image
Engineering Support Vector Machine Kernels That Recognize Translation Initiation Sites in DNA

Zien, A., Rätsch, G., Mika, S., Schölkopf, B., Lemmen, C., Smola, A., Lengauer, T., Müller, K.

In German Conference on Bioinformatics (GCB 1999), October 1999 (inproceedings)

Abstract
In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points from which regions encoding pro­ teins start, the so­called translation initiation sites (TIS). This can be modeled as a classification prob­ lem. We demonstrate the power of support vector machines (SVMs) for this task, and show how to suc­ cessfully incorporate biological prior knowledge by engineering an appropriate kernel function.

Web [BibTex]

1999

Web [BibTex]


no image
Shrinking the tube: a new support vector regression algorithm

Schölkopf, B., Bartlett, P., Smola, A., Williamson, R.

In Advances in Neural Information Processing Systems 11, pages: 330-336 , (Editors: MS Kearns and SA Solla and DA Cohn), MIT Press, Cambridge, MA, USA, 12th Annual Conference on Neural Information Processing Systems (NIPS), June 1999 (inproceedings)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Semiparametric support vector and linear programming machines

Smola, A., Friess, T., Schölkopf, B.

In Advances in Neural Information Processing Systems 11, pages: 585-591 , (Editors: MS Kearns and SA Solla and DA Cohn), MIT Press, Cambridge, MA, USA, Twelfth Annual Conference on Neural Information Processing Systems (NIPS), June 1999 (inproceedings)

Abstract
Semiparametric models are useful tools in the case where domain knowledge exists about the function to be estimated or emphasis is put onto understandability of the model. We extend two learning algorithms - Support Vector machines and Linear Programming machines to this case and give experimental results for SV machines.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Kernel PCA and De-noising in feature spaces

Mika, S., Schölkopf, B., Smola, A., Müller, K., Scholz, M., Rätsch, G.

In Advances in Neural Information Processing Systems 11, pages: 536-542 , (Editors: MS Kearns and SA Solla and DA Cohn), MIT Press, Cambridge, MA, USA, 12th Annual Conference on Neural Information Processing Systems (NIPS), June 1999 (inproceedings)

Abstract
Kernel PCA as a nonlinear feature extractor has proven powerful as a preprocessing step for classification algorithms. But it can also be considered as a natural generalization of linear principal component analysis. This gives rise to the question how to use nonlinear features for data compression, reconstruction, and de-noising, applications common in linear PCA. This is a nontrivial task, as the results provided by kernel PCA live in some high dimensional feature space and need not have pre-images in input space. This work presents ideas for finding approximate pre-images, focusing on Gaussian kernels, and shows experimental results using these pre-images in data reconstruction and de-noising on toy examples as well as on real world data.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Classifying LEP data with support vector algorithms.

Vannerem, P., Müller, K., Smola, A., Schölkopf, B., Söldner-Rembold, S.

In Artificial Intelligence in High Energy Nuclear Physics 99, Artificial Intelligence in High Energy Nuclear Physics 99, 1999 (inproceedings)

[BibTex]

[BibTex]


no image
Classification on proximity data with LP-machines

Graepel, T., Herbrich, R., Schölkopf, B., Smola, A., Bartlett, P., Müller, K., Obermayer, K., Williamson, R.

In Artificial Neural Networks, 1999. ICANN 99, 470, pages: 304-309, Conference Publications , IEEE, 9th International Conference on Artificial Neural Networks, 1999 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
Kernel-dependent support vector error bounds

Schölkopf, B., Shawe-Taylor, J., Smola, A., Williamson, R.

In Artificial Neural Networks, 1999. ICANN 99, 470, pages: 103-108 , Conference Publications , IEEE, 9th International Conference on Artificial Neural Networks, 1999 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
Linear programs for automatic accuracy control in regression

Smola, A., Schölkopf, B., Rätsch, G.

In Artificial Neural Networks, 1999. ICANN 99, 470, pages: 575-580 , Conference Publications , IEEE, 9th International Conference on Artificial Neural Networks, 1999 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
Regularized principal manifolds.

Smola, A., Williamson, R., Mika, S., Schölkopf, B.

In Lecture Notes in Artificial Intelligence, Vol. 1572, 1572, pages: 214-229 , Lecture Notes in Artificial Intelligence, (Editors: P Fischer and H-U Simon), Springer, Berlin, Germany, Computational Learning Theory: 4th European Conference, 1999 (inproceedings)

[BibTex]

[BibTex]


no image
Entropy numbers, operators and support vector kernels.

Williamson, R., Smola, A., Schölkopf, B.

In Lecture Notes in Artificial Intelligence, Vol. 1572, 1572, pages: 285-299, Lecture Notes in Artificial Intelligence, (Editors: P Fischer and H-U Simon), Springer, Berlin, Germany, Computational Learning Theory: 4th European Conference, 1999 (inproceedings)

[BibTex]

[BibTex]


no image
Is the Hippocampus a Kalman Filter?

Bousquet, O., Balakrishnan, K., Honavar, V.

In Proceedings of the Pacific Symposium on Biocomputing, 3, pages: 619-630, Proceedings of the Pacific Symposium on Biocomputing, 1999 (inproceedings)

[BibTex]

[BibTex]


no image
A Comparison of Artificial Neural Networks and Cluster Analysis for Typing Biometrics Authentication

Maisuria, K., Ong, CS., Lai, .

In unknown, pages: 9999-9999, International Joint Conference on Neural Networks, 1999 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
Fisher discriminant analysis with kernels

Mika, S., Rätsch, G., Weston, J., Schölkopf, B., Müller, K.

In Proceedings of the 1999 IEEE Signal Processing Society Workshop, 9, pages: 41-48, (Editors: Y-H Hu and J Larsen and E Wilson and S Douglas), IEEE, Neural Networks for Signal Processing IX, 1999 (inproceedings)

DOI [BibTex]

DOI [BibTex]