Header logo is ei


2004


no image
Attentional Modulation of Auditory Event-Related Potentials in a Brain-Computer Interface

Hill, J., Lal, T., Bierig, K., Birbaumer, N., Schölkopf, B.

In BioCAS04, (S3/5/INV- S3/17-20):4, IEEE Computer Society, Los Alamitos, CA, USA, 2004 IEEE International Workshop on Biomedical Circuits and Systems, December 2004 (inproceedings)

Abstract
Motivated by the particular problems involved in communicating with "locked-in" paralysed patients, we aim to develop a brain-computer interface that uses auditory stimuli. We describe a paradigm that allows a user to make a binary decision by focusing attention on one of two concurrent auditory stimulus sequences. Using Support Vector Machine classification and Recursive Channel Elimination on the independent components of averaged event-related potentials, we show that an untrained user‘s EEG data can be classified with an encouragingly high level of accuracy. This suggests that it is possible for users to modulate EEG signals in a single trial by the conscious direction of attention, well enough to be useful in BCI.

PDF Web DOI [BibTex]

2004

PDF Web DOI [BibTex]


no image
Using kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method

Honkela, A., Harmeling, S., Lundqvist, L., Valpola, H.

In ICA 2004, pages: 790-797, (Editors: Puntonet, C. G., A. Prieto), Springer, Berlin, Germany, Fifth International Conference on Independent Component Analysis and Blind Signal Separation, October 2004 (inproceedings)

Abstract
The variational Bayesian nonlinear blind source separation method introduced by Lappalainen and Honkela in 2000 is initialised with linear principal component analysis (PCA). Because of the multilayer perceptron (MLP) network used to model the nonlinearity, the method is susceptible to local minima and therefore sensitive to the initialisation used. As the method is used for nonlinear separation, the linear initialisation may in some cases lead it astray. In this paper we study the use of kernel PCA (KPCA) in the initialisation. KPCA is a rather straightforward generalisation of linear PCA and it is much faster to compute than the variational Bayesian method. The experiments show that it can produce significantly better initialisations than linear PCA. Additionally, the model comparison methods provided by the variational Bayesian framework can be easily applied to compare different kernels.

DOI [BibTex]

DOI [BibTex]


no image
Robust ICA for Super-Gaussian Sources

Meinecke, F., Harmeling, S., Müller, K.

In ICA 2004, pages: 217-224, (Editors: Puntonet, C. G., A. Prieto), Springer, Berlin, Germany, Fifth International Conference on Independent Component Analysis and Blind Signal Separation, October 2004 (inproceedings)

Abstract
Most ICA algorithms are sensitive to outliers. Instead of robustifying existing algorithms by outlier rejection techniques, we show how a simple outlier index can be used directly to solve the ICA problem for super-Gaussian source signals. This ICA method is outlier-robust by construction and can be used for standard ICA as well as for over-complete ICA (i.e. more source signals than observed signals (mixtures)).

DOI [BibTex]

DOI [BibTex]


no image
Pattern detection methods and systems and face detection methods and systems

Blake, A., Romdhani, S., Schölkopf, B., Torr, P. H. S.

United States Patent, No 6804391, October 2004 (patent)

[BibTex]

[BibTex]


no image
Modelling Spikes with Mixtures of Factor Analysers

Görür, D., Rasmussen, C., Tolias, A., Sinz, F., Logothetis, N.

In Pattern Recognition, pages: 391-398, LNCS 3175, (Editors: Rasmussen, C. E. , H.H. Bülthoff, B. Schölkopf, M.A. Giese), Springer, Berlin, Germany, 26th DAGM Symposium, September 2004 (inproceedings)

Abstract
Identifying the action potentials of individual neurons from extracellular recordings, known as spike sorting, is a challenging problem. We consider the spike sorting problem using a generative model,mixtures of factor analysers, which concurrently performs clustering and feature extraction. The most important advantage of this method is that it quantifies the certainty with which the spikes are classified. This can be used as a means for evaluating the quality of clustering and therefore spike isolation. Using this method, nearly simultaneously occurring spikes can also be modelled which is a hard task for many of the spike sorting methods. Furthermore, modelling the data with a generative model allows us to generate simulated data.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Learning Depth From Stereo

Sinz, F., Candela, J., BakIr, G., Rasmussen, C., Franz, M.

In 26th DAGM Symposium, pages: 245-252, LNCS 3175, (Editors: Rasmussen, C. E., H. H. Bülthoff, B. Schölkopf, M. A. Giese), Springer, Berlin, Germany, 26th DAGM Symposium, September 2004 (inproceedings)

Abstract
We compare two approaches to the problem of estimating the depth of a point in space from observing its image position in two different cameras: 1.~The classical photogrammetric approach explicitly models the two cameras and estimates their intrinsic and extrinsic parameters using a tedious calibration procedure; 2.~A generic machine learning approach where the mapping from image to spatial coordinates is directly approximated by a Gaussian Process regression. Our results show that the generic learning approach, in addition to simplifying the procedure of calibration, can lead to higher depth accuracies than classical calibration although no specific domain knowledge is used.

PDF PostScript Web [BibTex]

PDF PostScript Web [BibTex]


no image
Stability of Hausdorff-based Distance Measures

Shapiro, MD., Blaschko, MB.

In VIIP, pages: 1-6, VIIP, September 2004 (inproceedings)

[BibTex]

[BibTex]


no image
Learning to Find Graph Pre-Images

BakIr, G., Zien, A., Tsuda, K.

In Pattern Recognition, pages: 253-261, (Editors: Rasmussen, C. E., H. H. Bülthoff, B. Schölkopf, M. A. Giese), Springer, Berlin, Germany, 26th DAGM Symposium, August 2004 (inproceedings)

Abstract
The recent development of graph kernel functions has made it possible to apply well-established machine learning methods to graphs. However, to allow for analyses that yield a graph as a result, it is necessary to solve the so-called pre-image problem: to reconstruct a graph from its feature space representation induced by the kernel. Here, we suggest a practical solution to this problem.

PostScript PDF DOI [BibTex]

PostScript PDF DOI [BibTex]


no image
Gaussian Process Classification for Segmenting and Annotating Sequences

Altun, Y., Hofmann, T., Smola, A.

In Proceedings of the 21st International Conference on Machine Learning (ICML 2004), pages: 25-32, (Editors: Greiner, R. , D. Schuurmans), ACM Press, New York, USA, 21st International Conference on Machine Learning (ICML), July 2004 (inproceedings)

Abstract
Many real-world classification tasks involve the prediction of multiple, inter-dependent class labels. A prototypical case of this sort deals with prediction of a sequence of labels for a sequence of observations. Such problems arise naturally in the context of annotating and segmenting observation sequences. This paper generalizes Gaussian Process classification to predict multiple labels by taking dependencies between neighboring labels into account. Our approach is motivated by the desire to retain rigorous probabilistic semantics, while overcoming limitations of parametric methods like Conditional Random Fields, which exhibit conceptual and computational difficulties in high-dimensional input spaces. Experiments on named entity recognition and pitch accent prediction tasks demonstrate the competitiveness of our approach.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Learning with Non-Positive Kernels

Ong, CS., Mary, X., Canu, S., Smola, AJ.

In ICML 2004, pages: 81-81, ACM Press, New York, NY, USA, Twenty-First International Conference on Machine Learning, July 2004 (inproceedings)

Abstract
n this paper we show that many kernel methods can be adapted to deal with indefinite kernels, that is, kernels which are not positive semidefinite. They do not satisfy Mercer‘s condition and they induce associated functional spaces called Reproducing Kernel Kre&icaron;n Spaces (RKKS), a generalization of Reproducing Kernel Hilbert Spaces (RKHS).Machine learning in RKKS shares many "nice" properties of learning in RKHS, such as orthogonality and projection. However, since the kernels are indefinite, we can no longer minimize the loss, instead we stabilize it. We show a general representer theorem for constrained stabilization and prove generalization bounds by computing the Rademacher averages of the kernel class. We list several examples of indefinite kernels and investigate regularization methods to solve spline interpolation. Some preliminary experiments with indefinite kernels for spline smoothing are reported for truncated spectral factorization, Landweber-Fridman iterations, and MR-II.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Exponential Families for Conditional Random Fields

Altun, Y., Smola, A., Hofmann, T.

In Proceedings of the 20th Annual Conference on Uncertainty in Artificial Intelligence (UAI 2004), pages: 2-9, (Editors: Chickering, D.M. , J.Y. Halpern), Morgan Kaufmann, San Francisco, CA, USA, 20th Annual Conference on Uncertainty in Artificial Intelligence (UAI), July 2004 (inproceedings)

Abstract
In this paper we define conditional random fields in reproducing kernel Hilbert spaces and show connections to Gaussian Process classification. More specifically, we prove decomposition results for undirected graphical models and we give constructions for kernels. Finally we present efficient means of solving the optimization problem using reduced rank decompositions and we show how stationarity can be exploited efficiently in the optimization process.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Using Conditional Random Fields to Predict Pitch Accent in Conversational Speech

Gregory, M., Altun, Y.

In pages: 677-684, (Editors: Scott, D. , W. Daelemans, M. Walker), ACL, East Stroudsburg, PA, USA, 42nd Annual Meeting of the Association for Computational Linguistics (ACL), July 2004 (inproceedings)

Abstract
The detection of prosodic characteristics is an important aspect of both speech synthesis and speech recognition. Correct placement of pitch accents aids in more natural sounding speech, while automatic detection of accents can contribute to better wordlevel recognition and better textual understanding. In this paper we investigate probabilistic, contextual, and phonological factors that influence pitch accent placement in natural, conversational speech in a sequence labeling setting. We introduce Conditional Random Fields (CRFs) to pitch accent prediction task in order to incorporate these factors efficiently in a sequence model. We demonstrate the usefulness and the incremental effect of these factors in a sequence model by performing experiments on hand labeled data from the Switchboard Corpus. Our model outperforms the baseline and previous models of pitch accent prediction on the Switchboard Corpus.

Web [BibTex]

Web [BibTex]


no image
Support vector machine learning for interdependent and structured output spaces

Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y.

In pages: 1-8, (Editors: Greiner, R. , D. Schuurmans), AAAI Press, Menlo Park, CA, USA, Twenty-first International Conference on Machine Learning (ICML), July 2004 (inproceedings)

Web DOI [BibTex]

Web DOI [BibTex]


no image
PAC-Bayesian Generic Chaining

Audibert, J., Bousquet, O.

In Advances in Neural Information Processing Systems 16, pages: 1125-1132 , (Editors: Thrun, S., L.K. Saul, B. Schölkopf), MIT Press, Cambridge, MA, USA, Seventeenth Annual Conference on Neural Information Processing Systems (NIPS), June 2004 (inproceedings)

Abstract
There exist many different generalization error bounds for classification. Each of these bounds contains an improvement over the others for certain situations. Our goal is to combine these different improvements into a single bound. In particular we combine the PAC-Bayes approach introduced by McAllester, which is interesting for averaging classifiers, with the optimal union bound provided by the generic chaining technique developed by Fernique and Talagrand. This combination is quite natural since the generic chaining is based on the notion of majorizing measures, which can be considered as priors on the set of classifiers, and such priors also arise in the PAC-bayesian setting.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Prediction on Spike Data Using Kernel Algorithms

Eichhorn, J., Tolias, A., Zien, A., Kuss, M., Rasmussen, C., Weston, J., Logothetis, N., Schölkopf, B.

In Advances in Neural Information Processing Systems 16, pages: 1367-1374, (Editors: S Thrun and LK Saul and B Schölkopf), MIT Press, Cambridge, MA, USA, 17th Annual Conference on Neural Information Processing Systems (NIPS), June 2004 (inproceedings)

Abstract
We report and compare the performance of different learning algorithms based on data from cortical recordings. The task is to predict the orientation of visual stimuli from the activity of a population of simultaneously recorded neurons. We compare several ways of improving the coding of the input (i.e., the spike data) as well as of the output (i.e., the orientation), and report the results obtained using different kernel algorithms.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Warped Gaussian Processes

Snelson, E., Rasmussen, CE., Ghahramani, Z.

In Advances in Neural Information Processing Systems 16, pages: 337-344, (Editors: Thrun, S., L.K. Saul, B. Schölkopf), MIT Press, Cambridge, MA, USA, Seventeenth Annual Conference on Neural Information Processing Systems (NIPS), June 2004 (inproceedings)

Abstract
We generalise the Gaussian process (GP) framework for regression by learning a nonlinear transformation of the GP outputs. This allows for non-Gaussian processes and non-Gaussian noise. The learning algorithm chooses a nonlinear transformation such that transformed data is well-modelled by a GP. This can be seen as including a preprocessing transformation as an integral part of the probabilistic modelling problem, rather than as an ad-hoc step. We demonstrate on several real regression problems that learning the transformation can lead to significantly better performance than using a regular GP, or a GP with a fixed transformation.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Ranking on Data Manifolds

Zhou, D., Weston, J., Gretton, A., Bousquet, O., Schölkopf, B.

In Advances in neural information processing systems 16, pages: 169-176, (Editors: S Thrun and L Saul and B Schölkopf), MIT Press, Cambridge, MA, USA, 17th Annual Conference on Neural Information Processing Systems (NIPS), June 2004 (inproceedings)

Abstract
The Google search engine has enjoyed a huge success with its web page ranking algorithm, which exploits global, rather than local, hyperlink structure of the web using random walks. Here we propose a simple universal ranking algorithm for data lying in the Euclidean space, such as text or image data. The core idea of our method is to rank the data with respect to the intrinsic manifold structure collectively revealed by a great amount of data. Encouraging experimental results from synthetic, image, and text data illustrate the validity of our method.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Gaussian Processes in Reinforcement Learning

Rasmussen, C., Kuss, M.

In Advances in Neural Information Processing Systems 16, pages: 751-759, (Editors: Thrun, S., L. K. Saul, B. Schölkopf), MIT Press, Cambridge, MA, USA, Seventeenth Annual Conference on Neural Information Processing Systems (NIPS), June 2004 (inproceedings)

Abstract
We exploit some useful properties of Gaussian process (GP) regression models for reinforcement learning in continuous state spaces and discrete time. We demonstrate how the GP model allows evaluation of the value function in closed form. The resulting policy iteration algorithm is demonstrated on a simple problem with a two dimensional state space. Further, we speculate that the intrinsic ability of GP models to characterise distributions of functions would allow the method to capture entire distributions over future values instead of merely their expectation, which has traditionally been the focus of much of reinforcement learning.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Learning with Local and Global Consistency

Zhou, D., Bousquet, O., Lal, T., Weston, J., Schölkopf, B.

In Advances in Neural Information Processing Systems 16, pages: 321-328, (Editors: S Thrun and LK Saul and B Schölkopf), MIT Press, Cambridge, MA, USA, 17th Annual Conference on Neural Information Processing Systems (NIPS), June 2004 (inproceedings)

Abstract
We consider the general problem of learning from labeled and unlabeled data, which is often called semi-supervised learning or transductive inference. A principled approach to semi-supervised learning is to design a classifying function which is sufficiently smooth with respect to the intrinsic structure collectively revealed by known labeled and unlabeled points. We present a simple algorithm to obtain such a smooth solution. Our method yields encouraging experimental results on a number of classification problems and demonstrates effective use of unlabeled data.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Learning to Find Pre-Images

Bakir, G., Weston, J., Schölkopf, B.

In Advances in Neural Information Processing Systems 16, pages: 449-456, (Editors: S Thrun and LK Saul and B Schölkopf), MIT Press, Cambridge, MA, USA, 17th Annual Conference on Neural Information Processing Systems (NIPS), June 2004 (inproceedings)

Abstract
We consider the problem of reconstructing patterns from a feature map. Learning algorithms using kernels to operate in a reproducing kernel Hilbert space (RKHS) express their solutions in terms of input points mapped into the RKHS. We introduce a technique based on kernel principal component analysis and regression to reconstruct corresponding patterns in the input space (aka pre-images) and review its performance in several applications requiring the construction of pre-images. The introduced technique avoids difficult and/or unstable numerical optimization, is easy to implement and, unlike previous methods, permits the computation of pre-images in discrete input spaces.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Measure Based Regularization

Bousquet, O., Chapelle, O., Hein, M.

In Advances in Neural Information Processing Systems 16, pages: 1221-1228, (Editors: Thrun, S., L. Saul, B. Schölkopf), MIT Press, Cambridge, MA, USA, Seventeenth Annual Conference on Neural Information Processing Systems (NIPS), June 2004 (inproceedings)

Abstract
We address in this paper the question of how the knowledge of the marginal distribution $P(x)$ can be incorporated in a learning algorithm. We suggest three theoretical methods for taking into account this distribution for regularization and provide links to existing graph-based semi-supervised learning algorithms. We also propose practical implementations.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Insights from Machine Learning Applied to Human Visual Classification

Graf, A., Wichmann, F.

In Advances in Neural Information Processing Systems 16, pages: 905-912, (Editors: Thrun, S., L. Saul, B. Schölkopf), MIT Press, Cambridge, MA, USA, Seventeenth Annual Conference on Neural Information Processing Systems (NIPS), June 2004 (inproceedings)

Abstract
We attempt to understand visual classification in humans using both psychophysical and machine learning techniques. Frontal views of human faces were used for a gender classification task. Human subjects classified the faces and their gender judgment, reaction time and confidence rating were recorded. Several hyperplane learning algorithms were used on the same classification task using the Principal Components of the texture and flowfield representation of the faces. The classification performance of the learning algorithms was estimated using the face database with the true gender of the faces as labels, and also with the gender estimated by the subjects. We then correlated the human responses to the distance of the stimuli to the separating hyperplane of the learning algorithms. Our results suggest that human classification can be modeled by some hyperplane algorithms in the feature space we used. For classification, the brain needs more processing for stimuli close to that hyperplane than for those further away.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Image Construction by Linear Programming

Tsuda, K., Rätsch, G.

In Advances in Neural Information Processing Systems 16, pages: 57-64, (Editors: Thrun, S., L.K. Saul, B. Schölkopf), MIT Press, Cambridge, MA, USA, Seventeenth Annual Conference on Neural Information Processing Systems (NIPS), June 2004 (inproceedings)

Abstract
A common way of image denoising is to project a noisy image to the subspace of admissible images made for instance by PCA. However, a major drawback of this method is that all pixels are updated by the projection, even when only a few pixels are corrupted by noise or occlusion. We propose a new method to identify the noisy pixels by 1-norm penalization and update the identified pixels only. The identification and updating of noisy pixels are formulated as one linear program which can be solved efficiently. Especially, one can apply the ν-trick to directly specify the fraction of pixels to be reconstructed. Moreover, we extend the linear program to be able to exploit prior knowledge that occlusions often appear in contiguous blocks (e.g. sunglasses on faces). The basic idea is to penalize boundary points and interior points of the occluded area differently. We are able to show the ν-property also for this extended LP leading a method which is easy to use. Experimental results impressively demonstrate the power of our approach.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Semi-Supervised Protein Classification using Cluster Kernels

Weston, J., Leslie, C., Zhou, D., Elisseeff, A., Noble, W.

In Advances in Neural Information Processing Systems 16, pages: 595-602, (Editors: Thrun, S., L.K. Saul, B. Schölkopf), MIT Press, Cambridge, MA, USA, Seventeenth Annual Conference on Neural Information Processing Systems (NIPS), June 2004 (inproceedings)

Abstract
A key issue in supervised protein classification is the representation of input sequences of amino acids. Recent work using string kernels for protein data has achieved state-of-the-art classification performance. However, such representations are based only on labeled data --- examples with known 3D structures, organized into structural classes --- while in practice, unlabeled data is far more plentiful. In this work, we develop simple and scalable cluster kernel techniques for incorporating unlabeled data into the representation of protein sequences. We show that our methods greatly improve the classification performance of string kernels and outperform standard approaches for using unlabeled data, such as adding close homologs of the positive examples to the training data. We achieve equal or superior performance to previously presented cluster kernel methods while achieving far greater computational efficiency.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Kernel Hebbian Algorithm for single-frame super-resolution

Kim, K., Franz, M., Schölkopf, B.

In Computer Vision - ECCV 2004, LNCS vol. 3024, pages: 135-149, (Editors: A Leonardis and H Bischof), Springer, Berlin, Germany, 8th European Conference on Computer Vision (ECCV), May 2004 (inproceedings)

Abstract
This paper presents a method for single-frame image super-resolution using an unsupervised learning technique. The required prior knowledge about the high-resolution images is obtained from Kernel Principal Component Analysis (KPCA). The original form of KPCA, however, can be only applied to strongly restricted image classes due to the limited number of training examples that can be processed. We therefore propose a new iterative method for performing KPCA, the {em Kernel Hebbian Algorithm}. By kernelizing the Generalized Hebbian Algorithm, one can iteratively estimate the Kernel Principal Components with only linear order memory complexity. The resulting super-resolution algorithm shows a comparable performance to the existing supervised methods on images containing faces and natural scenes.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Pattern Selection for SVM based "Futures Trading System"

Sun, J., Cho, S., Shin, H.

In Proc. of the Korean Data Mining Conference, pages: 175-183, Korean Data Mining Society Conference, April 2004 (inproceedings)

[BibTex]

[BibTex]


no image
Minimum Sum-Squared Residue based clustering of Gene Expression Data

Cho, H., Guan, Y., Dhillon, I., Sra, S.

In SIAM Data Mining, pages: 00-00, SDM, April 2004 (inproceedings)

GZIP [BibTex]

GZIP [BibTex]


no image
Preservation of Neighborhood Relation under Input to Feature Space Mapping in SVM Training

Shin, H., Cho, S.

In Proc. of the 33rd International Conference on Computers and Industrial Engineering (C&IE 2004), pages: 1-10, the 33rd International Conference on Computers and Industrial Engineering (C&IE), April 2004, in CD (inproceedings)

[BibTex]

[BibTex]


no image
Unifying Colloborative and Content-Based Filtering.

Basilico, J., Hofmann, T.

In ACM International Conference Proceeding Series, pages: 65 , (Editors: Greiner, R. , D. Schuurmans), ACM Press, New York, USA, ICLM, 2004 (inproceedings)

Abstract
Collaborative and content-based filtering are two paradigms that have been applied in the context of recommender systems and user preference prediction. This paper proposes a novel, unified approach that systematically integrates all available training information such as past user-item ratings as well as attributes of items or users to learn a prediction function. The key ingredient of our method is the design of a suitable kernel or similarity function between user-item pairs that allows simultaneous generalization across the user and item dimensions. We propose an on-line algorithm (JRank) that generalizes perceptron learning. Experimental results on the EachMovie data set show significant improvements over standard approaches.

PDF [BibTex]

PDF [BibTex]


no image
Clustering Protein Sequence and Structure Space with Infinite Gaussian Mixture Models

Dubey, A., Hwang, S., Rangel, C., Rasmussen, CE., Ghahramani, Z., Wild, DL.

In Pacific Symposium on Biocomputing 2004; Vol. 9, pages: 399-410, World Scientific Publishing, Singapore, Pacific Symposium on Biocomputing, 2004 (inproceedings)

Abstract
We describe a novel approach to the problem of automatically clustering protein sequences and discovering protein families, subfamilies etc., based on the thoery of infinite Gaussian mixture models. This method allows the data itself to dictate how many mixture components are required to model it, and provides a measure of the probability that two proteins belong to the same cluster. We illustrate our methods with application to three data sets: globin sequences, globin sequences with known tree-dimensional structures and G-pretein coupled receptor sequences. The consistency of the clusters indicate that that our methods is producing biologically meaningful results, which provide a very good indication of the underlying families and subfamilies. With the inclusion of secondary structure and residue solvent accessibility information, we obtain a classification of sequences of known structure which reflects and extends their SCOP classifications. A supplementary web site containing larger versions of the figures is available at http://public.kgi.edu/~wild/PSB04

PDF [BibTex]

PDF [BibTex]


no image
Efficient Approximations for Support Vector Machines in Object Detection

Kienzle, W., BakIr, G., Franz, M., Schölkopf, B.

In DAGM 2004, pages: 54-61, (Editors: CE Rasmussen and HH Bülthoff and B Schölkopf and MA Giese), Springer, Berlin, Germany, Pattern Recognition, Proceedings of the 26th DAGM Symposium, 2004 (inproceedings)

Abstract
We present a new approximation scheme for support vector decision functions in object detection. In the present approach we are building on an existing algorithm where the set of support vectors is replaced by a smaller so-called reduced set of synthetic points. Instead of finding the reduced set via unconstrained optimization, we impose a structural constraint on the synthetic vectors such that the resulting approximation can be evaluated via separable filters. Applications that require scanning an entire image can benefit from this representation: when using separable filters, the average computational complexity for evaluating a reduced set vector on a test patch of size (h x w) drops from O(hw) to O(h+w). We show experimental results on handwritten digits and face detection.

PDF [BibTex]

PDF [BibTex]


no image
Kernel Methods for Manifold Estimation

Schölkopf, B.

In Proceedings in Computational Statistics, pages: 441-452, (Editors: J Antoch), Physica-Verlag/Springer, Heidelberg, Germany, COMPSTAT, 2004 (inproceedings)

[BibTex]

[BibTex]


no image
A Regularization Framework for Learningfrom Graph Data

Zhou, D., Schölkopf, B.

In ICML Workshop on Statistical Relational Learning and Its Connections to Other Fields, pages: 132-137, ICML, 2004 (inproceedings)

Abstract
The data in many real-world problems can be thought of as a graph, such as the web, co-author networks, and biological networks. We propose a general regularization framework on graphs, which is applicable to the classification, ranking, and link prediction problems. We also show that the method can be explained as lazy random walks. We evaluate the method on a number of experiments.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
A kernel view of the dimensionality reduction of manifolds

Ham, J., Lee, D., Mika, S., Schölkopf, B.

In Proceedings of the Twenty-First International Conference on Machine Learning, pages: 369-376, (Editors: CE Brodley), ACM, New York, NY, USA, ICML, 2004, also appeared as MPI-TR 110 (inproceedings)

Abstract
We interpret several well-known algorithms for dimensionality reduction of manifolds as kernel methods. Isomap, graph Laplacian eigenmap, and locally linear embedding (LLE) all utilize local neighborhood information to construct a global embedding of the manifold. We show how all three algorithms can be described as kernel PCA on specially constructed Gram matrices, and illustrate the similarities and differences between the algorithms with representative examples.

PDF [BibTex]

PDF [BibTex]


no image
Protein Functional Class Prediction with a Combined Graph

Shin, H., Tsuda, K., Schölkopf, B.

In Proceedings of the Korean Data Mining Conference, pages: 200-219, Proceedings of the Korean Data Mining Conference, 2004 (inproceedings)

Abstract
In bioinformatics, there exist multiple descriptions of graphs for the same set of genes or proteins. For instance, in yeast systems, graph edges can represent different relationships such as protein-protein interactions, genetic interactions, or co-participation in a protein complex, etc. Relying on similarities between nodes, each graph can be used independently for prediction of protein function. However, since different graphs contain partly independent and partly complementary information about the problem at hand, one can enhance the total information extracted by combining all graphs. In this paper, we propose a method for integrating multiple graphs within a framework of semi-supervised learning. The method alternates between minimizing the objective function with respect to network output and with respect to combining weights. We apply the method to the task of protein functional class prediction in yeast. The proposed method performs significantly better than the same algorithm trained on any single graph.

PDF [BibTex]

PDF [BibTex]


no image
Learning from Labeled and Unlabeled Data Using Random Walks

Zhou, D., Schölkopf, B.

In Pattern Recognition, Proceedings of the 26th DAGM Symposium, pages: 237-244, (Editors: Rasmussen, C.E., H.H. Bülthoff, M.A. Giese and B. Schölkopf), Pattern Recognition, Proceedings of the 26th DAGM Symposium, 2004 (inproceedings)

Abstract
We consider the general problem of learning from labeled and unlabeled data. Given a set of points, some of them are labeled, and the remaining points are unlabeled. The goal is to predict the labels of the unlabeled points. Any supervised learning algorithm can be applied to this problem, for instance, Support Vector Machines (SVMs). The problem of our interest is if we can implement a classifier which uses the unlabeled data information in some way and has higher accuracy than the classifiers which use the labeled data only. Recently we proposed a simple algorithm, which can substantially benefit from large amounts of unlabeled data and demonstrates clear superiority to supervised learning methods. In this paper we further investigate the algorithm using random walks and spectral graph theory, which shed light on the key steps in this algorithm.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Multivariate Regression via Stiefel Manifold Constraints

BakIr, G., Gretton, A., Franz, M., Schölkopf, B.

In Lecture Notes in Computer Science, Vol. 3175, pages: 262-269, (Editors: CE Rasmussen and HH Bülthoff and B Schölkopf and MA Giese), Springer, Berlin, Germany, Pattern Recognition, Proceedings of the 26th DAGM Symposium, 2004 (inproceedings)

Abstract
We introduce a learning technique for regression between high-dimensional spaces. Standard methods typically reduce this task to many one-dimensional problems, with each output dimension considered independently. By contrast, in our approach the feature construction and the regression estimation are performed jointly, directly minimizing a loss function that we specify, subject to a rank constraint. A major advantage of this approach is that the loss is no longer chosen according to the algorithmic requirements, but can be tailored to the characteristics of the task at hand; the features will then be optimal with respect to this objective, and dependence between the outputs can be exploited.

PostScript [BibTex]

PostScript [BibTex]


no image
Implicit estimation of Wiener series

Franz, M., Schölkopf, B.

In Machine Learning for Signal Processing XIV, Proc. 2004 IEEE Signal Processing Society Workshop, pages: 735-744, (Editors: A Barros and J Principe and J Larsen and T Adali and S Douglas), IEEE, New York, Machine Learning for Signal Processing XIV, Proc. IEEE Signal Processing Society Workshop, 2004 (inproceedings)

Abstract
The Wiener series is one of the standard methods to systematically characterize the nonlinearity of a system. The classical estimation method of the expansion coefficients via cross-correlation suffers from severe problems that prevent its application to high-dimensional and strongly nonlinear systems. We propose an implicit estimation method based on regression in a reproducing kernel Hilbert space that alleviates these problems. Experiments show performance advantages in terms of convergence, interpretability, and system sizes that can be handled.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Hilbertian Metrics on Probability Measures and their Application in SVM’s

Hein, H., Lal, T., Bousquet, O.

In Pattern Recognition, Proceedings of th 26th DAGM Symposium, 3175, pages: 270-277, Lecture Notes in Computer Science, (Editors: Rasmussen, C. E., H. H. Bülthoff, M. Giese and B. Schölkopf), Pattern Recognition, Proceedings of th 26th DAGM Symposium, 2004 (inproceedings)

Abstract
The goal of this article is to investigate the field of Hilbertian metrics on probability measures. Since they are very versatile and can therefore be applied in various problems they are of great interest in kernel methods. Quit recently Tops{o}e and Fuglede introduced a family of Hilbertian metrics on probability measures. We give basic properties of the Hilbertian metrics of this family and other used metrics in the literature. Then we propose an extension of the considered metrics which incorporates structural information of the probability space into the Hilbertian metric. Finally we compare all proposed metrics in an image and text classification problem using histogram data.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Gasussian process model based predictive control

Kocijan, J., Murray-Smith, R., Rasmussen, CE., Girard, A.

In Proceedings of the ACC 2004, pages: 2214-2219, Proceedings of the ACC, 2004 (inproceedings)

Abstract
Gaussian process models provide a probabilistic non-parametric modelling approach for black-box identi cation of non-linear dynamic systems. The Gaussian processes can highlight areas of the input space where prediction quality is poor, due to the lack of data or its complexity, by indicating the higher variance around the predicted mean. Gaussian process models contain noticeably less coef cients to be optimised. This paper illustrates possible application of Gaussian process models within model-based predictive control. The extra information provided within Gaussian process model is used in predictive control, where optimisation of control signal takes the variance information into account. The predictive control principle is demonstrated on control of pH process benchmark.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
A New Variational Framework for Rigid-Body Alignment

Kato, T., Tsuda, K., Tomii, K., Asai, K.

In Joint IAPR International Workshops on Syntactical and Structural Pattern Recognition (SSPR 2004) and Statistical Pattern Recognition (SPR 2004), pages: 171-179, (Editors: Fred, A.,T. Caelli, R.P.W. Duin, A. Campilho and D. de Ridder), Joint IAPR International Workshops on Syntactical and Structural Pattern Recognition (SSPR) and Statistical Pattern Recognition (SPR), 2004 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
Practical Method for Blind Inversion of Wiener Systems

Zhang, K., Chan, L.

In Proceedings of International Joint Conference on Neural Networks (IJCNN 2004), pages: 2163-2168, International Joint Conference on Neural Networks (IJCNN), 2004, Volume 3 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
Semi-supervised kernel regression using whitened function classes

Franz, M., Kwon, Y., Rasmussen, C., Schölkopf, B.

In Pattern Recognition, Proceedings of the 26th DAGM Symposium, Lecture Notes in Computer Science, Vol. 3175, LNCS 3175, pages: 18-26, (Editors: CE Rasmussen and HH Bülthoff and MA Giese and B Schölkopf), Springer, Berlin, Gerrmany, 26th DAGM Symposium, 2004 (inproceedings)

Abstract
The use of non-orthonormal basis functions in ridge regression leads to an often undesired non-isotropic prior in function space. In this study, we investigate an alternative regularization technique that results in an implicit whitening of the basis functions by penalizing directions in function space with a large prior variance. The regularization term is computed from unlabelled input data that characterizes the input distribution. Tests on two datasets using polynomial basis functions showed an improved average performance compared to standard ridge regression.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Maximal Margin Classification for Metric Spaces

Hein, M., Bousquet, O.

In Learning Theory and Kernel Machines, pages: 72-86, (Editors: Schölkopf, B. and Warmuth, M. K.), Springer, Heidelberg, Germany, 16. Annual Conference on Computational Learning Theory / COLT Kernel, 2004 (inproceedings)

Abstract
In this article we construct a maximal margin classification algorithm for arbitrary metric spaces. At first we show that the Support Vector Machine (SVM) is a maximal margin algorithm for the class of metric spaces where the negative squared distance is conditionally positive definite (CPD). This means that the metric space can be isometrically embedded into a Hilbert space, where one performs linear maximal margin separation. We will show that the solution only depends on the metric, but not on the kernel. Following the framework we develop for the SVM, we construct an algorithm for maximal margin classification in arbitrary metric spaces. The main difference compared with SVM is that we no longer embed isometrically into a Hilbert space, but a Banach space. We further give an estimate of the capacity of the function class involved in this algorithm via Rademacher averages. We recover an algorithm of Graepel et al. [6].

PDF PostScript PDF DOI [BibTex]

PDF PostScript PDF DOI [BibTex]


no image
On the Convergence of Spectral Clustering on Random Samples: The Normalized Case

von Luxburg, U., Bousquet, O., Belkin, M.

In Proceedings of the 17th Annual Conference on Learning Theory, pages: 457-471, Proceedings of the 17th Annual Conference on Learning Theory, 2004 (inproceedings)

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Learning Movement Primitives

Schaal, S., Peters, J., Nakanishi, J., Ijspeert, A.

In 11th International Symposium on Robotics Research (ISRR2003), pages: 561-572, (Editors: Dario, P. and Chatila, R.), Springer, ISRR, 2004, clmc (inproceedings)

Abstract
This paper discusses a comprehensive framework for modular motor control based on a recently developed theory of dynamic movement primitives (DMP). DMPs are a formulation of movement primitives with autonomous nonlinear differential equations, whose time evolution creates smooth kinematic control policies. Model-based control theory is used to convert the outputs of these policies into motor commands. By means of coupling terms, on-line modifications can be incorporated into the time evolution of the differential equations, thus providing a rather flexible and reactive framework for motor planning and execution. The linear parameterization of DMPs lends itself naturally to supervised learning from demonstration. Moreover, the temporal, scale, and translation invariance of the differential equations with respect to these parameters provides a useful means for movement recognition. A novel reinforcement learning technique based on natural stochastic policy gradients allows a general approach of improving DMPs by trial and error learning with respect to almost arbitrary optimization criteria. We demonstrate the different ingredients of the DMP approach in various examples, involving skill learning from demonstration on the humanoid robot DB, and learning biped walking from demonstration in simulation, including self-improvement of the movement patterns towards energy efficiency through resonance tuning.

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2000


no image
A real-time model of the human knee for application in virtual orthopaedic trainer

Peters, J., Riener, R.

In Proceedings of the 10th International Conference on BioMedical Engineering (ICBME 2000), 10, pages: 1-2, 10th International Conference on BioMedical Engineering (ICBME) , December 2000 (inproceedings)

Abstract
In this paper a real-time capable computational model of the human knee is presented. The model describes the passive elastic joint characteristics in six degrees-of-freedom (DOF). A black-box approach was chosen, where experimental data were approximated by piecewise polynomial functions. The knee model has been applied in a the Virtual Orthopaedic Trainer, which can support training of physical knee evaluation required for diagnosis and surgical planning.

PDF Web [BibTex]

2000

PDF Web [BibTex]


no image
On Designing an Automated Malaysian Stemmer for the Malay Language

Tai, SY., Ong, CS., Abullah, NA.

In Fifth International Workshop on Information Retrieval with Asian Languages, pages: 207-208, ACM Press, New York, NY, USA, Fifth International Workshop on Information Retrieval with Asian Languages, October 2000 (inproceedings)

Abstract
Online and interactive information retrieval systems are likely to play an increasing role in the Malay Language community. To facilitate and automate the process of matching morphological term variants, a stemmer focusing on common affix removal algorithms is proposed as part of the design of an information retrieval system for the Malay Language. Stemming is a morphological process of normalizing word tokens down to their essential roots. The proposed stemmer strips prefixes and suffixes off the word. The experiment conducted with web sites selected from the World Wide Web has exhibited substantial improvements in the number of words indexed.

PostScript Web DOI [BibTex]

PostScript Web DOI [BibTex]


no image
Ensemble of Specialized Networks based on Input Space Partition

Shin, H., Lee, H., Cho, S.

In Proc. of the Korean Operations Research and Management Science Conference, pages: 33-36, Korean Operations Research and Management Science Conference, October 2000 (inproceedings)

[BibTex]

[BibTex]


no image
DES Approach Failure Recovery of Pump-valve System

Son, HI., Kim, KW., Lee, S.

In Korean Society of Precision Engineering (KSPE) Conference, pages: 647-650, Annual Meeting of the Korean Society of Precision Engineering (KSPE), October 2000 (inproceedings)

PDF [BibTex]

PDF [BibTex]