Header logo is ei


2002


no image
Stability and Generalization

Bousquet, O., Elisseeff, A.

Journal of Machine Learning Research, 2, pages: 499-526, 2002 (article)

Abstract
We define notions of stability for learning algorithms and show how to use these notions to derive generalization error bounds based on the empirical error and the leave-one-out error. The methods we use can be applied in the regression framework as well as in the classification one when the classifier is obtained by thresholding a real-valued function. We study the stability properties of large classes of learning algorithms such as regularization based algorithms. In particular we focus on Hilbert space regularization and Kullback-Leibler regularization. We demonstrate how to apply the results to SVM for regression and classification.

PDF PostScript [BibTex]

2002

PDF PostScript [BibTex]


no image
Subspace information criterion for non-quadratic regularizers – model selection for sparse regressors

Tsuda, K., Sugiyama, M., Müller, K.

IEEE Trans Neural Networks, 13(1):70-80, 2002 (article)

PDF [BibTex]

PDF [BibTex]


no image
Modeling splicing sites with pairwise correlations

Arita, M., Tsuda, K., Asai, K.

Bioinformatics, 18(Suppl 2):27-34, 2002 (article)

PDF [BibTex]

PDF [BibTex]


no image
Observations on the Nyström Method for Gaussian Process Prediction

Williams, C., Rasmussen, C., Schwaighofer, A., Tresp, V.

Max Planck Institute for Biological Cybernetics, Tübingen, Germany, 2002 (techreport)

Abstract
A number of methods for speeding up Gaussian Process (GP) prediction have been proposed, including the Nystr{\"o}m method of Williams and Seeger (2001). In this paper we focus on two issues (1) the relationship of the Nystr{\"o}m method to the Subset of Regressors method (Poggio and Girosi 1990; Luo and Wahba, 1997) and (2) understanding in what circumstances the Nystr{\"o}m approximation would be expected to provide a good approximation to exact GP regression.

PostScript [BibTex]

PostScript [BibTex]


no image
Perfusion Quantification using Gaussian Process Deconvolution

Andersen, IK., Szymkowiak, A., Rasmussen, CE., Hanson, LG., Marstrand, JR., Larsson, HBW., Hansen, LK.

Magnetic Resonance in Medicine, (48):351-361, 2002 (article)

Abstract
The quantification of perfusion using dynamic susceptibility contrast MR imaging requires deconvolution to obtain the residual impulse-response function (IRF). Here, a method using a Gaussian process for deconvolution, GPD, is proposed. The fact that the IRF is smooth is incorporated as a constraint in the method. The GPD method, which automatically estimates the noise level in each voxel, has the advantage that model parameters are optimized automatically. The GPD is compared to singular value decomposition (SVD) using a common threshold for the singular values and to SVD using a threshold optimized according to the noise level in each voxel. The comparison is carried out using artificial data as well as using data from healthy volunteers. It is shown that GPD is comparable to SVD variable optimized threshold when determining the maximum of the IRF, which is directly related to the perfusion. GPD provides a better estimate of the entire IRF. As the signal to noise ratio increases or the time resolution of the measurements increases, GPD is shown to be superior to SVD. This is also found for large distribution volumes.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Tracking a Small Set of Experts by Mixing Past Posteriors

Bousquet, O., Warmuth, M.

Journal of Machine Learning Research, 3, pages: 363-396, (Editors: Long, P.), 2002 (article)

Abstract
In this paper, we examine on-line learning problems in which the target concept is allowed to change over time. In each trial a master algorithm receives predictions from a large set of n experts. Its goal is to predict almost as well as the best sequence of such experts chosen off-line by partitioning the training sequence into k+1 sections and then choosing the best expert for each section. We build on methods developed by Herbster and Warmuth and consider an open problem posed by Freund where the experts in the best partition are from a small pool of size m. Since k >> m, the best expert shifts back and forth between the experts of the small pool. We propose algorithms that solve this open problem by mixing the past posteriors maintained by the master algorithm. We relate the number of bits needed for encoding the best partition to the loss bounds of the algorithms. Instead of paying log n for choosing the best expert in each section we first pay log (n choose m) bits in the bounds for identifying the pool of m experts and then log m bits per new section. In the bounds we also pay twice for encoding the boundaries of the sections.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
A femoral arteriovenous shunt facilitates arterial whole blood sampling in animals

Weber, B., Burger, C., Biro, P., Buck, A.

Eur J Nucl Med Mol Imaging, 29, pages: 319-323, 2002 (article)

[BibTex]

[BibTex]


no image
Some Local Measures of Complexity of Convex Hulls and Generalization Bounds

Bousquet, O., Koltchinskii, V., Panchenko, D.

In Proceedings of the 15th annual conference on Computational Learning Theory, Proceedings of the 15th annual conference on Computational Learning Theory, 2002 (inproceedings)

Abstract
We investigate measures of complexity of function classes based on continuity moduli of Gaussian and Rademacher processes. For Gaussian processes, we obtain bounds on the continuity modulus on the convex hull of a function class in terms of the same quantity for the class itself. We also obtain new bounds on generalization error in terms of localized Rademacher complexities. This allows us to prove new results about generalization performance for convex hulls in terms of characteristics of the base class. As a byproduct, we obtain a simple proof of some of the known bounds on the entropy of convex hulls.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Contrast discrimination with pulse-trains in pink noise

Henning, G., Bird, C., Wichmann, F.

Journal of the Optical Society of America A, 19(7), pages: 1259-1266, 2002 (article)

Abstract
Detection performance was measured with sinusoidal and pulse-train gratings. Although the 2.09-c/deg pulse-train, or line gratings, contained at least 8 harmonics all at equal contrast, they were no more detectable than their most detectable component. The addition of broadband pink noise designed to equalize the detectability of the components of the pulse train made the pulse train about a factor of four more detectable than any of its components. However, in contrast-discrimination experiments, with a pedestal or masking grating of the same form and phase as the signal and 15% contrast, the noise did not affect the discrimination performance of the pulse train relative to that obtained with its sinusoidal components. We discuss the implications of these observations for models of early vision in particular the implications for possible sources of internal noise.

PDF [BibTex]

PDF [BibTex]


no image
A kernel approach for learning from almost orthogonal patterns

Schölkopf, B., Weston, J., Eskin, E., Leslie, C., Noble, W.

In Principles of Data Mining and Knowledge Discovery, Lecture Notes in Computer Science, 2430/2431, pages: 511-528, Lecture Notes in Computer Science, (Editors: T Elomaa and H Mannila and H Toivonen), Springer, Berlin, Germany, 13th European Conference on Machine Learning (ECML) and 6th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'2002), 2002 (inproceedings)

PostScript DOI [BibTex]

PostScript DOI [BibTex]


no image
Optimal linear estimation of self-motion - a real-world test of a model of fly tangential neurons

Franz, MO.

SAB 02 Workshop, Robotics as theoretical biology, 7th meeting of the International Society for Simulation of Adaptive Behaviour (SAB), (Editors: Prescott, T.; Webb, B.), 2002 (poster)

Abstract
The tangential neurons in the fly brain are sensitive to the typical optic flow patterns generated during self-motion (see example in Fig.1). We examine whether a simplified linear model of these neurons can be used to estimate self-motion from the optic flow. We present a theory for the construction of an optimal linear estimator incorporating prior knowledge both about the distance distribution of the environment, and about the noise and self-motion statistics of the sensor. The optimal estimator is tested on a gantry carrying an omnidirectional vision sensor that can be moved along three translational and one rotational degree of freedom. The experiments indicate that the proposed approach yields accurate results for rotation estimates, independently of the current translation and scene layout. Translation estimates, however, turned out to be sensitive to simultaneous rotation and to the particular distance distribution of the scene. The gantry experiments confirm that the receptive field organization of the tangential neurons allows them, as an ensemble, to extract self-motion from the optic flow.

PDF [BibTex]

PDF [BibTex]


no image
Choosing Multiple Parameters for Support Vector Machines

Chapelle, O., Vapnik, V., Bousquet, O., Mukherjee, S.

Machine Learning, 46(1):131-159, 2002 (article)

Abstract
The problem of automatically tuning multiple parameters for pattern recognition Support Vector Machines (SVM) is considered. This is done by minimizing some estimates of the generalization error of SVMs using a gradient descent algorithm over the set of parameters. Usual methods for choosing parameters, based on exhaustive search become intractable as soon as the number of parameters exceeds two. Some experimental results assess the feasibility of our approach for a large number of parameters (more than 100) and demonstrate an improvement of generalization performance.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Infinite Mixtures of Gaussian Process Experts

Rasmussen, CE., Ghahramani, Z.

In (Editors: Dietterich, Thomas G.; Becker, Suzanna; Ghahramani, Zoubin), 2002 (inproceedings)

Abstract
We present an extension to the Mixture of Experts (ME) model, where the individual experts are Gaussian Process (GP) regression models. Using a input-dependent adaptation of the Dirichlet Process, we implement a gating network for an infinite number of Experts. Inference in this model may be done efficiently using a Markov Chain relying on Gibbs sampling. The model allows the effective covariance function to vary with the inputs, and may handle large datasets -- thus potentially overcoming two of the biggest hurdles with GP models. Simulations show the viability of this approach.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Marginalized kernels for RNA sequence data analysis

Kin, T., Tsuda, K., Asai, K.

In Genome Informatics 2002, pages: 112-122, (Editors: Lathtop, R. H.; Nakai, K.; Miyano, S.; Takagi, T.; Kanehisa, M.), Genome Informatics, 2002, (Best Paper Award) (inproceedings)

Web [BibTex]

Web [BibTex]


no image
Luminance Artifacts on CRT Displays

Wichmann, F.

In IEEE Visualization, pages: 571-574, (Editors: Moorhead, R.; Gross, M.; Joy, K. I.), IEEE Visualization, 2002 (inproceedings)

Abstract
Most visualization panels today are still built around cathode-ray tubes (CRTs), certainly on personal desktops at work and at home. Whilst capable of producing pleasing images for common applications ranging from email writing to TV and DVD presentation, it is as well to note that there are a number of nonlinear transformations between input (voltage) and output (luminance) which distort the digital and/or analogue images send to a CRT. Some of them are input-independent and hence easy to fix, e.g. gamma correction, but others, such as pixel interactions, depend on the content of the input stimulus and are thus harder to compensate for. CRT-induced image distortions cause problems not only in basic vision research but also for applications where image fidelity is critical, most notably in medicine (digitization of X-ray images for diagnostic purposes) and in forms of online commerce, such as the online sale of images, where the image must be reproduced on some output device which will not have the same transfer function as the customer's CRT. I will present measurements from a number of CRTs and illustrate how some of their shortcomings may be problematic for the aforementioned applications.

[BibTex]

[BibTex]

1996


no image
The DELVE user manual

Rasmussen, CE., Neal, RM., Hinton, GE., van Camp, D., Revow, M., Ghahramani, Z., Kustra, R., Tibshirani, R.

Department of Computer Science, University of Toronto, December 1996 (techreport)

Abstract
This manual describes the preliminary release of the DELVE environment. Some features described here have not yet implemented, as noted. Support for regression tasks is presently somewhat more developed than that for classification tasks. We recommend that you exercise caution when using this version of DELVE for real work, as it is possible that bugs remain in the software. We hope that you will send us reports of any problems you encounter, as well as any other comments you may have on the software or manual, at the e-mail address below. Please mention the version number of the manual and/or the software with any comments you send.

GZIP [BibTex]

1996

GZIP [BibTex]


no image
Nonlinear Component Analysis as a Kernel Eigenvalue Problem

Schölkopf, B., Smola, A., Müller, K.

(44), Max Planck Institute for Biological Cybernetics Tübingen, December 1996, This technical report has also been published elsewhere (techreport)

Abstract
We describe a new method for performing a nonlinear form of Principal Component Analysis. By the use of integral operator kernel functions, we can efficiently compute principal components in high-dimensional feature spaces, related to input space by some nonlinear map; for instance the space of all possible 5-pixel products in 16 x 16 images. We give the derivation of the method, along with a discussion of other techniques which can be made nonlinear with the kernel approach; and present first experimental results on nonlinear feature extraction for pattern recognition.

[BibTex]

[BibTex]


no image
Quality Prediction of Steel Products using Neural Networks

Shin, H., Jhee, W.

In Proc. of the Korean Expert System Conference, pages: 112-124, Korean Expert System Society Conference, November 1996 (inproceedings)

[BibTex]

[BibTex]


no image
Comparison of view-based object recognition algorithms using realistic 3D models

Blanz, V., Schölkopf, B., Bülthoff, H., Burges, C., Vapnik, V., Vetter, T.

In Artificial Neural Networks: ICANN 96, LNCS, vol. 1112, pages: 251-256, Lecture Notes in Computer Science, (Editors: C von der Malsburg and W von Seelen and JC Vorbrüggen and B Sendhoff), Springer, Berlin, Germany, 6th International Conference on Artificial Neural Networks, July 1996 (inproceedings)

Abstract
Two view-based object recognition algorithms are compared: (1) a heuristic algorithm based on oriented filters, and (2) a support vector learning machine trained on low-resolution images of the objects. Classification performance is assessed using a high number of images generated by a computer graphics system under precisely controlled conditions. Training- and test-images show a set of 25 realistic three-dimensional models of chairs from viewing directions spread over the upper half of the viewing sphere. The percentage of correct identification of all 25 objects is measured.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Learning View Graphs for Robot Navigation

Franz, M., Schölkopf, B., Georg, P., Mallot, H., Bülthoff, H.

(33), Max Planck Institute for Biological Cybernetics, Tübingen,, July 1996 (techreport)

Abstract
We present a purely vision-based scheme for learning a parsimonious representation of an open environment. Using simple exploration behaviours, our system constructs a graph of appropriately chosen views. To navigate between views connected in the graph, we employ a homing strategy inspired by findings of insect ethology. Simulations and robot experiments demonstrate the feasibility of the proposed approach.

[BibTex]

[BibTex]


no image
Incorporating invariances in support vector learning machines

Schölkopf, B., Burges, C., Vapnik, V.

In Artificial Neural Networks: ICANN 96, LNCS vol. 1112, pages: 47-52, (Editors: C von der Malsburg and W von Seelen and JC Vorbrüggen and B Sendhoff), Springer, Berlin, Germany, 6th International Conference on Artificial Neural Networks, July 1996, volume 1112 of Lecture Notes in Computer Science (inproceedings)

Abstract
Developed only recently, support vector learning machines achieve high generalization ability by minimizing a bound on the expected test error; however, so far there existed no way of adding knowledge about invariances of a classification problem at hand. We present a method of incorporating prior knowledge about transformation invariances by applying transformations to support vectors, the training examples most critical for determining the classification boundary.

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
A practical Monte Carlo implementation of Bayesian learning

Rasmussen, CE.

In Advances in Neural Information Processing Systems 8, pages: 598-604, (Editors: Touretzky, D.S. , M.C. Mozer, M.E. Hasselmo), MIT Press, Cambridge, MA, USA, Ninth Annual Conference on Neural Information Processing Systems (NIPS), June 1996 (inproceedings)

Abstract
A practical method for Bayesian training of feed-forward neural networks using sophisticated Monte Carlo methods is presented and evaluated. In reasonably small amounts of computer time this approach outperforms other state-of-the-art methods on 5 datalimited tasks from real world domains.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Gaussian Processes for Regression

Williams, CKI., Rasmussen, CE.

In Advances in neural information processing systems 8, pages: 514-520, (Editors: Touretzky, D.S. , M.C. Mozer, M.E. Hasselmo), MIT Press, Cambridge, MA, USA, Ninth Annual Conference on Neural Information Processing Systems (NIPS), June 1996 (inproceedings)

Abstract
The Bayesian analysis of neural networks is difficult because a simple prior over weights implies a complex prior over functions. We investigate the use of a Gaussian process prior over functions, which permits the predictive Bayesian analysis for fixed values of hyperparameters to be carried out exactly using matrix operations. Two methods, using optimization and averaging (via Hybrid Monte Carlo) over hyperparameters have been tested on a number of challenging problems and have produced excellent results.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Evaluation of Gaussian Processes and other Methods for Non-Linear Regression

Rasmussen, CE.

Biologische Kybernetik, Graduate Department of Computer Science, Univeristy of Toronto, 1996 (phdthesis)

PostScript [BibTex]

PostScript [BibTex]


no image
Künstliches Lernen

Schölkopf, B.

In Komplexe adaptive Systeme, Forum für Interdisziplinäre Forschung, 15, pages: 93-117, Forum für interdisziplinäre Forschung, (Editors: S Bornholdt and PH Feindt), Röll, Dettelbach, 1996 (inbook)

[BibTex]

[BibTex]


no image
Aktives Erwerben eines Ansichtsgraphen zur diskreten Repräsentation offener Umwelten.

Franz, M., Schölkopf, B., Mallot, H., Bülthoff, H.

Fortschritte der K{\"u}nstlichen Intelligenz, pages: 138-147, (Editors: M. Thielscher and S.-E. Bornscheuer), 1996 (poster)

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Does motion-blur facilitate motion detection ?

Wichmann, F., Henning, G.

OSA Conference Program, pages: S127, 1996 (poster)

Abstract
Retinal-image motion induces the perceptual loss of high spatial-frequency content - motion blur - that affects broadband stimuli. The relative detectability of motion blur and motion itself, measured in 2-AFC experiments, shows that, although the blur associated with motion can be detected, motion itself is the more effective cue.

[BibTex]

[BibTex]

1993


no image
Presynaptic and Postsynaptic Competition in models for the Development of Neuromuscular Connections

Rasmussen, CE., Willshaw, DJ.

Biological Cybernetics, 68, pages: 409-419, 1993 (article)

Abstract
The development of the nervous system involves in many cases interactions on a local scale rather than the execution of a fully specified genetic blueprint. The problem is to discover the nature of these interactions and the factors on which they depend. The withdrawal of polyinnervation in developing muscle is an example where such competitive interactions play an important role. We examine the possible types of competition in formal models that have plausible biological implementations. By relating the behaviour of the models to the anatomical and physiological findings we show that a model that incorporates two types of competition is superior to others. Analysis suggests that the phenomenon of intrinsic withdrawal is a side effect of the competitive mechanisms rather than a separate non-competitive feature. Full scale computer simulations have been used to confirm the capabilities of this model.

PostScript [BibTex]

1993

PostScript [BibTex]


no image
Cartesian Dynamics of Simple Molecules: X Linear Quadratomics (C∞v Symmetry).

Anderson, A., Davison, T., Nagi, N., Schlueter, S.

Spectroscopy Letters, 26, pages: 509-522, 1993 (article)

[BibTex]

[BibTex]