Header logo is ei



no image
Domain Adaptation with Conditional Transferable Components

Gong, M., Zhang, K., Liu, T., Tao, D., Glymour, C., Schölkopf, B.

Proceedings of the 33nd International Conference on Machine Learning (ICML), 48, pages: 2839-2848, JMLR Workshop and Conference Proceedings, (Editors: Balcan, M.-F. and Weinberger, K. Q.), June 2016 (conference)

link (url) [BibTex]

link (url) [BibTex]


no image
Learning Causal Interaction Network of Multivariate Hawkes Processes

Etesami, S., Kiyavash, N., Zhang, K., Singhal, K.

Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (UAI), June 2016, poster presentation (conference)

[BibTex]

[BibTex]


no image
Efficient Large-scale Approximate Nearest Neighbor Search on the GPU

Wieschollek, P., Wang, O., Sorkine-Hornung, A., Lensch, H. P. A.

29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages: 2027 - 2035, IEEE, June 2016 (conference)

DOI [BibTex]

DOI [BibTex]


no image
On the Identifiability and Estimation of Functional Causal Models in the Presence of Outcome-Dependent Selection

Zhang, K., Zhang, J., Huang, B., Schölkopf, B., Glymour, C.

Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (UAI), pages: 825-834, (Editors: Ihler, A. and Janzing, D.), AUAI Press, June 2016 (conference)

link (url) [BibTex]

link (url) [BibTex]


Thumb xl screen shot 2018 10 09 at 11.42.49
Active Uncertainty Calibration in Bayesian ODE Solvers

Kersting, H., Hennig, P.

Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (UAI), pages: 309-318, (Editors: Ihler, A. and Janzing, D.), AUAI Press, June 2016 (conference)

Abstract
There is resurging interest, in statistics and machine learning, in solvers for ordinary differential equations (ODEs) that return probability measures instead of point estimates. Recently, Conrad et al.~introduced a sampling-based class of methods that are `well-calibrated' in a specific sense. But the computational cost of these methods is significantly above that of classic methods. On the other hand, Schober et al.~pointed out a precise connection between classic Runge-Kutta ODE solvers and Gaussian filters, which gives only a rough probabilistic calibration, but at negligible cost overhead. By formulating the solution of ODEs as approximate inference in linear Gaussian SDEs, we investigate a range of probabilistic ODE solvers, that bridge the trade-off between computational cost and probabilistic calibration, and identify the inaccurate gradient measurement as the crucial source of uncertainty. We propose the novel filtering-based method Bayesian Quadrature filtering (BQF) which uses Bayesian quadrature to actively learn the imprecision in the gradient measurement by collecting multiple gradient evaluations.

link (url) Project Page Project Page [BibTex]

link (url) Project Page Project Page [BibTex]


no image
The Arrow of Time in Multivariate Time Serie

Bauer, S., Schölkopf, B., Peters, J.

Proceedings of the 33rd International Conference on Machine Learning (ICML), 48, pages: 2043-2051, JMLR Workshop and Conference Proceedings, (Editors: Balcan, M. F. and Weinberger, K. Q.), JMLR, June 2016 (conference)

link (url) [BibTex]

link (url) [BibTex]


no image
A Kernel Test for Three-Variable Interactions with Random Processes

Rubenstein, P. K., Chwialkowski, K. P., Gretton, A.

Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence (UAI), (Editors: Ihler, Alexander T. and Janzing, Dominik), June 2016 (conference)

PDF Supplement Arxiv [BibTex]

PDF Supplement Arxiv [BibTex]


no image
Continuous Deep Q-Learning with Model-based Acceleration

Gu, S., Lillicrap, T., Sutskever, I., Levine, S.

Proceedings of the 33nd International Conference on Machine Learning (ICML), 48, pages: 2829-2838, JMLR Workshop and Conference Proceedings, (Editors: Maria-Florina Balcan and Kilian Q. Weinberger), JMLR.org, June 2016 (conference)

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Bounded Rational Decision-Making in Feedforward Neural Networks

Leibfried, F, Braun, D

Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (UAI), pages: 407-416, June 2016 (conference)

Abstract
Bounded rational decision-makers transform sensory input into motor output under limited computational resources. Mathematically, such decision-makers can be modeled as information-theoretic channels with limited transmission rate. Here, we apply this formalism for the first time to multilayer feedforward neural networks. We derive synaptic weight update rules for two scenarios, where either each neuron is considered as a bounded rational decision-maker or the network as a whole. In the update rules, bounded rationality translates into information-theoretically motivated types of regularization in weight space. In experiments on the MNIST benchmark classification task for handwritten digits, we show that such information-theoretic regularization successfully prevents overfitting across different architectures and attains results that are competitive with other recent techniques like dropout, dropconnect and Bayes by backprop, for both ordinary and convolutional neural networks.

[BibTex]

[BibTex]


no image
Batch Bayesian Optimization via Local Penalization

González, J., Dai, Z., Hennig, P., Lawrence, N.

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), 51, pages: 648-657, JMLR Workshop and Conference Proceedings, (Editors: Gretton, A. and Robert, C. C.), May 2016 (conference)

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
MuProp: Unbiased Backpropagation for Stochastic Neural Networks

Gu, S., Levine, S., Sutskever, I., Mnih, A.

4th International Conference on Learning Representations (ICLR), May 2016 (conference)

Arxiv [BibTex]

Arxiv [BibTex]


no image
An Improved Cognitive Brain-Computer Interface for Patients with Amyotrophic Lateral Sclerosis

Hohmann, M. R., Fomina, T., Jayaram, V., Förster, C., Just, J., M., S., Schölkopf, B., Schöls, L., Grosse-Wentrup, M.

Proceedings of the Sixth International BCI Meeting, pages: 44, (Editors: Müller-Putz, G. R. and Huggins, J. E. and Steyrl, D.), BCI, May 2016 (conference)

DOI [BibTex]

DOI [BibTex]


no image
Movement Primitives with Multiple Phase Parameters

Ewerton, M., Maeda, G., Neumann, G., Kisner, V., Kollegger, G., Wiemeyer, J., Peters, J.

IEEE International Conference on Robotics and Automation (ICRA), pages: 201-206, IEEE, May 2016 (conference)

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
TerseSVM : A Scalable Approach for Learning Compact Models in Large-scale Classification

Babbar, R., Muandet, K., Schölkopf, B.

Proceedings of the 2016 SIAM International Conference on Data Mining (SDM), pages: 234-242, (Editors: Sanjay Chawla Venkatasubramanian and Wagner Meira), May 2016 (conference)

DOI Project Page [BibTex]

DOI Project Page [BibTex]


Thumb xl 2pamcompressed
A Lightweight Robotic Arm with Pneumatic Muscles for Robot Learning

Büchler, D., Ott, H., Peters, J.

Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 4086-4092, IEEE, IEEE International Conference on Robotics and Automation, May 2016 (conference)

ICRA16final DOI Project Page [BibTex]

ICRA16final DOI Project Page [BibTex]


Thumb xl untitled
Probabilistic Approximate Least-Squares

Bartels, S., Hennig, P.

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), 51, pages: 676-684, JMLR Workshop and Conference Proceedings, (Editors: Gretton, A. and Robert, C. C. ), May 2016 (conference)

Abstract
Least-squares and kernel-ridge / Gaussian process regression are among the foundational algorithms of statistics and machine learning. Famously, the worst-case cost of exact nonparametric regression grows cubically with the data-set size; but a growing number of approximations have been developed that estimate good solutions at lower cost. These algorithms typically return point estimators, without measures of uncertainty. Leveraging recent results casting elementary linear algebra operations as probabilistic inference, we propose a new approximate method for nonparametric least-squares that affords a probabilistic uncertainty estimate over the error between the approximate and exact least-squares solution (this is not the same as the posterior variance of the associated Gaussian process regressor). This allows estimating the error of the least-squares solution on a subset of the data relative to the full-data solution. The uncertainty can be used to control the computational effort invested in the approximation. Our algorithm has linear cost in the data-set size, and a simple formal form, so that it can be implemented with a few lines of code in programming languages with linear algebra functionality.

link (url) Project Page Project Page [BibTex]

link (url) Project Page Project Page [BibTex]


no image
Learning soft task priorities for control of redundant robots

Modugno, V., Neumann, G., Rueckert, E., Oriolo, G., Peters, J., Ivaldi, S.

IEEE International Conference on Robotics and Automation (ICRA), pages: 221-226, IEEE, May 2016 (conference)

DOI [BibTex]

DOI [BibTex]


no image
On the Reliability of Information and Trustworthiness of Web Sources in Wikipedia

Tabibian, B., Farajtabar, M., Valera, I., Song, L., Schölkopf, B., Gomez Rodriguez, M.

Wikipedia workshop at the 10th International AAAI Conference on Web and Social Media (ICWSM), May 2016 (conference)

[BibTex]

[BibTex]


Thumb xl 2016 peer grading
Peer Grading in a Course on Algorithms and Data Structures: Machine Learning Algorithms do not Improve over Simple Baselines

Sajjadi, M. S. M., Alamgir, M., von Luxburg, U.

Proceedings of the 3rd ACM conference on Learning @ Scale, pages: 369-378, (Editors: Haywood, J. and Aleven, V. and Kay, J. and Roll, I.), ACM, L@S, April 2016, (An earlier version of this paper had been presented at the ICML 2015 workshop for Machine Learning for Education.) (conference)

Arxiv Peer-Grading dataset request [BibTex]

Arxiv Peer-Grading dataset request [BibTex]


no image
Fabular: Regression Formulas As Probabilistic Programming

Borgström, J., Gordon, A. D., Ouyang, L., Russo, C., Ścibior, A., Szymczak, M.

Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages: 271-283, POPL ’16, ACM, January 2016 (conference)

DOI Project Page [BibTex]

DOI Project Page [BibTex]


Thumb xl heteroscedasticgp
Modeling Variability of Musculoskeletal Systems with Heteroscedastic Gaussian Processes

Büchler, D., Calandra, R., Peters, J.

Workshop on Neurorobotics, Neural Information Processing Systems (NIPS), 2016 (conference)

NIPS16Neurorobotics [BibTex]

NIPS16Neurorobotics [BibTex]


no image
Screening Rules for Convex Problems

Raj, A., Olbrich, J., Gärtner, B., Schölkopf, B., Jaggi, M.

2016 (unpublished) Submitted

[BibTex]

[BibTex]


no image
Causal and statistical learning

Schölkopf, B., Janzing, D., Lopez-Paz, D.

Oberwolfach Reports, 13(3):1896-1899, (Editors: A. Christmann and K. Jetter and S. Smale and D.-X. Zhou), 2016 (conference)

DOI [BibTex]

DOI [BibTex]

2002


no image
Gender Classification of Human Faces

Graf, A., Wichmann, F.

In Biologically Motivated Computer Vision, pages: 1-18, (Editors: Bülthoff, H. H., S.W. Lee, T. A. Poggio and C. Wallraven), Springer, Berlin, Germany, Second International Workshop on Biologically Motivated Computer Vision (BMCV), November 2002 (inproceedings)

Abstract
This paper addresses the issue of combining pre-processing methods—dimensionality reduction using Principal Component Analysis (PCA) and Locally Linear Embedding (LLE)—with Support Vector Machine (SVM) classification for a behaviorally important task in humans: gender classification. A processed version of the MPI head database is used as stimulus set. First, summary statistics of the head database are studied. Subsequently the optimal parameters for LLE and the SVM are sought heuristically. These values are then used to compare the original face database with its processed counterpart and to assess the behavior of a SVM with respect to changes in illumination and perspective of the face images. Overall, PCA was superior in classification performance and allowed linear separability.

PDF PDF DOI [BibTex]

2002

PDF PDF DOI [BibTex]


no image
Insect-Inspired Estimation of Self-Motion

Franz, MO., Chahl, JS.

In Biologically Motivated Computer Vision, (2525):171-180, LNCS, (Editors: Bülthoff, H.H. , S.W. Lee, T.A. Poggio, C. Wallraven), Springer, Berlin, Germany, Second International Workshop on Biologically Motivated Computer Vision (BMCV), November 2002 (inproceedings)

Abstract
The tangential neurons in the fly brain are sensitive to the typical optic flow patterns generated during self-motion. In this study, we examine whether a simplified linear model of these neurons can be used to estimate self-motion from the optic flow. We present a theory for the construction of an optimal linear estimator incorporating prior knowledge about the environment. The optimal estimator is tested on a gantry carrying an omnidirectional vision sensor. The experiments show that the proposed approach leads to accurate and robust estimates of rotation rates, whereas translation estimates turn out to be less reliable.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Combining sensory Information to Improve Visualization

Ernst, M., Banks, M., Wichmann, F., Maloney, L., Bülthoff, H.

In Proceedings of the Conference on Visualization ‘02 (VIS ‘02), pages: 571-574, (Editors: Moorhead, R. , M. Joy), IEEE, Piscataway, NJ, USA, IEEE Conference on Visualization (VIS '02), October 2002 (inproceedings)

Abstract
Seemingly effortlessly the human brain reconstructs the three-dimensional environment surrounding us from the light pattern striking the eyes. This seems to be true across almost all viewing and lighting conditions. One important factor for this apparent easiness is the redundancy of information provided by the sensory organs. For example, perspective distortions, shading, motion parallax, or the disparity between the two eyes' images are all, at least partly, redundant signals which provide us with information about the three-dimensional layout of the visual scene. Our brain uses all these different sensory signals and combines the available information into a coherent percept. In displays visualizing data, however, the information is often highly reduced and abstracted, which may lead to an altered perception and therefore a misinterpretation of the visualized data. In this panel we will discuss mechanisms involved in the combination of sensory information and their implications for simulations using computer displays, as well as problems resulting from current display technology such as cathode-ray tubes.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Sampling Techniques for Kernel Methods

Achlioptas, D., McSherry, F., Schölkopf, B.

In Advances in neural information processing systems 14 , pages: 335-342, (Editors: TG Dietterich and S Becker and Z Ghahramani), MIT Press, Cambridge, MA, USA, 15th Annual Neural Information Processing Systems Conference (NIPS), September 2002 (inproceedings)

Abstract
We propose randomized techniques for speeding up Kernel Principal Component Analysis on three levels: sampling and quantization of the Gram matrix in training, randomized rounding in evaluating the kernel expansions, and random projections in evaluating the kernel itself. In all three cases, we give sharp bounds on the accuracy of the obtained approximations.

PDF Web [BibTex]

PDF Web [BibTex]


no image
The Infinite Hidden Markov Model

Beal, MJ., Ghahramani, Z., Rasmussen, CE.

In Advances in Neural Information Processing Systems 14, pages: 577-584, (Editors: Dietterich, T.G. , S. Becker, Z. Ghahramani), MIT Press, Cambridge, MA, USA, Fifteenth Annual Neural Information Processing Systems Conference (NIPS), September 2002 (inproceedings)

Abstract
We show that it is possible to extend hidden Markov models to have a countably infinite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the infinitely many transition parameters, leaving only three hyperparameters which can be learned from data. These three hyperparameters define a hierarchical Dirichlet process capable of capturing a rich set of transition dynamics. The three hyperparameters control the time scale of the dynamics, the sparsity of the underlying state-transition matrix, and the expected number of distinct hidden states in a finite sequence. In this framework it is also natural to allow the alphabet of emitted symbols to be infinite - consider, for example, symbols being possible words appearing in English text.

PDF Web [BibTex]

PDF Web [BibTex]


no image
A new discriminative kernel from probabilistic models

Tsuda, K., Kawanabe, M., Rätsch, G., Sonnenburg, S., Müller, K.

In Advances in Neural Information Processing Systems 14, pages: 977-984, (Editors: Dietterich, T.G. , S. Becker, Z. Ghahramani), MIT Press, Cambridge, MA, USA, Fifteenth Annual Neural Information Processing Systems Conference (NIPS), September 2002 (inproceedings)

Abstract
Recently, Jaakkola and Haussler proposed a method for constructing kernel functions from probabilistic models. Their so called \Fisher kernel" has been combined with discriminative classi ers such as SVM and applied successfully in e.g. DNA and protein analysis. Whereas the Fisher kernel (FK) is calculated from the marginal log-likelihood, we propose the TOP kernel derived from Tangent vectors Of Posterior log-odds. Furthermore, we develop a theoretical framework on feature extractors from probabilistic models and use it for analyzing the TOP kernel. In experiments our new discriminative TOP kernel compares favorably to the Fisher kernel.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Incorporating Invariances in Non-Linear Support Vector Machines

Chapelle, O., Schölkopf, B.

In Advances in Neural Information Processing Systems 14, pages: 609-616, (Editors: TG Dietterich and S Becker and Z Ghahramani), MIT Press, Cambridge, MA, USA, 15th Annual Neural Information Processing Systems Conference (NIPS), September 2002 (inproceedings)

Abstract
The choice of an SVM kernel corresponds to the choice of a representation of the data in a feature space and, to improve performance, it should therefore incorporate prior knowledge such as known transformation invariances. We propose a technique which extends earlier work and aims at incorporating invariances in nonlinear kernels. We show on a digit recognition task that the proposed approach is superior to the Virtual Support Vector method, which previously had been the method of choice.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Kernel feature spaces and nonlinear blind source separation

Harmeling, S., Ziehe, A., Kawanabe, M., Müller, K.

In Advances in Neural Information Processing Systems 14, pages: 761-768, (Editors: Dietterich, T. G., S. Becker, Z. Ghahramani), MIT Press, Cambridge, MA, USA, Fifteenth Annual Neural Information Processing Systems Conference (NIPS), September 2002 (inproceedings)

Abstract
In kernel based learning the data is mapped to a kernel feature space of a dimension that corresponds to the number of training data points. In practice, however, the data forms a smaller submanifold in feature space, a fact that has been used e.g. by reduced set techniques for SVMs. We propose a new mathematical construction that permits to adapt to the intrinsic dimension and to find an orthonormal basis of this submanifold. In doing so, computations get much simpler and more important our theoretical framework allows to derive elegant kernelized blind source separation (BSS) algorithms for arbitrary invertible nonlinear mixings. Experiments demonstrate the good performance and high computational efficiency of our kTDSEP algorithm for the problem of nonlinear BSS.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Algorithms for Learning Function Distinguishable Regular Languages

Fernau, H., Radl, A.

In Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition, pages: 64-73, (Editors: Caelli, T. , A. Amin, R. P.W. Duin, M. Kamel, D. de Ridder), Springer, Berlin, Germany, Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition, August 2002 (inproceedings)

Abstract
Function distinguishable languages were introduced as a new methodology of defining characterizable subclasses of the regular languages which are learnable from text. Here, we give details on the implementation and the analysis of the corresponding learning algorithms. We also discuss problems which might occur in practical applications.

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Decision Boundary Pattern Selection for Support Vector Machines

Shin, H., Cho, S.

In Proc. of the Korean Data Mining Conference, pages: 33-41, Korean Data Mining Conference, May 2002 (inproceedings)

[BibTex]

[BibTex]


no image
k-NN based Pattern Selection for Support Vector Classifiers

Shin, H., Cho, S.

In Proc. of the Korean Industrial Engineers Conference, pages: 645-651, Korean Industrial Engineers Conference, May 2002 (inproceedings)

[BibTex]

[BibTex]


no image
Microarrays: How Many Do You Need?

Zien, A., Fluck, J., Zimmer, R., Lengauer, T.

In RECOMB 2002, pages: 321-330, ACM Press, New York, NY, USA, Sixth Annual International Conference on Research in Computational Molecular Biology, April 2002 (inproceedings)

Abstract
We estimate the number of microarrays that is required in order to gain reliable results from a common type of study: the pairwise comparison of different classes of samples. Current knowlegde seems to suffice for the construction of models that are realistic with respect to searches for individual differentially expressed genes. Such models allow to investigate the dependence of the required number of samples on the relevant parameters: the biological variability of the samples within each class; the fold changes in expression; the detection sensitivity of the microarrays; and the acceptable error rates of the results. We supply experimentalists with general conclusions as well as a freely accessible Java applet at http://cartan.gmd.de/~zien/classsize/ for fine tuning simulations to their particular actualities. Since the situation can be assumed to be very similar for large scale proteomics and metabolomics studies, our methods and results might also apply there.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Pattern Selection for Support Vector Classifiers

Shin, H., Cho, S.

In Ideal 2002, pages: 97-103, (Editors: Yin, H. , N. Allinson, R. Freeman, J. Keane, S. Hubbard), Springer, Berlin, Germany, Third International Conference on Intelligent Data Engineering and Automated Learning, January 2002 (inproceedings)

Abstract
SVMs tend to take a very long time to train with a large data set. If "redundant" patterns are identified and deleted in pre-processing, the training time could be reduced significantly. We propose a k-nearest neighbors(k-NN) based pattern selection method. The method tries to select the patterns that are near the decision boundary and that are correctly labeled. The simulations over synthetic data sets showed promising results: (1) By converting a non-separable problem to a separable one, the search for an optimal error tolerance parameter became unnecessary. (2) SVM training time decreased by two orders of magnitude without any loss of accuracy. (3) The redundant SVs were substantially reduced.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
The leave-one-out kernel

Tsuda, K., Kawanabe, M.

In Artificial Neural Networks -- ICANN 2002, 2415, pages: 727-732, LNCS, (Editors: Dorronsoro, J. R.), Artificial Neural Networks -- ICANN, 2002 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
Localized Rademacher Complexities

Bartlett, P., Bousquet, O., Mendelson, S.

In Proceedings of the 15th annual conference on Computational Learning Theory, pages: 44-58, Proceedings of the 15th annual conference on Computational Learning Theory, 2002 (inproceedings)

Abstract
We investigate the behaviour of global and local Rademacher averages. We present new error bounds which are based on the local averages and indicate how data-dependent local averages can be estimated without {it a priori} knowledge of the class at hand.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Film Cooling: A Comparative Study of Different Heaterfoil Configurations for Liquid Crystals Experiments

Vogel, G., Graf, ABA., Weigand, B.

In ASME TURBO EXPO 2002, Amsterdam, GT-2002-30552, ASME TURBO EXPO, Amsterdam, 2002 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
Some Local Measures of Complexity of Convex Hulls and Generalization Bounds

Bousquet, O., Koltchinskii, V., Panchenko, D.

In Proceedings of the 15th annual conference on Computational Learning Theory, Proceedings of the 15th annual conference on Computational Learning Theory, 2002 (inproceedings)

Abstract
We investigate measures of complexity of function classes based on continuity moduli of Gaussian and Rademacher processes. For Gaussian processes, we obtain bounds on the continuity modulus on the convex hull of a function class in terms of the same quantity for the class itself. We also obtain new bounds on generalization error in terms of localized Rademacher complexities. This allows us to prove new results about generalization performance for convex hulls in terms of characteristics of the base class. As a byproduct, we obtain a simple proof of some of the known bounds on the entropy of convex hulls.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
A kernel approach for learning from almost orthogonal patterns

Schölkopf, B., Weston, J., Eskin, E., Leslie, C., Noble, W.

In Principles of Data Mining and Knowledge Discovery, Lecture Notes in Computer Science, 2430/2431, pages: 511-528, Lecture Notes in Computer Science, (Editors: T Elomaa and H Mannila and H Toivonen), Springer, Berlin, Germany, 13th European Conference on Machine Learning (ECML) and 6th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'2002), 2002 (inproceedings)

PostScript DOI [BibTex]

PostScript DOI [BibTex]


no image
Infinite Mixtures of Gaussian Process Experts

Rasmussen, CE., Ghahramani, Z.

In (Editors: Dietterich, Thomas G.; Becker, Suzanna; Ghahramani, Zoubin), 2002 (inproceedings)

Abstract
We present an extension to the Mixture of Experts (ME) model, where the individual experts are Gaussian Process (GP) regression models. Using a input-dependent adaptation of the Dirichlet Process, we implement a gating network for an infinite number of Experts. Inference in this model may be done efficiently using a Markov Chain relying on Gibbs sampling. The model allows the effective covariance function to vary with the inputs, and may handle large datasets -- thus potentially overcoming two of the biggest hurdles with GP models. Simulations show the viability of this approach.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Marginalized kernels for RNA sequence data analysis

Kin, T., Tsuda, K., Asai, K.

In Genome Informatics 2002, pages: 112-122, (Editors: Lathtop, R. H.; Nakai, K.; Miyano, S.; Takagi, T.; Kanehisa, M.), Genome Informatics, 2002, (Best Paper Award) (inproceedings)

Web [BibTex]

Web [BibTex]


no image
Luminance Artifacts on CRT Displays

Wichmann, F.

In IEEE Visualization, pages: 571-574, (Editors: Moorhead, R.; Gross, M.; Joy, K. I.), IEEE Visualization, 2002 (inproceedings)

Abstract
Most visualization panels today are still built around cathode-ray tubes (CRTs), certainly on personal desktops at work and at home. Whilst capable of producing pleasing images for common applications ranging from email writing to TV and DVD presentation, it is as well to note that there are a number of nonlinear transformations between input (voltage) and output (luminance) which distort the digital and/or analogue images send to a CRT. Some of them are input-independent and hence easy to fix, e.g. gamma correction, but others, such as pixel interactions, depend on the content of the input stimulus and are thus harder to compensate for. CRT-induced image distortions cause problems not only in basic vision research but also for applications where image fidelity is critical, most notably in medicine (digitization of X-ray images for diagnostic purposes) and in forms of online commerce, such as the online sale of images, where the image must be reproduced on some output device which will not have the same transfer function as the customer's CRT. I will present measurements from a number of CRTs and illustrate how some of their shortcomings may be problematic for the aforementioned applications.

[BibTex]

[BibTex]

1997


no image
The view-graph approach to visual navigation and spatial memory

Mallot, H., Franz, M., Schölkopf, B., Bülthoff, H.

In Artificial Neural Networks: ICANN ’97, pages: 751-756, (Editors: W Gerstner and A Germond and M Hasler and J-D Nicoud), Springer, Berlin, Germany, 7th International Conference on Artificial Neural Networks, October 1997 (inproceedings)

Abstract
This paper describes a purely visual navigation scheme based on two elementary mechanisms (piloting and guidance) and a graph structure combining individual navigation steps controlled by these mechanisms. In robot experiments in real environments, both mechanisms have been tested, piloting in an open environment and guidance in a maze with restricted movement opportunities. The results indicate that navigation and path planning can be brought about with these simple mechanisms. We argue that the graph of local views (snapshots) is a general and biologically plausible means of representing space and integrating the various mechanisms of map behaviour.

PDF PDF DOI [BibTex]

1997

PDF PDF DOI [BibTex]


no image
Predicting time series with support vector machines

Müller, K., Smola, A., Rätsch, G., Schölkopf, B., Kohlmorgen, J., Vapnik, V.

In Artificial Neural Networks: ICANN’97, pages: 999-1004, (Editors: Schölkopf, B. , C.J.C. Burges, A.J. Smola), Springer, Berlin, Germany, 7th International Conference on Artificial Neural Networks , October 1997 (inproceedings)

Abstract
Support Vector Machines are used for time series prediction and compared to radial basis function networks. We make use of two different cost functions for Support Vectors: training with (i) an e insensitive loss and (ii) Huber's robust loss function and discuss how to choose the regularization parameters in these models. Two applications are considered: data from (a) a noisy (normal and uniform noise) Mackey Glass equation and (b) the Santa Fe competition (set D). In both cases Support Vector Machines show an excellent performance. In case (b) the Support Vector approach improves the best known result on the benchmark by a factor of 29%.

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Predicting time series with support vectur machines

Müller, K., Smola, A., Rätsch, G., Schölkopf, B., Kohlmorgen, J., Vapnik, V.

In Artificial neural networks: ICANN ’97, pages: 999-1004, (Editors: W Gerstner and A Germond and M Hasler and J-D Nicoud), Springer, Berlin, Germany, 7th International Conference on Artificial Neural Networks , October 1997 (inproceedings)

Abstract
Support Vector Machines are used for time series prediction and compared to radial basis function networks. We make use of two different cost functions for Support Vectors: training with (i) an e insensitive loss and (ii) Huber's robust loss function and discuss how to choose the regularization parameters in these models. Two applications are considered: data from (a) a noisy (normal and uniform noise) Mackey Glass equation and (b) the Santa Fe competition (set D). In both cases Support Vector Machines show an excellent performance. In case (b) the Support Vector approach improves the best known result on the benchmark by a factor of 29%.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Kernel principal component analysis

Schölkopf, B., Smola, A., Müller, K.

In Artificial neural networks: ICANN ’97, LNCS, vol. 1327, pages: 583-588, (Editors: W Gerstner and A Germond and M Hasler and J-D Nicoud), Springer, Berlin, Germany, 7th International Conference on Artificial Neural Networks, October 1997 (inproceedings)

Abstract
A new method for performing a nonlinear form of Principal Component Analysis is proposed. By the use of integral operator kernel functions, one can efficiently compute principal components in highdimensional feature spaces, related to input space by some nonlinear map; for instance the space of all possible d-pixel products in images. We give the derivation of the method and present experimental results on polynomial feature extraction for pattern recognition.

PDF DOI [BibTex]

PDF DOI [BibTex]