Header logo is ei


2012


no image
Robot Skill Learning

Peters, J., Kober, J., Mülling, K., Nguyen-Tuong, D., Kroemer, O.

In 20th European Conference on Artificial Intelligence , pages: 40-45, ECAI, 2012 (inproceedings)

PDF DOI [BibTex]

2012

PDF DOI [BibTex]


no image
Towards a learning-theoretic analysis of spike-timing dependent plasticity

Balduzzi, D., Besserve, M.

In Advances in Neural Information Processing Systems 25, pages: 2465-2473, (Editors: P Bartlett and FCN Pereira and CJC. Burges and L Bottou and KQ Weinberger), Curran Associates Inc., 26th Annual Conference on Neural Information Processing Systems (NIPS), 2012 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
Recording and Playback of Camera Shake: Benchmarking Blind Deconvolution with a Real-World Database

Köhler, R., Hirsch, M., Mohler, B., Schölkopf, B., Harmeling, S.

In Computer Vision - ECCV 2012, LNCS Vol. 7578, pages: 27-40, (Editors: A. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato, and C. Schmid), Springer, Berlin, Germany, 12th European Conference on Computer Vision, ECCV , 2012 (inproceedings)

Abstract
Motion blur due to camera shake is one of the predominant sources of degradation in handheld photography. Single image blind deconvolution (BD) or motion deblurring aims at restoring a sharp latent image from the blurred recorded picture without knowing the camera motion that took place during the exposure. BD is a long-standing problem, but has attracted much attention recently, cumulating in several algorithms able to restore photos degraded by real camera motion in high quality. In this paper, we present a benchmark dataset for motion deblurring that allows quantitative performance evaluation and comparison of recent approaches featuring non-uniform blur models. To this end, we record and analyse real camera motion, which is played back on a robot platform such that we can record a sequence of sharp images sampling the six dimensional camera motion trajectory. The goal of deblurring is to recover one of these sharp images, and our dataset contains all information to assess how closely various algorithms approximate that goal. In a comprehensive comparison, we evaluate state-of-the-art single image BD algorithms incorporating uniform and non-uniform blur models.

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Towards identifying and validating cognitive correlates in a passive Brain-Computer Interface for detecting Loss of Control

Zander, TO., Pape, AA.

In Proceedings of the 34th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, IEEE, EMBC, 2012 (inproceedings)

[BibTex]

[BibTex]


no image
Neural correlates of workload and puzzlement during loss of control

Pape, AA., Gerjets, P., Zander, TO.

In Meeting of the EARLI SIG 22 Neuroscience and Education, 2012 (inproceedings)

[BibTex]

[BibTex]


no image
Hypothesis testing using pairwise distances and associated kernels

Sejdinovic, D., Gretton, A., Sriperumbudur, B., Fukumizu, K.

In Proceedings of the 29th International Conference on Machine Learning, pages: 1111-1118, (Editors: J Langford and J Pineau), Omnipress, New York, NY, USA, ICML, 2012 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
Efficient Training of Graph-Regularized Multitask SVMs

Widmer, C., Kloft, M., Görnitz, N., Rätsch, G.

In Machine Learning and Knowledge Discovery in Databases - European Conference, ECML/PKDD 2012, LNCS Vol. 7523, pages: 633-647, (Editors: PA Flach and T De Bie and N Cristianini), Springer, Berlin, Germany, ECML, 2012 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
Hilbert Space Embeddings of POMDPs

Nishiyama, Y., Boularias, A., Gretton, A., Fukumizu, K.

In Conference on Uncertainty in Artificial Intelligence (UAI), 2012 (inproceedings)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Learning Throwing and Catching Skills

Kober, J., Mülling, K., Peters, J.

In IEEE/RSJ International Conference on Intelligent Robots and Systems , pages: 5167-5168, IROS, 2012 (inproceedings)

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Maximally Informative Interaction Learning for Scene Exploration

van Hoof, H., Kroemer, O., Ben Amor, H., Peters, J.

In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 5152-5158, IROS, 2012 (inproceedings)

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Investigating the Neural Basis of Brain-Computer Interface (BCI)-based Stroke Rehabilitation

Meyer, T., Peters, J., Zander, T., Brötz, D., Soekadar, S., Schölkopf, B., Grosse-Wentrup, M.

In International Conference on NeuroRehabilitation (ICNR) , pages: 617-621, (Editors: JL Pons, D Torricelli, and M Pajaro), Springer, Berlin, Germany, ICNR, 2012 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function

Ortega, P., Grau-Moya, J., Genewein, T., Balduzzi, D., Braun, D.

In Advances in Neural Information Processing Systems 25, pages: 3014-3022, (Editors: P Bartlett and FCN Pereira and CJC. Burges and L Bottou and KQ Weinberger), Curran Associates Inc., 26th Annual Conference on Neural Information Processing Systems (NIPS), 2012 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
Algorithms for Learning Markov Field Policies

Boularias, A., Kroemer, O., Peters, J.

In Advances in Neural Information Processing Systems 25, pages: 2186-2194, (Editors: P Bartlett and FCN Pereira and CJC. Burges and L Bottou and KQ Weinberger), Curran Associates Inc., 26th Annual Conference on Neural Information Processing Systems (NIPS), 2012 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
Semi-Supervised Domain Adaptation with Copulas

Lopez-Paz, D., Hernandez-Lobato, J., Schölkopf, B.

In Advances in Neural Information Processing Systems 25, pages: 674-682, (Editors: P Bartlett, FCN Pereira, CJC. Burges, L Bottou, and KQ Weinberger), Curran Associates Inc., 26th Annual Conference on Neural Information Processing Systems (NIPS), 2012 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
Gradient Weights help Nonparametric Regressors

Kpotufe, S., Boularias, A.

In Advances in Neural Information Processing Systems 25, pages: 2870-2878, (Editors: P Bartlett and FCN Pereira and CJC. Burges and L Bottou and KQ Weinberger), 26th Annual Conference on Neural Information Processing Systems (NIPS), 2012 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
A Blind Deconvolution Approach for Pseudo CT Prediction from MR Image Pairs

Hirsch, M., Hofmann, M., Mantlik, F., Pichler, B., Schölkopf, B., Habeck, M.

In 19th IEEE International Conference on Image Processing (ICIP) , pages: 2953 -2956, IEEE, ICIP, 2012 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
A mixed model approach for joint genetic analysis of alternatively spliced transcript isoforms using RNA-Seq data

Rakitsch, B., Lippert, C., Topa, H., Borgwardt, KM., Honkela, A., Stegle, O.

In 2012 (inproceedings) Submitted

Web [BibTex]

Web [BibTex]


no image
Evaluation of marginal likelihoods via the density of states

Habeck, M.

In Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2012) , 22, pages: 486-494, (Editors: N Lawrence and M Girolami), JMLR: W&CP 22, AISTATS, 2012 (inproceedings)

Abstract
Bayesian model comparison involves the evaluation of the marginal likelihood, the expectation of the likelihood under the prior distribution. Typically, this high-dimensional integral over all model parameters is approximated using Markov chain Monte Carlo methods. Thermodynamic integration is a popular method to estimate the marginal likelihood by using samples from annealed posteriors. Here we show that there exists a robust and flexible alternative. The new method estimates the density of states, which counts the number of states associated with a particular value of the likelihood. If the density of states is known, computation of the marginal likelihood reduces to a one- dimensional integral. We outline a maximum likelihood procedure to estimate the density of states from annealed posterior samples. We apply our method to various likelihoods and show that it is superior to thermodynamic integration in that it is more flexible with regard to the annealing schedule and the family of bridging distributions. Finally, we discuss the relation of our method with Skilling's nested sampling.

PDF [BibTex]

PDF [BibTex]


no image
Distributed multisensory signals acquisition and analysis in dyadic interactions

Tawari, A., Tran, C., Doshi, A., Zander, TO.

In Proceedings of the 2012 ACM Annual Conference on Human Factors in Computing Systems Extended Abstracts, pages: 2261-2266, (Editors: JA Konstan and EH Chi and K Höök), ACM, New York, NY, USA, CHI, 2012 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
Measuring Cognitive Load by means of EEG-data - how detailed is the picture we can get?

Scharinger, C., Cierniak, G., Walter, C., Zander, TO., Gerjets, P.

In Meeting of the EARLI SIG 22 Neuroscience and Education, 2012 (inproceedings)

[BibTex]

[BibTex]


no image
Optimal kernel choice for large-scale two-sample tests

Gretton, A., Sriperumbudur, B., Sejdinovic, D., Strathmann, H., Balakrishnan, S., Pontil, M., Fukumizu, K.

In Advances in Neural Information Processing Systems 25, pages: 1214-1222, (Editors: P Bartlett and FCN Pereira and CJC. Burges and L Bottou and KQ Weinberger), Curran Associates Inc., 26th Annual Conference on Neural Information Processing Systems (NIPS), 2012 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
On the Hardness of Domain Adaptation and the Utility of Unlabeled Target Samples

Ben-David, S., Urner, R.

In Algorithmic Learning Theory - 23rd International Conference, 7568, pages: 139-153, Lecture Notes in Computer Science, (Editors: Bshouty, NH. and Stoltz, G and Vayatis, N and Zeugmann, T), Springer Berlin Heidelberg, ALT, 2012 (inproceedings)

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Domain Adaptation–Can Quantity compensate for Quality?

Ben-David, S., Shalev-Shwartz, S., Urner, R.

In International Symposium on Artificial Intelligence and Mathematics, ISAIM, 2012 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]


no image
Learning from Weak Teachers

Urner, R., Ben-David, S., Shamir, O.

In Proceedings of the 15th International Conference on Artificial Intelligence and Statistics, 22, pages: 1252-1260, (Editors: Lawrence, N. and Girolami, M.), JMLR, AISTATS, 2012 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]


no image
Adaptive Coding of Actions and Observations

Ortega, PA, Braun, DA

pages: 1-4, NIPS Workshop on Information in Perception and Action, December 2012 (conference)

Abstract
The application of expected utility theory to construct adaptive agents is both computationally intractable and statistically questionable. To overcome these difficulties, agents need the ability to delay the choice of the optimal policy to a later stage when they have learned more about the environment. How should agents do this optimally? An information-theoretic answer to this question is given by the Bayesian control rule—the solution to the adaptive coding problem when there are not only observations but also actions. This paper reviews the central ideas behind the Bayesian control rule.

link (url) [BibTex]

link (url) [BibTex]


no image
Free Energy and the Generalized Optimality Equations for Sequential Decision Making

Ortega, PA, Braun, DA

pages: 1-10, 10th European Workshop on Reinforcement Learning (EWRL), July 2012 (conference)

Abstract
The free energy functional has recently been proposed as a variational principle for bounded rational decision-making, since it instantiates a natural trade-off between utility gains and information processing costs that can be axiomatically derived. Here we apply the free energy principle to general decision trees that include both adversarial and stochastic environments. We derive generalized sequential optimality equations that not only include the Bellman optimality equations as a limit case, but also lead to well-known decision-rules such as Expectimax, Minimax and Expectiminimax. We show how these decision-rules can be derived from a single free energy principle that assigns a resource parameter to each node in the decision tree. These resource parameters express a concrete computational cost that can be measured as the amount of samples that are needed from the distribution that belongs to each node. The free energy principle therefore provides the normative basis for generalized optimality equations that account for both adversarial and stochastic environments.

link (url) [BibTex]

link (url) [BibTex]

2003


no image
How to Deal with Large Dataset, Class Imbalance and Binary Output in SVM based Response Model

Shin, H., Cho, S.

In Proc. of the Korean Data Mining Conference, pages: 93-107, Korean Data Mining Conference, December 2003, Best Paper Award (inproceedings)

Abstract
[Abstract]: Various machine learning methods have made a rapid transition to response modeling in search of improved performance. And support vector machine (SVM) has also been attracting much attention lately. This paper presents an SVM response model. We are specifically focusing on the how-to’s to circumvent practical obstacles, such as how to face with class imbalance problem, how to produce the scores from an SVM classifier for lift chart analysis, and how to evaluate the models on accuracy and profit. Besides coping with the intractability problem of SVM training caused by large marketing dataset, a previously proposed pattern selection algorithm is introduced. SVM training accompanies time complexity of the cube of training set size. The pattern selection algorithm picks up important training patterns before SVM response modeling. We made comparison on SVM training results between the pattern selection algorithm and random sampling. Three aspects of SVM response models were evaluated: accuracies, lift chart analysis, and computational efficiency. The SVM trained with selected patterns showed a high accuracy, a high uplift in profit and in response rate, and a high computational efficiency.

PDF [BibTex]

2003

PDF [BibTex]


no image
Bayesian Monte Carlo

Rasmussen, CE., Ghahramani, Z.

In Advances in Neural Information Processing Systems 15, pages: 489-496, (Editors: Becker, S. , S. Thrun, K. Obermayer), MIT Press, Cambridge, MA, USA, Sixteenth Annual Conference on Neural Information Processing Systems (NIPS), October 2003 (inproceedings)

Abstract
We investigate Bayesian alternatives to classical Monte Carlo methods for evaluating integrals. Bayesian Monte Carlo (BMC) allows the incorporation of prior knowledge, such as smoothness of the integrand, into the estimation. In a simple problem we show that this outperforms any classical importance sampling method. We also attempt more challenging multidimensional integrals involved in computing marginal likelihoods of statistical models (a.k.a. partition functions and model evidences). We find that Bayesian Monte Carlo outperformed Annealed Importance Sampling, although for very high dimensional problems or problems with massive multimodality BMC may be less adequate. One advantage of the Bayesian approach to Monte Carlo is that samples can be drawn from any distribution. This allows for the possibility of active design of sample points so as to maximise information gain.

PDF Web [BibTex]

PDF Web [BibTex]


no image
On the Complexity of Learning the Kernel Matrix

Bousquet, O., Herrmann, D.

In Advances in Neural Information Processing Systems 15, pages: 399-406, (Editors: Becker, S. , S. Thrun, K. Obermayer), The MIT Press, Cambridge, MA, USA, Sixteenth Annual Conference on Neural Information Processing Systems (NIPS), October 2003 (inproceedings)

Abstract
We investigate data based procedures for selecting the kernel when learning with Support Vector Machines. We provide generalization error bounds by estimating the Rademacher complexities of the corresponding function classes. In particular we obtain a complexity bound for function classes induced by kernels with given eigenvectors, i.e., we allow to vary the spectrum and keep the eigenvectors fix. This bound is only a logarithmic factor bigger than the complexity of the function class induced by a single kernel. However, optimizing the margin over such classes leads to overfitting. We thus propose a suitable way of constraining the class. We use an efficient algorithm to solve the resulting optimization problem, present preliminary experimental results, and compare them to an alignment-based approach.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Control, Planning, Learning, and Imitation with Dynamic Movement Primitives

Schaal, S., Peters, J., Nakanishi, J., Ijspeert, A.

In IROS 2003, pages: 1-21, Workshop on Bilateral Paradigms on Humans and Humanoids, IEEE International Conference on Intelligent Robots and Systems, October 2003 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
Discriminative Learning for Label Sequences via Boosting

Altun, Y., Hofmann, T., Johnson, M.

In Advances in Neural Information Processing Systems 15, pages: 977-984, (Editors: Becker, S. , S. Thrun, K. Obermayer ), MIT Press, Cambridge, MA, USA, Sixteenth Annual Conference on Neural Information Processing Systems (NIPS), October 2003 (inproceedings)

Abstract
This paper investigates a boosting approach to discriminative learning of label sequences based on a sequence rank loss function.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Multiple-step ahead prediction for non linear dynamic systems: A Gaussian Process treatment with propagation of the uncertainty

Girard, A., Rasmussen, CE., Quiñonero-Candela, J., Murray-Smith, R.

In Advances in Neural Information Processing Systems 15, pages: 529-536, (Editors: Becker, S. , S. Thrun, K. Obermayer), MIT Press, Cambridge, MA, USA, Sixteenth Annual Conference on Neural Information Processing Systems (NIPS), October 2003 (inproceedings)

Abstract
We consider the problem of multi-step ahead prediction in time series analysis using the non-parametric Gaussian process model. k-step ahead forecasting of a discrete-time non-linear dynamic system can be performed by doing repeated one-step ahead predictions. For a state-space model of the form y_t = f(y_{t-1},...,y_{t-L}), the prediction of y at time t + k is based on the point estimates of the previous outputs. In this paper, we show how, using an analytical Gaussian approximation, we can formally incorporate the uncertainty about intermediate regressor values, thus updating the uncertainty on the current prediction.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Cluster Kernels for Semi-Supervised Learning

Chapelle, O., Weston, J., Schölkopf, B.

In Advances in Neural Information Processing Systems 15, pages: 585-592, (Editors: S Becker and S Thrun and K Obermayer), MIT Press, Cambridge, MA, USA, 16th Annual Conference on Neural Information Processing Systems (NIPS), October 2003 (inproceedings)

Abstract
We propose a framework to incorporate unlabeled data in kernel classifier, based on the idea that two points in the same cluster are more likely to have the same label. This is achieved by modifying the eigenspectrum of the kernel matrix. Experimental results assess the validity of this approach.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Mismatch String Kernels for SVM Protein Classification

Leslie, C., Eskin, E., Weston, J., Noble, W.

In Advances in Neural Information Processing Systems 15, pages: 1417-1424, (Editors: Becker, S. , S. Thrun, K. Obermayer), MIT Press, Cambridge, MA, USA, Sixteenth Annual Conference on Neural Information Processing Systems (NIPS), October 2003 (inproceedings)

Abstract
We introduce a class of string kernels, called mismatch kernels, for use with support vector machines (SVMs) in a discriminative approach to the protein classification problem. These kernels measure sequence similarity based on shared occurrences of k-length subsequences, counted with up to m mismatches, and do not rely on any generative model for the positive training sequences. We compute the kernels efficiently using a mismatch tree data structure and report experiments on a benchmark SCOP dataset, where we show that the mismatch kernel used with an SVM classifier performs as well as the Fisher kernel, the most successful method for remote homology detection, while achieving considerable computational savings.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Incremental Gaussian Processes

Quinonero Candela, J., Winther, O.

In Advances in Neural Information Processing Systems 15, pages: 1001-1008, (Editors: Becker, S. , S. Thrun, K. Obermayer), MIT Press, Cambridge, MA, USA, Sixteenth Annual Conference on Neural Information Processing Systems (NIPS), October 2003 (inproceedings)

Abstract
In this paper, we consider Tipping‘s relevance vector machine (RVM) and formalize an incremental training strategy as a variant of the expectation-maximization (EM) algorithm that we call subspace EM. Working with a subset of active basis functions, the sparsity of the RVM solution will ensure that the number of basis functions and thereby the computational complexity is kept low. We also introduce a mean field approach to the intractable classification model that is expected to give a very good approximation to exact Bayesian inference and contains the Laplace approximation as a special case. We test the algorithms on two large data sets with O(10^3-10^4) examples. The results indicate that Bayesian learning of large data sets, e.g. the MNIST database is realistic.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Kernel Dependency Estimation

Weston, J., Chapelle, O., Elisseeff, A., Schölkopf, B., Vapnik, V.

In Advances in Neural Information Processing Systems 15, pages: 873-880, (Editors: S Becker and S Thrun and K Obermayer), MIT Press, Cambridge, MA, USA, 16th Annual Conference on Neural Information Processing Systems (NIPS), October 2003 (inproceedings)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Derivative observations in Gaussian Process models of dynamic systems

Solak, E., Murray-Smith, R., Leithead, WE., Leith, D., Rasmussen, CE.

In Advances in Neural Information Processing Systems 15, pages: 1033-1040, (Editors: Becker, S., S. Thrun and K. Obermayer), MIT Press, Cambridge, MA, USA, Sixteenth Annual Conference on Neural Information Processing Systems (NIPS), October 2003 (inproceedings)

Abstract
Gaussian processes provide an approach to nonparametric modelling which allows a straightforward combination of function and derivative observations in an empirical model. This is of particular importance in identification of nonlinear dynamic systems from experimental data. 1) It allows us to combine derivative information, and associated uncertainty with normal function observations into the learning and inference process. This derivative information can be in the form of priors specified by an expert or identified from perturbation data close to equilibrium. 2) It allows a seamless fusion of multiple local linear models in a consistent manner, inferring consistent models and ensuring that integrability constraints are met. 3) It improves dramatically the computational efficiency of Gaussian process models for dynamic system identification, by summarising large quantities of near-equilibrium data by a handful of linearisations, reducing the training set size - traditionally a problem for Gaussian process models.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Linear Combinations of Optic Flow Vectors for Estimating Self-Motion: a Real-World Test of a Neural Model

Franz, MO., Chahl, JS.

In Advances in Neural Information Processing Systems 15, pages: 1319-1326, (Editors: Becker, S., S. Thrun and K. Obermayer), MIT Press, Cambridge, MA, USA, Sixteenth Annual Conference on Neural Information Processing Systems (NIPS), October 2003 (inproceedings)

Abstract
The tangential neurons in the fly brain are sensitive to the typical optic flow patterns generated during self-motion. In this study, we examine whether a simplified linear model of these neurons can be used to estimate self-motion from the optic flow. We present a theory for the construction of an estimator consisting of a linear combination of optic flow vectors that incorporates prior knowledge both about the distance distribution of the environment, and about the noise and self-motion statistics of the sensor. The estimator is tested on a gantry carrying an omnidirectional vision sensor. The experiments show that the proposed approach leads to accurate and robust estimates of rotation rates, whereas translation estimates turn out to be less reliable.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Clustering with the Fisher score

Tsuda, K., Kawanabe, M., Müller, K.

In Advances in Neural Information Processing Systems 15, pages: 729-736, (Editors: Becker, S. , S. Thrun, K. Obermayer), MIT Press, Cambridge, MA, USA, Sixteenth Annual Conference on Neural Information Processing Systems (NIPS), October 2003 (inproceedings)

Abstract
Recently the Fisher score (or the Fisher kernel) is increasingly used as a feature extractor for classification problems. The Fisher score is a vector of parameter derivatives of loglikelihood of a probabilistic model. This paper gives a theoretical analysis about how class information is preserved in the space of the Fisher score, which turns out that the Fisher score consists of a few important dimensions with class information and many nuisance dimensions. When we perform clustering with the Fisher score, K-Means type methods are obviously inappropriate because they make use of all dimensions. So we will develop a novel but simple clustering algorithm specialized for the Fisher score, which can exploit important dimensions. This algorithm is successfully tested in experiments with artificial data and real data (amino acid sequences).

PDF Web [BibTex]

PDF Web [BibTex]


no image
Large Margin Methods for Label Sequence Learning

Altun, Y., Hofmann, T.

In pages: 993-996, International Speech Communication Association, Bonn, Germany, 8th European Conference on Speech Communication and Technology (EuroSpeech), September 2003 (inproceedings)

Web [BibTex]

Web [BibTex]


no image
Fast Pattern Selection Algorithm for Support Vector Classifiers: "Time Complexity Analysis"

Shin, H., Cho, S.

In Lecture Notes in Computer Science (LNCS 2690), LNCS 2690, pages: 1008-1015, Springer-Verlag, Heidelberg, The 4th International Conference on Intelligent Data Engineering (IDEAL), September 2003 (inproceedings)

Abstract
Training SVM requires large memory and long cpu time when the pattern set is large. To alleviate the computational burden in SVM training, we propose a fast preprocessing algorithm which selects only the patterns near the decision boundary. The time complexity of the proposed algorithm is much smaller than that of the naive M^2 algorithm

PDF [BibTex]

PDF [BibTex]


no image
Marginalized Kernels between Labeled Graphs

Kashima, H., Tsuda, K., Inokuchi, A.

In 20th International Conference on Machine Learning, pages: 321-328, (Editors: Faucett, T. and N. Mishra), 20th International Conference on Machine Learning, August 2003 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
Sparse Gaussian Processes: inference, subspace identification and model selection

Csato, L., Opper, M.

In Proceedings, pages: 1-6, (Editors: Van der Hof, , Wahlberg), The Netherlands, 13th IFAC Symposium on System Identifiaction, August 2003, electronical version; Index ThA02-2 (inproceedings)

Abstract
Gaussian Process (GP) inference is a probabilistic kernel method where the GP is treated as a latent function. The inference is carried out using the Bayesian online learning and its extension to the more general iterative approach which we call TAP/EP learning. Sparsity is introduced in this context to make the TAP/EP method applicable to large datasets. We address the prohibitive scaling of the number of parameters by defining a subset of the training data that is used as the support the GP, thus the number of required parameters is independent of the training set, similar to the case of ``Support--‘‘ or ``Relevance--Vectors‘‘. An advantage of the full probabilistic treatment is that allows the computation of the marginal data likelihood or evidence, leading to hyper-parameter estimation within the GP inference. An EM algorithm to choose the hyper-parameters is proposed. The TAP/EP learning is the E-step and the M-step then updates the hyper-parameters. Due to the sparse E-step the resulting algorithm does not involve manipulation of large matrices. The presented algorithm is applicable to a wide variety of likelihood functions. We present results of applying the algorithm on classification and nonstandard regression problems for artificial and real datasets.

PDF GZIP [BibTex]

PDF GZIP [BibTex]


no image
Adaptive, Cautious, Predictive control with Gaussian Process Priors

Murray-Smith, R., Sbarbaro, D., Rasmussen, CE., Girard, A.

In Proceedings of the 13th IFAC Symposium on System Identification, pages: 1195-1200, (Editors: Van den Hof, P., B. Wahlberg and S. Weiland), Proceedings of the 13th IFAC Symposium on System Identification, August 2003 (inproceedings)

Abstract
Nonparametric Gaussian Process models, a Bayesian statistics approach, are used to implement a nonlinear adaptive control law. Predictions, including propagation of the state uncertainty are made over a k-step horizon. The expected value of a quadratic cost function is minimised, over this prediction horizon, without ignoring the variance of the model predictions. The general method and its main features are illustrated on a simulation example.

PDF [BibTex]

PDF [BibTex]


no image
Generative Model-based Clustering of Directional Data

Banerjee, A., Dhillon, I., Ghosh, J., Sra, S.

In Proc. ACK SIGKDD, pages: 00-00, KDD, August 2003 (inproceedings)

GZIP [BibTex]

GZIP [BibTex]


no image
Hidden Markov Support Vector Machines

Altun, Y., Tsochantaridis, I., Hofmann, T.

In pages: 4-11, (Editors: Fawcett, T. , N. Mishra), AAAI Press, Menlo Park, CA, USA, Twentieth International Conference on Machine Learning (ICML), August 2003 (inproceedings)

Web [BibTex]

Web [BibTex]


no image
How Many Neighbors To Consider in Pattern Pre-selection for Support Vector Classifiers?

Shin, H., Cho, S.

In Proc. of INNS-IEEE International Joint Conference on Neural Networks (IJCNN 2003), pages: 565-570, IJCNN, July 2003 (inproceedings)

Abstract
Training support vector classifiers (SVC) requires large memory and long cpu time when the pattern set is large. To alleviate the computational burden in SVC training, we previously proposed a preprocessing algorithm which selects only the patterns in the overlap region around the decision boundary, based on neighborhood properties [8], [9], [10]. The k-nearest neighbors’ class label entropy for each pattern was used to estimate the pattern’s proximity to the decision boundary. The value of parameter k is critical, yet has been determined by a rather ad-hoc fashion. We propose in this paper a systematic procedure to determine k and show its effectiveness through experiments.

PDF [BibTex]

PDF [BibTex]


no image
On the Representation, Learning and Transfer of Spatio-Temporal Movement Characteristics

Ilg, W., Bakir, GH., Mezger, J., Giese, MA.

In Humanoids Proceedings, pages: 0-0, Humanoids Proceedings, July 2003, electronical version (inproceedings)

Abstract
In this paper we present a learning-based approach for the modelling of complex movement sequences. Based on the method of Spatio-Temporal Morphable Models (STMMS. We derive a hierarchical algorithm that, in a first step, identifies automatically movement elements in movement sequences based on a coarse spatio-temporal description, and in a second step models these movement primitives by approximation through linear combinations of learned example movement trajectories. We describe the different steps of the algorithm and show how it can be applied for modelling and synthesis of complex sequences of human movements that contain movement elements with variable style. The proposed method is demonstrated on different applications of movement representation relevant for imitation learning of movement styles in humanoid robotics.

PDF [BibTex]

PDF [BibTex]


no image
Loss Functions and Optimization Methods for Discriminative Learning of Label Sequences

Altun, Y., Johnson, M., Hofmann, T.

In pages: 145-152, (Editors: Collins, M. , M. Steedman), ACL, East Stroudsburg, PA, USA, Conference on Empirical Methods in Natural Language Processing (EMNLP) , July 2003 (inproceedings)

Abstract
Discriminative models have been of interest in the NLP community in recent years. Previous research has shown that they are advantageous over generative models. In this paper, we investigate how different objective functions and optimization methods affect the performance of the classifiers in the discriminative learning framework. We focus on the sequence labelling problem, particularly POS tagging and NER tasks. Our experiments show that changing the objective function is not as effective as changing the features included in the model.

Web [BibTex]

Web [BibTex]


no image
Time Complexity Analysis of Fast Pattern Selection Algorithm for SVM

Shin, H., Cho, S.

In Proc. of the Korean Data Mining Conference, pages: 221-231, Korean Data Mining Conference, June 2003 (inproceedings)

[BibTex]

[BibTex]