Header logo is ei



no image
UDP Communication channel design of master-slave robot system

Hong, A., Cho, JH., Wang, H., Lee, DY.

In pages: 231-232, 2010 KSME Conference, June 2010 (inproceedings)

[BibTex]

[BibTex]


no image
Telling cause from effect based on high-dimensional observations

Janzing, D., Hoyer, P., Schölkopf, B.

In Proceedings of the 27th International Conference on Machine Learning, pages: 479-486, (Editors: J Fürnkranz and T Joachims), International Machine Learning Society, Madison, WI, USA, ICML, June 2010 (inproceedings)

Abstract
We describe a method for inferring linear causal relations among multi-dimensional variables. The idea is to use an asymmetry between the distributions of cause and effect that occurs if the covariance matrix of the cause and the structure matrix mapping the cause to the effect are independently chosen. The method applies to both stochastic and deterministic causal relations, provided that the dimensionality is sufficiently high (in some experiments, 5 was enough). It is applicable to Gaussian as well as non-Gaussian data.

PDF Web [BibTex]

PDF Web [BibTex]


no image
A scalable trust-region algorithm with application to mixed-norm regression

Kim, D., Sra, S., Dhillon, I.

In Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pages: 519-526, (Editors: Fürnkranz, J. , T. Joachims), International Machine Learning Society, Madison, WI, USA, 27th International Conference on Machine Learning (ICML), June 2010 (inproceedings)

Abstract
We present a new algorithm for minimizing a convex loss-function subject to regularization. Our framework applies to numerous problems in machine learning and statistics; notably, for sparsity-promoting regularizers such as ℓ1 or ℓ1, ∞ norms, it enables efficient computation of sparse solutions. Our approach is based on the trust-region framework with nonsmooth objectives, which allows us to build on known results to provide convergence analysis. We avoid the computational overheads associated with the conventional Hessian approximation used by trust-region methods by instead using a simple separable quadratic approximation. This approximation also enables use of proximity operators for tackling nonsmooth regularizers. We illustrate the versatility of our resulting algorithm by specializing it to three mixed-norm regression problems: group lasso [36], group logistic regression [21], and multi-task lasso [19]. We experiment with both synthetic and real-world large-scale data—our method is seen to be competitive, robust, and scalable.

PDF Web [BibTex]

PDF Web [BibTex]


no image
The Influence of the Image Basis on Modeling and Steganalysis Performance

Schwamberger, V., Le, P., Schölkopf, B., Franz, M.

In Information Hiding, pages: 133-144, (Editors: R Böhme and PWL Fong and R Safavi-Naini), Springer, Berlin, Germany, 12th international Workshop (IH), June 2010 (inproceedings)

Abstract
We compare two image bases with respect to their capabilities for image modeling and steganalysis. The first basis consists of wavelets, the second is a Laplacian pyramid. Both bases are used to decompose the image into subbands where the local dependency structure is modeled with a linear Bayesian estimator. Similar to existing approaches, the image model is used to predict coefficient values from their neighborhoods, and the final classification step uses statistical descriptors of the residual. Our findings are counter-intuitive on first sight: Although Laplacian pyramids have better image modeling capabilities than wavelets, steganalysis based on wavelets is much more successful. We present a number of experiments that suggest possible explanations for this result.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
A PAC-Bayesian Analysis of Co-clustering, Graph Clustering, and Pairwise Clustering

Seldin, Y.

In ICML 2010 Workshop on Social Analytics: Learning from human interactions, pages: 1-5, ICML Workshop on Social Analytics: Learning from human interactions, June 2010 (inproceedings)

Abstract
We review briefly the PAC-Bayesian analysis of co-clustering (Seldin and Tishby, 2008, 2009, 2010), which provided generalization guarantees and regularization terms absent in the preceding formulations of this problem and achieved state-of-the-art prediction results in MovieLens collaborative filtering task. Inspired by this analysis we formulate weighted graph clustering1 as a prediction problem: given a subset of edge weights we analyze the ability of graph clustering to predict the remaining edge weights. This formulation enables practical and theoretical comparison of different approaches to graph clustering as well as comparison of graph clustering with other possible ways to model the graph. Following the lines of (Seldin and Tishby, 2010) we derive PAC-Bayesian generalization bounds for graph clustering. The bounds show that graph clustering should optimize a trade-off between empirical data fit and the mutual information that clusters preserve on the graph nodes. A similar trade-off derived from information-theoretic considerations was already shown to produce state-of-the-art results in practice (Slonim et al., 2005; Yom-Tov and Slonim, 2009). This paper supports the empirical evidence by providing a better theoretical foundation, suggesting formal generalization guarantees, and offering a more accurate way to deal with finite sample issues.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Solving Large-Scale Nonnegative Least Squares

Sra, S.

16th Conference of the International Linear Algebra Society (ILAS), June 2010 (talk)

Abstract
We study the fundamental problem of nonnegative least squares. This problem was apparently introduced by Lawson and Hanson [1] under the name NNLS. As is evident from its name, NNLS seeks least-squares solutions that are also nonnegative. Owing to its wide-applicability numerous algorithms have been derived for NNLS, beginning from the active-set approach of Lawson and Han- son [1] leading up to the sophisticated interior-point method of Bellavia et al. [2]. We present a new algorithm for NNLS that combines projected subgradients with the non-monotonic gradient descent idea of Barzilai and Borwein [3]. Our resulting algorithm is called BBSG, and we guarantee its convergence by ex- ploiting properties of NNLS in conjunction with projected subgradients. BBSG is surprisingly simple and scales well to large problems. We substantiate our claims by empirically evaluating BBSG and comparing it with established con- vex solvers and specialized NNLS algorithms. The numerical results suggest that BBSG is a practical method for solving large-scale NNLS problems.

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
Matrix Approximation Problems

Sra, S.

EU Regional School: Rheinisch-Westf{\"a}lische Technische Hochschule Aachen, May 2010 (talk)

PDF AVI [BibTex]

PDF AVI [BibTex]


no image
BCI2000 and Python

Hill, NJ.

Invited lecture at the 7th International BCI2000 Workshop, Pacific Grove, CA, USA, May 2010 (talk)

Abstract
A tutorial, with exercises, on how to integrate your own Python code with the BCI2000 realtime software package.

PDF [BibTex]

PDF [BibTex]


no image
Extending BCI2000 Functionality with Your Own C++ Code

Hill, NJ.

Invited lecture at the 7th International BCI2000 Workshop, Pacific Grove, CA, USA, May 2010 (talk)

Abstract
A tutorial, with exercises, on how to use BCI2000 C++ framework to write your own real-time signal-processing modules.

[BibTex]

[BibTex]


no image
Apprenticeship learning via soft local homomorphisms

Boularias, A., Chaib-Draa, B.

In Proceedings of the 2010 IEEE International Conference on Robotics and Automation (ICRA 2010), pages: 2971-2976, IEEE, Piscataway, NJ, USA, 2010 IEEE International Conference on Robotics and Automation (ICRA), May 2010 (inproceedings)

Abstract
We consider the problem of apprenticeship learning when the expert's demonstration covers only a small part of a large state space. Inverse Reinforcement Learning (IRL) provides an efficient solution to this problem based on the assumption that the expert is optimally acting in a Markov Decision Process (MDP). However, past work on IRL requires an accurate estimate of the frequency of encountering each feature of the states when the robot follows the expert‘s policy. Given that the complete policy of the expert is unknown, the features frequencies can only be empirically estimated from the demonstrated trajectories. In this paper, we propose to use a transfer method, known as soft homomorphism, in order to generalize the expert‘s policy to unvisited regions of the state space. The generalized policy can be used either as the robot‘s final policy, or to calculate the features frequencies within an IRL algorithm. Empirical results show that our approach is able to learn good policies from a small number of demonstrations.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Using Model Knowledge for Learning Inverse Dynamics

Nguyen-Tuong, D., Peters, J.

In Proceedings of the 2010 IEEE International Conference on Robotics and Automation (ICRA 2010), pages: 2677-2682, IEEE, Piscataway, NJ, USA, 2010 IEEE International Conference on Robotics and Automation (ICRA), May 2010 (inproceedings)

Abstract
In recent years, learning models from data has become an increasingly interesting tool for robotics, as it allows straightforward and accurate model approximation. However, in most robot learning approaches, the model is learned from scratch disregarding all prior knowledge about the system. For many complex robot systems, available prior knowledge from advanced physics-based modeling techniques can entail valuable information for model learning that may result in faster learning speed, higher accuracy and better generalization. In this paper, we investigate how parametric physical models (e.g., obtained from rigid body dynamics) can be used to improve the learning performance, and, especially, how semiparametric regression methods can be applied in this context. We present two possible semiparametric regression approaches, where the knowledge of the physical model can either become part of the mean function or of the kernel in a nonparametric Gaussian process regression. We compare the learning performance o f these methods first on sampled data and, subsequently, apply the obtained inverse dynamics models in tracking control on a real Barrett WAM. The results show that the semiparametric models learned with rigid body dynamics as prior outperform the standard rigid body dynamics models on real data while generalizing better for unknown parts of the state space.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Coherent Inference on Optimal Play in Game Trees

Hennig, P., Stern, D., Graepel, T.

In JMLR Workshop and Conference Proceedings Volume 9: AISTATS 2010, pages: 326-333, (Editors: Teh, Y.W. , M. Titterington ), JMLR, Cambridge, MA, USA, Thirteenth International Conference on Artificial Intelligence and Statistics, May 2010 (inproceedings)

Abstract
Round-based games are an instance of discrete planning problems. Some of the best contemporary game tree search algorithms use random roll-outs as data. Relying on a good policy, they learn on-policy values by propagating information upwards in the tree, but not between sibling nodes. Here, we present a generative model and a corresponding approximate message passing scheme for inference on the optimal, off-policy value of nodes in smooth AND/OR trees, given random roll-outs. The crucial insight is that the distribution of values in game trees is not completely arbitrary. We define a generative model of the on-policy values using a latent score for each state, representing the value under the random roll-out policy. Inference on the values under the optimal policy separates into an inductive, pre-data step and a deductive, post-data part. Both can be solved approximately with Expectation Propagation, allowing off-policy value inference for any node in the (exponentially big) tree in linear time.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Incremental Sparsification for Real-time Online Model Learning

Nguyen-Tuong, D., Peters, J.

In JMLR Workshop and Conference Proceedings Volume 9: AISTATS 2010, pages: 557-564, (Editors: Teh, Y.W. , M. Titterington), JMLR, Cambridge, MA, USA, Thirteenth International Conference on Artificial Intelligence and Statistics, May 2010 (inproceedings)

Abstract
Online model learning in real-time is required by many applications such as in robot tracking control. It poses a difficult problem, as fast and incremental online regression with large data sets is the essential component which cannot be achieved by straightforward usage of off-the-shelf machine learning methods (such as Gaussian process regression or support vector regression). In this paper, we propose a framework for online, incremental sparsification with a fixed budget designed for large scale real-time model learning. The proposed approach combines a sparsification method based on an independence measure with a large scale database. In combination with an incremental learning approach such as sequential support vector regression, we obtain a regression method which is applicable in real-time online learning. It exhibits competitive learning accuracy when compared with standard regression techniques. Implementation on a real robot emphasizes the applicability of the proposed approach in real-time online model learning for real world systems.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Multitask Learning for Brain-Computer Interfaces

Alamgir, M., Grosse-Wentrup, M., Altun, Y.

In JMLR Workshop and Conference Proceedings Volume 9: AISTATS 2010, pages: 17-24, (Editors: Teh, Y.W. , M. Titterington), JMLR, Cambridge, MA, USA, Thirteenth International Conference on Artificial Intelligence and Statistics , May 2010 (inproceedings)

Abstract
Brain-computer interfaces (BCIs) are limited in their applicability in everyday settings by the current necessity to record subjectspecific calibration data prior to actual use of the BCI for communication. In this paper, we utilize the framework of multitask learning to construct a BCI that can be used without any subject-specific calibration process. We discuss how this out-of-the-box BCI can be further improved in a computationally efficient manner as subject-specific data becomes available. The feasibility of the approach is demonstrated on two sets of experimental EEG data recorded during a standard two-class motor imagery paradigm from a total of 19 healthy subjects. Specifically, we show that satisfactory classification results can be achieved with zero training data, and combining prior recordings with subjectspecific calibration data substantially outperforms using subject-specific data only. Our results further show that transfer between recordings under slightly different experimental setups is feasible.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Identifying Cause and Effect on Discrete Data using Additive Noise Models

Peters, J., Janzing, D., Schölkopf, B.

In JMLR Workshop and Conference Proceedings Volume 9: AISTATS 2010, pages: 597-604, (Editors: YW Teh and M Titterington), JMLR, Cambridge, MA, USA, 13th International Conference on Artificial Intelligence and Statistics, May 2010 (inproceedings)

Abstract
Inferring the causal structure of a set of random variables from a finite sample of the joint distribution is an important problem in science. Recently, methods using additive noise models have been suggested to approach the case of continuous variables. In many situations, however, the variables of interest are discrete or even have only finitely many states. In this work we extend the notion of additive noise models to these cases. Whenever the joint distribution P(X;Y ) admits such a model in one direction, e.g. Y = f(X) + N; N ? X, it does not admit the reversed model X = g(Y ) + ~N ; ~N ? Y as long as the model is chosen in a generic way. Based on these deliberations we propose an efficient new algorithm that is able to distinguish between cause and effect for a finite sample of discrete variables. We show that this algorithm works both on synthetic and real data sets.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Semi-supervised Learning via Generalized Maximum Entropy

Erkan, A., Altun, Y.

In JMLR Workshop and Conference Proceedings Volume 9: AISTATS 2010, pages: 209-216, (Editors: Teh, Y.W. , M. Titterington), JMLR, Cambridge, MA, USA, Thirteenth International Conference on Artificial Intelligence and Statistics , May 2010 (inproceedings)

Abstract
Various supervised inference methods can be analyzed as convex duals of the generalized maximum entropy (MaxEnt) framework. Generalized MaxEnt aims to find a distribution that maximizes an entropy function while respecting prior information represented as potential functions in miscellaneous forms of constraints and/or penalties. We extend this framework to semi-supervised learning by incorporating unlabeled data via modifications to these potential functions reflecting structural assumptions on the data geometry. The proposed approach leads to a family of discriminative semi-supervised algorithms, that are convex, scalable, inherently multi-class, easy to implement, and that can be kernelized naturally. Experimental evaluation of special cases shows the competitiveness of our methodology.

PDF Web [BibTex]

PDF Web [BibTex]


no image
A New Algorithm for Improving the Resolution of Cryo-EM Density Maps

Hirsch, M., Schölkopf, B., Habeck, M.

In Research in Computational Molecular Biology, Lecture Notes in Bioinformatics, Vol. 6044 , pages: 174-188, (Editors: B Berger), Springer, Berlin, Germany, 14th International Conference on Research in Computational Molecular Biology (RECOMB), May 2010 (inproceedings)

Abstract
Cryo-electron microscopy (cryo-EM) plays an increasingly prominent role in structure elucidation of macromolecular assemblies. Advances in experimental instrumentation and computational power have spawned numerous cryo-EM studies of large biomolecular complexes resulting in the reconstruction of three-dimensional density maps at intermediate and low resolution. In this resolution range, identification and interpretation of structural elements and modeling of biomolecular structure with atomic detail becomes problematic. In this paper, we present a novel algorithm that enhances the resolution of intermediate- and low-resolution density maps. Our underlying assumption is to model the low-resolution density map as a blurred and possibly noise-corrupted version of an unknown high-resolution map that we seek to recover by deconvolution. By exploiting the nonnegativity of both the high-resolution map and blur kernel we derive multiplicative updates reminiscent of those used in nonnegative matrix factorization. Our framework allows for easy incorporation of additional prior knowledge such as smoothness and sparseness, on both the sharpened density map and the blur kernel. A probabilistic formulation enables us to derive updates for the hyperparameters, therefore our approach has no parameter that needs adjustment. We apply the algorithm to simulated three-dimensional electron microscopic data. We show that our method provides better resolved density maps when compared with B-factor sharpening, especially in the presence of noise. Moreover, our method can use additional information provided by homologous structures, which helps to improve the resolution even further.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Movement Templates for Learning of Hitting and Batting

Kober, J., Mülling, K., Krömer, O., Lampert, C., Schölkopf, B., Peters, J.

In Proceedings of the 2010 IEEE International Conference on Robotics and Automation (ICRA 2010), pages: 853-858, IEEE, Piscataway, NJ, USA, 2010 IEEE International Conference on Robotics and Automation (ICRA), May 2010 (inproceedings)

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Machine-Learning Methods for Decoding Intentional Brain States

Hill, NJ.

Symposium "Non-Invasive Brain Computer Interfaces: Current Developments and Applications" (BIOMAG), March 2010 (talk)

Abstract
Brain-computer interfaces (BCI) work by making the user perform a specific mental task, such as imagining moving body parts or performing some other covert mental activity, or attending to a particular stimulus out of an array of options, in order to encode their intention into a measurable brain signal. Signal-processing and machine-learning techniques are then used to decode the measured signal to identify the encoded mental state and hence extract the user‘s initial intention. The high-noise high-dimensional nature of brain-signals make robust decoding techniques a necessity. Generally, the approach has been to use relatively simple feature extraction techniques, such as template matching and band-power estimation, coupled to simple linear classifiers. This has led to a prevailing view among applied BCI researchers that (sophisticated) machine-learning is irrelevant since “it doesn‘t matter what classifier you use once your features are extracted.” Using examples from our own MEG and EEG experiments, I‘ll demonstrate how machine-learning principles can be applied in order to improve BCI performance, if they are formulated in a domain-specific way. The result is a type of data-driven analysis that is more than “just” classification, and can be used to find better feature extractors.

PDF Web [BibTex]

PDF Web [BibTex]


no image
PAC-Bayesian Analysis in Unsupervised Learning

Seldin, Y.

Foundations and New Trends of PAC Bayesian Learning Workshop, March 2010 (talk)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Experiments with Motor Primitives to learn Table Tennis

Peters, J., Mülling, K., Kober, J.

In Experimental Robotics, pages: 1-13, (Editors: Khatib, O. , V. Kumar, G. Sukhatme), Springer, Berlin, Germany, 12th International Symposium on Experimental Robotics (ISER), March 2010 (inproceedings)

Web [BibTex]

Web [BibTex]


no image
Causality: Objectives and Assessment

Guyon, I., Janzing, D., Schölkopf, B.

In JMLR Workshop and Conference Proceedings: Volume 6 , pages: 1-42, (Editors: I Guyon and D Janzing and B Schölkopf), MIT Press, Cambridge, MA, USA, Causality: Objectives and Assessment (NIPS Workshop) , February 2010 (inproceedings)

Abstract
The NIPS 2008 workshop on causality provided a forum for researchers from different horizons to share their view on causal modeling and address the difficult question of assessing causal models. There has been a vivid debate on properly separating the notion of causality from particular models such as graphical models, which have been dominating the field in the past few years. Part of the workshop was dedicated to discussing the results of a challenge, which offered a wide variety of applications of causal modeling. We have regrouped in these proceedings the best papers presented. Most lectures were videotaped or recorded. All information regarding the challenge and the lectures are found at http://www.clopinet.com/isabelle/Projects/NIPS2008/. This introduction provides a synthesis of the findings and a gentle introduction to causality topics, which are the object of active research.

Web [BibTex]

Web [BibTex]


no image
Learning Motor Primitives for Robotics

Kober, J., Peters, J.

EVENT Lab: Reinforcement Learning in Robotics and Virtual Reality, January 2010 (talk)

Abstract
The acquisition and self-improvement of novel motor skills is among the most important problems in robotics. Motor primitives offer one of the most promising frameworks for the application of machine learning techniques in this context. Employing the Dynamic Systems Motor primitives originally introduced by Ijspeert et al. (2003), appropriate learning algorithms for a concerted approach of both imitation and reinforcement learning are presented. Using these algorithms new motor skills, i.e., Ball-in-a-Cup, Ball-Paddling and Dart-Throwing, are learned.

[BibTex]

[BibTex]


no image
Leveraging Sequence Classification by Taxonomy-based Multitask Learning

Widmer, C., Leiva, J., Altun, Y., Rätsch, G.

In Research in Computational Molecular Biology, LNCS, Vol. 6044, pages: 522-534, (Editors: B Berger), Springer, Berlin, Germany, 14th Annual International Conference, RECOMB, 2010 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
Probabilistic latent variable models for distinguishing between cause and effect

Mooij, J., Stegle, O., Janzing, D., Zhang, K., Schölkopf, B.

In Advances in Neural Information Processing Systems 23, pages: 1687-1695, (Editors: J Lafferty and CKI Williams and J Shawe-Taylor and RS Zemel and A Culotta), Curran, Red Hook, NY, USA, 24th Annual Conference on Neural Information Processing Systems (NIPS), 2010 (inproceedings)

Abstract
We propose a novel method for inferring whether X causes Y or vice versa from joint observations of X and Y. The basic idea is to model the observed data using probabilistic latent variable models, which incorporate the effects of unobserved noise. To this end, we consider the hypothetical effect variable to be a function of the hypothetical cause variable and an independent noise term (not necessarily additive). An important novel aspect of our work is that we do not restrict the model class, but instead put general non-parametric priors on this function and on the distribution of the cause. The causal direction can then be inferred by using standard Bayesian model selection. We evaluate our approach on synthetic data and real-world data and report encouraging results.

PDF Web [BibTex]

PDF Web [BibTex]


no image
JigPheno: Semantic Feature Extraction in biological images

Karaletsos, T., Stegle, O., Winn, J., Borgwardt, K.

In NIPS, Workshop on Machine Learning in Computational Biology, 2010 (inproceedings)

[BibTex]

[BibTex]


no image
Nonparametric Tree Graphical Models

Song, L., Gretton, A., Guestrin, C.

In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, Volume 9 , pages: 765-772, (Editors: YW Teh and M Titterington ), JMLR, AISTATS, 2010 (inproceedings)

PDF [BibTex]

PDF [BibTex]


no image
Novel machine learning methods for MHC Class I binding prediction

Widmer, C., Toussaint, N., Altun, Y., Kohlbacher, O., Rätsch, G.

In Pattern Recognition in Bioinformatics, pages: 98-109, (Editors: TMH Dijkstra and E Tsivtsivadze and E Marchiori and T Heskes), Springer, Berlin, Germany, 5th IAPR International Conference, PRIB, 2010 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
Bootstrapping Apprenticeship Learning

Boularias, A., Chaib-Draa, B.

In Advances in Neural Information Processing Systems 23, pages: 289-297, (Editors: Lafferty, J. , C. K.I. Williams, J. Shawe-Taylor, R. S. Zemel, A. Culotta), Curran, Red Hook, NY, USA, Twenty-Fourth Annual Conference on Neural Information Processing Systems (NIPS), 2010 (inproceedings)

Abstract
We consider the problem of apprenticeship learning where the examples, demonstrated by an expert, cover only a small part of a large state space. Inverse Reinforcement Learning (IRL) provides an efficient tool for generalizing the demonstration, based on the assumption that the expert is maximizing a utility function that is a linear combination of state-action features. Most IRL algorithms use a simple Monte Carlo estimation to approximate the expected feature counts under the expert's policy. In this paper, we show that the quality of the learned policies is highly sensitive to the error in estimating the feature counts. To reduce this error, we introduce a novel approach for bootstrapping the demonstration by assuming that: (i), the expert is (near-)optimal, and (ii), the dynamics of the system is known. Empirical results on gridworlds and car racing problems show that our approach is able to learn good policies from a small number of demonstrations.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models

Zhang, K., Hyvärinen, A.

In JMLR Workshop and Conference Proceedings, Volume 6, pages: 157-164, (Editors: I Guyon and D Janzing and B Schölkopf), MIT Press, Cambridge, MA, USA, Causality: Objectives and Assessment (NIPS Workshop), 2010 (inproceedings)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Clustering Based Approach to Learning Regular Expressions over Large Alphabet for Noisy Unstructured Text

Babbar, R., Singh, N.

In Proceedings of the Fourth Workshop on Analytics for Noisy Unstructured Text Data, pages: 43-50, (Editors: R Basili and DP Lopresti and C Ringlstetter and S Roy and KU Schulz and LV Subramaniam), ACM, AND (in conjunction with CIKM), 2010 (inproceedings)

Web [BibTex]

Web [BibTex]


no image
Characteristic Kernels on Structured Domains Excel in Robotics and Human Action Recognition

Danafar, S., Gretton, A., Schmidhuber, J.

In Machine Learning and Knowledge Discovery in Databases, LNCS Vol. 6321, pages: 264-279, (Editors: JL Balcázar and F Bonchi and A Gionis and M Sebag), Springer, Berlin, Germany, ECML PKDD, 2010 (inproceedings)

Abstract
Embedding probability distributions into a sufficiently rich (characteristic) reproducing kernel Hilbert space enables us to take higher order statistics into account. Characterization also retains effective statistical relation between inputs and outputs in regression and classification. Recent works established conditions for characteristic kernels on groups and semigroups. Here we study characteristic kernels on periodic domains, rotation matrices, and histograms. Such structured domains are relevant for homogeneity testing, forward kinematics, forward dynamics, inverse dynamics, etc. Our kernel-based methods with tailored characteristic kernels outperform previous methods on robotics problems and also on a widely used benchmark for recognition of human actions in videos.

DOI [BibTex]

DOI [BibTex]


no image
Movement extraction by detecting dynamics switches and repetitions

Chiappa, S., Peters, J.

In Advances in Neural Information Processing Systems 23, pages: 388-396, (Editors: Lafferty, J. , C. K.I. Williams, J. Shawe-Taylor, R. S. Zemel, A. Culotta), Curran, Red Hook, NY, USA, Twenty-Fourth Annual Conference on Neural Information Processing Systems (NIPS), 2010 (inproceedings)

Abstract
Many time-series such as human movement data consist of a sequence of basic actions, e.g., forehands and backhands in tennis. Automatically extracting and characterizing such actions is an important problem for a variety of different applications. In this paper, we present a probabilistic segmentation approach in which an observed time-series is modeled as a concatenation of segments corresponding to different basic actions. Each segment is generated through a noisy transformation of one of a few hidden trajectories representing different types of movement, with possible time re-scaling. We analyze three different approximation methods for dealing with model intractability, and demonstrate how the proposed approach can successfully segment table tennis movements recorded using a robot arm as haptic input device.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Space-Variant Single-Image Blind Deconvolution for Removing Camera Shake

Harmeling, S., Hirsch, M., Schölkopf, B.

In Advances in Neural Information Processing Systems 23, pages: 829-837, (Editors: J Lafferty and CKI Williams and J Shawe-Taylor and RS Zemel and A Culotta), Curran, Red Hook, NY, USA, 24th Annual Conference on Neural Information Processing Systems (NIPS), 2010 (inproceedings)

Abstract
Modelling camera shake as a space-invariant convolution simplifies the problem of removing camera shake, but often insufficiently models actual motion blur such as those due to camera rotation and movements outside the sensor plane or when objects in the scene have different distances to the camera. In an effort to address these limitations, (i) we introduce a taxonomy of camera shakes, (ii) we build on a recently introduced framework for space-variant filtering by Hirsch et al. and a fast algorithm for single image blind deconvolution for space-invariant filters by Cho and Lee to construct a method for blind deconvolution in the case of space-variant blur, and (iii), we present an experimental setup for evaluation that allows us to take images with real camera shake while at the same time recording the spacevariant point spread function corresponding to that blur. Finally, we demonstrate that our method is able to deblur images degraded by spatially-varying blur originating from real camera shake, even without using additionally motion sensor information.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Getting lost in space: Large sample analysis of the resistance distance

von Luxburg, U., Radl, A., Hein, M.

In Advances in Neural Information Processing Systems 23, pages: 2622-2630, (Editors: Lafferty, J. , C. K.I. Williams, J. Shawe-Taylor, R. S. Zemel, A. Culotta), Curran, Red Hook, NY, USA, Twenty-Fourth Annual Conference on Neural Information Processing Systems (NIPS), 2010 (inproceedings)

Abstract
The commute distance between two vertices in a graph is the expected time it takes a random walk to travel from the first to the second vertex and back. We study the behavior of the commute distance as the size of the underlying graph increases. We prove that the commute distance converges to an expression that does not take into account the structure of the graph at all and that is completely meaningless as a distance function on the graph. Consequently, the use of the raw commute distance for machine learning purposes is strongly discouraged for large graphs and in high dimensions. As an alternative we introduce the amplified commute distance that corrects for the undesired large sample effects.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Distinguishing between cause and effect

Mooij, J., Janzing, D.

In JMLR Workshop and Conference Proceedings: Volume 6, pages: 147-156, (Editors: Guyon, I. , D. Janzing, B. Schölkopf), MIT Press, Cambridge, MA, USA, Causality: Objectives and Assessment (NIPS Workshop) , 2010 (inproceedings)

Abstract
We describe eight data sets that together formed the CauseEffectPairs task in the Causality Challenge #2: Pot-Luck competition. Each set consists of a sample of a pair of statistically dependent random variables. One variable is known to cause the other one, but this information was hidden from the participants; the task was to identify which of the two variables was the cause and which one the effect, based upon the observed sample. The data sets were chosen such that we expect common agreement on the ground truth. Even though part of the statistical dependences may also be due to hidden common causes, common sense tells us that there is a significant cause-effect relation between the two variables in each pair. We also present baseline results using three different causal inference methods.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Kernel Methods for Detecting the Direction of Time Series

Peters, J., Janzing, D., Gretton, A., Schölkopf, B.

In Advances in Data Analysis, Data Handling and Business Intelligence, pages: 57-66, (Editors: A Fink and B Lausen and W Seidel and A Ultsch), Springer, Berlin, Germany, 32nd Annual Conference of the Gesellschaft f{\"u}r Klassifikation e.V. (GfKl), 2010 (inproceedings)

Abstract
We propose two kernel based methods for detecting the time direction in empirical time series. First we apply a Support Vector Machine on the finite-dimensional distributions of the time series (classification method) by embedding these distributions into a Reproducing Kernel Hilbert Space. For the ARMA method we fit the observed data with an autoregressive moving average process and test whether the regression residuals are statistically independent of the past values. Whenever the dependence in one direction is significantly weaker than in the other we infer the former to be the true one. Both approaches were able to detect the direction of the true generating model for simulated data sets. We also applied our tests to a large number of real world time series. The ARMA method made a decision for a significant fraction of them, in which it was mostly correct, while the classification method did not perform as well, but still exceeded chance level.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Switched Latent Force Models for Movement Segmentation

Alvarez, M., Peters, J., Schölkopf, B., Lawrence, N.

In Advances in neural information processing systems 23, pages: 55-63, (Editors: J Lafferty and CKI Williams and J Shawe-Taylor and RS Zemel and A Culotta), Curran, Red Hook, NY, USA, 24th Annual Conference on Neural Information Processing Systems (NIPS), 2010 (inproceedings)

Abstract
Latent force models encode the interaction between multiple related dynamical systems in the form of a kernel or covariance function. Each variable to be modeled is represented as the output of a differential equation and each differential equation is driven by a weighted sum of latent functions with uncertainty given by a Gaussian process prior. In this paper we consider employing the latent force model framework for the problem of determining robot motor primitives. To deal with discontinuities in the dynamical systems or the latent driving force we introduce an extension of the basic latent force model, that switches between different latent functions and potentially different dynamical systems. This creates a versatile representation for robot movements that can capture discrete changes and non-linearities in the dynamics. We give illustrative examples on both synthetic data and for striking movements recorded using a BarrettWAM robot as haptic input device. Our inspiration is robot motor primitives, but we expect our model to have wide application for dynamical systems including models for human motion capture data and systems biology.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Naı̈ve Security in a Wi-Fi World

Swanson, C., Urner, R., Lank, E.

In Trust Management IV - 4th IFIP WG 11.11 International Conference Proceedings, pages: 32-47, (Editors: Nishigaki, M., Josang, A., Murayama, Y., Marsh, S.), IFIPTM, 2010 (inproceedings)

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2004


no image
Attentional Modulation of Auditory Event-Related Potentials in a Brain-Computer Interface

Hill, J., Lal, T., Bierig, K., Birbaumer, N., Schölkopf, B.

In BioCAS04, (S3/5/INV- S3/17-20):4, IEEE Computer Society, Los Alamitos, CA, USA, 2004 IEEE International Workshop on Biomedical Circuits and Systems, December 2004 (inproceedings)

Abstract
Motivated by the particular problems involved in communicating with "locked-in" paralysed patients, we aim to develop a brain-computer interface that uses auditory stimuli. We describe a paradigm that allows a user to make a binary decision by focusing attention on one of two concurrent auditory stimulus sequences. Using Support Vector Machine classification and Recursive Channel Elimination on the independent components of averaged event-related potentials, we show that an untrained user‘s EEG data can be classified with an encouragingly high level of accuracy. This suggests that it is possible for users to modulate EEG signals in a single trial by the conscious direction of attention, well enough to be useful in BCI.

PDF Web DOI [BibTex]

2004

PDF Web DOI [BibTex]


no image
Discrete vs. Continuous: Two Sides of Machine Learning

Zhou, D.

October 2004 (talk)

Abstract
We consider the problem of transductive inference. In many real-world problems, unlabeled data is far easier to obtain than labeled data. Hence transductive inference is very significant in many practical problems. According to Vapnik's point of view, one should predict the function value only on the given points directly rather than a function defined on the whole space, the latter being a more complicated problem. Inspired by this idea, we develop discrete calculus on finite discrete spaces, and then build discrete regularization. A family of transductive algorithms is naturally derived from this regularization framework. We validate the algorithms on both synthetic and real-world data from text/web categorization to bioinformatics problems. A significant by-product of this work is a powerful way of ranking data based on examples including images, documents, proteins and many other kinds of data. This talk is mainly based on the followiing contribution: (1) D. Zhou and B. Sch{\"o}lkopf: Transductive Inference with Graphs, MPI Technical report, August, 2004; (2) D. Zhou, B. Sch{\"o}lkopf and T. Hofmann. Semi-supervised Learning on Directed Graphs. NIPS 2004; (3) D. Zhou, O. Bousquet, T.N. Lal, J. Weston and B. Sch{\"o}lkopf. Learning with Local and Global Consistency. NIPS 2003.

PDF [BibTex]


no image
Using kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method

Honkela, A., Harmeling, S., Lundqvist, L., Valpola, H.

In ICA 2004, pages: 790-797, (Editors: Puntonet, C. G., A. Prieto), Springer, Berlin, Germany, Fifth International Conference on Independent Component Analysis and Blind Signal Separation, October 2004 (inproceedings)

Abstract
The variational Bayesian nonlinear blind source separation method introduced by Lappalainen and Honkela in 2000 is initialised with linear principal component analysis (PCA). Because of the multilayer perceptron (MLP) network used to model the nonlinearity, the method is susceptible to local minima and therefore sensitive to the initialisation used. As the method is used for nonlinear separation, the linear initialisation may in some cases lead it astray. In this paper we study the use of kernel PCA (KPCA) in the initialisation. KPCA is a rather straightforward generalisation of linear PCA and it is much faster to compute than the variational Bayesian method. The experiments show that it can produce significantly better initialisations than linear PCA. Additionally, the model comparison methods provided by the variational Bayesian framework can be easily applied to compare different kernels.

DOI [BibTex]

DOI [BibTex]


no image
Robust ICA for Super-Gaussian Sources

Meinecke, F., Harmeling, S., Müller, K.

In ICA 2004, pages: 217-224, (Editors: Puntonet, C. G., A. Prieto), Springer, Berlin, Germany, Fifth International Conference on Independent Component Analysis and Blind Signal Separation, October 2004 (inproceedings)

Abstract
Most ICA algorithms are sensitive to outliers. Instead of robustifying existing algorithms by outlier rejection techniques, we show how a simple outlier index can be used directly to solve the ICA problem for super-Gaussian source signals. This ICA method is outlier-robust by construction and can be used for standard ICA as well as for over-complete ICA (i.e. more source signals than observed signals (mixtures)).

DOI [BibTex]

DOI [BibTex]


no image
Modelling Spikes with Mixtures of Factor Analysers

Görür, D., Rasmussen, C., Tolias, A., Sinz, F., Logothetis, N.

In Pattern Recognition, pages: 391-398, LNCS 3175, (Editors: Rasmussen, C. E. , H.H. Bülthoff, B. Schölkopf, M.A. Giese), Springer, Berlin, Germany, 26th DAGM Symposium, September 2004 (inproceedings)

Abstract
Identifying the action potentials of individual neurons from extracellular recordings, known as spike sorting, is a challenging problem. We consider the spike sorting problem using a generative model,mixtures of factor analysers, which concurrently performs clustering and feature extraction. The most important advantage of this method is that it quantifies the certainty with which the spikes are classified. This can be used as a means for evaluating the quality of clustering and therefore spike isolation. Using this method, nearly simultaneously occurring spikes can also be modelled which is a hard task for many of the spike sorting methods. Furthermore, modelling the data with a generative model allows us to generate simulated data.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Learning Depth From Stereo

Sinz, F., Candela, J., BakIr, G., Rasmussen, C., Franz, M.

In 26th DAGM Symposium, pages: 245-252, LNCS 3175, (Editors: Rasmussen, C. E., H. H. Bülthoff, B. Schölkopf, M. A. Giese), Springer, Berlin, Germany, 26th DAGM Symposium, September 2004 (inproceedings)

Abstract
We compare two approaches to the problem of estimating the depth of a point in space from observing its image position in two different cameras: 1.~The classical photogrammetric approach explicitly models the two cameras and estimates their intrinsic and extrinsic parameters using a tedious calibration procedure; 2.~A generic machine learning approach where the mapping from image to spatial coordinates is directly approximated by a Gaussian Process regression. Our results show that the generic learning approach, in addition to simplifying the procedure of calibration, can lead to higher depth accuracies than classical calibration although no specific domain knowledge is used.

PDF PostScript Web [BibTex]

PDF PostScript Web [BibTex]


no image
Grundlagen von Support Vector Maschinen und Anwendungen in der Bildverarbeitung

Eichhorn, J.

September 2004 (talk)

Abstract
Invited talk at the workshop "Numerical, Statistical and Discrete Methods in Image Processing" at the TU M{\"u}nchen (in GERMAN)

PDF [BibTex]


no image
Stability of Hausdorff-based Distance Measures

Shapiro, MD., Blaschko, MB.

In VIIP, pages: 1-6, VIIP, September 2004 (inproceedings)

[BibTex]

[BibTex]