Header logo is ei


2014


no image
Single-Source Domain Adaptation with Target and Conditional Shift

Zhang, K., Schölkopf, B., Muandet, K., Wang, Z., Zhou, Z., Persello, C.

In Regularization, Optimization, Kernels, and Support Vector Machines, pages: 427-456, 19, Chapman & Hall/CRC Machine Learning & Pattern Recognition, (Editors: Suykens, J. A. K., Signoretto, M. and Argyriou, A.), Chapman and Hall/CRC, Boca Raton, USA, 2014 (inbook)

[BibTex]

2014

[BibTex]


no image
Higher-Order Tensors in Diffusion Imaging

Schultz, T., Fuster, A., Ghosh, A., Deriche, R., Florack, L., Lim, L.

In Visualization and Processing of Tensors and Higher Order Descriptors for Multi-Valued Data, pages: 129-161, Mathematics + Visualization, (Editors: Westin, C.-F., Vilanova, A. and Burgeth, B.), Springer, 2014 (inbook)

[BibTex]

[BibTex]


no image
Fuzzy Fibers: Uncertainty in dMRI Tractography

Schultz, T., Vilanova, A., Brecheisen, R., Kindlmann, G.

In Scientific Visualization: Uncertainty, Multifield, Biomedical, and Scalable Visualization, pages: 79-92, 8, Mathematics + Visualization, (Editors: Hansen, C. D., Chen, M., Johnson, C. R., Kaufman, A. E. and Hagen, H.), Springer, 2014 (inbook)

[BibTex]

[BibTex]


no image
Nonconvex Proximal Splitting with Computational Errors

Sra, S.

In Regularization, Optimization, Kernels, and Support Vector Machines, pages: 83-102, 4, (Editors: Suykens, J. A. K., Signoretto, M. and Argyriou, A.), CRC Press, 2014 (inbook)

[BibTex]

[BibTex]


no image
Active Learning - Modern Learning Theory

Balcan, M., Urner, R.

In Encyclopedia of Algorithms, (Editors: Kao, M.-Y.), Springer Berlin Heidelberg, 2014 (incollection)

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2013


no image
A Review of Performance Variations in SMR-Based Brain–Computer Interfaces (BCIs)

Grosse-Wentrup, M., Schölkopf, B.

In Brain-Computer Interface Research, pages: 39-51, 4, SpringerBriefs in Electrical and Computer Engineering, (Editors: Guger, C., Allison, B. Z. and Edlinger, G.), Springer, 2013 (inbook)

PDF DOI [BibTex]

2013

PDF DOI [BibTex]


no image
Semi-supervised learning in causal and anticausal settings

Schölkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., Mooij, J.

In Empirical Inference, pages: 129-141, 13, Festschrift in Honor of Vladimir Vapnik, (Editors: Schölkopf, B., Luo, Z. and Vovk, V.), Springer, 2013 (inbook)

DOI [BibTex]

DOI [BibTex]


no image
Tractable large-scale optimization in machine learning

Sra, S.

In Tractability: Practical Approaches to Hard Problems, pages: 202-230, 7, (Editors: Bordeaux, L., Hamadi , Y., Kohli, P. and Mateescu, R. ), Cambridge University Press , 2013 (inbook)

[BibTex]

[BibTex]


no image
Animating Samples from Gaussian Distributions

Hennig, P.

(8), Max Planck Institute for Intelligent Systems, Tübingen, Germany, 2013 (techreport)

PDF [BibTex]

PDF [BibTex]


no image
Maximizing Kepler science return per telemetered pixel: Detailed models of the focal plane in the two-wheel era

Hogg, D. W., Angus, R., Barclay, T., Dawson, R., Fergus, R., Foreman-Mackey, D., Harmeling, S., Hirsch, M., Lang, D., Montet, B. T., Schiminovich, D., Schölkopf, B.

arXiv:1309.0653, 2013 (techreport)

link (url) [BibTex]

link (url) [BibTex]


no image
Maximizing Kepler science return per telemetered pixel: Searching the habitable zones of the brightest stars

Montet, B. T., Angus, R., Barclay, T., Dawson, R., Fergus, R., Foreman-Mackey, D., Harmeling, S., Hirsch, M., Hogg, D. W., Lang, D., Schiminovich, D., Schölkopf, B.

arXiv:1309.0654, 2013 (techreport)

link (url) [BibTex]

link (url) [BibTex]


no image
On the Relations and Differences between Popper Dimension, Exclusion Dimension and VC-Dimension

Seldin, Y., Schölkopf, B.

In Empirical Inference - Festschrift in Honor of Vladimir N. Vapnik, pages: 53-57, 6, (Editors: Schölkopf, B., Luo, Z. and Vovk, V.), Springer, 2013 (inbook)

[BibTex]

[BibTex]

2010


no image
Computationally efficient algorithms for statistical image processing: Implementation in R

Langovoy, M., Wittich, O.

(2010-053), EURANDOM, Technische Universiteit Eindhoven, December 2010 (techreport)

Abstract
In the series of our earlier papers on the subject, we proposed a novel statistical hy- pothesis testing method for detection of objects in noisy images. The method uses results from percolation theory and random graph theory. We developed algorithms that allowed to detect objects of unknown shapes in the presence of nonparametric noise of unknown level and of un- known distribution. No boundary shape constraints were imposed on the objects, only a weak bulk condition for the object's interior was required. Our algorithms have linear complexity and exponential accuracy. In the present paper, we describe an implementation of our nonparametric hypothesis testing method. We provide a program that can be used for statistical experiments in image processing. This program is written in the statistical programming language R.

PDF [BibTex]

2010

PDF [BibTex]


no image
Fast Convergent Algorithms for Expectation Propagation Approximate Bayesian Inference

Seeger, M., Nickisch, H.

Max Planck Institute for Biological Cybernetics, December 2010 (techreport)

Abstract
We propose a novel algorithm to solve the expectation propagation relaxation of Bayesian inference for continuous-variable graphical models. In contrast to most previous algorithms, our method is provably convergent. By marrying convergent EP ideas from (Opper&Winther 05) with covariance decoupling techniques (Wipf&Nagarajan 08, Nickisch&Seeger 09), it runs at least an order of magnitude faster than the most commonly used EP solver.

Web [BibTex]

Web [BibTex]


no image
Markerless tracking of Dynamic 3D Scans of Faces

Walder, C., Breidt, M., Bülthoff, H., Schölkopf, B., Curio, C.

In Dynamic Faces: Insights from Experiments and Computation, pages: 255-276, (Editors: Curio, C., Bülthoff, H. H. and Giese, M. A.), MIT Press, Cambridge, MA, USA, December 2010 (inbook)

Web [BibTex]

Web [BibTex]


no image
Policy Gradient Methods

Peters, J., Bagnell, J.

In Encyclopedia of Machine Learning, pages: 774-776, (Editors: Sammut, C. and Webb, G. I.), Springer, Berlin, Germany, December 2010 (inbook)

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
A PAC-Bayesian Analysis of Graph Clustering and Pairwise Clustering

Seldin, Y.

Max Planck Institute for Biological Cybernetics, Tübingen, Germany, September 2010 (techreport)

Abstract
We formulate weighted graph clustering as a prediction problem: given a subset of edge weights we analyze the ability of graph clustering to predict the remaining edge weights. This formulation enables practical and theoretical comparison of different approaches to graph clustering as well as comparison of graph clustering with other possible ways to model the graph. We adapt the PAC-Bayesian analysis of co-clustering (Seldin and Tishby, 2008; Seldin, 2009) to derive a PAC-Bayesian generalization bound for graph clustering. The bound shows that graph clustering should optimize a trade-off between empirical data fit and the mutual information that clusters preserve on the graph nodes. A similar trade-off derived from information-theoretic considerations was already shown to produce state-of-the-art results in practice (Slonim et al., 2005; Yom-Tov and Slonim, 2009). This paper supports the empirical evidence by providing a better theoretical foundation, suggesting formal generalization guarantees, and offering a more accurate way to deal with finite sample issues. We derive a bound minimization algorithm and show that it provides good results in real-life problems and that the derived PAC-Bayesian bound is reasonably tight.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Sparse nonnegative matrix approximation: new formulations and algorithms

Tandon, R., Sra, S.

(193), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, September 2010 (techreport)

Abstract
We introduce several new formulations for sparse nonnegative matrix approximation. Subsequently, we solve these formulations by developing generic algorithms. Further, to help selecting a particular sparse formulation, we briefly discuss the interpretation of each formulation. Finally, preliminary experiments are presented to illustrate the behavior of our formulations and algorithms.

PDF [BibTex]

PDF [BibTex]


no image
Robust nonparametric detection of objects in noisy images

Langovoy, M., Wittich, O.

(2010-049), EURANDOM, Technische Universiteit Eindhoven, September 2010 (techreport)

Abstract
We propose a novel statistical hypothesis testing method for detection of objects in noisy images. The method uses results from percolation theory and random graph theory. We present an algorithm that allows to detect objects of unknown shapes in the presence of nonparametric noise of unknown level and of unknown distribution. No boundary shape constraints are imposed on the object, only a weak bulk condition for the object's interior is required. The algorithm has linear complexity and exponential accuracy and is appropriate for real-time systems. In this paper, we develop further the mathematical formalism of our method and explore im- portant connections to the mathematical theory of percolation and statistical physics. We prove results on consistency and algorithmic complexity of our testing procedure. In addition, we address not only an asymptotic behavior of the method, but also a nite sample performance of our test.

PDF [BibTex]

PDF [BibTex]


no image
Large Scale Variational Inference and Experimental Design for Sparse Generalized Linear Models

Seeger, M., Nickisch, H.

Max Planck Institute for Biological Cybernetics, August 2010 (techreport)

Abstract
Many problems of low-level computer vision and image processing, such as denoising, deconvolution, tomographic reconstruction or super-resolution, can be addressed by maximizing the posterior distribution of a sparse linear model (SLM). We show how higher-order Bayesian decision-making problems, such as optimizing image acquisition in magnetic resonance scanners, can be addressed by querying the SLM posterior covariance, unrelated to the density's mode. We propose a scalable algorithmic framework, with which SLM posteriors over full, high-resolution images can be approximated for the first time, solving a variational optimization problem which is convex iff posterior mode finding is convex. These methods successfully drive the optimization of sampling trajectories for real-world magnetic resonance imaging through Bayesian experimental design, which has not been attempted before. Our methodology provides new insight into similarities and differences between sparse reconstruction and approximate Bayesian inference, and has important implications for compressive sensing of real-world images.

Web [BibTex]


no image
Cooperative Cuts for Image Segmentation

Jegelka, S., Bilmes, J.

(UWEETR-1020-0003), University of Washington, Washington DC, USA, August 2010 (techreport)

Abstract
We propose a novel framework for graph-based cooperative regularization that uses submodular costs on graph edges. We introduce an efficient iterative algorithm to solve the resulting hard discrete optimization problem, and show that it has a guaranteed approximation factor. The edge-submodular formulation is amenable to the same extensions as standard graph cut approaches, and applicable to a range of problems. We apply this method to the image segmentation problem. Specifically, Here, we apply it to introduce a discount for homogeneous boundaries in binary image segmentation on very difficult images, precisely, long thin objects and color and grayscale images with a shading gradient. The experiments show that significant portions of previously truncated objects are now preserved.

Web [BibTex]

Web [BibTex]


no image
Fast algorithms for total-variationbased optimization

Barbero, A., Sra, S.

(194), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, August 2010 (techreport)

Abstract
We derive a number of methods to solve efficiently simple optimization problems subject to a totalvariation (TV) regularization, under different norms of the TV operator and both for the case of 1-dimensional and 2-dimensional data. In spite of the non-smooth, non-separable nature of the TV terms considered, we show that a dual formulation with strong structure can be derived. Taking advantage of this structure we develop adaptions of existing algorithms from the optimization literature, resulting in efficient methods for the problem at hand. Experimental results show that for 1-dimensional data the proposed methods achieve convergence within good accuracy levels in practically linear time, both for L1 and L2 norms. For the more challenging 2-dimensional case a performance of order O(N2 log2 N) for N x N inputs is achieved when using the L2 norm. A final section suggests possible extensions and lines of further work.

PDF [BibTex]

PDF [BibTex]


no image
Gaussian Mixture Modeling with Gaussian Process Latent Variable Models

Nickisch, H., Rasmussen, C.

Max Planck Institute for Biological Cybernetics, June 2010 (techreport)

Abstract
Density modeling is notoriously difficult for high dimensional data. One approach to the problem is to search for a lower dimensional manifold which captures the main characteristics of the data. Recently, the Gaussian Process Latent Variable Model (GPLVM) has successfully been used to find low dimensional manifolds in a variety of complex data. The GPLVM consists of a set of points in a low dimensional latent space, and a stochastic map to the observed space. We show how it can be interpreted as a density model in the observed space. However, the GPLVM is not trained as a density model and therefore yields bad density estimates. We propose a new training strategy and obtain improved generalisation performance and better density estimates in comparative evaluations on several benchmark data sets.

Web [BibTex]

Web [BibTex]


no image
Generalized Proximity and Projection with Norms and Mixed-norms

Sra, S.

(192), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, May 2010 (techreport)

Abstract
We discuss generalized proximity operators (GPO) and their associated generalized projection problems. On inputs of size n, we show how to efficiently apply GPOs and generalized projections for separable norms and distance-like functions to accuracy e in O(n log(1/e)) time. We also derive projection algorithms that run theoretically in O(n log n log(1/e)) time but can for suitable parameter ranges empirically outperform the O(n log(1/e)) projection method. The proximity and projection tasks are either separable, and solved directly, or are reduced to a single root-finding step. We highlight that as a byproduct, our analysis also yields an O(n log(1/e)) (weakly linear-time) procedure for Euclidean projections onto the l1;1-norm ball; previously only an O(n log n) method was known. We provide empirical evaluation to illustrate the performance of our methods, noting that for the l1;1-norm projection, our implementation is more than two orders of magnitude faster than the previously known method.

PDF [BibTex]

PDF [BibTex]


no image
Cooperative Cuts: Graph Cuts with Submodular Edge Weights

Jegelka, S., Bilmes, J.

(189), Max Planck Institute for Biological Cybernetics, Tuebingen, Germany, March 2010 (techreport)

Abstract
We introduce a problem we call Cooperative cut, where the goal is to find a minimum-cost graph cut but where a submodular function is used to define the cost of a subsets of edges. That means, the cost of an edge that is added to the current cut set C depends on the edges in C. This generalization of the cost in the standard min-cut problem to a submodular cost function immediately makes the problem harder. Not only do we prove NP hardness even for nonnegative submodular costs, but also show a lower bound of Omega(|V|^(1/3)) on the approximation factor for the problem. On the positive side, we propose and compare four approximation algorithms with an overall approximation factor of min { |V|/2, |C*|, O( sqrt(|E|) log |V|), |P_max|}, where C* is the optimal solution, and P_max is the longest s, t path across the cut between given s, t. We also introduce additional heuristics for the problem which have attractive properties from the perspective of practical applications and implementations in that existing fast min-cut libraries may be used as subroutines. Both our approximation algorithms, and our heuristics, appear to do well in practice.

PDF [BibTex]

PDF [BibTex]


no image
Learning Continuous Grasp Affordances by Sensorimotor Exploration

Detry, R., Baseski, E., Popovic, M., Touati, Y., Krüger, N., Kroemer, O., Peters, J., Piater, J.

In From Motor Learning to Interaction Learning in Robots, pages: 451-465, Studies in Computational Intelligence ; 264, (Editors: Sigaud, O. and Peters, J.), Springer, Berlin, Germany, January 2010 (inbook)

Abstract
We develop means of learning and representing object grasp affordances probabilistically. By grasp affordance, we refer to an entity that is able to assess whether a given relative object-gripper configuration will yield a stable grasp. These affordances are represented with grasp densities, continuous probability density functions defined on the space of 3D positions and orientations. Grasp densities are registered with a visual model of the object they characterize. They are exploited by aligning them to a target object using visual pose estimation. Grasp densities are refined through experience: A robot “plays” with an object by executing grasps drawn randomly for the object’s grasp density. The robot then uses the outcomes of these grasps to build a richer density through an importance sampling mechanism. Initial grasp densities, called hypothesis densities, are bootstrapped from grasps collected using a motion capture system, or from grasps generated from the visual model of the object. Refined densities, called empirical densities, represent affordances that have been confirmed through physical experience. The applicability of our method is demonstrated by producing empirical densities for two object with a real robot and its 3-finger hand. Hypothesis densities are created from visual cues and human demonstration.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Imitation and Reinforcement Learning for Motor Primitives with Perceptual Coupling

Kober, J., Mohler, B., Peters, J.

In From Motor Learning to Interaction Learning in Robots, pages: 209-225, Studies in Computational Intelligence ; 264, (Editors: Sigaud, O. and Peters, J.), Springer, Berlin, Germany, January 2010 (inbook)

Abstract
Traditional motor primitive approaches deal largely with open-loop policies which can only deal with small perturbations. In this paper, we present a new type of motor primitive policies which serve as closed-loop policies together with an appropriate learning algorithm. Our new motor primitives are an augmented version version of the dynamical system-based motor primitives [Ijspeert et al(2002)Ijspeert, Nakanishi, and Schaal] that incorporates perceptual coupling to external variables. We show that these motor primitives can perform complex tasks such as Ball-in-a-Cup or Kendama task even with large variances in the initial conditions where a skilled human player would be challenged. We initialize the open-loop policies by imitation learning and the perceptual coupling with a handcrafted solution. We first improve the open-loop policies and subsequently the perceptual coupling using a novel reinforcement learning method which is particularly well-suited for dynamical system-based motor primitives.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
From Motor Learning to Interaction Learning in Robots

Sigaud, O., Peters, J.

In From Motor Learning to Interaction Learning in Robots, pages: 1-12, Studies in Computational Intelligence ; 264, (Editors: Sigaud, O. and Peters, J.), Springer, Berlin, Germany, January 2010 (inbook)

Abstract
The number of advanced robot systems has been increasing in recent years yielding a large variety of versatile designs with many degrees of freedom. These robots have the potential of being applicable in uncertain tasks outside wellstructured industrial settings. However, the complexity of both systems and tasks is often beyond the reach of classical robot programming methods. As a result, a more autonomous solution for robot task acquisition is needed where robots adaptively adjust their behaviour to the encountered situations and required tasks. Learning approaches pose one of the most appealing ways to achieve this goal. However, while learning approaches are of high importance for robotics, we cannot simply use off-the-shelf methods from the machine learning community as these usually do not scale into the domains of robotics due to excessive computational cost as well as a lack of scalability. Instead, domain appropriate approaches are needed. In this book, we focus on several core domains of robot learning. For accurate task execution, we need motor learning capabilities. For fast learning of the motor tasks, imitation learning offers the most promising approach. Self improvement requires reinforcement learning approaches that scale into the domain of complex robots. Finally, for efficient interaction of humans with robot systems, we will need a form of interaction learning. This chapter provides a general introduction to these issues and briefly presents the contributions of the subsequent chapters to the corresponding research topics.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Real-Time Local GP Model Learning

Nguyen-Tuong, D., Seeger, M., Peters, J.

In From Motor Learning to Interaction Learning in Robots, 264, pages: 193-207, Studies in Computational Intelligence, (Editors: Sigaud, O. and Peters, J.), Springer, Berlin, Germany, January 2010 (inbook)

Abstract
For many applications in robotics, accurate dynamics models are essential. However, in some applications, e.g., in model-based tracking control, precise dynamics models cannot be obtained analytically for sufficiently complex robot systems. In such cases, machine learning offers a promising alternative for approximating the robot dynamics using measured data. However, standard regression methods such as Gaussian process regression (GPR) suffer from high computational complexity which prevents their usage for large numbers of samples or online learning to date. In this paper, we propose an approximation to the standard GPR using local Gaussian processes models inspired by [Vijayakumar et al(2005)Vijayakumar, D’Souza, and Schaal, Snelson and Ghahramani(2007)]. Due to reduced computational cost, local Gaussian processes (LGP) can be applied for larger sample-sizes and online learning. Comparisons with other nonparametric regressions, e.g., standard GPR, support vector regression (SVR) and locally weighted proje ction regression (LWPR), show that LGP has high approximation accuracy while being sufficiently fast for real-time online learning.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Machine Learning Methods for Automatic Image Colorization

Charpiat, G., Bezrukov, I., Hofmann, M., Altun, Y., Schölkopf, B.

In Computational Photography: Methods and Applications, pages: 395-418, Digital Imaging and Computer Vision, (Editors: Lukac, R.), CRC Press, Boca Raton, FL, USA, 2010 (inbook)

Abstract
We aim to color greyscale images automatically, without any manual intervention. The color proposition could then be interactively corrected by user-provided color landmarks if necessary. Automatic colorization is nontrivial since there is usually no one-to-one correspondence between color and local texture. The contribution of our framework is that we deal directly with multimodality and estimate, for each pixel of the image to be colored, the probability distribution of all possible colors, instead of choosing the most probable color at the local level. We also predict the expected variation of color at each pixel, thus defining a non-uniform spatial coherency criterion. We then use graph cuts to maximize the probability of the whole colored image at the global level. We work in the L-a-b color space in order to approximate the human perception of distances between colors, and we use machine learning tools to extract as much information as possible from a dataset of colored examples. The resulting algorithm is fast, designed to be more robust to texture noise, and is above all able to deal with ambiguity, in contrary to previous approaches.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Approaches Based on Support Vector Machine to Classification of Remote Sensing Data

Bruzzone, L., Persello, C.

In Handbook of Pattern Recognition and Computer Vision, pages: 329-352, (Editors: Chen, C.H.), ICP, London, UK, 2010 (inbook)

Abstract
This chapter presents an extensive and critical review on the use of kernel methods and in particular of support vector machines (SVMs) in the classification of remote-sensing (RS) data. The chapter recalls the mathematical formulation and the main theoretical concepts related to SVMs, and discusses the motivations at the basis of the use of SVMs in remote sensing. A review on the main applications of SVMs in classification of remote sensing is given, presenting a literature survey on the use of SVMs for the analysis of different kinds of RS images. In addition, the most recent methodological developments related to SVM-based classification techniques in RS are illustrated by focusing on semisupervised, domain adaptation, and context sensitive approaches. Finally, the most promising research directions on SVM in RS are identified and discussed.

Web [BibTex]

Web [BibTex]


no image
Information-theoretic inference of common ancestors

Steudel, B., Ay, N.

Computing Research Repository (CoRR), abs/1010.5720, pages: 18, 2010 (techreport)

Web [BibTex]

Web [BibTex]

2006


no image
A New Projected Quasi-Newton Approach for the Nonnegative Least Squares Problem

Kim, D., Sra, S., Dhillon, I.

(TR-06-54), Univ. of Texas, Austin, December 2006 (techreport)

PDF [BibTex]

2006

PDF [BibTex]


no image
Probabilistic inference for solving (PO)MDPs

Toussaint, M., Harmeling, S., Storkey, A.

(934), School of Informatics, University of Edinburgh, December 2006 (techreport)

PDF [BibTex]

PDF [BibTex]


no image
Minimal Logical Constraint Covering Sets

Sinz, F., Schölkopf, B.

(155), Max Planck Institute for Biological Cybernetics, Tübingen, December 2006 (techreport)

Abstract
We propose a general framework for computing minimal set covers under class of certain logical constraints. The underlying idea is to transform the problem into a mathematical programm under linear constraints. In this sense it can be seen as a natural extension of the vector quantization algorithm proposed by Tipping and Schoelkopf. We show which class of logical constraints can be cast and relaxed into linear constraints and give an algorithm for the transformation.

PDF [BibTex]

PDF [BibTex]


no image
Prediction of Protein Function from Networks

Shin, H., Tsuda, K.

In Semi-Supervised Learning, pages: 361-376, Adaptive Computation and Machine Learning, (Editors: Chapelle, O. , B. Schölkopf, A. Zien), MIT Press, Cambridge, MA, USA, November 2006 (inbook)

Abstract
In computational biology, it is common to represent domain knowledge using graphs. Frequently there exist multiple graphs for the same set of nodes, representing information from different sources, and no single graph is sufficient to predict class labels of unlabelled nodes reliably. One way to enhance reliability is to integrate multiple graphs, since individual graphs are partly independent and partly complementary to each other for prediction. In this chapter, we describe an algorithm to assign weights to multiple graphs within graph-based semi-supervised learning. Both predicting class labels and searching for weights for combining multiple graphs are formulated into one convex optimization problem. The graph-combining method is applied to functional class prediction of yeast proteins.When compared with individual graphs, the combined graph with optimized weights performs significantly better than any single graph.When compared with the semidefinite programming-based support vector machine (SDP/SVM), it shows comparable accuracy in a remarkably short time. Compared with a combined graph with equal-valued weights, our method could select important graphs without loss of accuracy, which implies the desirable property of integration with selectivity.

Web [BibTex]

Web [BibTex]


no image
Discrete Regularization

Zhou, D., Schölkopf, B.

In Semi-supervised Learning, pages: 237-250, Adaptive computation and machine learning, (Editors: O Chapelle and B Schölkopf and A Zien), MIT Press, Cambridge, MA, USA, November 2006 (inbook)

Abstract
Many real-world machine learning problems are situated on finite discrete sets, including dimensionality reduction, clustering, and transductive inference. A variety of approaches for learning from finite sets has been proposed from different motivations and for different problems. In most of those approaches, a finite set is modeled as a graph, in which the edges encode pairwise relationships among the objects in the set. Consequently many concepts and methods from graph theory are adopted. In particular, the graph Laplacian is widely used. In this chapter we present a systemic framework for learning from a finite set represented as a graph. We develop discrete analogues of a number of differential operators, and then construct a discrete analogue of classical regularization theory based on those discrete differential operators. The graph Laplacian based approaches are special cases of this general discrete regularization framework. An important thing implied in this framework is that we have a wide choices of regularization on graph in addition to the widely-used graph Laplacian based one.

PDF Web [BibTex]

PDF Web [BibTex]


no image
New Methods for the P300 Visual Speller

Biessmann, F.

(1), (Editors: Hill, J. ), Max-Planck Institute for Biological Cybernetics, Tübingen, Germany, November 2006 (techreport)

PDF [BibTex]

PDF [BibTex]


no image
Geometric Analysis of Hilbert Schmidt Independence criterion based ICA contrast function

Shen, H., Jegelka, S., Gretton, A.

(PA006080), National ICT Australia, Canberra, Australia, October 2006 (techreport)

Web [BibTex]

Web [BibTex]


no image
A tutorial on spectral clustering

von Luxburg, U.

(149), Max Planck Institute for Biological Cybernetics, Tübingen, August 2006 (techreport)

Abstract
In recent years, spectral clustering has become one of the most popular modern clustering algorithms. It is simple to implement, can be solved efficiently by standard linear algebra software, and very often outperforms traditional clustering algorithms such as the k-means algorithm. Nevertheless, on the first glance spectral clustering looks a bit mysterious, and it is not obvious to see why it works at all and what it really does. This article is a tutorial introduction to spectral clustering. We describe different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches. Advantages and disadvantages of the different spectral clustering algorithms are discussed.

PDF [BibTex]

PDF [BibTex]


no image
Towards the Inference of Graphs on Ordered Vertexes

Zien, A., Raetsch, G., Ong, C.

(150), Max Planck Institute for Biological Cybernetics, Tübingen, August 2006 (techreport)

Abstract
We propose novel methods for machine learning of structured output spaces. Specifically, we consider outputs which are graphs with vertices that have a natural order. We consider the usual adjacency matrix representation of graphs, as well as two other representations for such a graph: (a) decomposing the graph into a set of paths, (b) converting the graph into a single sequence of nodes with labeled edges. For each of the three representations, we propose an encoding and decoding scheme. We also propose an evaluation measure for comparing two graphs.

PDF [BibTex]

PDF [BibTex]


no image
Nonnegative Matrix Approximation: Algorithms and Applications

Sra, S., Dhillon, I.

Univ. of Texas, Austin, May 2006 (techreport)

[BibTex]

[BibTex]


no image
An Automated Combination of Sequence Motif Kernels for Predicting Protein Subcellular Localization

Zien, A., Ong, C.

(146), Max Planck Institute for Biological Cybernetics, Tübingen, April 2006 (techreport)

Abstract
Protein subcellular localization is a crucial ingredient to many important inferences about cellular processes, including prediction of protein function and protein interactions. While many predictive computational tools have been proposed, they tend to have complicated architectures and require many design decisions from the developer. We propose an elegant and fully automated approach to building a prediction system for protein subcellular localization. We propose a new class of protein sequence kernels which considers all motifs including motifs with gaps. This class of kernels allows the inclusion of pairwise amino acid distances into their computation. We further propose a multiclass support vector machine method which directly solves protein subcellular localization without resorting to the common approach of splitting the problem into several binary classification problems. To automatically search over families of possible amino acid motifs, we generalize our method to optimize over multiple kernels at the same time. We compare our automated approach to four other predictors on three different datasets.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Training a Support Vector Machine in the Primal

Chapelle, O.

(147), Max Planck Institute for Biological Cybernetics, Tübingen, April 2006, The version in the "Large Scale Kernel Machines" book is more up to date. (techreport)

Abstract
Most literature on Support Vector Machines (SVMs) concentrate on the dual optimization problem. In this paper, we would like to point out that the primal problem can also be solved efficiently, both for linear and non-linear SVMs, and there is no reason for ignoring it. Moreover, from the primal point of view, new families of algorithms for large scale SVM training can be investigated.

PDF [BibTex]

PDF [BibTex]


no image
Cross-Validation Optimization for Structured Hessian Kernel Methods

Seeger, M., Chapelle, O.

Max-Planck Institute for Biological Cybernetics, Tübingen, Germany, February 2006 (techreport)

Abstract
We address the problem of learning hyperparameters in kernel methods for which the Hessian of the objective is structured. We propose an approximation to the cross-validation log likelihood whose gradient can be computed analytically, solving the hyperparameter learning problem efficiently through nonlinear optimization. Crucially, our learning method is based entirely on matrix-vector multiplication primitives with the kernel matrices and their derivatives, allowing straightforward specialization to new kernels or to large datasets. When applied to the problem of multi-way classification, our method scales linearly in the number of classes and gives rise to state-of-the-art results on a remote imaging task.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Combining a Filter Method with SVMs

Lal, T., Chapelle, O., Schölkopf, B.

In Feature Extraction: Foundations and Applications, Studies in Fuzziness and Soft Computing, Vol. 207, pages: 439-446, Studies in Fuzziness and Soft Computing ; 207, (Editors: I Guyon and M Nikravesh and S Gunn and LA Zadeh), Springer, Berlin, Germany, 2006 (inbook)

Abstract
Our goal for the competition (feature selection competition NIPS 2003) was to evaluate the usefulness of simple machine learning techniques. We decided to use the correlation criteria as a feature selection method and Support Vector Machines for the classification part. Here we explain how we chose the regularization parameter C of the SVM, how we determined the kernel parameter and how we estimated the number of features used for each data set. All analyzes were carried out on the training sets of the competition data. We choose the data set Arcene as an example to explain the approach step by step. In our view the point of this competition was the construction of a well performing classifier rather than the systematic analysis of a specific approach. This is why our search for the best classifier was only guided by the described methods and that we deviated from the road map at several occasions. All calculations were done with the software Spider [2004].

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Embedded methods

Lal, T., Chapelle, O., Weston, J., Elisseeff, A.

In Feature Extraction: Foundations and Applications, pages: 137-165, Studies in Fuzziness and Soft Computing ; 207, (Editors: Guyon, I. , S. Gunn, M. Nikravesh, L. A. Zadeh), Springer, Berlin, Germany, 2006 (inbook)

Abstract
Embedded methods are a relatively new approach to feature selection. Unlike filter methods, which do not incorporate learning, and wrapper approaches, which can be used with arbitrary classifiers, in embedded methods the features selection part can not be separated from the learning part. Existing embedded methods are reviewed based on a unifying mathematical framework.

PDF Web [BibTex]

PDF Web [BibTex]


Thumb xl screen shot 2012 06 06 at 11.31.38 am
Implicit Wiener Series, Part II: Regularised estimation

Gehler, P., Franz, M.

(148), Max Planck Institute, 2006 (techreport)

pdf [BibTex]

2003


no image
Support Vector Channel Selection in BCI

Lal, T., Schröder, M., Hinterberger, T., Weston, J., Bogdan, M., Birbaumer, N., Schölkopf, B.

(120), Max Planck Institute for Biological Cybernetics, Tuebingen, Germany, December 2003 (techreport)

Abstract
Designing a Brain Computer Interface (BCI) system one can choose from a variety of features that may be useful for classifying brain activity during a mental task. For the special case of classifying EEG signals we propose the usage of the state of the art feature selection algorithms Recursive Feature Elimination [3] and Zero-Norm Optimization [13] which are based on the training of Support Vector Machines (SVM) [11]. These algorithms can provide more accurate solutions than standard filter methods for feature selection [14]. We adapt the methods for the purpose of selecting EEG channels. For a motor imagery paradigm we show that the number of used channels can be reduced significantly without increasing the classification error. The resulting best channels agree well with the expected underlying cortical activity patterns during the mental tasks. Furthermore we show how time dependent task specific information can be visualized.

PDF Web [BibTex]

2003

PDF Web [BibTex]


no image
Technical report on Separation methods for nonlinear mixtures

Jutten, C., Karhunen, J., Almeida, L., Harmeling, S.

(D29), EU-Project BLISS, October 2003 (techreport)

PDF [BibTex]

PDF [BibTex]