Header logo is ei


2009


no image
Policy Search for Motor Primitives

Peters, J., Kober, J.

KI - Zeitschrift K{\"u}nstliche Intelligenz, 23(3):38-40, August 2009 (article)

Abstract
Many motor skills in humanoid robotics can be learned using parametrized motor primitives from demonstrations. However, most interesting motor learning problems require self-improvement often beyond the reach of current reinforcement learning methods due to the high dimensionality of the state-space. We develop an EM-inspired algorithm applicable to complex motor learning tasks. We compare this algorithm to several well-known parametrized policy search methods and show that it outperforms them. We apply it to motor learning problems and show that it can learn the complex Ball-in-a-Cup task using a real Barrett WAM robot arm.

Web [BibTex]

2009

Web [BibTex]


no image
A neurophysiologically plausible population code model for human contrast discrimination

Goris, R., Wichmann, F., Henning, G.

Journal of Vision, 9(7):1-22, July 2009 (article)

Abstract
The pedestal effect is the improvement in the detectability of a sinusoidal grating in the presence of another grating of the same orientation, spatial frequency, and phase—usually called the pedestal. Recent evidence has demonstrated that the pedestal effect is differently modified by spectrally flat and notch-filtered noise: The pedestal effect is reduced in flat noise but virtually disappears in the presence of notched noise (G. B. Henning & F. A. Wichmann, 2007). Here we consider a network consisting of units whose contrast response functions resemble those of the cortical cells believed to underlie human pattern vision and demonstrate that, when the outputs of multiple units are combined by simple weighted summation—a heuristic decision rule that resembles optimal information combination and produces a contrast-dependent weighting profile—the network produces contrast-discrimination data consistent with psychophysical observations: The pedestal effect is present without noise, reduced in broadband noise, but almost disappears in notched noise. These findings follow naturally from the normalization model of simple cells in primary visual cortex, followed by response-based pooling, and suggest that in processing even low-contrast sinusoidal gratings, the visual system may combine information across neurons tuned to different spatial frequencies and orientations.

Web DOI [BibTex]

Web DOI [BibTex]


no image
A Novel Context-Sensitive Semisupervised SVM Classifier Robust to Mislabeled Training Samples

Bruzzone, L., Persello, C.

IEEE Transactions on Geoscience and Remote Sensing, 47(7):2142-2154, July 2009 (article)

Abstract
This paper presents a novel context-sensitive semisupervised support vector machine (CS4VM) classifier, which is aimed at addressing classification problems where the available training set is not fully reliable, i.e., some labeled samples may be associated to the wrong information class (mislabeled patterns). Unlike standard context-sensitive methods, the proposed CS4VM classifier exploits the contextual information of the pixels belonging to the neighborhood system of each training sample in the learning phase to improve the robustness to possible mislabeled training patterns. This is achieved according to both the design of a semisupervised procedure and the definition of a novel contextual term in the cost function associated with the learning of the classifier. In order to assess the effectiveness of the proposed CS4VM and to understand the impact of the addressed problem in real applications, we also present an extensive experimental analysis carried out on training sets that include different percentages of mislabeled patterns having different distributions on the classes. In the analysis, we also study the robustness to mislabeled training patterns of some widely used supervised and semisupervised classification algorithms (i.e., conventional support vector machine (SVM), progressive semisupervised SVM, maximum likelihood, and k-nearest neighbor). Results obtained on a very high resolution image and on a medium resolution image confirm both the robustness and the effectiveness of the proposed CS4VM with respect to standard classification algorithms and allow us to derive interesting conclusions on the effects of mislabeled patterns on different classifiers.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Falsificationism and Statistical Learning Theory: Comparing the Popper and Vapnik-Chervonenkis Dimensions

Corfield, D., Schölkopf, B., Vapnik, V.

Journal for General Philosophy of Science, 40(1):51-58, July 2009 (article)

Abstract
We compare Karl Popper’s ideas concerning the falsifiability of a theory with similar notions from the part of statistical learning theory known as VC-theory. Popper’s notion of the dimension of a theory is contrasted with the apparently very similar VC-dimension. Having located some divergences, we discuss how best to view Popper’s work from the perspective of statistical learning theory, either as a precursor or as aiming to capture a different learning activity.

PDF DOI [BibTex]


no image
Randomized algorithms for statistical image analysis based on percolation theory

Davies, P., Langovoy, M., Wittich, O.

27th European Meeting of Statisticians (EMS), July 2009 (talk)

Abstract
We propose a novel probabilistic method for detection of signals and reconstruction of images in the presence of random noise. The method uses results from percolation and random graph theories (see Grimmett (1999)). We address the problem of detection and estimation of signals in situations where the signal-to-noise ratio is particularly low. We present an algorithm that allows to detect objects of various shapes in noisy images. The algorithm has linear complexity and exponential accuracy. Our algorithm substantially di ers from wavelets-based algorithms (see Arias-Castro et.al. (2005)). Moreover, we present an algorithm that produces a crude estimate of an object based on the noisy picture. This algorithm also has linear complexity and is appropriate for real-time systems. We prove results on consistency and algorithmic complexity of our procedures.

Web PDF [BibTex]

Web PDF [BibTex]


no image
Guest editorial: Special issue on robot learning, Part A

Peters, J., Ng, A.

Autonomous Robots, 27(1):1-2, July 2009 (article)

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
A Geometric Approach to Confidence Sets for Ratios: Fieller’s Theorem, Generalizations, and Bootstrap

von Luxburg, U., Franz, V.

Statistica Sinica, 19(3):1095-1117, July 2009 (article)

Abstract
We present a geometric method to determine confidence sets for the ratio E(Y)/E(X) of the means of random variables X and Y. This method reduces the problem of constructing confidence sets for the ratio of two random variables to the problem of constructing confidence sets for the means of one-dimensional random variables. It is valid in a large variety of circumstances. In the case of normally distributed random variables, the so constructed confidence sets coincide with the standard Fieller confidence sets. Generalizations of our construction lead to definitions of exact and conservative confidence sets for very general classes of distributions, provided the joint expectation of (X,Y) exists and the linear combinations of the form aX + bY are well-behaved. Finally, our geometric method allows to derive a very simple bootstrap approach for constructing conservative confidence sets for ratios which perform favorably in certain situations, in particular in the asymmetric heavy-tailed regime.

PDF PDF Web [BibTex]


no image
Learning Motor Primitives for Robotics

Kober, J., Peters, J., Oztop, E.

Advanced Telecommunications Research Center ATR, June 2009 (talk)

Abstract
The acquisition and self-improvement of novel motor skills is among the most important problems in robotics. Motor primitives offer one of the most promising frameworks for the application of machine learning techniques in this context. Employing the Dynamic Systems Motor primitives originally introduced by Ijspeert et al. (2003), appropriate learning algorithms for a concerted approach of both imitation and reinforcement learning are presented. Using these algorithms new motor skills, i.e., Ball-in-a-Cup, Ball-Paddling and Dart-Throwing, are learned.

[BibTex]

[BibTex]


no image
Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer

Lampert, C.

IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), June 2009 (talk)

Web [BibTex]

Web [BibTex]


no image
Center-surround patterns emerge as optimal predictors for human saccade targets

Kienzle, W., Franz, M., Schölkopf, B., Wichmann, F.

Journal of Vision, 9(5:7):1-15, May 2009 (article)

Abstract
The human visual system is foveated, that is, outside the central visual field resolution and acuity drop rapidly. Nonetheless much of a visual scene is perceived after only a few saccadic eye movements, suggesting an effective strategy for selecting saccade targets. It has been known for some time that local image structure at saccade targets influences the selection process. However, the question of what the most relevant visual features are is still under debate. Here we show that center-surround patterns emerge as the optimal solution for predicting saccade targets from their local image structure. The resulting model, a one-layer feed-forward network, is surprisingly simple compared to previously suggested models which assume much more complex computations such as multi-scale processing and multiple feature channels. Nevertheless, our model is equally predictive. Furthermore, our findings are consistent with neurophysiological hardware in the superior colliculus. Bottom-up visual saliency may thus not be computed cortically as has been thought previously.

PDF DOI [BibTex]


no image
Influence of Different Assignment Conditions on the Determination of Symmetric Homo-dimeric Structures with ARIA

Bardiaux, B., Bernard, A., Rieping, W., Habeck, M., Malliavin, TE., Nilges, M.

Proteins, 75(3):569-585, May 2009 (article)

Abstract
The ambiguous restraint for iterative assignment (ARIA) approach for NMR structure calculation is evaluated for symmetric homodimeric proteins by assessing the effect of several data analysis and assignment methods on the structure quality. In particular, we study the effects of network anchoring and spin-diffusion correction. The spin-diffusion correction improves the protein structure quality systematically, whereas network anchoring enhances the assignment efficiency by speeding up the convergence and coping with highly ambiguous data. For some homodimeric folds, network anchoring has been proved essential for unraveling both chain and proton assignment ambiguities.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Beamforming in Noninvasive Brain-Computer Interfaces

Grosse-Wentrup, M., Liefhold, C., Gramann, K., Buss, M.

IEEE Transactions on Biomedical Engineering, 56(4):1209-1219, April 2009 (article)

Abstract
Spatial filtering (SF) constitutes an integral part of building EEG-based brain–computer interfaces (BCIs). Algorithms frequently used for SF, such as common spatial patterns (CSPs) and independent component analysis, require labeled training data for identifying filters that provide information on a subject‘s intention, which renders these algorithms susceptible to overfitting on artifactual EEG components. In this study, beamforming is employed to construct spatial filters that extract EEG sources originating within predefined regions of interest within the brain. In this way, neurophysiological knowledge on which brain regions are relevant for a certain experimental paradigm can be utilized to construct unsupervised spatial filters that are robust against artifactual EEG components. Beamforming is experimentally compared with CSP and Laplacian spatial filtering (LP) in a two-class motor-imagery paradigm. It is demonstrated that beamforming outperforms CSP and LP on noisy datasets, while CSP and beamforming perform almost equally well on datasets with few artifactual trials. It is concluded that beamforming constitutes an alternative method for SF that might be particularly useful for BCIs used in clinical settings, i.e., in an environment where artifact-free datasets are difficult to obtain.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Constructing Sparse Kernel Machines Using Attractors

Lee, D., Jung, K., Lee, J.

IEEE Transactions on Neural Networks, 20(4):721-729, April 2009 (article)

Abstract
In this brief, a novel method that constructs a sparse kernel machine is proposed. The proposed method generates attractors as sparse solutions from a built-in kernel machine via a dynamical system framework. By readjusting the corresponding coefficients and bias terms, a sparse kernel machine that approximates a conventional kernel machine is constructed. The simulation results show that the constructed sparse kernel machine improves the efficiency of testing phase while maintaining comparable test error.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Optimal construction of k-nearest-neighbor graphs for identifying noisy clusters

Maier, M., Hein, M., von Luxburg, U.

Theoretical Computer Science, 410(19):1749-1764, April 2009 (article)

Abstract
We study clustering algorithms based on neighborhood graphs on a random sample of data points. The question we ask is how such a graph should be constructed in order to obtain optimal clustering results. Which type of neighborhood graph should one choose, mutual k-nearest-neighbor or symmetric k-nearest-neighbor? What is the optimal parameter k? In our setting, clusters are defined as connected components of the t-level set of the underlying probability distribution. Clusters are said to be identified in the neighborhood graph if connected components in the graph correspond to the true underlying clusters. Using techniques from random geometric graph theory, we prove bounds on the probability that clusters are identified successfully, both in a noise-free and in a noisy setting. Those bounds lead to several conclusions. First, k has to be chosen surprisingly high (rather of the order n than of the order logn) to maximize the probability of cluster identification. Secondly, the major difference between the mutual and the symmetric k-nearest-neighbor graph occurs when one attempts to detect the most significant cluster only.

PDF PDF DOI [BibTex]


no image
Overlap and refractory effects in a Brain-Computer Interface speller based on the visual P300 Event-Related Potential

Martens, S., Hill, N., Farquhar, J., Schölkopf, B.

Journal of Neural Engineering, 6(2):1-9, April 2009 (article)

Abstract
We reveal the presence of refractory and overlap effects in the event-related potentials in visual P300 speller datasets, and we show their negative impact on the performance of the system. This finding has important implications for how to encode the letters that can be selected for communication. However, we show that such effects are dependent on stimulus parameters: an alternative stimulus type based on apparent motion suffers less from the refractory effects and leads to an improved letter prediction performance.

PDF DOI [BibTex]


no image
Nearest Neighbor Clustering: A Baseline Method for Consistent Clustering with Arbitrary Objective Functions

Bubeck, S., von Luxburg, U.

Journal of Machine Learning Research, 10, pages: 657-698, March 2009 (article)

Abstract
Clustering is often formulated as a discrete optimization problem. The objective is to find, among all partitions of the data set, the best one according to some quality measure. However, in the statistical setting where we assume that the finite data set has been sampled from some underlying space, the goal is not to find the best partition of the given sample, but to approximate the true partition of the underlying space. We argue that the discrete optimization approach usually does not achieve this goal, and instead can lead to inconsistency. We construct examples which provably have this behavior. As in the case of supervised learning, the cure is to restrict the size of the function classes under consideration. For appropriate “small” function classes we can prove very general consistency theorems for clustering optimization schemes. As one particular algorithm for clustering with a restricted function space we introduce “nearest neighbor clustering”. Similar to the k-nearest neighbor classifier in supervised learning, this algorithm can be seen as a general baseline algorithm to minimize arbitrary clustering objective functions. We prove that it is statistically consistent for all commonly used clustering objective functions.

PDF Web [BibTex]


no image
Protein Functional Class Prediction With a Combined Graph

Shin, H., Tsuda, K., Schölkopf, B.

Expert Systems with Applications, 36(2):3284-3292, March 2009 (article)

Abstract
In bioinformatics, there exist multiple descriptions of graphs for the same set of genes or proteins. For instance, in yeast systems, graph edges can represent different relationships such as protein–protein interactions, genetic interactions, or co-participation in a protein complex, etc. Relying on similarities between nodes, each graph can be used independently for prediction of protein function. However, since different graphs contain partly independent and partly complementary information about the problem at hand, one can enhance the total information extracted by combining all graphs. In this paper, we propose a method for integrating multiple graphs within a framework of semi-supervised learning. The method alternates between minimizing the objective function with respect to network output and with respect to combining weights. We apply the method to the task of protein functional class prediction in yeast. The proposed method performs significantly better than the same algorithm trained on any singl e graph.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Gaussian Process Dynamic Programming

Deisenroth, M., Rasmussen, C., Peters, J.

Neurocomputing, 72(7-9):1508-1524, March 2009 (article)

Abstract
Reinforcement learning (RL) and optimal control of systems with contin- uous states and actions require approximation techniques in most interesting cases. In this article, we introduce Gaussian process dynamic programming (GPDP), an approximate value-function based RL algorithm. We consider both a classic optimal control problem, where problem-specific prior knowl- edge is available, and a classic RL problem, where only very general priors can be used. For the classic optimal control problem, GPDP models the unknown value functions with Gaussian processes and generalizes dynamic programming to continuous-valued states and actions. For the RL problem, GPDP starts from a given initial state and explores the state space using Bayesian active learning. To design a fast learner, available data has to be used efficiently. Hence, we propose to learn probabilistic models of the a priori unknown transition dynamics and the value functions on the fly. In both cases, we successfully apply the resulting continuous-valued controllers to the under-actuated pendulum swing up and analyze the performances of the suggested algorithms. It turns out that GPDP uses data very efficiently and can be applied to problems, where classic dynamic programming would be cumbersome.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
An algebraic characterization of the optimum of regularized kernel methods

Dinuzzo, F., De Nicolao, G.

Machine Learning, 74(3):315-345, March 2009 (article)

Abstract
The representer theorem for kernel methods states that the solution of the associated variational problem can be expressed as the linear combination of a finite number of kernel functions. However, for non-smooth loss functions, the analytic characterization of the coefficients poses nontrivial problems. Standard approaches resort to constrained optimization reformulations which, in general, lack a closed-form solution. Herein, by a proper change of variable, it is shown that, for any convex loss function, the coefficients satisfy a system of algebraic equations in a fixed-point form, which may be directly obtained from the primal formulation. The algebraic characterization is specialized to regression and classification methods and the fixed-point equations are explicitly characterized for many loss functions of practical interest. The consequences of the main result are then investigated along two directions. First, the existence of an unconstrained smooth reformulation of the original non-smooth problem is proven. Second, in the context of SURE (Stein’s Unbiased Risk Estimation), a general formula for the degrees of freedom of kernel regression methods is derived.

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Towards quantitative PET/MRI: a review of MR-based attenuation correction techniques

Hofmann, M., Pichler, B., Schölkopf, B., Beyer, T.

European Journal of Nuclear Medicine and Molecular Imaging, 36(Supplement 1):93-104, March 2009 (article)

Abstract
Introduction Positron emission tomography (PET) is a fully quantitative technology for imaging metabolic pathways and dynamic processes in vivo. Attenuation correction of raw PET data is a prerequisite for quantification and is typically based on separate transmission measurements. In PET/CT attenuation correction, however, is performed routinely based on the available CT transmission data. Objective Recently, combined PET/magnetic resonance (MR) has been proposed as a viable alternative to PET/CT. Current concepts of PET/MRI do not include CT-like transmission sources and, therefore, alternative methods of PET attenuation correction must be found. This article reviews existing approaches to MR-based attenuation correction (MR-AC). Most groups have proposed MR-AC algorithms for brain PET studies and more recently also for torso PET/MR imaging. Most MR-AC strategies require the use of complementary MR and transmission images, or morphology templates generated from transmission images. We review and discuss these algorithms and point out challenges for using MR-AC in clinical routine. Discussion MR-AC is work-in-progress with potentially promising results from a template-based approach applicable to both brain and torso imaging. While efforts are ongoing in making clinically viable MR-AC fully automatic, further studies are required to realize the potential benefits of MR-based motion compensation and partial volume correction of the PET data.

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Generating Spike Trains with Specified Correlation Coefficients

Macke, J., Berens, P., Ecker, A., Tolias, A., Bethge, M.

Neural Computation, 21(2):397-423, February 2009 (article)

Abstract
Spike trains recorded from populations of neurons can exhibit substantial pairwise correlations between neurons and rich temporal structure. Thus, for the realistic simulation and analysis of neural systems, it is essential to have efficient methods for generating artificial spike trains with specified correlation structure. Here we show how correlated binary spike trains can be simulated by means of a latent multivariate gaussian model. Sampling from the model is computationally very efficient and, in particular, feasible even for large populations of neurons. The entropy of the model is close to the theoretical maximum for a wide range of parameters. In addition, this framework naturally extends to correlations over time and offers an elegant way to model correlated neural spike counts with arbitrary marginal distributions.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Automatic detection of preclinical neurodegeneration: Presymptomatic Huntington disease

Klöppel, S., Chu, C., Tan, G., Draganski, B., Johnson, H., Paulsen, J., Kienzle, W., Tabrizi, S., Ashburner, J., Frackowiak, R.

Neurology, 72(5):426-431, February 2009 (article)

Abstract
Background: Treatment of neurodegenerative diseases is likely to be most beneficial in the very early, possibly preclinical stages of degeneration. We explored the usefulness of fully automatic structural MRI classification methods for detecting subtle degenerative change. The availability of a definitive genetic test for Huntington disease (HD) provides an excellent metric for judging the performance of such methods in gene mutation carriers who are free of symptoms. Methods: Using the gray matter segment of MRI scans, this study explored the usefulness of a multivariate support vector machine to automatically identify presymptomatic HD gene mutation carriers (PSCs) in the absence of any a priori information. A multicenter data set of 96 PSCs and 95 age- and sex-matched controls was studied. The PSC group was subclassified into three groups based on time from predicted clinical onset, an estimate that is a function of DNA mutation size and age. Results: Subjects with at least a 33% chance of developing unequivocal signs of HD in 5 years were correctly assigned to the PSC group 69% of the time. Accuracy improved to 83% when regions affected by the disease were selected a priori for analysis. Performance was at chance when the probability of developing symptoms in 5 years was less than 10%. Conclusions: Presymptomatic Huntington disease gene mutation carriers close to estimated diagnostic onset were successfully separated from controls on the basis of single anatomic scans, without additional a priori information. Prior information is required to allow separation when degenerative changes are either subtle or variable.

Web [BibTex]

Web [BibTex]


no image
Enumeration of condition-dependent dense modules in protein interaction networks

Georgii, E., Dietmann, S., Uno, T., Pagel, P., Tsuda, K.

Bioinformatics, 25(7):933-940, February 2009 (article)

Abstract
Motivation: Modern systems biology aims at understanding how the different molecular components of a biological cell interact. Often, cellular functions are performed by complexes consisting of many different proteins. The composition of these complexes may change according to the cellular environment, and one protein may be involved in several different processes. The automatic discovery of functional complexes from protein interaction data is challenging. While previous approaches use approximations to extract dense modules, our approach exactly solves the problem of dense module enumeration. Furthermore, constraints from additional information sources such as gene expression and phenotype data can be integrated, so we can systematically mine for dense modules with interesting profiles. Results: Given a weighted protein interaction network, our method discovers all protein sets that satisfy a user-defined minimum density threshold. We employ a reverse search strategy, which allows us to exploit the density criterion in an efficient way. Our experiments show that the novel approach is feasible and produces biologically meaningful results. In comparative validation studies using yeast data, the method achieved the best overall prediction performance with respect to confirmed complexes. Moreover, by enhancing the yeast network with phenotypic and phylogenetic profiles and the human network with tissue-specific expression data, we identified condition-dependent complex variants.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Prototype Classification: Insights from Machine Learning

Graf, A., Bousquet, O., Rätsch, G., Schölkopf, B.

Neural Computation, 21(1):272-300, January 2009 (article)

Abstract
We shed light on the discrimination between patterns belonging to two different classes by casting this decoding problem into a generalized prototype framework. The discrimination process is then separated into two stages: a projection stage that reduces the dimensionality of the data by projecting it on a line and a threshold stage where the distributions of the projected patterns of both classes are separated. For this, we extend the popular mean-of-class prototype classification using algorithms from machine learning that satisfy a set of invariance properties. We report a simple yet general approach to express different types of linear classification algorithms in an identical and easy-to-visualize formal framework using generalized prototypes where these prototypes are used to express the normal vector and offset of the hyperplane. We investigate nonmargin classifiers such as the classical prototype classifier, the Fisher classifier, and the relevance vector machine. We then study hard and soft margin cl assifiers such as the support vector machine and a boosted version of the prototype classifier. Subsequently, we relate mean-of-class prototype classification to other classification algorithms by showing that the prototype classifier is a limit of any soft margin classifier and that boosting a prototype classifier yields the support vector machine. While giving novel insights into classification per se by presenting a common and unified formalism, our generalized prototype framework also provides an efficient visualization and a principled comparison of machine learning classification.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Automatic classification of brain resting states using fMRI temporal signals

Soldati, N., Robinson, S., Persello, C., Jovicich, J., Bruzzone, L.

Electronics Letters, 45(1):19-21, January 2009 (article)

Abstract
A novel technique is presented for the automatic discrimination between networks of dasiaresting statesdasia of the human brain and physiological fluctuations in functional magnetic resonance imaging (fMRI). The method is based on features identified via a statistical approach to group independent component analysis time courses, which may be extracted from fMRI data. This technique is entirely automatic and, unlike other approaches, uses temporal rather than spatial information. The method achieves 83% accuracy in the identification of resting state networks.

Web DOI [BibTex]

Web DOI [BibTex]


no image
The DICS repository: module-assisted analysis of disease-related gene lists

Dietmann, S., Georgii, E., Antonov, A., Tsuda, K., Mewes, H.

Bioinformatics, 25(6):830-831, January 2009 (article)

Abstract
The DICS database is a dynamic web repository of computationally predicted functional modules from the human protein–protein interaction network. It provides references to the CORUM, DrugBank, KEGG and Reactome pathway databases. DICS can be accessed for retrieving sets of overlapping modules and protein complexes that are significantly enriched in a gene list, thereby providing valuable information about the functional context.

Web DOI [BibTex]

Web DOI [BibTex]


no image
mGene: accurate SVM-based gene finding with an application to nematode genomes

Schweikert, G., Zien, A., Zeller, G., Behr, J., Dieterich, C., Ong, C., Philips, P., De Bona, F., Hartmann, L., Bohlen, A., Krüger, N., Sonnenburg, S., Rätsch, G.

Genome Research, 19(11):2133-43, 2009 (article)

Abstract
We present a highly accurate gene-prediction system for eukaryotic genomes, called mGene. It combines in an unprecedented manner the flexibility of generalized hidden Markov models (gHMMs) with the predictive power of modern machine learning methods, such as Support Vector Machines (SVMs). Its excellent performance was proved in an objective competition based on the genome of the nematode Caenorhabditis elegans. Considering the average of sensitivity and specificity, the developmental version of mGene exhibited the best prediction performance on nucleotide, exon, and transcript level for ab initio and multiple-genome gene-prediction tasks. The fully developed version shows superior performance in 10 out of 12 evaluation criteria compared with the other participating gene finders, including Fgenesh++ and Augustus. An in-depth analysis of mGene's genome-wide predictions revealed that approximately 2200 predicted genes were not contained in the current genome annotation. Testing a subset of 57 of these genes by RT-PCR and sequencing, we confirmed expression for 24 (42%) of them. mGene missed 300 annotated genes, out of which 205 were unconfirmed. RT-PCR testing of 24 of these genes resulted in a success rate of merely 8%. These findings suggest that even the gene catalog of a well-studied organism such as C. elegans can be substantially improved by mGene's predictions. We also provide gene predictions for the four nematodes C. briggsae, C. brenneri, C. japonica, and C. remanei. Comparing the resulting proteomes among these organisms and to the known protein universe, we identified many species-specific gene inventions. In a quality assessment of several available annotations for these genomes, we find that mGene's predictions are most accurate.

DOI [BibTex]

DOI [BibTex]


no image
Structure and activity of the N-terminal substrate recognition domains in proteasomal ATPases

Djuranovic, S., Hartmann, MD., Habeck, M., Ursinus, A., Zwickl, P., Martin, J., Lupas, AN., Zeth, K.

Molecular Cell, 34(5):580-590, 2009 (article)

Abstract
The proteasome forms the core of the protein quality control system in archaea and eukaryotes and also occurs in one bacterial lineage, the Actinobacteria. Access to its proteolytic compartment is controlled by AAA ATPases, whose N-terminal domains (N domains) are thought to mediate substrate recognition. The N domains of an archaeal proteasomal ATPase, Archaeoglobus fulgidus PAN, and of its actinobacterial homolog, Rhodococcus erythropolis ARC, form hexameric rings, whose subunits consist of an N-terminal coiled coil and a C-terminal OB domain. In ARC-N, the OB domains are duplicated and form separate rings. PAN-N and ARC-N can act as chaperones, preventing the aggregation of heterologous proteins in vitro, and this activity is preserved in various chimeras, even when these include coiled coils and OB domains from unrelated proteins. The structures suggest a molecular mechanism for substrate processing based on concerted radial motions of the coiled coils relative to the OB rings.

DOI [BibTex]

DOI [BibTex]


no image
Discussion of: Brownian Distance Covariance

Gretton, A., Fukumizu, K., Sriperumbudur, B.

The Annals of Applied Statistics, 3(4):1285-1294, 2009 (article)

[BibTex]

[BibTex]


no image
Efficient factor GARCH models and factor-DCC models

Zhang, K., Chan, L.

Quantitative Finance, 9(1):71-91, 2009 (article)

Abstract
We report that, in the estimation of univariate GARCH or multivariate generalized orthogonal GARCH (GO-GARCH) models, maximizing the likelihood is equivalent to making the standardized residuals as independent as possible. Based on this, we propose three factor GARCH models in the framework of GO-GARCH: independent-factor GARCH exploits factors that are statistically as independent as possible; factors in best-factor GARCH have the largest autocorrelation in their squared values such that their volatilities could be forecast well by univariate GARCH; and factors in conditional-decorrelation GARCH are conditionally as uncorrelated as possible. A convenient two-step method for estimating these models is introduced. Since the extracted factors may still have weak conditional correlations, we further propose factor-DCC models as an extension to the above factor GARCH models with dynamic conditional correlation (DCC) modelling the remaining conditional correlations between factors. Experimental results for the Hong Kong stock market show that conditional-decorrelation GARCH and independent-factor GARCH have better generalization performance than the original GO-GARCH, and that conditional-decorrelation GARCH (among factor GARCH models) and its extension with DCC embedded (among factor-DCC models) behave best.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Non-linear System Identification: Visual Saliency Inferred from Eye-Movement Data

Wichmann, F., Kienzle, W., Schölkopf, B., Franz, M.

Journal of Vision, 9(8):article 32, 2009 (article)

Abstract
For simple visual patterns under the experimenter's control we impose which information, or features, an observer can use to solve a given perceptual task. For natural vision tasks, however, there are typically a multitude of potential features in a given visual scene which the visual system may be exploiting when analyzing it: edges, corners, contours, etc. Here we describe a novel non-linear system identification technique based on modern machine learning methods that allows the critical features an observer uses to be inferred directly from the observer's data. The method neither requires stimuli to be embedded in noise nor is it limited to linear perceptive fields (classification images). We demonstrate our technique by deriving the critical image features observers fixate in natural scenes (bottom-up visual saliency). Unlike previous studies where the relevant structure is determined manually—e.g. by selecting Gabors as visual filters—we do not make any assumptions in this regard, but numerically infer number and properties them from the eye-movement data. We show that center-surround patterns emerge as the optimal solution for predicting saccade targets from local image structure. The resulting model, a one-layer feed-forward network with contrast gain-control, is surprisingly simple compared to previously suggested saliency models. Nevertheless, our model is equally predictive. Furthermore, our findings are consistent with neurophysiological hardware in the superior colliculus. Bottom-up visual saliency may thus not be computed cortically as has been thought previously.

Web DOI [BibTex]


no image
mGene.web: a web service for accurate computational gene finding

Schweikert, G., Behr, J., Zien, A., Zeller, G., Ong, C., Sonnenburg, S., Rätsch, G.

Nucleic Acids Research, 37, pages: W312-6, 2009 (article)

Abstract
We describe mGene.web, a web service for the genome-wide prediction of protein coding genes from eukaryotic DNA sequences. It offers pre-trained models for the recognition of gene structures including untranslated regions in an increasing number of organisms. With mGene.web, users have the additional possibility to train the system with their own data for other organisms on the push of a button, a functionality that will greatly accelerate the annotation of newly sequenced genomes. The system is built in a highly modular way, such that individual components of the framework, like the promoter prediction tool or the splice site predictor, can be used autonomously. The underlying gene finding system mGene is based on discriminative machine learning techniques and its high accuracy has been demonstrated in an international competition on nematode genomes. mGene.web is available at http://www.mgene.org/web, it is free of charge and can be used for eukaryotic genomes of small to moderate size (several hundred Mbp).

DOI [BibTex]

DOI [BibTex]

2002


no image
Optimized Support Vector Machines for Nonstationary Signal Classification

Davy, M., Gretton, A., Doucet, A., Rayner, P.

IEEE Signal Processing Letters, 9(12):442-445, December 2002 (article)

Abstract
This letter describes an efficient method to perform nonstationary signal classification. A support vector machine (SVM) algorithm is introduced and its parameters optimised in a principled way. Simulations demonstrate that our low complexity method outperforms state-of-the-art nonstationary signal classification techniques.

PostScript Web DOI [BibTex]

2002

PostScript Web DOI [BibTex]


no image
A New Discriminative Kernel from Probabilistic Models

Tsuda, K., Kawanabe, M., Rätsch, G., Sonnenburg, S., Müller, K.

Neural Computation, 14(10):2397-2414, October 2002 (article)

PDF [BibTex]

PDF [BibTex]


no image
Functional Genomics of Osteoarthritis

Aigner, T., Bartnik, E., Zien, A., Zimmer, R.

Pharmacogenomics, 3(5):635-650, September 2002 (article)

Web [BibTex]

Web [BibTex]


no image
Constructing Boosting algorithms from SVMs: an application to one-class classification.

Rätsch, G., Mika, S., Schölkopf, B., Müller, K.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9):1184-1199, September 2002 (article)

Abstract
We show via an equivalence of mathematical programs that a support vector (SV) algorithm can be translated into an equivalent boosting-like algorithm and vice versa. We exemplify this translation procedure for a new algorithm—one-class leveraging—starting from the one-class support vector machine (1-SVM). This is a first step toward unsupervised learning in a boosting framework. Building on so-called barrier methods known from the theory of constrained optimization, it returns a function, written as a convex combination of base hypotheses, that characterizes whether a given test point is likely to have been generated from the distribution underlying the training data. Simulations on one-class classification problems demonstrate the usefulness of our approach.

DOI [BibTex]

DOI [BibTex]


no image
Co-Clustering of Biological Networks and Gene Expression Data

Hanisch, D., Zien, A., Zimmer, R., Lengauer, T.

Bioinformatics, (Suppl 1):145S-154S, 18, July 2002 (article)

Abstract
Motivation: Large scale gene expression data are often analysed by clustering genes based on gene expression data alone, though a priori knowledge in the form of biological networks is available. The use of this additional information promises to improve exploratory analysis considerably. Results: We propose constructing a distance function which combines information from expression data and biological networks. Based on this function, we compute a joint clustering of genes and vertices of the network. This general approach is elaborated for metabolic networks. We define a graph distance function on such networks and combine it with a correlation-based distance function for gene expression measurements. A hierarchical clustering and an associated statistical measure is computed to arrive at a reasonable number of clusters. Our method is validated using expression data of the yeast diauxic shift. The resulting clusters are easily interpretable in terms of the biochemical network and the gene expression data and suggest that our method is able to automatically identify processes that are relevant under the measured conditions.

Web [BibTex]

Web [BibTex]


no image
Confidence measures for protein fold recognition

Sommer, I., Zien, A., von Ohsen, N., Zimmer, R., Lengauer, T.

Bioinformatics, 18(6):802-812, June 2002 (article)

[BibTex]

[BibTex]


no image
The contributions of color to recognition memory for natural scenes

Wichmann, F., Sharpe, L., Gegenfurtner, K.

Journal of Experimental Psychology: Learning, Memory and Cognition, 28(3):509-520, May 2002 (article)

Abstract
The authors used a recognition memory paradigm to assess the influence of color information on visual memory for images of natural scenes. Subjects performed 5-10% better for colored than for black-and-white images independent of exposure duration. Experiment 2 indicated little influence of contrast once the images were suprathreshold, and Experiment 3 revealed that performance worsened when images were presented in color and tested in black and white, or vice versa, leading to the conclusion that the surface property color is part of the memory representation. Experiments 4 and 5 exclude the possibility that the superior recognition memory for colored images results solely from attentional factors or saliency. Finally, the recognition memory advantage disappears for falsely colored images of natural scenes: The improvement in recognition memory depends on the color congruence of presented images with learned knowledge about the color gamut found within natural scenes. The results can be accounted for within a multiple memory systems framework.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Training invariant support vector machines

DeCoste, D., Schölkopf, B.

Machine Learning, 46(1-3):161-190, January 2002 (article)

Abstract
Practical experience has shown that in order to obtain the best possible performance, prior knowledge about invariances of a classification problem at hand ought to be incorporated into the training procedure. We describe and review all known methods for doing so in support vector machines, provide experimental results, and discuss their respective merits. One of the significant new results reported in this work is our recent achievement of the lowest reported test error on the well-known MNIST digit recognition benchmark task, with SVM training times that are also significantly faster than previous SVM methods.

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Model Selection for Small Sample Regression

Chapelle, O., Vapnik, V., Bengio, Y.

Machine Learning, 48(1-3):9-23, 2002 (article)

Abstract
Model selection is an important ingredient of many machine learning algorithms, in particular when the sample size in small, in order to strike the right trade-off between overfitting and underfitting. Previous classical results for linear regression are based on an asymptotic analysis. We present a new penalization method for performing model selection for regression that is appropriate even for small samples. Our penalization is based on an accurate estimator of the ratio of the expected training error and the expected generalization error, in terms of the expected eigenvalues of the input covariance matrix.

PostScript [BibTex]

PostScript [BibTex]


no image
Contrast discrimination with sinusoidal gratings of different spatial frequency

Bird, C., Henning, G., Wichmann, F.

Journal of the Optical Society of America A, 19(7), pages: 1267-1273, 2002 (article)

Abstract
The detectability of contrast increments was measured as a function of the contrast of a masking or “pedestal” grating at a number of different spatial frequencies ranging from 2 to 16 cycles per degree of visual angle. The pedestal grating always had the same orientation, spatial frequency and phase as the signal. The shape of the contrast increment threshold versus pedestal contrast (TvC) functions depend of the performance level used to define the “threshold,” but when both axes are normalized by the contrast corresponding to 75% correct detection at each frequency, the (TvC) functions at a given performance level are identical. Confidence intervals on the slope of the rising part of the TvC functions are so wide that it is not possible with our data to reject Weber’s Law.

PDF [BibTex]

PDF [BibTex]


no image
A Bennett Concentration Inequality and Its Application to Suprema of Empirical Processes

Bousquet, O.

C. R. Acad. Sci. Paris, Ser. I, 334, pages: 495-500, 2002 (article)

Abstract
We introduce new concentration inequalities for functions on product spaces. They allow to obtain a Bennett type deviation bound for suprema of empirical processes indexed by upper bounded functions. The result is an improvement on Rio's version \cite{Rio01b} of Talagrand's inequality \cite{Talagrand96} for equidistributed variables.

PDF PostScript [BibTex]


no image
Numerical evolution of axisymmetric, isolated systems in general relativity

Frauendiener, J., Hein, M.

Physical Review D, 66, pages: 124004-124004, 2002 (article)

Abstract
We describe in this article a new code for evolving axisymmetric isolated systems in general relativity. Such systems are described by asymptotically flat space-times, which have the property that they admit a conformal extension. We are working directly in the extended conformal manifold and solve numerically Friedrich's conformal field equations, which state that Einstein's equations hold in the physical space-time. Because of the compactness of the conformal space-time the entire space-time can be calculated on a finite numerical grid. We describe in detail the numerical scheme, especially the treatment of the axisymmetry and the boundary.

GZIP [BibTex]

GZIP [BibTex]


no image
Marginalized kernels for biological sequences

Tsuda, K., Kin, T., Asai, K.

Bioinformatics, 18(Suppl 1):268-275, 2002 (article)

PDF [BibTex]

PDF [BibTex]


no image
Stability and Generalization

Bousquet, O., Elisseeff, A.

Journal of Machine Learning Research, 2, pages: 499-526, 2002 (article)

Abstract
We define notions of stability for learning algorithms and show how to use these notions to derive generalization error bounds based on the empirical error and the leave-one-out error. The methods we use can be applied in the regression framework as well as in the classification one when the classifier is obtained by thresholding a real-valued function. We study the stability properties of large classes of learning algorithms such as regularization based algorithms. In particular we focus on Hilbert space regularization and Kullback-Leibler regularization. We demonstrate how to apply the results to SVM for regression and classification.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Subspace information criterion for non-quadratic regularizers – model selection for sparse regressors

Tsuda, K., Sugiyama, M., Müller, K.

IEEE Trans Neural Networks, 13(1):70-80, 2002 (article)

PDF [BibTex]

PDF [BibTex]


no image
Modeling splicing sites with pairwise correlations

Arita, M., Tsuda, K., Asai, K.

Bioinformatics, 18(Suppl 2):27-34, 2002 (article)

PDF [BibTex]

PDF [BibTex]