Header logo is ei


2006


no image
A Kernel Method for the Two-Sample-Problem

Gretton, A., Borgwardt, K., Rasch, M., Schölkopf, B., Smola, A.

20th Annual Conference on Neural Information Processing Systems (NIPS), December 2006 (talk)

Abstract
We propose two statistical tests to determine if two samples are from different distributions. Our test statistic is in both cases the distance between the means of the two samples mapped into a reproducing kernel Hilbert space (RKHS). The first test is based on a large deviation bound for the test statistic, while the second is based on the asymptotic distribution of this statistic. We show that the test statistic can be computed in $O(m^2)$ time. We apply our approach to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where our test performs strongly. We also demonstrate excellent performance when comparing distributions over graphs, for which no alternative tests currently exist.

PDF [BibTex]

2006

PDF [BibTex]


no image
Ab-initio gene finding using machine learning

Schweikert, G., Zeller, G., Zien, A., Ong, C., de Bona, F., Sonnenburg, S., Phillips, P., Rätsch, G.

NIPS Workshop on New Problems and Methods in Computational Biology, December 2006 (talk)

Web [BibTex]

Web [BibTex]


no image
Reinforcement Learning by Reward-Weighted Regression

Peters, J.

NIPS Workshop: Towards a New Reinforcement Learning? , December 2006 (talk)

Web [BibTex]

Web [BibTex]


no image
Graph boosting for molecular QSAR analysis

Saigo, H., Kadowaki, T., Kudo, T., Tsuda, K.

NIPS Workshop on New Problems and Methods in Computational Biology, December 2006 (talk)

Abstract
We propose a new boosting method that systematically combines graph mining and mathematical programming-based machine learning. Informative and interpretable subgraph features are greedily found by a series of graph mining calls. Due to our mathematical programming formulation, subgraph features and pre-calculated real-valued features are seemlessly integrated. We tested our algorithm on a quantitative structure-activity relationship (QSAR) problem, which is basically a regression problem when given a set of chemical compounds. In benchmark experiments, the prediction accuracy of our method favorably compared with the best results reported on each dataset.

Web [BibTex]

Web [BibTex]


no image
Inferring Causal Directions by Evaluating the Complexity of Conditional Distributions

Sun, X., Janzing, D., Schölkopf, B.

NIPS Workshop on Causality and Feature Selection, December 2006 (talk)

Abstract
We propose a new approach to infer the causal structure that has generated the observed statistical dependences among n random variables. The idea is that the factorization of the joint measure of cause and effect into P(cause)P(effect|cause) leads typically to simpler conditionals than non-causal factorizations. To evaluate the complexity of the conditionals we have tried two methods. First, we have compared them to those which maximize the conditional entropy subject to the observed first and second moments since we consider the latter as the simplest conditionals. Second, we have fitted the data with conditional probability measures being exponents of functions in an RKHS space and defined the complexity by a Hilbert-space semi-norm. Such a complexity measure has several properties that are useful for our purpose. We describe some encouraging results with both methods applied to real-world data. Moreover, we have combined constraint-based approaches to causal discovery (i.e., methods using only information on conditional statistical dependences) with our method in order to distinguish between causal hypotheses which are equivalent with respect to the imposed independences. Furthermore, we compare the performance to Bayesian approaches to causal inference.

Web [BibTex]


no image
Structure validation of the Josephin domain of ataxin-3: Conclusive evidence for an open conformation

Nicastro, G., Habeck, M., Masino, L., Svergun, DI., Pastore, A.

Journal of Biomolecular NMR, 36(4):267-277, December 2006 (article)

Abstract
The availability of new and fast tools in structure determination has led to a more than exponential growth of the number of structures solved per year. It is therefore increasingly essential to assess the accuracy of the new structures by reliable approaches able to assist validation. Here, we discuss a specific example in which the use of different complementary techniques, which include Bayesian methods and small angle scattering, resulted essential for validating the two currently available structures of the Josephin domain of ataxin-3, a protein involved in the ubiquitin/proteasome pathway and responsible for neurodegenerative spinocerebellar ataxia of type 3. Taken together, our results demonstrate that only one of the two structures is compatible with the experimental information. Based on the high precision of our refined structure, we show that Josephin contains an open cleft which could be directly implicated in the interaction with polyubiquitin chains and other partners.

Web DOI [BibTex]

Web DOI [BibTex]


no image
A Unifying View of Wiener and Volterra Theory and Polynomial Kernel Regression

Franz, M., Schölkopf, B.

Neural Computation, 18(12):3097-3118, December 2006 (article)

Abstract
Volterra and Wiener series are perhaps the best understood nonlinear system representations in signal processing. Although both approaches have enjoyed a certain popularity in the past, their application has been limited to rather low-dimensional and weakly nonlinear systems due to the exponential growth of the number of terms that have to be estimated. We show that Volterra and Wiener series can be represented implicitly as elements of a reproducing kernel Hilbert space by utilizing polynomial kernels. The estimation complexity of the implicit representation is linear in the input dimensionality and independent of the degree of nonlinearity. Experiments show performance advantages in terms of convergence, interpretability, and system sizes that can be handled.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Learning Optimal EEG Features Across Time, Frequency and Space

Farquhar, J., Hill, J., Schölkopf, B.

NIPS Workshop on Current Trends in Brain-Computer Interfacing, December 2006 (talk)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Semi-Supervised Learning

Zien, A.

Advanced Methods in Sequence Analysis Lectures, November 2006 (talk)

Web [BibTex]

Web [BibTex]


no image
Statistical Analysis of Slow Crack Growth Experiments

Pfingsten, T., Glien, K.

Journal of the European Ceramic Society, 26(15):3061-3065, November 2006 (article)

Abstract
A common approach for the determination of Slow Crack Growth (SCG) parameters are the static and dynamic loading method. Since materials with small Weibull module show a large variability in strength, a correct statistical analysis of the data is indispensable. In this work we propose the use of the Maximum Likelihood method and a Baysian analysis, which, in contrast to the standard procedures, take into account that failure strengths are Weibull distributed. The analysis provides estimates for the SCG parameters, the Weibull module, and the corresponding confidence intervals and overcomes the necessity of manual differentiation between inert and fatigue strength data. We compare the methods to a Least Squares approach, which can be considered the standard procedure. The results for dynamic loading data from the glass sealing of MEMS devices show that the assumptions inherent to the standard approach lead to significantly different estimates.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
An Improved Adaptive Power Line Interference Canceller for Electrocardiography

Martens, SMM., Mischi, M., Oei, SG., Bergmans, JWM.

IEEE Transactions on Biomedical Engineering, 53(11):2220-2231, November 2006 (article)

Abstract
Power line interference may severely corrupt a biomedical recording. Notch filters and adaptive cancellers have been suggested to suppress this interference. We propose an improved adaptive canceller for the reduction of the fundamental power line interference component and harmonics in electrocardiogram (ECG) recordings. The method tracks the amplitude, phase, and frequency of all the interference components for power line frequency deviations up to about 4 Hz. A comparison is made between the performance of our method, former adaptive cancellers, and a narrow and a wide notch filter in suppressing the fundamental power line interference component. For this purpose a real ECG signal is corrupted by an artificial power line interference signal. The cleaned signal after applying all methods is compared with the original ECG signal. Our improved adaptive canceller shows a signal-to-power-line-interference ratio for the fundamental component up to 30 dB higher than that produced by the other methods. Moreover, our method is also effective for the suppression of the harmonics of the power line interference.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Donagi-Markman cubic for Hitchin systems

Balduzzi, D.

Mathematical Research Letters, 13(6):923-933, November 2006 (article)

Abstract
The Donagi-Markman cubic is the differential of the period map for algebraic completely integrable systems. Here we prove a formula for the cubic in the case of Hitchin’s system for arbitrary semisimple g. This was originally stated (without proof) by Pantev for sln.

Web [BibTex]

Web [BibTex]


no image
A Machine Learning Approach for Determining the PET Attenuation Map from Magnetic Resonance Images

Hofmann, M., Steinke, F., Judenhofer, M., Claussen, C., Schölkopf, B., Pichler, B.

IEEE Medical Imaging Conference, November 2006 (talk)

Abstract
A promising new combination in multimodality imaging is MR-PET, where the high soft tissue contrast of Magnetic Resonance Imaging (MRI) and the functional information of Positron Emission Tomography (PET) are combined. Although many technical problems have recently been solved, it is still an open problem to determine the attenuation map from the available MR scan, as the MR intensities are not directly related to the attenuation values. One standard approach is an atlas registration where the atlas MR image is aligned with the patient MR thus also yielding an attenuation image for the patient. We also propose another approach, which to our knowledge has not been tried before: Using Support Vector Machines we predict the attenuation value directly from the local image information. We train this well-established machine learning algorithm using small image patches. Although both approaches sometimes yielded acceptable results, they also showed their specific shortcomings: The registration often fails with large deformations whereas the prediction approach is problematic when the local image structure is not characteristic enough. However, the failures often do not coincide and integration of both information sources is promising. We therefore developed a combination method extending Support Vector Machines to use not only local image structure but also atlas registered coordinates. We demonstrate the strength of this combination approach on a number of examples.

[BibTex]

[BibTex]


no image
Mining frequent stem patterns from unaligned RNA sequences

Hamada, M., Tsuda, K., Kudo, T., Kin, T., Asai, K.

Bioinformatics, 22(20):2480-2487, October 2006 (article)

Abstract
Motivation: In detection of non-coding RNAs, it is often necessary to identify the secondary structure motifs from a set of putative RNA sequences. Most of the existing algorithms aim to provide the best motif or few good motifs, but biologists often need to inspect all the possible motifs thoroughly. Results: Our method RNAmine employs a graph theoretic representation of RNA sequences, and detects all the possible motifs exhaustively using a graph mining algorithm. The motif detection problem boils down to finding frequently appearing patterns in a set of directed and labeled graphs. In the tasks of common secondary structure prediction and local motif detection from long sequences, our method performed favorably both in accuracy and in efficiency with the state-of-the-art methods such as CMFinder.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Large-Scale Gene Expression Profiling Reveals Major Pathogenetic Pathways of Cartilage Degeneration in Osteoarthritis

Aigner, T., Fundel, K., Saas, J., Gebhard, P., Haag, J., Weiss, T., Zien, A., Obermayr, F., Zimmer, R., Bartnik, E.

Arthritis and Rheumatism, 54(11):3533-3544, October 2006 (article)

Abstract
Objective. Despite many research efforts in recent decades, the major pathogenetic mechanisms of osteo- arthritis (OA), including gene alterations occurring during OA cartilage degeneration, are poorly under- stood, and there is no disease-modifying treatment approach. The present study was therefore initiated in order to identify differentially expressed disease-related genes and potential therapeutic targets. Methods. This investigation consisted of a large gene expression profiling study performed based on 78 normal and disease samples, using a custom-made complementar y DNA array covering >4,000 genes. Results. Many differentially expressed genes were identified, including the expected up-regulation of ana- bolic and catabolic matrix genes. In particular, the down-regulation of important oxidative defense genes, i.e., the genes for superoxide dismutases 2 and 3 and glutathione peroxidase 3, was prominent. This indicates that continuous oxidative stress to the cells and the matrix is one major underlying pathogenetic mecha- nism in OA. Also, genes that are involved in the phenot ypic stabilit y of cells, a feature that is greatly reduced in OA cartilage, appeared to be suppressed. Conclusion. Our findings provide a reference data set on gene alterations in OA cartilage and, importantly, indicate major mechanisms underlying central cell bio- logic alterations that occur during the OA disease process. These results identify molecular targets that can be further investigated in the search for therapeutic interventions.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Semi-Supervised Support Vector Machines and Application to Spam Filtering

Zien, A.

ECML Discovery Challenge Workshop, September 2006 (talk)

Abstract
After introducing the semi-supervised support vector machine (aka TSVM for "transductive SVM"), a few popular training strategies are briefly presented. Then the assumptions underlying semi-supervised learning are reviewed. Finally, two modern TSVM optimization techniques are applied to the spam filtering data sets of the workshop; it is shown that they can achieve excellent results, if the problem of the data being non-iid can be handled properly.

PDF Web [BibTex]


no image
Implicit Surface Modelling with a Globally Regularised Basis of Compact Support

Walder, C., Schölkopf, B., Chapelle, O.

Computer Graphics Forum, 25(3):635-644, September 2006 (article)

Abstract
We consider the problem of constructing a globally smooth analytic function that represents a surface implicitly by way of its zero set, given sample points with surface normal vectors. The contributions of the paper include a novel means of regularising multi-scale compactly supported basis functions that leads to the desirable interpolation properties previously only associated with fully supported bases. We also provide a regularisation framework for simpler and more direct treatment of surface normals, along with a corresponding generalisation of the representer theorem lying at the core of kernel-based machine learning methods. We demonstrate the techniques on 3D problems of up to 14 million data points, as well as 4D time series data and four-dimensional interpolation between three-dimensional shapes.

PDF GZIP DOI [BibTex]


no image
Inferential Structure Determination: Probabilistic determination and validation of NMR structures

Habeck, M.

Gordon Research Conference on Computational Aspects of Biomolecular NMR, September 2006 (talk)

Web [BibTex]

Web [BibTex]


no image
From outliers to prototypes: Ordering data

Harmeling, S., Dornhege, G., Tax, D., Meinecke, F., Müller, K.

Neurocomputing, 69(13-15):1608-1618, August 2006 (article)

Abstract
We propose simple and fast methods based on nearest neighbors that order objects from high-dimensional data sets from typical points to untypical points. On the one hand, we show that these easy-to-compute orderings allow us to detect outliers (i.e. very untypical points) with a performance comparable to or better than other often much more sophisticated methods. On the other hand, we show how to use these orderings to detect prototypes (very typical points) which facilitate exploratory data analysis algorithms such as noisy nonlinear dimensionality reduction and clustering. Comprehensive experiments demonstrate the validity of our approach.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
An Online Support Vector Machine for Abnormal Events Detection

Davy, M., Desobry, F., Gretton, A., Doncarli, C.

Signal Processing, 86(8):2009-2025, August 2006 (article)

Abstract
The ability to detect online abnormal events in signals is essential in many real-world Signal Processing applications. Previous algorithms require an explicit signal statistical model, and interpret abnormal events as statistical model abrupt changes. Corresponding implementation relies on maximum likelihood or on Bayes estimation theory with generally excellent performance. However, there are numerous cases where a robust and tractable model cannot be obtained, and model-free approaches need to be considered. In this paper, we investigate a machine learning, descriptor-based approach that does not require an explicit descriptors statistical model, based on Support Vector novelty detection. A sequential optimization algorithm is introduced. Theoretical considerations as well as simulations on real signals demonstrate its practical efficiency.

PDF PostScript PDF DOI [BibTex]

PDF PostScript PDF DOI [BibTex]


no image
Machine Learning Algorithms for Polymorphism Detection

Schweikert, G., Zeller, G., Clark, R., Ossowski, S., Warthmann, N., Shinn, P., Frazer, K., Ecker, J., Huson, D., Weigel, D., Schölkopf, B., Rätsch, G.

2nd ISCB Student Council Symposium, August 2006 (talk)

Abstract
Analyzing resequencing array data using machine learning, we obtain a genome-wide inventory of polymorphisms in 20 wild strains of Arabidopsis thaliana, including 750,000 single nucleotide poly- morphisms (SNPs) and thousands of highly polymorphic regions and deletions. We thus provide an unprecedented resource for the study of natural variation in plants.

Web [BibTex]

Web [BibTex]


no image
Integrating Structured Biological data by Kernel Maximum Mean Discrepancy

Borgwardt, K., Gretton, A., Rasch, M., Kriegel, H., Schölkopf, B., Smola, A.

Bioinformatics, 22(4: ISMB 2006 Conference Proceedings):e49-e57, August 2006 (article)

Abstract
Motivation: Many problems in data integration in bioinformatics can be posed as one common question: Are two sets of observations generated by the same distribution? We propose a kernel-based statistical test for this problem, based on the fact that two distributions are different if and only if there exists at least one function having different expectation on the two distributions. Consequently we use the maximum discrepancy between function means as the basis of a test statistic. The Maximum Mean Discrepancy (MMD) can take advantage of the kernel trick, which allows us to apply it not only to vectors, but strings, sequences, graphs, and other common structured data types arising in molecular biology. Results: We study the practical feasibility of an MMD-based test on three central data integration tasks: Testing cross-platform comparability of microarray data, cancer diagnosis, and data-content based schema matching for two different protein function classification schemas. In all of these experiments, including high-dimensional ones, MMD is very accurate in finding samples that were generated from the same distribution, and outperforms its best competitors. Conclusions: We have defined a novel statistical test of whether two samples are from the same distribution, compatible with both multivariate and structured data, that is fast, easy to implement, and works well, as confirmed by our experiments.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Large Scale Transductive SVMs

Collobert, R., Sinz, F., Weston, J., Bottou, L.

Journal of Machine Learning Research, 7, pages: 1687-1712, August 2006 (article)

Abstract
We show how the Concave-Convex Procedure can be applied to the optimization of Transductive SVMs, which traditionally requires solving a combinatorial search problem. This provides for the first time a highly scalable algorithm in the nonlinear case. Detailed experiments verify the utility of our approach.

PostScript PDF PDF [BibTex]

PostScript PDF PDF [BibTex]


no image
Building Support Vector Machines with Reduced Classifier Complexity

Keerthi, S., Chapelle, O., DeCoste, D.

Journal of Machine Learning Research, 7, pages: 1493-1515, July 2006 (article)

Abstract
Support vector machines (SVMs), though accurate, are not preferred in applications requiring great classification speed, due to the number of support vectors being large. To overcome this problem we devise a primal method with the following properties: (1) it decouples the idea of basis functions from the concept of support vectors; (2) it greedily finds a set of kernel basis functions of a specified maximum size ($dmax$) to approximate the SVM primal cost function well; (3) it is efficient and roughly scales as $O(ndmax^2)$ where $n$ is the number of training examples; and, (4) the number of basis functions it requires to achieve an accuracy close to the SVM accuracy is usually far less than the number of SVM support vectors.

PDF [BibTex]

PDF [BibTex]


no image
Inferential structure determination: Overview and new developments

Habeck, M.

Sixth CCPN Annual Conference: Efficient and Rapid Structure Determination by NMR, July 2006 (talk)

Web [BibTex]

Web [BibTex]


no image
ARTS: Accurate Recognition of Transcription Starts in Human

Sonnenburg, S., Zien, A., Rätsch, G.

Bioinformatics, 22(14):e472-e480, July 2006 (article)

Abstract
Motivation: One of the most important features of genomic DNA are the protein-coding genes. While it is of great value to identify those genes and the encoded proteins, it is also crucial to understand how their transcription is regulated. To this end one has to identify the corresponding promoters and the contained transcription factor binding sites. TSS finders can be used to locate potential promoters. They may also be used in combination with other signal and content detectors to resolve entire gene structures. Results: We have developed a novel kernel based method - called ARTS - that accurately recognizes transcription start sites in human. The application of otherwise too computationally expensive Support Vector Machines was made possible due to the use of efficient training and evaluation techniques using suffix tries. In a carefully designed experimental study, we compare our TSS finder to state-of-the-art methods from the literature: McPromoter, Eponine and FirstEF. For given false positive rates within a reasonable range, we consistently achieve considerably higher true positive rates. For instance, ARTS finds about 24% true positives at a false positive rate of 1/1000, where the other methods find less than half (10.5%). Availability: Datasets, model selection results, whole genome predictions, and additional experimental results are available at http://www.fml.tuebingen.mpg.de/raetsch/projects/arts

Web DOI [BibTex]

Web DOI [BibTex]


no image
Large Scale Multiple Kernel Learning

Sonnenburg, S., Rätsch, G., Schäfer, C., Schölkopf, B.

Journal of Machine Learning Research, 7, pages: 1531-1565, July 2006 (article)

Abstract
While classical kernel-based learning algorithms are based on a single kernel, in practice it is often desirable to use multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for classification, leading to a convex quadratically constrained quadratic program. We show that it can be rewritten as a semi-infinite linear program that can be efficiently solved by recycling the standard SVM implementations. Moreover, we generalize the formulation and our method to a larger class of problems, including regression and one-class classification. Experimental results show that the proposed algorithm works for hundred thousands of examples or hundreds of kernels to be combined, and helps for automatic model selection, improving the interpretability of the learning result. In a second part we discuss general speed up mechanism for SVMs, especially when used with sparse feature maps as appear for string kernels, allowing us to train a string kernel SVM on a 10 million real-world splice data set from computational biology. We integrated multiple kernel learning in our machine learning toolbox SHOGUN for which the source code is publicly available at http://www.fml.tuebingen.mpg.de/raetsch/projects/shogun.

PDF [BibTex]

PDF [BibTex]


no image
Factorial coding of natural images: how effective are linear models in removing higher-order dependencies?

Bethge, M.

Journal of the Optical Society of America A, 23(6):1253-1268, June 2006 (article)

Abstract
The performance of unsupervised learning models for natural images is evaluated quantitatively by means of information theory. We estimate the gain in statistical independence (the multi-information reduction) achieved with independent component analysis (ICA), principal component analysis (PCA), zero-phase whitening, and predictive coding. Predictive coding is translated into the transform coding framework, where it can be characterized by the constraint of a triangular filter matrix. A randomly sampled whitening basis and the Haar wavelet are included into the comparison as well. The comparison of all these methods is carried out for different patch sizes, ranging from 2x2 to 16x16 pixels. In spite of large differences in the shape of the basis functions, we find only small differences in the multi-information between all decorrelation transforms (5% or less) for all patch sizes. Among the second-order methods, PCA is optimal for small patch sizes and predictive coding performs best for large patch sizes. The extra gain achieved with ICA is always less than 2%. In conclusion, the `edge filters‘ found with ICA lead only to a surprisingly small improvement in terms of its actual objective.

PDF Web [BibTex]


no image
MCMC inference in (Conditionally) Conjugate Dirichlet Process Gaussian Mixture Models

Rasmussen, C., Görür, D.

ICML Workshop on Learning with Nonparametric Bayesian Methods, June 2006 (talk)

Abstract
We compare the predictive accuracy of the Dirichlet Process Gaussian mixture models using conjugate and conditionally conjugate priors and show that better density models result from using the wider class of priors. We explore several MCMC schemes exploiting conditional conjugacy and show their computational merits on several multidimensional density estimation problems.

Web [BibTex]

Web [BibTex]


no image
Sampling for non-conjugate infinite latent feature models

Görür, D., Rasmussen, C.

(Editors: Bernardo, J. M.), 8th Valencia International Meeting on Bayesian Statistics (ISBA), June 2006 (talk)

Abstract
Latent variable models are powerful tools to model the underlying structure in data. Infinite latent variable models can be defined using Bayesian nonparametrics. Dirichlet process (DP) models constitute an example of infinite latent class models in which each object is assumed to belong to one of the, mutually exclusive, infinitely many classes. Recently, the Indian buffet process (IBP) has been defined as an extension of the DP. IBP is a distribution over sparse binary matrices with infinitely many columns which can be used as a distribution for non-exclusive features. Inference using Markov chain Monte Carlo (MCMC) in conjugate IBP models has been previously described, however requiring conjugacy restricts the use of IBP. We describe an MCMC algorithm for non-conjugate IBP models. Modelling the choice behaviour is an important topic in psychology, economics and related fields. Elimination by Aspects (EBA) is a choice model that assumes each alternative has latent features with associated weights that lead to the observed choice outcomes. We formulate a non-parametric version of EBA by using IBP as the prior over the latent binary features. We infer the features of objects that lead to the choice data by using our sampling scheme for inference.

PDF [BibTex]

PDF [BibTex]


no image
Classifying EEG and ECoG Signals without Subject Training for Fast BCI Implementation: Comparison of Non-Paralysed and Completely Paralysed Subjects

Hill, N., Lal, T., Schröder, M., Hinterberger, T., Wilhelm, B., Nijboer, F., Mochty, U., Widman, G., Elger, C., Schölkopf, B., Kübler, A., Birbaumer, N.

IEEE Transactions on Neural Systems and Rehabilitation Engineering, 14(2):183-186, June 2006 (article)

Abstract
We summarize results from a series of related studies that aim to develop a motor-imagery-based brain-computer interface using a single recording session of EEG or ECoG signals for each subject. We apply the same experimental and analytical methods to 11 non-paralysed subjects (8 EEG, 3 ECoG), and to 5 paralysed subjects (4 EEG, 1 ECoG) who had been unable to communicate for some time. While it was relatively easy to obtain classifiable signals quickly from most of the non-paralysed subjects, it proved impossible to classify the signals obtained from the paralysed patients by the same methods. This highlights the fact that though certain BCI paradigms may work well with healthy subjects, this does not necessarily indicate success with the target user group. We outline possible reasons for this failure to transfer.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
SCARNA: Fast and Accurate Structural Alignment of RNA Sequences by Matching Fixed-Length Stem Fragments

Tabei, Y., Tsuda, K., Kin, T., Asai, K.

Bioinformatics, 22(14):1723-1729, May 2006 (article)

Abstract
The functions of non-coding RNAs are strongly related to their secondary structures, but it is known that a secondary structure prediction of a single sequence is not reliable. Therefore, we have to collect similar RNA sequences with a common secondary structure for the analyses of a new non-coding RNA without knowing the exact secondary structure itself. Therefore, the sequence comparison in searching similar RNAs should consider not only their sequence similarities but their potential secondary structures. Sankoff‘s algorithm predicts the common secondary structures of the sequences, but it is computationally too expensive to apply to large-scale analyses. Because we often want to compare a large number of cDNA sequences or to search similar RNAs in the whole genome sequences, much faster algorithms are required. We propose a new method of comparing RNA sequences based on the structural alignments of the fixed-length fragments of the stem candidates. The implemented software, SCARNA (Stem Candidate Aligner for RNAs), is fast enough to apply to the long sequences in the large-scale analyses. The accuracy of the alignments is better or comparable to the much slower existing algorithms.

PDF Web DOI [BibTex]


no image
Response Modeling with Support Vector Machines

Shin, H., Cho, S.

Expert Systems with Applications, 30(4):746-760, May 2006 (article)

Abstract
Support Vector Machine (SVM) employs Structural Risk minimization (SRM) principle to generalize better than conventional machine learning methods employing the traditional Empirical Risk Minimization (ERM) principle. When applying SVM to response modeling in direct marketing,h owever,one has to deal with the practical difficulties: large training data,class imbalance and binary SVM output. This paper proposes ways to alleviate or solve the addressed difficulties through informative sampling,u se of different costs for different classes, and use of distance to decision boundary. This paper also provides various evaluation measures for response models in terms of accuracies,lift chart analysis and computational efficiency.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Optimizing amino acid substitution matrices with a local alignment kernel

Saigo, H., Vert, J., Akutsu, T.

BMC Bioinformatics, 7(246):1-12, May 2006 (article)

Abstract
Background Detecting remote homologies by direct comparison of protein sequences remains a challenging task. We had previously developed a similarity score between sequences, called a local alignment kernel, that exhibits good performance for this task in combination with a support vector machine. The local alignment kernel depends on an amino acid substitution matrix. Since commonly used BLOSUM or PAM matrices for scoring amino acid matches have been optimized to be used in combination with the Smith-Waterman algorithm, the matrices optimal for the local alignment kernel can be different. Results Contrary to the local alignment score computed by the Smith-Waterman algorithm, the local alignment kernel is differentiable with respect to the amino acid substitution and its derivative can be computed efficiently by dynamic programming. We optimized the substitution matrix by classical gradient descent by setting an objective function that measures how well the local alignment kernel discriminates homologs from non-homologs in the COG database. The local alignment kernel exhibits better performance when it uses the matrices and gap parameters optimized by this procedure than when it uses the matrices optimized for the Smith-Waterman algorithm. Furthermore, the matrices and gap parameters optimized for the local alignment kernel can also be used successfully by the Smith-Waterman algorithm. Conclusion This optimization procedure leads to useful substitution matrices, both for the local alignment kernel and the Smith-Waterman algorithm. The best performance for homology detection is obtained by the local alignment kernel.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Advances in Neural Information Processing Systems 18: Proceedings of the 2005 Conference

Weiss, Y., Schölkopf, B., Platt, J.

Proceedings of the 19th Annual Conference on Neural Information Processing Systems (NIPS 2005), pages: 1676, MIT Press, Cambridge, MA, USA, 19th Annual Conference on Neural Information Processing Systems (NIPS), May 2006 (proceedings)

Abstract
The annual Neural Information Processing Systems (NIPS) conference is the flagship meeting on neural computation. It draws a diverse group of attendees--physicists, neuroscientists, mathematicians, statisticians, and computer scientists. The presentations are interdisciplinary, with contributions in algorithms, learning theory, cognitive science, neuroscience, brain imaging, vision, speech and signal processing, reinforcement learning and control, emerging technologies, and applications. Only twenty-five percent of the papers submitted are accepted for presentation at NIPS, so the quality is exceptionally high. This volume contains the papers presented at the December 2005 meeting, held in Vancouver.

Web [BibTex]

Web [BibTex]


no image
The Effect of Artifacts on Dependence Measurement in fMRI

Gretton, A., Belitski, A., Murayama, Y., Schölkopf, B., Logothetis, N.

Magnetic Resonance Imaging, 24(4):401-409, April 2006 (article)

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Einer für viele: Ein Linux-PC bedient mehrere Arbeitsplätze

Renner, M., Stark, S.

c‘t, 2006(10):228-235, April 2006 (article)

Abstract
Ein moderner PC ist rechenstark genug, um mehrere Anwender gleichzeitig zu bedienen; und Linux als Multi-User-System ist von Hause aus darauf vorbereitet, mehrere gleichzeitig angemeldete Benutzer mit einem eigenen grafischen Desktop zu versorgen. Mit einem Kernelpatch und ein wenig Bastelei lassen sich an einen Linux-PC sogar mehrere unabh{\"a}ngige Monitore, Tastaturen und M{\"a}use anschließen.

Web [BibTex]

Web [BibTex]


no image
Phase noise and the classification of natural images

Wichmann, F., Braun, D., Gegenfurtner, K.

Vision Research, 46(8-9):1520-1529, April 2006 (article)

Abstract
We measured the effect of global phase manipulations on a rapid animal categorization task. The Fourier spectra of our images of natural scenes were manipulated by adding zero-mean random phase noise at all spatial frequencies. The phase noise was the independent variable, uniformly and symmetrically distributed between 0 degree and ±180 degrees. Subjects were remarkably resistant to phase noise. Even with ±120 degree phase noise subjects were still performing at 75% correct. The high resistance of the subjects’ animal categorization rate to phase noise suggests that the visual system is highly robust to such random image changes. The proportion of correct answers closely followed the correlation between original and the phase noise-distorted images. Animal detection rate was higher when the same task was performed with contrast reduced versions of the same natural images, at contrasts where the contrast reduction mimicked that resulting from our phase randomization. Since the subjects’ categorization rate was better in the contrast experiment, reduction of local contrast alone cannot explain the performance in the phase noise experiment. This result obtained with natural images differs from those obtained for simple sinusoidal stimuli were performance changes due to phase changes are attributed to local contrast changes only. Thus the global phasechange accompanying disruption of image structure such as edges and object boundaries at different spatial scales reduces object classification over and above the performance deficit resulting from reducing contrast. Additional colour information improves the categorization performance by 2 %.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Functional census of mutation sequence spaces: The example of p53 cancer rescue mutants

Danziger, S., Swamidass, S., Zeng, J., Dearth, L., Lu, Q., Cheng, J., Cheng, J., Hoang, V., Saigo, H., Luo, R., Baldi, P., Brachmann, R., Lathrop, R.

IEEE Transactions on Computational Biology and Bioinformatics, 3(2):114-125, April 2006 (article)

Abstract
Many biomedical problems relate to mutant functional properties across a sequence space of interest, e.g., flu, cancer, and HIV. Detailed knowledge of mutant properties and function improves medical treatment and prevention. A functional census of p53 cancer rescue mutants would aid the search for cancer treatments from p53 mutant rescue. We devised a general methodology for conducting a functional census of a mutation sequence space by choosing informative mutants early. The methodology was tested in a double-blind predictive test on the functional rescue property of 71 novel putative p53 cancer rescue mutants iteratively predicted in sets of three (24 iterations). The first double-blind 15-point moving accuracy was 47 percent and the last was 86 percent; r = 0.01 before an epiphanic 16th iteration and r = 0.92 afterward. Useful mutants were chosen early (overall r = 0.80). Code and data are freely available (http://www.igb.uci.edu/research/research.html, corresponding authors: R.H.L. for computation and R.K.B. for biology).

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
A Direct Method for Building Sparse Kernel Learning Algorithms

Wu, M., Schölkopf, B., BakIr, G.

Journal of Machine Learning Research, 7, pages: 603-624, April 2006 (article)

Abstract
Many Kernel Learning Algorithms(KLA), including Support Vector Machine (SVM), result in a Kernel Machine (KM), such as a kernel classifier, whose key component is a weight vector in a feature space implicitly introduced by a positive definite kernel function. This weight vector is usually obtained by solving a convex optimization problem. Based on this fact we present a direct method to build Sparse Kernel Learning Algorithms (SKLA) by adding one more constraint to the original convex optimization problem, such that the sparseness of the resulting KM is explicitly controlled while at the same time the performance of the resulting KM can be kept as high as possible. A gradient based approach is provided to solve this modified optimization problem. Applying this method to the SVM results in a concrete algorithm for building Sparse Large Margin Classifiers (SLMC). Further analysis of the SLMC algorithm indicates that it essentially finds a discriminating subspace that can be spanned by a small number of vectors, and in this subspace, the different classes of data are linearly well separated. Experimental results over several classification benchmarks demonstrate the effectiveness of our approach.

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
An Inventory of Sequence Polymorphisms For Arabidopsis

Clark, R., Ossowski, S., Schweikert, G., Rätsch, G., Shinn, P., Zeller, G., Warthmann, N., Fu, G., Hinds, D., Chen, H., Frazer, K., Huson, D., Schölkopf, B., Nordborg, M., Ecker, J., Weigel, D.

17th International Conference on Arabidopsis Research, April 2006 (talk)

Abstract
We have used high-density oligonucleotide arrays to characterize common sequence variation in 20 wild strains of Arabidopsis thaliana that were chosen for maximal genetic diversity. Both strands of each possible SNP of the 119 Mb reference genome were represented on the arrays, which were hybridized with whole genome, isothermally amplified DNA to minimize ascertainment biases. Using two complementary approaches, a model based algorithm, and a newly developed machine learning method, we identified over 550,000 SNPs with a false discovery rate of ~ 0.03 (average of 1 SNP for every 216 bp of the genome). A heuristic algorithm predicted in addition ~700 highly polymorphic or deleted regions per accession. Over 700 predicted polymorphisms with major functional effects (e.g., premature stop codons, or deletions of coding sequence) were validated by dideoxy sequencing. Using this data set, we provide the first systematic description of the types of genes that harbor major effect polymorphisms in natural populations at moderate allele frequencies. The data also provide an unprecedented resource for the study of genetic variation in an experimentally tractable, multicellular model organism.

[BibTex]

[BibTex]


no image
Machine Learning and Applications in Biology

Shin, H.

6th Course in Bioinformatics for Molecular Biologist, March 2006 (talk)

Abstract
The emergence of the fields of computational biology and bioinformatics has alleviated the burden of solving many biological problems, saving the time and cost required for experiments and also providing predictions that guide new experiments. Within computational biology, machine learning algorithms have played a central role in dealing with the flood of biological data. The goal of this tutorial is to raise awareness and comprehension of machine learning so that biologists can properly match the task at hand to the corresponding analytical approach. We start by categorizing biological problem settings and introduce the general machine learning schemes that fit best to each or these categories. We then explore representative models in further detail, from traditional statistical models to recent kernel models, presenting several up-to-date research projects in bioinfomatics to exemplify how biological questions can benefit from a machine learning approach. Finally, we discuss how cooperation between biologists and machine learners might be made smoother.

PDF [BibTex]

PDF [BibTex]


no image
Kernel extrapolation

Vishwanathan, SVN., Borgwardt, KM., Guttman, O., Smola, AJ.

Neurocomputing, 69(7-9):721-729, March 2006 (article)

Abstract
We present a framework for efficient extrapolation of reduced rank approximations, graph kernels, and locally linear embeddings (LLE) to unseen data. We also present a principled method to combine many of these kernels and then extrapolate them. Central to our method is a theorem for matrix approximation, and an extension of the representer theorem to handle multiple joint regularization constraints. Experiments in protein classification demonstrate the feasibility of our approach.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Statistical Properties of Kernel Principal Component Analysis

Blanchard, G., Bousquet, O., Zwald, L.

Machine Learning, 66(2-3):259-294, March 2006 (article)

Abstract
We study the properties of the eigenvalues of Gram matrices in a non-asymptotic setting. Using local Rademacher averages, we provide data-dependent and tight bounds for their convergence towards eigenvalues of the corresponding kernel operator. We perform these computations in a functional analytic framework which allows to deal implicitly with reproducing kernel Hilbert spaces of infinite dimension. This can have applications to various kernel algorithms, such as Support Vector Machines (SVM). We focus on Kernel Principal Component Analysis (KPCA) and, using such techniques, we obtain sharp excess risk bounds for the reconstruction error. In these bounds, the dependence on the decay of the spectrum and on the closeness of successive eigenvalues is made explicit.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Network-based de-noising improves prediction from microarray data

Kato, T., Murata, Y., Miura, K., Asai, K., Horton, P., Tsuda, K., Fujibuchi, W.

BMC Bioinformatics, 7(Suppl. 1):S4-S4, March 2006 (article)

Abstract
Prediction of human cell response to anti-cancer drugs (compounds) from microarray data is a challenging problem, due to the noise properties of microarrays as well as the high variance of living cell responses to drugs. Hence there is a strong need for more practical and robust methods than standard methods for real-value prediction. We devised an extended version of the off-subspace noise-reduction (de-noising) method to incorporate heterogeneous network data such as sequence similarity or protein-protein interactions into a single framework. Using that method, we first de-noise the gene expression data for training and test data and also the drug-response data for training data. Then we predict the unknown responses of each drug from the de-noised input data. For ascertaining whether de-noising improves prediction or not, we carry out 12-fold cross-validation for assessment of the prediction performance. We use the Pearson‘s correlation coefficient between the true and predicted respon se values as the prediction performance. De-noising improves the prediction performance for 65% of drugs. Furthermore, we found that this noise reduction method is robust and effective even when a large amount of artificial noise is added to the input data. We found that our extended off-subspace noise-reduction method combining heterogeneous biological data is successful and quite useful to improve prediction of human cell cancer drug responses from microarray data.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Data mining problems and solutions for response modeling in CRM

Cho, S., Shin, H., Yu, E., Ha, K., MacLachlan, D.

Entrue Journal of Information Technology, 5(1):55-64, March 2006 (article)

Abstract
We present three data mining problems that are often encountered in building a response model. They are robust modeling, variable selection and data selection. Respective algorithmic solutions are given. They are bagging based ensemble, genetic algorithm based wrapper approach and nearest neighbor-based data selection in that order. A real world data set from Direct Marketing Educational Foundation, or DMEF4, is used to show their effectiveness. Proposed methods were found to solve the problems in a practical way.

PDF [BibTex]

PDF [BibTex]


no image
Model-based Design Analysis and Yield Optimization

Pfingsten, T., Herrmann, D., Rasmussen, C.

IEEE Transactions on Semiconductor Manufacturing, 19(4):475-486, February 2006 (article)

Abstract
Fluctuations are inherent to any fabrication process. Integrated circuits and micro-electro-mechanical systems are particularly affected by these variations, and due to high quality requirements the effect on the devices’ performance has to be understood quantitatively. In recent years it has become possible to model the performance of such complex systems on the basis of design specifications, and model-based Sensitivity Analysis has made its way into industrial engineering. We show how an efficient Bayesian approach, using a Gaussian process prior, can replace the commonly used brute-force Monte Carlo scheme, making it possible to apply the analysis to computationally costly models. We introduce a number of global, statistically justified sensitivity measures for design analysis and optimization. Two models of integrated systems serve us as case studies to introduce the analysis and to assess its convergence properties. We show that the Bayesian Monte Carlo scheme can save costly simulation runs and can ensure a reliable accuracy of the analysis.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Prenatal development of ocular dominance and orientation maps in a self-organizing model of V1

Jegelka, S., Bednar, J., Miikkulainen, R.

Neurocomputing, 69(10-12):1291-1296, February 2006 (article)

Abstract
How orientation and ocular-dominance (OD) maps develop before visual experience begins is controversial. Possible influences include molecular signals and spontaneous activity, but their contributions remain unclear. This paper presents LISSOM simulations suggesting that previsual spontaneous activity alone is sufficient for realistic OR and OD maps to develop. Individual maps develop robustly with various previsual patterns, and are aided by background noise. However, joint OR/OD maps depend crucially on how correlated the patterns are between eyes, even over brief initial periods. Therefore, future biological experiments should account for multiple activity sources, and should measure map interactions rather than maps of single features.

PDF DOI [BibTex]

PDF DOI [BibTex]