Header logo is ei


2015


no image
Causal Inference for Empirical Time Series Based on the Postulate of Independence of Cause and Mechanism

Besserve, M.

53rd Annual Allerton Conference on Communication, Control, and Computing, September 2015 (talk)

[BibTex]

2015

[BibTex]


no image
Kernel methods in medical imaging

Charpiat, G., Hofmann, M., Schölkopf, B.

In Handbook of Biomedical Imaging, pages: 63-81, 4, (Editors: Paragios, N., Duncan, J. and Ayache, N.), Springer, Berlin, Germany, June 2015 (inbook)

Web link (url) [BibTex]

Web link (url) [BibTex]


no image
Independence of cause and mechanism in brain networks

Besserve, M.

DALI workshop on Networks: Processes and Causality, April 2015 (talk)

[BibTex]

[BibTex]


no image
Information-Theoretic Implications of Classical and Quantum Causal Structures

Chaves, R., Majenz, C., Luft, L., Maciel, T., Janzing, D., Schölkopf, B., Gross, D.

18th Conference on Quantum Information Processing (QIP), 2015 (talk)

Web link (url) [BibTex]

Web link (url) [BibTex]


no image
Assessment of brain tissue damage in the Sub-Acute Stroke Region by Multiparametric Imaging using [89-Zr]-Desferal-EPO-PET/MRI

Castaneda, S. G., Katiyar, P., Russo, F., Disselhorst, J. A., Calaminus, C., Poli, S., Maurer, A., Ziemann, U., Pichler, B. J.

World Molecular Imaging Conference, 2015 (talk)

[BibTex]

[BibTex]


no image
Statistical and Machine Learning Methods for Neuroimaging: Examples, Challenges, and Extensions to Diffusion Imaging Data

O’Donnell, L. J., Schultz, T.

In Visualization and Processing of Higher Order Descriptors for Multi-Valued Data, pages: 299-319, (Editors: Hotz, I. and Schultz, T.), Springer, 2015 (inbook)

[BibTex]

[BibTex]


no image
Early time point in vivo PET/MR is a promising biomarker for determining efficacy of a novel Db(\alphaEGFR)-scTRAIL fusion protein therapy in a colon cancer model

Divine, M. R., Harant, M., Katiyar, P., Disselhorst, J. A., Bukala, D., Aidone, S., Siegemund, M., Pfizenmaier, K., Kontermann, R., Pichler, B. J.

World Molecular Imaging Conference, 2015 (talk)

[BibTex]

[BibTex]


no image
Justifying Information-Geometric Causal Inference

Janzing, D., Steudel, B., Shajarisales, N., Schölkopf, B.

In Measures of Complexity: Festschrift for Alexey Chervonenkis, pages: 253-265, 18, (Editors: Vovk, V., Papadopoulos, H. and Gammerman, A.), Springer, 2015 (inbook)

DOI [BibTex]

DOI [BibTex]


no image
The search for single exoplanet transits in the Kepler light curves

Foreman-Mackey, D., Hogg, D. W., Schölkopf, B.

IAU General Assembly, 22, pages: 2258352, 2015 (talk)

link (url) [BibTex]

link (url) [BibTex]

2007


no image
Reaction graph kernels for discovering missing enzymes in the plant secondary metabolism

Saigo, H., Hattori, M., Tsuda, K.

NIPS Workshop on Machine Learning in Computational Biology, December 2007 (talk)

Abstract
Secondary metabolic pathway in plant is important for finding druggable candidate enzymes. However, there are many enzymes whose functions are still undiscovered especially in organism-specific metabolic pathways. We propose reaction graph kernels for automatically assigning the EC numbers to unknown enzymatic reactions in a metabolic network. Experiments are carried out on KEGG/REACTION database and our method successfully predicted the first three digits of the EC number with 83% accuracy.We also exhaustively predicted missing enzymatic functions in the plant secondary metabolism pathways, and evaluated our results in biochemical validity.

Web [BibTex]

2007

Web [BibTex]


no image
Positional Oligomer Importance Matrices

Sonnenburg, S., Zien, A., Philips, P., Rätsch, G.

NIPS Workshop on Machine Learning in Computational Biology, December 2007 (talk)

Abstract
At the heart of many important bioinformatics problems, such as gene finding and function prediction, is the classification of biological sequences, above all of DNA and proteins. In many cases, the most accurate classifiers are obtained by training SVMs with complex sequence kernels, for instance for transcription starts or splice sites. However, an often criticized downside of SVMs with complex kernels is that it is very hard for humans to understand the learned decision rules and to derive biological insights from them. To close this gap, we introduce the concept of positional oligomer importance matrices (POIMs) and develop an efficient algorithm for their computation. We demonstrate how they overcome the limitations of sequence logos, and how they can be used to find relevant motifs for different biological phenomena in a straight-forward way. Note that the concept of POIMs is not limited to interpreting SVMs, but is applicable to general k−mer based scoring systems.

Web [BibTex]

Web [BibTex]


no image
Machine Learning Algorithms for Polymorphism Detection

Schweikert, G., Zeller, G., Weigel, D., Schölkopf, B., Rätsch, G.

NIPS Workshop on Machine Learning in Computational Biology, December 2007 (talk)

Web [BibTex]

Web [BibTex]


no image
An Automated Combination of Kernels for Predicting Protein Subcellular Localization

Zien, A., Ong, C.

NIPS Workshop on Machine Learning in Computational Biology, December 2007 (talk)

Abstract
Protein subcellular localization is a crucial ingredient to many important inferences about cellular processes, including prediction of protein function and protein interactions.We propose a new class of protein sequence kernels which considers all motifs including motifs with gaps. This class of kernels allows the inclusion of pairwise amino acid distances into their computation. We utilize an extension of the multiclass support vector machine (SVM)method which directly solves protein subcellular localization without resorting to the common approach of splitting the problem into several binary classification problems. To automatically search over families of possible amino acid motifs, we optimize over multiple kernels at the same time. We compare our automated approach to four other predictors on three different datasets, and show that we perform better than the current state of the art. Furthermore, our method provides some insights as to which features are most useful for determining subcellular localization, which are in agreement with biological reasoning.

Web [BibTex]

Web [BibTex]


no image
Challenges in Brain-Computer Interface Development: Induction, Measurement, Decoding, Integration

Hill, NJ.

Invited keynote talk at the launch of BrainGain, the Dutch BCI research consortium, November 2007 (talk)

Abstract
I‘ll present a perspective on Brain-Computer Interface development from T{\"u}bingen. Some of the benefits promised by BCI technology lie in the near foreseeable future, and some further away. Our motivation is to make BCI technology feasible for the people who could benefit from what it has to offer soon: namely, people in the "completely locked-in" state. I‘ll mention some of the challenges of working with this user group, and explain the specific directions they have motivated us to take in developing experimental methods, algorithms, and software.

[BibTex]

[BibTex]


no image
Policy Learning for Robotics

Peters, J.

14th International Conference on Neural Information Processing (ICONIP), November 2007 (talk)

Web [BibTex]

Web [BibTex]


no image
Hilbert Space Representations of Probability Distributions

Gretton, A.

2nd Workshop on Machine Learning and Optimization at the ISM, October 2007 (talk)

Abstract
Many problems in unsupervised learning require the analysis of features of probability distributions. At the most fundamental level, we might wish to determine whether two distributions are the same, based on samples from each - this is known as the two-sample or homogeneity problem. We use kernel methods to address this problem, by mapping probability distributions to elements in a reproducing kernel Hilbert space (RKHS). Given a sufficiently rich RKHS, these representations are unique: thus comparing feature space representations allows us to compare distributions without ambiguity. Applications include testing whether cancer subtypes are distinguishable on the basis of DNA microarray data, and whether low frequency oscillations measured at an electrode in the cortex have a different distribution during a neural spike. A more difficult problem is to discover whether two random variables drawn from a joint distribution are independent. It turns out that any dependence between pairs of random variables can be encoded in a cross-covariance operator between appropriate RKHS representations of the variables, and we may test independence by looking at a norm of the operator. We demonstrate this independence test by establishing dependence between an English text and its French translation, as opposed to French text on the same topic but otherwise unrelated. Finally, we show that this operator norm is itself a difference in feature means.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Regression with Intervals

Kashima, H., Yamazaki, K., Saigo, H., Inokuchi, A.

International Workshop on Data-Mining and Statistical Science (DMSS2007), October 2007, JSAI Incentive Award. Talk was given by Hisashi Kashima. (talk)

Web [BibTex]

Web [BibTex]


no image
Support Vector Machine Learning for Interdependent and Structured Output Spaces

Altun, Y., Hofmann, T., Tsochantaridis, I.

In Predicting Structured Data, pages: 85-104, Advances in neural information processing systems, (Editors: Bakir, G. H. , T. Hofmann, B. Schölkopf, A. J. Smola, B. Taskar, S. V. N. Vishwanathan), MIT Press, Cambridge, MA, USA, September 2007 (inbook)

Web [BibTex]

Web [BibTex]


no image
Brisk Kernel ICA

Jegelka, S., Gretton, A.

In Large Scale Kernel Machines, pages: 225-250, Neural Information Processing, (Editors: Bottou, L. , O. Chapelle, D. DeCoste, J. Weston), MIT Press, Cambridge, MA, USA, September 2007 (inbook)

Abstract
Recent approaches to independent component analysis have used kernel independence measures to obtain very good performance in ICA, particularly in areas where classical methods experience difficulty (for instance, sources with near-zero kurtosis). In this chapter, we compare two efficient extensions of these methods for large-scale problems: random subsampling of entries in the Gram matrices used in defining the independence measures, and incomplete Cholesky decomposition of these matrices. We derive closed-form, efficiently computable approximations for the gradients of these measures, and compare their performance on ICA using both artificial and music data. We show that kernel ICA can scale up to much larger problems than yet attempted, and that incomplete Cholesky decomposition performs better than random sampling.

PDF Web [BibTex]

PDF Web [BibTex]


no image
MR-Based PET Attenuation Correction: Method and Validation

Hofmann, M., Steinke, F., Scheel, V., Brady, M., Schölkopf, B., Pichler, B.

Joint Molecular Imaging Conference, September 2007 (talk)

Abstract
PET/MR combines the high soft tissue contrast of Magnetic Resonance Imaging (MRI) and the functional information of Positron Emission Tomography (PET). For quantitative PET information, correction of tissue photon attenuation is mandatory. Usually in conventional PET, the attenuation map is obtained from a transmission scan, which uses a rotating source, or from the CT scan in case of combined PET/CT. In the case of a PET/MR scanner, there is insufficient space for the rotating source and ideally one would want to calculate the attenuation map from the MR image instead. Since MR images provide information about proton density of the different tissue types, it is not trivial to use this data for PET attenuation correction. We present a method for predicting the PET attenuation map from a given the MR image, using a combination of atlas-registration and recognition of local patterns. Using "leave one out cross validation" we show on a database of 16 MR-CT image pairs that our method reliably allows estimating the CT image from the MR image. Subsequently, as in PET/CT, the PET attenuation map can be predicted from the CT image. On an additional dataset of MR/CT/PET triplets we quantitatively validate that our approach allows PET quantification with an error that is smaller than what would be clinically significant. We demonstrate our approach on T1-weighted human brain scans. However, the presented methods are more general and current research focuses on applying the established methods to human whole body PET/MRI applications.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Training a Support Vector Machine in the Primal

Chapelle, O.

In Large Scale Kernel Machines, pages: 29-50, Neural Information Processing, (Editors: Bottou, L. , O. Chapelle, D. DeCoste, J. Weston), MIT Press, Cambridge, MA, USA, September 2007, This is a slightly updated version of the Neural Computation paper (inbook)

Abstract
Most literature on Support Vector Machines (SVMs) concentrate on the dual optimization problem. In this paper, we would like to point out that the primal problem can also be solved efficiently, both for linear and non-linear SVMs, and that there is no reason to ignore this possibility. On the contrary, from the primal point of view new families of algorithms for large scale SVM training can be investigated.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Approximation Methods for Gaussian Process Regression

Quiñonero-Candela, J., Rasmussen, CE., Williams, CKI.

In Large-Scale Kernel Machines, pages: 203-223, Neural Information Processing, (Editors: Bottou, L. , O. Chapelle, D. DeCoste, J. Weston), MIT Press, Cambridge, MA, USA, September 2007 (inbook)

Abstract
A wealth of computationally efficient approximation methods for Gaussian process regression have been recently proposed. We give a unifying overview of sparse approximations, following Quiñonero-Candela and Rasmussen (2005), and a brief review of approximate matrix-vector multiplication methods.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Density Estimation of Structured Outputs in Reproducing Kernel Hilbert Spaces

Altun, Y., Smola, A.

In Predicting Structured Data, pages: 283-300, Advances in neural information processing systems, (Editors: BakIr, G. H., T. Hofmann, B. Schölkopf, A. J. Smola, B. Taskar, S. V.N. Vishwanathan), MIT Press, Cambridge, MA, USA, September 2007 (inbook)

Abstract
In this paper we study the problem of estimating conditional probability distributions for structured output prediction tasks in Reproducing Kernel Hilbert Spaces. More specically, we prove decomposition results for undirected graphical models, give constructions for kernels, and show connections to Gaussian Process classi- cation. Finally we present ecient means of solving the optimization problem and apply this to label sequence learning. Experiments on named entity recognition and pitch accent prediction tasks demonstrate the competitiveness of our approach.

Web [BibTex]

Web [BibTex]


no image
Trading Convexity for Scalability

Collobert, R., Sinz, F., Weston, J., Bottou, L.

In Large Scale Kernel Machines, pages: 275-300, Neural Information Processing, (Editors: Bottou, L. , O. Chapelle, D. DeCoste, J. Weston), MIT Press, Cambridge, MA, USA, September 2007 (inbook)

Abstract
Convex learning algorithms, such as Support Vector Machines (SVMs), are often seen as highly desirable because they offer strong practical properties and are amenable to theoretical analysis. However, in this work we show how nonconvexity can provide scalability advantages over convexity. We show how concave-convex programming can be applied to produce (i) faster SVMs where training errors are no longer support vectors, and (ii) much faster Transductive SVMs.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Bayesian methods for NMR structure determination

Habeck, M.

29th Annual Discussion Meeting: Magnetic Resonance in Biophysical Chemistry, September 2007 (talk)

Web [BibTex]

Web [BibTex]


no image
Classifying Event-Related Desynchronization in EEG, ECoG and MEG signals

Hill, N., Lal, T., Tangermann, M., Hinterberger, T., Widman, G., Elger, C., Schölkopf, B., Birbaumer, N.

In Toward Brain-Computer Interfacing, pages: 235-260, Neural Information Processing, (Editors: G Dornhege and J del R Millán and T Hinterberger and DJ McFarland and K-R Müller), MIT Press, Cambridge, MA, USA, September 2007 (inbook)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Joint Kernel Maps

Weston, J., Bakir, G., Bousquet, O., Mann, T., Noble, W., Schölkopf, B.

In Predicting Structured Data, pages: 67-84, Advances in neural information processing systems, (Editors: GH Bakir and T Hofmann and B Schölkopf and AJ Smola and B Taskar and SVN Vishwanathan), MIT Press, Cambridge, MA, USA, September 2007 (inbook)

Web [BibTex]

Web [BibTex]


no image
Brain-Computer Interfaces for Communication in Paralysis: A Clinical Experimental Approach

Hinterberger, T., Nijboer, F., Kübler, A., Matuz, T., Furdea, A., Mochty, U., Jordan, M., Lal, T., Hill, J., Mellinger, J., Bensch, M., Tangermann, M., Widman, G., Elger, C., Rosenstiel, W., Schölkopf, B., Birbaumer, N.

In Toward Brain-Computer Interfacing, pages: 43-64, Neural Information Processing, (Editors: G. Dornhege and J del R Millán and T Hinterberger and DJ McFarland and K-R Müller), MIT Press, Cambridge, MA, USA, September 2007 (inbook)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Thinking Out Loud: Research and Development of Brain Computer Interfaces

Hill, NJ.

Invited keynote talk at the Max Planck Society‘s PhDNet Workshop., July 2007 (talk)

Abstract
My principal interest is in applying machine-learning methods to the development of Brain-Computer Interfaces (BCI). This involves the classification of a user‘s intentions or mental states, or regression against some continuous intentional control signal, using brain signals obtained for example by EEG, ECoG or MEG. The long-term aim is to develop systems that a completely paralysed person (such as someone suffering from advanced Amyotrophic Lateral Sclerosis) could use to communicate. Such systems have the potential to improve the lives of many people who would be otherwise completely unable to communicate, but they are still very much in the research and development stages.

PDF [BibTex]

PDF [BibTex]


no image
Dirichlet Process Mixtures of Factor Analysers

Görür, D., Rasmussen, C.

Fifth Workshop on Bayesian Inference in Stochastic Processes (BSP5), June 2007 (talk)

Abstract
Mixture of factor analysers (MFA) is a well-known model that combines the dimensionality reduction technique of Factor Analysis (FA) with mixture modeling. The key issue in MFA is deciding on the latent dimension and the number of mixture components to be used. The Bayesian treatment of MFA has been considered by Beal and Ghahramani (2000) using variational approximation and by Fokoué and Titterington (2003) using birth-and –death Markov chain Monte Carlo (MCMC). Here, we present the nonparametric MFA model utilizing a Dirichlet process (DP) prior on the component parameters (that is, the factor loading matrix and the mean vector of each component) and describe an MCMC scheme for inference. The clustering property of the DP provides automatic selection of the number of mixture components. The latent dimensionality of each component is inferred by automatic relevance determination (ARD). Identifying the action potentials of individual neurons from extracellular recordings, known as spike sorting, is a challenging clustering problem. We apply our model for clustering the waveforms recorded from the cortex of a macaque monkey.

Web [BibTex]

Web [BibTex]


no image
New BCI approaches: Selective Attention to Auditory and Tactile Stimulus Streams

Hill, N., Raths, C.

Invited talk at the PASCAL Workshop on Methods of Data Analysis in Computational Neuroscience and Brain Computer Interfaces, June 2007 (talk)

Abstract
When considering Brain-Computer Interface (BCI) development for patients in the most severely paralysed states, there is considerable motivation to move away from BCI systems based on either motor cortex activity, or on visual stimuli. Together these account for most of current BCI research. I present the results of our recent exploration of new auditory- and tactile-stimulus-driven BCIs. The talk includes a tutorial on the construction and interpretation of classifiers which extract spatio-temporal features from event-related potential data. The effects and implications of whitening are discussed, and preliminary results on the effectiveness of a low-rank constraint (Tomioka and Aihara 2007) are shown.

PDF Web [BibTex]

PDF Web [BibTex]


no image
Towards Motor Skill Learning in Robotics

Peters, J.

Interactive Robot Learning - RSS workshop, June 2007 (talk)

Web [BibTex]

Web [BibTex]


no image
Transductive Support Vector Machines for Structured Variables

Zien, A., Brefeld, U., Scheffer, T.

International Conference on Machine Learning (ICML), June 2007 (talk)

Abstract
We study the problem of learning kernel machines transductively for structured output variables. Transductive learning can be reduced to combinatorial optimization problems over all possible labelings of the unlabeled data. In order to scale transductive learning to structured variables, we transform the corresponding non-convex, combinatorial, constrained optimization problems into continuous, unconstrained optimization problems. The discrete optimization parameters are eliminated and the resulting differentiable problems can be optimized efficiently. We study the effectiveness of the generalized TSVM on multiclass classification and label-sequence learning problems empirically.

PDF PDF Web [BibTex]

PDF PDF Web [BibTex]


no image
Impact of target-to-target interval on classification performance in the P300 speller

Martens, S., Hill, J., Farquhar, J., Schölkopf, B.

Scientific Meeting "Applied Neuroscience for Healthy Brain Function", May 2007 (talk)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Benchmarking of Policy Gradient Methods

Peters, J.

ADPRL Workshop, April 2007 (talk)

[BibTex]

[BibTex]


no image
Probabilistic Structure Calculation

Rieping, W., Habeck, M., Nilges, M.

In Structure and Biophysics: New Technologies for Current Challenges in Biology and Beyond, pages: 81-98, NATO Security through Science Series, (Editors: Puglisi, J. D.), Springer, Berlin, Germany, March 2007 (inbook)

Web DOI [BibTex]

Web DOI [BibTex]


no image
New Margin- and Evidence-Based Approaches for EEG Signal Classification

Hill, N., Farquhar, J.

Invited talk at the FaSor Jahressymposium, February 2007 (talk)

PDF [BibTex]

PDF [BibTex]


no image
On the Pre-Image Problem in Kernel Methods

BakIr, G., Schölkopf, B., Weston, J.

In Kernel Methods in Bioengineering, Signal and Image Processing, pages: 284-302, (Editors: G Camps-Valls and JL Rojo-Álvarez and M Martínez-Ramón), Idea Group Publishing, Hershey, PA, USA, January 2007 (inbook)

Abstract
In this chapter we are concerned with the problem of reconstructing patterns from their representation in feature space, known as the pre-image problem. We review existing algorithms and propose a learning based approach. All algorithms are discussed regarding their usability and complexity and evaluated on an image denoising application.

DOI [BibTex]

DOI [BibTex]


no image
Some comments on ν-SVM

Dinuzzo, F., De Nicolao, G.

In A tribute to Antonio Lepschy, pages: -, (Editors: Picci, G. , M. E. Valcher), Edizione Libreria Progetto, Padova, Italy, 2007 (inbook)

[BibTex]

[BibTex]

2006


no image
A Kernel Method for the Two-Sample-Problem

Gretton, A., Borgwardt, K., Rasch, M., Schölkopf, B., Smola, A.

20th Annual Conference on Neural Information Processing Systems (NIPS), December 2006 (talk)

Abstract
We propose two statistical tests to determine if two samples are from different distributions. Our test statistic is in both cases the distance between the means of the two samples mapped into a reproducing kernel Hilbert space (RKHS). The first test is based on a large deviation bound for the test statistic, while the second is based on the asymptotic distribution of this statistic. We show that the test statistic can be computed in $O(m^2)$ time. We apply our approach to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where our test performs strongly. We also demonstrate excellent performance when comparing distributions over graphs, for which no alternative tests currently exist.

PDF [BibTex]

2006

PDF [BibTex]


no image
Ab-initio gene finding using machine learning

Schweikert, G., Zeller, G., Zien, A., Ong, C., de Bona, F., Sonnenburg, S., Phillips, P., Rätsch, G.

NIPS Workshop on New Problems and Methods in Computational Biology, December 2006 (talk)

Web [BibTex]

Web [BibTex]


no image
Reinforcement Learning by Reward-Weighted Regression

Peters, J.

NIPS Workshop: Towards a New Reinforcement Learning? , December 2006 (talk)

Web [BibTex]

Web [BibTex]


no image
Graph boosting for molecular QSAR analysis

Saigo, H., Kadowaki, T., Kudo, T., Tsuda, K.

NIPS Workshop on New Problems and Methods in Computational Biology, December 2006 (talk)

Abstract
We propose a new boosting method that systematically combines graph mining and mathematical programming-based machine learning. Informative and interpretable subgraph features are greedily found by a series of graph mining calls. Due to our mathematical programming formulation, subgraph features and pre-calculated real-valued features are seemlessly integrated. We tested our algorithm on a quantitative structure-activity relationship (QSAR) problem, which is basically a regression problem when given a set of chemical compounds. In benchmark experiments, the prediction accuracy of our method favorably compared with the best results reported on each dataset.

Web [BibTex]

Web [BibTex]


no image
Inferring Causal Directions by Evaluating the Complexity of Conditional Distributions

Sun, X., Janzing, D., Schölkopf, B.

NIPS Workshop on Causality and Feature Selection, December 2006 (talk)

Abstract
We propose a new approach to infer the causal structure that has generated the observed statistical dependences among n random variables. The idea is that the factorization of the joint measure of cause and effect into P(cause)P(effect|cause) leads typically to simpler conditionals than non-causal factorizations. To evaluate the complexity of the conditionals we have tried two methods. First, we have compared them to those which maximize the conditional entropy subject to the observed first and second moments since we consider the latter as the simplest conditionals. Second, we have fitted the data with conditional probability measures being exponents of functions in an RKHS space and defined the complexity by a Hilbert-space semi-norm. Such a complexity measure has several properties that are useful for our purpose. We describe some encouraging results with both methods applied to real-world data. Moreover, we have combined constraint-based approaches to causal discovery (i.e., methods using only information on conditional statistical dependences) with our method in order to distinguish between causal hypotheses which are equivalent with respect to the imposed independences. Furthermore, we compare the performance to Bayesian approaches to causal inference.

Web [BibTex]


no image
Learning Optimal EEG Features Across Time, Frequency and Space

Farquhar, J., Hill, J., Schölkopf, B.

NIPS Workshop on Current Trends in Brain-Computer Interfacing, December 2006 (talk)

PDF Web [BibTex]

PDF Web [BibTex]


no image
Semi-Supervised Learning

Zien, A.

Advanced Methods in Sequence Analysis Lectures, November 2006 (talk)

Web [BibTex]

Web [BibTex]


no image
Prediction of Protein Function from Networks

Shin, H., Tsuda, K.

In Semi-Supervised Learning, pages: 361-376, Adaptive Computation and Machine Learning, (Editors: Chapelle, O. , B. Schölkopf, A. Zien), MIT Press, Cambridge, MA, USA, November 2006 (inbook)

Abstract
In computational biology, it is common to represent domain knowledge using graphs. Frequently there exist multiple graphs for the same set of nodes, representing information from different sources, and no single graph is sufficient to predict class labels of unlabelled nodes reliably. One way to enhance reliability is to integrate multiple graphs, since individual graphs are partly independent and partly complementary to each other for prediction. In this chapter, we describe an algorithm to assign weights to multiple graphs within graph-based semi-supervised learning. Both predicting class labels and searching for weights for combining multiple graphs are formulated into one convex optimization problem. The graph-combining method is applied to functional class prediction of yeast proteins.When compared with individual graphs, the combined graph with optimized weights performs significantly better than any single graph.When compared with the semidefinite programming-based support vector machine (SDP/SVM), it shows comparable accuracy in a remarkably short time. Compared with a combined graph with equal-valued weights, our method could select important graphs without loss of accuracy, which implies the desirable property of integration with selectivity.

Web [BibTex]

Web [BibTex]


no image
Discrete Regularization

Zhou, D., Schölkopf, B.

In Semi-supervised Learning, pages: 237-250, Adaptive computation and machine learning, (Editors: O Chapelle and B Schölkopf and A Zien), MIT Press, Cambridge, MA, USA, November 2006 (inbook)

Abstract
Many real-world machine learning problems are situated on finite discrete sets, including dimensionality reduction, clustering, and transductive inference. A variety of approaches for learning from finite sets has been proposed from different motivations and for different problems. In most of those approaches, a finite set is modeled as a graph, in which the edges encode pairwise relationships among the objects in the set. Consequently many concepts and methods from graph theory are adopted. In particular, the graph Laplacian is widely used. In this chapter we present a systemic framework for learning from a finite set represented as a graph. We develop discrete analogues of a number of differential operators, and then construct a discrete analogue of classical regularization theory based on those discrete differential operators. The graph Laplacian based approaches are special cases of this general discrete regularization framework. An important thing implied in this framework is that we have a wide choices of regularization on graph in addition to the widely-used graph Laplacian based one.

PDF Web [BibTex]

PDF Web [BibTex]