Header logo is ei


2009


no image
Toward a Theory of Consciousness

Tononi, G., Balduzzi, D.

In The Cognitive Neurosciences, pages: 1201-1220, (Editors: Gazzaniga, M.S.), MIT Press, Cambridge, MA, USA, October 2009 (inbook)

Web [BibTex]

2009

Web [BibTex]


no image
Text Clustering with Mixture of von Mises-Fisher Distributions

Sra, S., Banerjee, A., Ghosh, J., Dhillon, I.

In Text mining: classification, clustering, and applications, pages: 121-161, Chapman & Hall/CRC data mining and knowledge discovery series, (Editors: Srivastava, A. N. and Sahami, M.), CRC Press, Boca Raton, FL, USA, June 2009 (inbook)

Web DOI [BibTex]

Web DOI [BibTex]


no image
Data Mining for Biologists

Tsuda, K.

In Biological Data Mining in Protein Interaction Networks, pages: 14-27, (Editors: Li, X. and Ng, S.-K.), Medical Information Science Reference, Hershey, PA, USA, May 2009 (inbook)

Abstract
In this tutorial chapter, we review basics about frequent pattern mining algorithms, including itemset mining, association rule mining and graph mining. These algorithms can find frequently appearing substructures in discrete data. They can discover structural motifs, for example, from mutation data, protein structures and chemical compounds. As they have been primarily used for business data, biological applications are not so common yet, but their potential impact would be large. Recent advances in computers including multicore machines and ever increasing memory capacity support the application of such methods to larger datasets. We explain technical aspects of the algorithms, but do not go into details. Current biological applications are summarized and possible future directions are given.

Web [BibTex]

Web [BibTex]


no image
Large Margin Methods for Part of Speech Tagging

Altun, Y.

In Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods, pages: 141-160, (Editors: Keshet, J. and Bengio, S.), Wiley, Hoboken, NJ, USA, January 2009 (inbook)

Web [BibTex]

Web [BibTex]


no image
Covariate shift and local learning by distribution matching

Gretton, A., Smola, A., Huang, J., Schmittfull, M., Borgwardt, K., Schölkopf, B.

In Dataset Shift in Machine Learning, pages: 131-160, (Editors: Quiñonero-Candela, J., Sugiyama, M., Schwaighofer, A. and Lawrence, N. D.), MIT Press, Cambridge, MA, USA, 2009 (inbook)

Abstract
Given sets of observations of training and test data, we consider the problem of re-weighting the training data such that its distribution more closely matches that of the test data. We achieve this goal by matching covariate distributions between training and test sets in a high dimensional feature space (specifically, a reproducing kernel Hilbert space). This approach does not require distribution estimation. Instead, the sample weights are obtained by a simple quadratic programming procedure. We provide a uniform convergence bound on the distance between the reweighted training feature mean and the test feature mean, a transductive bound on the expected loss of an algorithm trained on the reweighted data, and a connection to single class SVMs. While our method is designed to deal with the case of simple covariate shift (in the sense of Chapter ??), we have also found benefits for sample selection bias on the labels. Our correction procedure yields its greatest and most consistent advantages when the learning algorithm returns a classifier/regressor that is simpler" than the data might suggest.

PDF Web [BibTex]

PDF Web [BibTex]


no image
An introduction to Kernel Learning Algorithms

Gehler, P., Schölkopf, B.

In Kernel Methods for Remote Sensing Data Analysis, pages: 25-48, 2, (Editors: Gustavo Camps-Valls and Lorenzo Bruzzone), Wiley, New York, NY, USA, 2009 (inbook)

Abstract
Kernel learning algorithms are currently becoming a standard tool in the area of machine learning and pattern recognition. In this chapter we review the fundamental theory of kernel learning. As the basic building block we introduce the kernel function, which provides an elegant and general way to compare possibly very complex objects. We then review the concept of a reproducing kernel Hilbert space and state the representer theorem. Finally we give an overview of the most prominent algorithms, which are support vector classification and regression, Gaussian Processes and kernel principal analysis. With multiple kernel learning and structured output prediction we also introduce some more recent advancements in the field.

link (url) DOI [BibTex]

link (url) DOI [BibTex]

1998


no image
Support-Vektor-Lernen

Schölkopf, B.

In Ausgezeichnete Informatikdissertationen 1997, pages: 135-150, (Editors: G Hotz and H Fiedler and P Gorny and W Grass and S Hölldobler and IO Kerner and R Reischuk), Teubner Verlag, Stuttgart, 1998 (inbook)

[BibTex]

1998

[BibTex]