Header logo is ei


no image
Concentration Inequalities and Empirical Processes Theory Applied to the Analysis of Learning Algorithms

Bousquet, O.

Biologische Kybernetik, Ecole Polytechnique, 2002 (phdthesis) Accepted

New classification algorithms based on the notion of 'margin' (e.g. Support Vector Machines, Boosting) have recently been developed. The goal of this thesis is to better understand how they work, via a study of their theoretical performance. In order to do this, a general framework for real-valued classification is proposed. In this framework, it appears that the natural tools to use are Concentration Inequalities and Empirical Processes Theory. Thanks to an adaptation of these tools, a new measure of the size of a class of functions is introduced, which can be computed from the data. This allows, on the one hand, to better understand the role of eigenvalues of the kernel matrix in Support Vector Machines, and on the other hand, to obtain empirical model selection criteria.

PostScript [BibTex]

no image
Support Vector Machines: Induction Principle, Adaptive Tuning and Prior Knowledge

Chapelle, O.

Biologische Kybernetik, 2002 (phdthesis)

This thesis presents a theoretical and practical study of Support Vector Machines (SVM) and related learning algorithms. In a first part, we introduce a new induction principle from which SVMs can be derived, but some new algorithms are also presented in this framework. In a second part, after studying how to estimate the generalization error of an SVM, we suggest to choose the kernel parameters of an SVM by minimizing this estimate. Several applications such as feature selection are presented. Finally the third part deals with the incoporation of prior knowledge in a learning algorithm and more specifically, we studied the case of known invariant transormations and the use of unlabeled data.

GZIP [BibTex]