Del Favero, S.Varagnolo, D.Dinuzzo, F.Schenato, L.Pillonetto, G.
3210-3215, December, 2011
We analyze the problem of data sets reduction for support vector classification. The work is also motivated by distributed problems, where sensors collect binary measurements at different locations moving inside an environment that needs to be divided into a collection of regions labeled in two different ways. The scope is to let each agent retain and exchange only those measurements that are mostly informative for the collective reconstruction of the decision boundary. For the case of separable classes, we provide the exact conditions and an efficient algorithm to determine if an element in the training set can become a support vector when new data arrive. The analysis is then extended to the non-separable case deriving a sufficient discardability condition and a general data selection
scheme for classification. Numerical experiments relative to the distributed problem show that the proposed procedure allows the agents to exchange a small amount of the collected data to obtain a highly predictive decision boundary.
IEEE Transactions on Neural Networks, 22(2):290-303, February, 2011
A client-server architecture to simultaneously solve multiple learning tasks from distributed datasets is described. In such architecture, each client corresponds to an individual learning task and the associated dataset of examples. The goal of the architecture is to perform information fusion from multiple datasets while preserving privacy of individual data. The role of the server is to collect data in real time from the clients and codify the information in a common database. Such information can be used by all the clients to solve their individual learning task, so that each client can exploit the information content of all the datasets without actually having access to private data of others. The proposed algorithmic framework, based on regularization and kernel methods, uses a suitable class of “mixed effect” kernels. The methodology is illustrated through a simulated recommendation system, as well as an experiment involving pharmacological data coming from a multicentric clinical trial.
JMLR Workshop and Conference Proceedings Volume 20:181-196, November, 2011
Output kernel learning techniques allow to simultaneously learn a vector-valued function and a positive semidefinite matrix which describes the relationships between the outputs. In this paper, we introduce a new formulation that imposes a low-rank constraint on the output kernel and operates directly on a factor of the kernel matrix. First, we investigate the connection between output kernel learning and a regularization problem for an architecture
with two layers. Then, we show that a variety of methods such as nuclear norm regularized regression, reduced-rank regression, principal component analysis, and low rank matrix approximation can be seen as special cases of the output kernel learning framework. Finally, we introduce a block coordinate descent strategy for learning low-rank output kernels.
IEEE Transactions on Neural Networks, 22(10):1576-1587, October, 2011
In this paper, we analyze the convergence of two general classes of optimization algorithms for regularized kernel methods with convex loss function and quadratic norm regularization. The first methodology is a new class of algorithms based on fixed-point iterations that are well-suited for a parallel implementation and can be used with any convex loss function. The second methodology is based on coordinate descent, and generalizes some techniques previously proposed for linear support vector machines. It exploits the structure of additively separable loss functions to compute solutions of line searches in closed form. The two methodologies are both very easy to implement. In this paper, we also show how to remove non-differentiability of the objective functional by exactly reformulating a convex regularization problem as an unconstrained differentiable stabilization problem.
In this paper, we review some recent research directions regarding the synthesis of functions from data using kernel methods. We start by highlighting the central role of the representer theorem and then outline some recent advances in large scale optimization, learning the kernel, and multi-task learning.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(2):193-205, February, 2010
Standard single-task kernel methods have recently been extended to the case of multitask learning in the context of regularization theory. There are experimental results, especially in biomedicine, showing the benefit of the multitask approach compared to the single-task one. However, a possible drawback is computational complexity. For instance, when regularization networks are used, complexity scales as the cube of the overall number of training data, which may be large when several tasks are involved. The aim of this paper is to derive an efficient computational scheme for an important class of multitask kernels. More precisely, a quadratic loss is assumed and each task consists of the sum of a common term and a task-specific one. Within a Bayesian setting, a recursive online algorithm is obtained, which updates both estimates and confidence intervals as new data become available. The algorithm is tested on two simulated problems and a real data set relative to xenobiotics administration in human patients.
In this note, a class of discontinuous feedback laws that switch over branches of parabolas in the auxiliary state plane is analyzed. Conditions are provided under which controllers belonging to this class are second order sliding-mode algorithms: they ensure uniform global finite-time output stability for uncertain systems of relative degree two.
It is presented a discontinuous controller that ensure uniform finite-time zero stabilization of the output for uncertain SISO systems of relative degree two, while keeping the auxiliary system state within a prescribed convex polygon. The proposed method extends applicability of second order sliding modes controllers to the case of uncertain dynamical systems with constraints.
The representer theorem for kernel methods states that the solution of the associated variational problem can be expressed as the linear combination of a finite number of kernel functions. However, for non-smooth loss functions, the analytic characterization of the coefficients poses nontrivial problems. Standard approaches resort to constrained optimization reformulations which, in general, lack a closed-form solution. Herein, by a proper change of variable, it is shown that, for any convex loss function, the coefficients satisfy a system of algebraic equations in a fixed-point form, which may be directly obtained from the primal formulation. The algebraic characterization is specialized to regression and classification methods and the fixed-point equations are explicitly characterized for many loss functions of practical interest. The consequences of the main result are then investigated along two directions. First, the existence of an unconstrained smooth reformulation of the original non-smooth problem is proven. Second, in the context of SURE (Stein’s Unbiased Risk Estimation), a general formula for the degrees of freedom of kernel regression methods is derived.
Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems