Header logo is ei

ei Thumb sm thumb suvrit sra
Suvrit Sra
Alumni
ei Thumb sm dsc03356
Rohit Babbar
Research Scientist
21 results

2015


no image
Positive definite matrices and the S-divergence

Sra, S.

Proceedings of the American Mathematical Society, 2015, Published electronically: October 22, 2015 (article)

DOI Project Page [BibTex]

2015

DOI Project Page [BibTex]

2014


no image
Efficient nearest neighbors via robust sparse hashing

Cherian, A., Sra, S., Morellas, V., Papanikolopoulos, N.

IEEE Transactions on Image Processing, 23(8):3646-3655, 2014 (article)

DOI Project Page [BibTex]

2014

DOI Project Page [BibTex]


no image
Fast Newton methods for the group fused lasso

Wytock, M., Sra, S., Kolter, J. Z.

In Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence, pages: 888-897, (Editors: Zhang, N. L. and Tian, J.), AUAI Press, UAI, 2014 (inproceedings)

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]

2012


no image
A non-monotonic method for large-scale non-negative least squares

Kim, D., Sra, S., Dhillon, I. S.

Optimization Methods and Software, 28(5):1012-1039, Febuary 2012 (article)

DOI Project Page [BibTex]

2012

DOI Project Page [BibTex]


no image
Fast projection onto mixed-norm balls with applications

Sra, S.

Minining and Knowledge Discovery (DMKD), 25(2):358-377, September 2012 (article)

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
Accelerating Nearest Neighbor Search on Manycore Systems

Cayton, L.

In Parallel Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International, pages: 402-413, IPDPS, May 2012 (inproceedings)

Abstract
We develop methods for accelerating metric similarity search that are effective on modern hardware. Our algorithms factor into easily parallelizable components, making them simple to deploy and efficient on multicore CPUs and GPUs. Despite the simple structure of our algorithms, their search performance is provably sublinear in the size of the database, with a factor dependent only on its intrinsic dimensionality. We demonstrate that our methods provide substantial speedups on a range of datasets and hardware platforms. In particular, we present results on a 48-core server machine, on graphics hardware, and on a multicore desktop.

Web DOI Project Page [BibTex]

Web DOI Project Page [BibTex]


no image
Scalable nonconvex inexact proximal splitting

Sra, S.

In Advances of Neural Information Processing Systems 25, pages: 539-547, (Editors: P Bartlett and FCN Pereira and CJC. Burges and L Bottou and KQ Weinberger), Curran Associates Inc., 26th Annual Conference on Neural Information Processing Systems (NIPS), 2012 (inproceedings)

PDF Project Page [BibTex]

PDF Project Page [BibTex]

2011


no image
Optimization for Machine Learning

Sra, S., Nowozin, S., Wright, S.

pages: 494, Neural information processing series, MIT Press, Cambridge, MA, USA, December 2011 (book)

Abstract
The interplay between optimization and machine learning is one of the most important developments in modern computational science. Optimization formulations and methods are proving to be vital in designing algorithms to extract essential knowledge from huge volumes of data. Machine learning, however, is not simply a consumer of optimization technology but a rapidly evolving field that is itself generating new optimization ideas. This book captures the state of the art of the interaction between optimization and machine learning in a way that is accessible to researchers in both fields. Optimization approaches have enjoyed prominence in machine learning because of their wide applicability and attractive theoretical properties. The increasing complexity, size, and variety of today's machine learning models call for the reassessment of existing assumptions. This book starts the process of reassessment. It describes the resurgence in novel contexts of established frameworks such as first-order methods, stochastic approximations, convex relaxations, interior-point methods, and proximal methods. It also devotes attention to newer themes such as regularized optimization, robust optimization, gradient and subgradient methods, splitting techniques, and second-order methods. Many of these techniques draw inspiration from other fields, including operations research, theoretical computer science, and subfields of optimization. The book will enrich the ongoing cross-fertilization between the machine learning community and these other fields, and within the broader optimization community.

Web Project Page [BibTex]

2011

Web Project Page [BibTex]


no image
Projected Newton-type methods in machine learning

Schmidt, M., Kim, D., Sra, S.

In Optimization for Machine Learning, pages: 305-330, (Editors: Sra, S., Nowozin, S. and Wright, S. J.), MIT Press, Cambridge, MA, USA, December 2011 (inbook)

Abstract
We consider projected Newton-type methods for solving large-scale optimization problems arising in machine learning and related fields. We first introduce an algorithmic framework for projected Newton-type methods by reviewing a canonical projected (quasi-)Newton method. This method, while conceptually pleasing, has a high computation cost per iteration. Thus, we discuss two variants that are more scalable, namely, two-metric projection and inexact projection methods. Finally, we show how to apply the Newton-type framework to handle non-smooth objectives. Examples are provided throughout the chapter to illustrate machine learning applications of our framework.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]


no image
Analysis of Fixed-Point and Coordinate Descent Algorithms for Regularized Kernel Methods

Dinuzzo, F.

IEEE Transactions on Neural Networks, 22(10):1576-1587, October 2011 (article)

Abstract
In this paper, we analyze the convergence of two general classes of optimization algorithms for regularized kernel methods with convex loss function and quadratic norm regularization. The first methodology is a new class of algorithms based on fixed-point iterations that are well-suited for a parallel implementation and can be used with any convex loss function. The second methodology is based on coordinate descent, and generalizes some techniques previously proposed for linear support vector machines. It exploits the structure of additively separable loss functions to compute solutions of line searches in closed form. The two methodologies are both very easy to implement. In this paper, we also show how to remove non-differentiability of the objective functional by exactly reformulating a convex regularization problem as an unconstrained differentiable stabilization problem.

Web DOI Project Page [BibTex]


no image
Fast Newton-type Methods for Total-Variation with Applications

Barbero, A., Sra, S.

In Proceedings of the 28th International Conference on Machine Learning, ICML 2011, pages: 313-320, (Editors: L Getoor and T Scheffer), Omnipress, 28th International Conference on Machine Learning (ICML), 2011 (inproceedings)

Project Page [BibTex]

Project Page [BibTex]


no image
Efficient Similarity Search for Covariance Matrices via the Jensen-Bregman LogDet Divergence

Cherian, A., Sra, S., Banerjee, A., Papanikolopoulos, N.

In IEEE International Conference on Computer Vision, ICCV 2011, pages: 2399-2406, (Editors: DN Metaxas and L Quan and A Sanfeliu and LJ Van Gool), IEEE, 13th International Conference on Computer Vision (ICCV), 2011 (inproceedings)

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
Approximation Bounds for Inference using Cooperative Cut

Jegelka, S., Bilmes, J.

In pages: 577-584, (Editors: Getoor, L. , T. Scheffer), International Machine Learning Society, Madison, WI, USA, 28th International Conference on Machine Learning (ICML), July 2011 (inproceedings)

Abstract
We analyze a family of probability distributions that are characterized by an embedded combinatorial structure. This family includes models having arbitrary treewidth and arbitrary sized factors. Unlike general models with such freedom, where the “most probable explanation” (MPE) problem is inapproximable, the combinatorial structure within our model, in particular the indirect use of submodularity, leads to several MPE algorithms that all have approximation guarantees.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]


no image
Online submodular minimization for combinatorial structures

Jegelka, S., Bilmes, J.

In pages: 345-352, (Editors: Getoor, L. , T. Scheffer), International Machine Learning Society, Madison, WI, USA, 28th International Conference on Machine Learning (ICML), July 2011 (inproceedings)

Abstract
Most results for online decision problems with structured concepts, such as trees or cuts, assume linear costs. In many settings, however, nonlinear costs are more realistic. Owing to their non-separability, these lead to much harder optimization problems. Going beyond linearity, we address online approximation algorithms for structured concepts that allow the cost to be submodular, i.e., nonseparable. In particular, we show regret bounds for three Hannan-consistent strategies that capture different settings. Our results also tighten a regret bound for unconstrained online submodular minimization.

PDF PDF Web Project Page [BibTex]

PDF PDF Web Project Page [BibTex]


no image
Submodularity beyond submodular energies: coupling edges in graph cuts

Jegelka, S., Bilmes, J.

In pages: 1897-1904, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2011 (inproceedings)

Abstract
We propose a new family of non-submodular global energy functions that still use submodularity internally to couple edges in a graph cut. We show it is possible to develop an efficient approximation algorithm that, thanks to the internal submodularity, can use standard graph cuts as a subroutine. We demonstrate the advantages of edge coupling in a natural setting, namely image segmentation. In particular, for finestructured objects and objects with shading variation, our structured edge coupling leads to significant improvements over standard approaches.

PDF Web DOI Project Page [BibTex]

PDF Web DOI Project Page [BibTex]

2010


no image
Tackling Box-Constrained Optimization via a New Projected Quasi-Newton Approach

Kim, D., Sra, S., Dhillon, I.

SIAM Journal on Scientific Computing, 32(6):3548-3563 , December 2010 (article)

Abstract
Numerous scientific applications across a variety of fields depend on box-constrained convex optimization. Box-constrained problems therefore continue to attract research interest. We address box-constrained (strictly convex) problems by deriving two new quasi-Newton algorithms. Our algorithms are positioned between the projected-gradient [J. B. Rosen, J. SIAM, 8 (1960), pp. 181–217] and projected-Newton [D. P. Bertsekas, SIAM J. Control Optim., 20 (1982), pp. 221–246] methods. We also prove their convergence under a simple Armijo step-size rule. We provide experimental results for two particular box-constrained problems: nonnegative least squares (NNLS), and nonnegative Kullback–Leibler (NNKL) minimization. For both NNLS and NNKL our algorithms perform competitively as compared to well-established methods on medium-sized problems; for larger problems our approach frequently outperforms the competition.

Web DOI Project Page [BibTex]

2010

Web DOI Project Page [BibTex]


no image
A scalable trust-region algorithm with application to mixed-norm regression

Kim, D., Sra, S., Dhillon, I.

In Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pages: 519-526, (Editors: Fürnkranz, J. , T. Joachims), International Machine Learning Society, Madison, WI, USA, 27th International Conference on Machine Learning (ICML), June 2010 (inproceedings)

Abstract
We present a new algorithm for minimizing a convex loss-function subject to regularization. Our framework applies to numerous problems in machine learning and statistics; notably, for sparsity-promoting regularizers such as ℓ1 or ℓ1, ∞ norms, it enables efficient computation of sparse solutions. Our approach is based on the trust-region framework with nonsmooth objectives, which allows us to build on known results to provide convergence analysis. We avoid the computational overheads associated with the conventional Hessian approximation used by trust-region methods by instead using a simple separable quadratic approximation. This approximation also enables use of proximity operators for tackling nonsmooth regularizers. We illustrate the versatility of our resulting algorithm by specializing it to three mixed-norm regression problems: group lasso [36], group logistic regression [21], and multi-task lasso [19]. We experiment with both synthetic and real-world large-scale data—our method is seen to be competitive, robust, and scalable.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]

2009


no image
Efficient Bregman Range Search

Cayton, L.

In Advances in Neural Information Processing Systems 22, pages: 243-251, (Editors: Bengio, Y. , D. Schuurmans, J. Lafferty, C. Williams, A. Culotta), Curran, Red Hook, NY, USA, 23rd Annual Conference on Neural Information Processing Systems (NIPS), 2009 (inproceedings)

Abstract
We develop an algorithm for efficient range search when the notion of dissimilarity is given by a Bregman divergence. The range search task is to return all points in a potentially large database that are within some specified distance of a query. It arises in many learning algorithms such as locally-weighted regression, kernel density estimation, neighborhood graph-based algorithms, and in tasks like outlier detection and information retrieval. In metric spaces, efficient range search-like algorithms based on spatial data structures have been deployed on a variety of statistical tasks. Here we describe an algorithm for range search for an arbitrary Bregman divergence. This broad class of dissimilarity measures includes the relative entropy, Mahalanobis distance, Itakura-Saito divergence, and a variety of matrix divergences. Metric methods cannot be directly applied since Bregman divergences do not in general satisfy the triangle inequality. We derive geometric properties of Bregman divergences that yield an efficient algorithm for range search based on a recently proposed space decomposition for Bregman divergences.

PDF Web Project Page [BibTex]

2009

PDF Web Project Page [BibTex]


no image
Convex Perturbations for Scalable Semidefinite Programming

Kulis, B., Sra, S., Dhillon, I.

In JMLR Workshop and Conference Proceedings Volume 5: AISTATS 2009, pages: 296-303, (Editors: van Dyk, D. , M. Welling), MIT Press, Cambridge, MA, USA, Twelfth International Conference on Artificial Intelligence and Statistics, April 2009 (inproceedings)

Abstract
Many important machine learning problems are modeled and solved via semidefinite programs; examples include metric learning, nonlinear embedding, and certain clustering problems. Often, off-the-shelf software is invoked for the associated optimization, which can be inappropriate due to excessive computational and storage requirements. In this paper, we introduce the use of convex perturbations for solving semidefinite programs (SDPs), and for a specific perturbation we derive an algorithm that has several advantages over existing techniques: a) it is simple, requiring only a few lines of Matlab, b) it is a first-order method, and thereby scalable, and c) it can easily exploit the structure of a given SDP (e.g., when the constraint matrices are low-rank, a situation common to several machine learning SDPs). A pleasant byproduct of our method is a fast, kernelized version of the large-margin nearest neighbor metric learning algorithm. We demonstrate that our algorithm is effective in finding fast approximations to large-scale SDPs arising in some machine learning applications.

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]