Header logo is ei

Machine Learning approaches to protein ranking: discriminative, semi-supervised, scalable algorithms

2003

Technical Report

ei


A key tool in protein function discovery is the ability to rank databases of proteins given a query amino acid sequence. The most successful method so far is a web-based tool called PSI-BLAST which uses heuristic alignment of a profile built using the large unlabeled database. It has been shown that such use of global information via an unlabeled data improves over a local measure derived from a basic pairwise alignment such as performed by PSI-BLAST's predecessor, BLAST. In this article we look at ways of leveraging techniques from the field of machine learning for the problem of ranking. We show how clustering and semi-supervised learning techniques, which aim to capture global structure in data, can significantly improve over PSI-BLAST.

Author(s): Weston, J. and Leslie, C. and Elisseeff, A. and Noble, WS.
Number (issue): 111
Year: 2003
Month: June
Day: 0

Department(s): Empirical Inference
Bibtex Type: Technical Report (techreport)

Institution: Max Planck Institute for Biological Cybernetics, Tübingen, Germany

Links: PDF

BibTex

@techreport{2300,
  title = {Machine Learning approaches to protein ranking: discriminative, semi-supervised, scalable algorithms},
  author = {Weston, J. and Leslie, C. and Elisseeff, A. and Noble, WS.},
  number = {111},
  institution = {Max Planck Institute for Biological Cybernetics, T{\"u}bingen, Germany},
  month = jun,
  year = {2003},
  month_numeric = {6}
}