Header logo is ei

PALMA: Perfect Alignments using Large Margin Algorithms

2006

Conference Paper

ei


Despite many years of research on how to properly align sequences in the presence of sequencing errors, alternative splicing and micro-exons, the correct alignment of mRNA sequences to genomic DNA is still a challenging task. We present a novel approach based on large margin learning that combines kernel based splice site predictions with common sequence alignment techniques. By solving a convex optimization problem, our algorithm -- called PALMA -- tunes the parameters of the model such that the true alignment scores higher than all other alignments. In an experimental study on the alignments of mRNAs containing artificially generated micro-exons, we show that our algorithm drastically outperforms all other methods: It perfectly aligns all 4358 sequences on an hold-out set, while the best other method misaligns at least 90 of them. Moreover, our algorithm is very robust against noise in the query sequence: when deleting, inserting, or mutating up to 50% of the query sequence, it still aligns 95% of all sequences correctly, while other methods achieve less than 36% accuracy. For datasets, additional results and a stand-alone alignment tool see http://www.fml.mpg.de/raetsch/projects/palma.

Author(s): Rätsch, G. and Hepp, B. and Schulze, U. and Ong, CS.
Book Title: GCB 2006
Journal: Proceedings of the German Conference on Bioinformatics 2006 (GCB 2006)
Pages: 104-113
Year: 2006
Month: September
Day: 0
Editors: Huson, D. , O. Kohlbacher, A. Lupas, K. Nieselt, A. Zell
Publisher: Gesellschaft f{\"u}r Informatik

Department(s): Empirical Inference
Bibtex Type: Conference Paper (inproceedings)

Event Name: German Conference on Bioinformatics 2006
Event Place: Tübingen, Germany

Address: Bonn, Germany
Digital: 0
Language: en
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik

Links: PDF
Web

BibTex

@inproceedings{4157,
  title = {PALMA: Perfect Alignments using Large Margin Algorithms},
  author = {R{\"a}tsch, G. and Hepp, B. and Schulze, U. and Ong, CS.},
  journal = {Proceedings of the German Conference on Bioinformatics 2006 (GCB 2006)},
  booktitle = {GCB 2006},
  pages = {104-113},
  editors = {Huson, D. , O. Kohlbacher, A. Lupas, K. Nieselt, A. Zell},
  publisher = {Gesellschaft f{\"u}r Informatik},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  address = {Bonn, Germany},
  month = sep,
  year = {2006},
  month_numeric = {9}
}