Header logo is ei


2000


no image
Knowledge Discovery in Databases: An Information Retrieval Perspective

Ong, CS.

Malaysian Journal of Computer Science, 13(2):54-63, December 2000 (article)

Abstract
The current trend of increasing capabilities in data generation and collection has resulted in an urgent need for data mining applications, also called knowledge discovery in databases. This paper identifies and examines the issues involved in extracting useful grains of knowledge from large amounts of data. It describes a framework to categorise data mining systems. The author also gives an overview of the issues pertaining to data pre processing, as well as various information gathering methodologies and techniques. The paper covers some popular tools such as classification, clustering, and generalisation. A summary of statistical and machine learning techniques used currently is also provided.

PDF [BibTex]

2000

PDF [BibTex]


no image
A Simple Iterative Approach to Parameter Optimization

Zien, A., Zimmer, R., Lengauer, T.

Journal of Computational Biology, 7(3,4):483-501, November 2000 (article)

Abstract
Various bioinformatics problems require optimizing several different properties simultaneously. For example, in the protein threading problem, a scoring function combines the values for different parameters of possible sequence-to-structure alignments into a single score to allow for unambiguous optimization. In this context, an essential question is how each property should be weighted. As the native structures are known for some sequences, a partial ordering on optimal alignments to other structures, e.g., derived from structural comparisons, may be used to adjust the weights. To resolve the arising interdependence of weights and computed solutions, we propose a heuristic approach: iterating the computation of solutions (here, threading alignments) given the weights and the estimation of optimal weights of the scoring function given these solutions via systematic calibration methods. For our application (i.e., threading), this iterative approach results in structurally meaningful weights that significantly improve performance on both the training and the test data sets. In addition, the optimized parameters show significant improvements on the recognition rate for a grossly enlarged comprehensive benchmark, a modified recognition protocol as well as modified alignment types (local instead of global and profiles instead of single sequences). These results show the general validity of the optimized weights for the given threading program and the associated scoring contributions.

Web [BibTex]

Web [BibTex]


no image
Identification of Drug Target Proteins

Zien, A., Küffner, R., Mevissen, T., Zimmer, R., Lengauer, T.

ERCIM News, 43, pages: 16-17, October 2000 (article)

Web [BibTex]

Web [BibTex]


no image
Engineering Support Vector Machine Kernels That Recognize Translation Initiation Sites

Zien, A., Rätsch, G., Mika, S., Schölkopf, B., Lengauer, T., Müller, K.

Bioinformatics, 16(9):799-807, September 2000 (article)

Abstract
Motivation: In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points at which regions start that code for proteins. These points are called translation initiation sites (TIS). Results: The task of finding TIS can be modeled as a classification problem. We demonstrate the applicability of support vector machines for this task, and show how to incorporate prior biological knowledge by engineering an appropriate kernel function. With the described techniques the recognition performance can be improved by 26% over leading existing approaches. We provide evidence that existing related methods (e.g. ESTScan) could profit from advanced TIS recognition.

Web DOI [BibTex]

Web DOI [BibTex]


no image
A Meanfield Approach to the Thermodynamics of a Protein-Solvent System with Application to the Oligomerization of the Tumour Suppressor p53.

Noolandi, J., Davison, TS., Vokel, A., Nie, F., Kay, C., Arrowsmith, C.

Proceedings of the National Academy of Sciences of the United States of America, 97(18):9955-9960, August 2000 (article)

Web [BibTex]

Web [BibTex]


no image
New Support Vector Algorithms

Schölkopf, B., Smola, A., Williamson, R., Bartlett, P.

Neural Computation, 12(5):1207-1245, May 2000 (article)

Abstract
We propose a new class of support vector algorithms for regression and classification. In these algorithms, a parameter {nu} lets one effectively control the number of support vectors. While this can be useful in its own right, the parameterization has the additional benefit of enabling us to eliminate one of the other free parameters of the algorithm: the accuracy parameter {epsilon} in the regression case, and the regularization constant C in the classification case. We describe the algorithms, give some theoretical results concerning the meaning and the choice of {nu}, and report experimental results.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Bounds on Error Expectation for Support Vector Machines

Vapnik, V., Chapelle, O.

Neural Computation, 12(9):2013-2036, 2000 (article)

Abstract
We introduce the concept of span of support vectors (SV) and show that the generalization ability of support vector machines (SVM) depends on this new geometrical concept. We prove that the value of the span is always smaller (and can be much smaller) than the diameter of the smallest sphere containing th e support vectors, used in previous bounds. We also demonstate experimentally that the prediction of the test error given by the span is very accurate and has direct application in model selection (choice of the optimal parameters of the SVM)

GZIP [BibTex]

GZIP [BibTex]

1993


no image
Presynaptic and Postsynaptic Competition in models for the Development of Neuromuscular Connections

Rasmussen, CE., Willshaw, DJ.

Biological Cybernetics, 68, pages: 409-419, 1993 (article)

Abstract
The development of the nervous system involves in many cases interactions on a local scale rather than the execution of a fully specified genetic blueprint. The problem is to discover the nature of these interactions and the factors on which they depend. The withdrawal of polyinnervation in developing muscle is an example where such competitive interactions play an important role. We examine the possible types of competition in formal models that have plausible biological implementations. By relating the behaviour of the models to the anatomical and physiological findings we show that a model that incorporates two types of competition is superior to others. Analysis suggests that the phenomenon of intrinsic withdrawal is a side effect of the competitive mechanisms rather than a separate non-competitive feature. Full scale computer simulations have been used to confirm the capabilities of this model.

PostScript [BibTex]

1993

PostScript [BibTex]


no image
Cartesian Dynamics of Simple Molecules: X Linear Quadratomics (C∞v Symmetry).

Anderson, A., Davison, T., Nagi, N., Schlueter, S.

Spectroscopy Letters, 26, pages: 509-522, 1993 (article)

[BibTex]

[BibTex]