Header logo is ei


2008


no image
Poisson Geometry of Parabolic Bundles on Elliptic Curves

Balduzzi, D.

International Journal of Mathematics , 19(3):339-367, March 2008 (article)

Abstract
The moduli space of G-bundles on an elliptic curve with additional flag structure admits a Poisson structure. The bivector can be defined using double loop group, loop group and sheaf cohomology constructions. We investigate the links between these methods and for the case SL2 perform explicit computations, describing the bracket and its leaves in detail.

Web DOI [BibTex]

2008

Web DOI [BibTex]


no image
Inferring Spike Trains From Local Field Potentials

Rasch, M., Gretton, A., Murayama, Y., Maass, W., Logothetis, N.

Journal of Neurophysiology, 99(3):1461-1476, March 2008 (article)

Abstract
We investigated whether it is possible to infer spike trains solely on the basis of the underlying local field potentials (LFPs). Using support vector machines and linear regression models, we found that in the primary visual cortex (V1) of monkeys, spikes can indeed be inferred from LFPs, at least with moderate success. Although there is a considerable degree of variation across electrodes, the low-frequency structure in spike trains (in the 100-ms range) can be inferred with reasonable accuracy, whereas exact spike positions are not reliably predicted. Two kinds of features of the LFP are exploited for prediction: the frequency power of bands in the high gamma-range (40–90 Hz) and information contained in lowfrequency oscillations ( 10 Hz), where both phase and power modulations are informative. Information analysis revealed that both features code (mainly) independent aspects of the spike-to-LFP relationship, with the low-frequency LFP phase coding for temporally clustered spiking activity. Although both features and prediction quality are similar during seminatural movie stimuli and spontaneous activity, prediction performance during spontaneous activity degrades much more slowly with increasing electrode distance. The general trend of data obtained with anesthetized animals is qualitatively mirrored in that of a more limited data set recorded in V1 of non-anesthetized monkeys. In contrast to the cortical field potentials, thalamic LFPs (e.g., LFPs derived from recordings in the dorsal lateral geniculate nucleus) hold no useful information for predicting spiking activity.

Web DOI [BibTex]

Web DOI [BibTex]


no image
ISD: A Software Package for Bayesian NMR Structure Calculation

Rieping, W., Nilges, M., Habeck, M.

Bioinformatics, 24(8):1104-1105, February 2008 (article)

Abstract
SUMMARY: The conventional approach to calculating biomolecular structures from nuclear magnetic resonance (NMR) data is often viewed as subjective due to its dependence on rules of thumb for deriving geometric constraints and suitable values for theory parameters from noisy experimental data. As a result, it can be difficult to judge the precision of an NMR structure in an objective manner. The Inferential Structure Determination (ISD) framework, which has been introduced recently, addresses this problem by using Bayesian inference to derive a probability distribution that represents both the unknown structure and its uncertainty. It also determines additional unknowns, such as theory parameters, that normally need be chosen empirically. Here we give an overview of the ISD software package, which implements this methodology. AVAILABILITY: The program is available at http://www.bioc.cam.ac.uk/isd

Web DOI [BibTex]

Web DOI [BibTex]


no image
Probabilistic Structure Calculation

Nilges, M., Habeck, M., Rieping, W.

Comptes Rendus Chimie, 11(4-5):356-369, February 2008 (article)

Abstract
Molecular structures are usually calculated from experimental data with some method of energy minimisation or non-linear optimisation. Key aims of a structure calculation are to estimate the coordinate uncertainty, and to provide a meaningful measure of the quality of the fit to the data. We discuss approaches to optimally combine prior information and experimental data and the connection to probability theory. We analyse the appropriate statistics for NOEs and NOE-derived distances, and the related question of restraint potentials. Finally, we will discuss approaches to determine the appropriate weight on the experimental evidence and to obtain in this way an estimate of the data quality from the structure calculation. Whereas objective estimates of coordinates and their uncertainties can only be obtained by a full Bayesian treatment of the problem, standard structure calculation methods continue to play an important role. To obtain the full benefit of these methods, they should be founded on a rigorous Baye sian analysis.

Web DOI [BibTex]

Web DOI [BibTex]


no image
Optimization Techniques for Semi-Supervised Support Vector Machines

Chapelle, O., Sindhwani, V., Keerthi, S.

Journal of Machine Learning Research, 9, pages: 203-233, February 2008 (article)

Abstract
Due to its wide applicability, the problem of semi-supervised classification is attracting increasing attention in machine learning. Semi-Supervised Support Vector Machines (S3VMs) are based on applying the margin maximization principle to both labeled and unlabeled examples. Unlike SVMs, their formulation leads to a non-convex optimization problem. A suite of algorithms have recently been proposed for solving S3VMs. This paper reviews key ideas in this literature. The performance and behavior of various S3VMs algorithms is studied together, under a common experimental setting.

PDF [BibTex]

PDF [BibTex]


no image
Fast Projection-based Methods for the Least Squares Nonnegative Matrix Approximation Problem

Kim, D., Sra, S., Dhillon, I.

Statistical Analysis and Data Mining, 1(1):38-51, February 2008 (article)

Abstract
Nonnegative matrix approximation (NNMA) is a popular matrix decomposition technique that has proven to be useful across a diverse variety of fields with applications ranging from document analysis and image processing to bioinformatics and signal processing. Over the years, several algorithms for NNMA have been proposed, e.g. Lee and Seung‘s multiplicative updates, alternating least squares (ALS), and gradient descent-based procedures. However, most of these procedures suffer from either slow convergence, numerical instability, or at worst, serious theoretical drawbacks. In this paper, we develop a new and improved algorithmic framework for the least-squares NNMA problem, which is not only theoretically well-founded, but also overcomes many deficiencies of other methods. Our framework readily admits powerful optimization techniques and as concrete realizations we present implementations based on the Newton, BFGS and conjugate gradient methods. Our algorithms provide numerical resu lts supe rior to both Lee and Seung‘s method as well as to the alternating least squares heuristic, which was reported to work well in some situations but has no theoretical guarantees[1]. Our approach extends naturally to include regularization and box-constraints without sacrificing convergence guarantees. We present experimental results on both synthetic and real-world datasets that demonstrate the superiority of our methods, both in terms of better approximations as well as computational efficiency.

Web DOI [BibTex]

Web DOI [BibTex]


no image
A Unifying Probabilistic Framework for Analyzing Residual Dipolar Couplings

Habeck, M., Nilges, M., Rieping, W.

Journal of Biomolecular NMR, 40(2):135-144, February 2008 (article)

Abstract
Residual dipolar couplings provide complementary information to the nuclear Overhauser effect measurements that are traditionally used in biomolecular structure determination by NMR. In a de novo structure determination, however, lack of knowledge about the degree and orientation of molecular alignment complicates the analysis of dipolar coupling data. We present a probabilistic framework for analyzing residual dipolar couplings and demonstrate that it is possible to estimate the atomic coordinates, the complete molecular alignment tensor, and the error of the couplings simultaneously. As a by-product, we also obtain estimates of the uncertainty in the coordinates and the alignment tensor. We show that our approach encompasses existing methods for determining the alignment tensor as special cases, including least squares estimation, histogram fitting, and elimination of an explicit alignment tensor in the restraint energy.

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Contour-propagation Algorithms for Semi-automated Reconstruction of Neural Processes

Macke, J., Maack, N., Gupta, R., Denk, W., Schölkopf, B., Borst, A.

Journal of Neuroscience Methods, 167(2):349-357, January 2008 (article)

Abstract
A new technique, ”Serial Block Face Scanning Electron Microscopy” (SBFSEM), allows for automatic sectioning and imaging of biological tissue with a scanning electron microscope. Image stacks generated with this technology have a resolution sufficient to distinguish different cellular compartments, including synaptic structures, which should make it possible to obtain detailed anatomical knowledge of complete neuronal circuits. Such an image stack contains several thousands of images and is recorded with a minimal voxel size of 10-20nm in the x and y- and 30nm in z-direction. Consequently, a tissue block of 1mm3 (the approximate volume of the Calliphora vicina brain) will produce several hundred terabytes of data. Therefore, highly automated 3D reconstruction algorithms are needed. As a first step in this direction we have developed semiautomated segmentation algorithms for a precise contour tracing of cell membranes. These algorithms were embedded into an easy-to-operate user interface, which allows direct 3D observation of the extracted objects during the segmentation of image stacks. Compared to purely manual tracing, processing time is greatly accelerated.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
A Quantum-Statistical-Mechanical Extension of Gaussian Mixture Model

Tanaka, K., Tsuda, K.

Journal of Physics: Conference Series, 95(012023):1-9, January 2008 (article)

Abstract
We propose an extension of Gaussian mixture models in the statistical-mechanical point of view. The conventional Gaussian mixture models are formulated to divide all points in given data to some kinds of classes. We introduce some quantum states constructed by superposing conventional classes in linear combinations. Our extension can provide a new algorithm in classifications of data by means of linear response formulas in the statistical mechanics.

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Minimal Nonlinear Distortion Principle for Nonlinear Independent Component Analysis

Zhang, K., Chan, L.

Journal of Machine Learning Research, 9, pages: 2455-2487, 2008 (article)

Abstract
It is well known that solutions to the nonlinear independent component analysis (ICA) problem are highly non-unique. In this paper we propose the "minimal nonlinear distortion" (MND) principle for tackling the ill-posedness of nonlinear ICA problems. MND prefers the nonlinear ICA solution with the estimated mixing procedure as close as possible to linear, among all possible solutions. It also helps to avoid local optima in the solutions. To achieve MND, we exploit a regularization term to minimize the mean square error between the nonlinear mixing mapping and the best-fitting linear one. The effect of MND on the inherent trivial and non-trivial indeterminacies in nonlinear ICA solutions is investigated. Moreover, we show that local MND is closely related to the smoothness regularizer penalizing large curvature, which provides another useful regularization condition for nonlinear ICA. Experiments on synthetic data show the usefulness of the MND principle for separating various nonlinear mixtures. Finally, as an application, we use nonlinear ICA with MND to separate daily returns of a set of stocks in Hong Kong, and the linear causal relations among them are successfully discovered. The resulting causal relations give some interesting insights into the stock market. Such a result can not be achieved by linear ICA. Simulation studies also verify that when doing causality discovery, sometimes one should not ignore the nonlinear distortion in the data generation procedure, even if it is weak.

Web [BibTex]

Web [BibTex]


no image
Transport processes in networks with scattering ramification nodes

Radl, A.

Journal of Applied Functional Analysis, 3, pages: 461-483, 2008 (article)

Web [BibTex]

Web [BibTex]


no image
Learning to control in operational space

Peters, J., Schaal, S.

International Journal of Robotics Research, 27, pages: 197-212, 2008, clmc (article)

Abstract
One of the most general frameworks for phrasing control problems for complex, redundant robots is operational space control. However, while this framework is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in com- plex robots, e.g., humanoid robots. In this paper, we suggest a learning approach for opertional space control as a direct inverse model learning problem. A first important insight for this paper is that a physically cor- rect solution to the inverse problem with redundant degrees-of-freedom does exist when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component for our work is based on the insight that many operational space controllers can be understood in terms of a constrained optimal control problem. The cost function as- sociated with this optimal control problem allows us to formulate a learn- ing algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the machine learning point of view, this learning problem corre- sponds to a reinforcement learning problem that maximizes an immediate reward. We employ an expectation-maximization policy search algorithm in order to solve this problem. Evaluations on a three degrees of freedom robot arm are used to illustrate the suggested approach. The applica- tion to a physically realistic simulator of the anthropomorphic SARCOS Master arm demonstrates feasibility for complex high degree-of-freedom robots. We also show that the proposed method works in the setting of learning resolved motion rate control on real, physical Mitsubishi PA-10 medical robotics arm.

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2002


no image
Optimized Support Vector Machines for Nonstationary Signal Classification

Davy, M., Gretton, A., Doucet, A., Rayner, P.

IEEE Signal Processing Letters, 9(12):442-445, December 2002 (article)

Abstract
This letter describes an efficient method to perform nonstationary signal classification. A support vector machine (SVM) algorithm is introduced and its parameters optimised in a principled way. Simulations demonstrate that our low complexity method outperforms state-of-the-art nonstationary signal classification techniques.

PostScript Web DOI [BibTex]

2002

PostScript Web DOI [BibTex]


no image
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Schölkopf, B., Smola, A.

pages: 644, Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, USA, December 2002, Parts of this book, including an introduction to kernel methods, can be downloaded here. (book)

Abstract
In the 1990s, a new type of learning algorithm was developed, based on results from statistical learning theory: the Support Vector Machine (SVM). This gave rise to a new class of theoretically elegant learning machines that use a central concept of SVMs-kernels—for a number of learning tasks. Kernel machines provide a modular framework that can be adapted to different tasks and domains by the choice of the kernel function and the base algorithm. They are replacing neural networks in a variety of fields, including engineering, information retrieval, and bioinformatics. Learning with Kernels provides an introduction to SVMs and related kernel methods. Although the book begins with the basics, it also includes the latest research. It provides all of the concepts necessary to enable a reader equipped with some basic mathematical knowledge to enter the world of machine learning using theoretically well-founded yet easy-to-use kernel algorithms and to understand and apply the powerful algorithms that have been developed over the last few years.

Web [BibTex]

Web [BibTex]


no image
A New Discriminative Kernel from Probabilistic Models

Tsuda, K., Kawanabe, M., Rätsch, G., Sonnenburg, S., Müller, K.

Neural Computation, 14(10):2397-2414, October 2002 (article)

PDF [BibTex]

PDF [BibTex]


no image
Functional Genomics of Osteoarthritis

Aigner, T., Bartnik, E., Zien, A., Zimmer, R.

Pharmacogenomics, 3(5):635-650, September 2002 (article)

Web [BibTex]

Web [BibTex]


no image
Constructing Boosting algorithms from SVMs: an application to one-class classification.

Rätsch, G., Mika, S., Schölkopf, B., Müller, K.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9):1184-1199, September 2002 (article)

Abstract
We show via an equivalence of mathematical programs that a support vector (SV) algorithm can be translated into an equivalent boosting-like algorithm and vice versa. We exemplify this translation procedure for a new algorithm—one-class leveraging—starting from the one-class support vector machine (1-SVM). This is a first step toward unsupervised learning in a boosting framework. Building on so-called barrier methods known from the theory of constrained optimization, it returns a function, written as a convex combination of base hypotheses, that characterizes whether a given test point is likely to have been generated from the distribution underlying the training data. Simulations on one-class classification problems demonstrate the usefulness of our approach.

DOI [BibTex]

DOI [BibTex]


no image
Co-Clustering of Biological Networks and Gene Expression Data

Hanisch, D., Zien, A., Zimmer, R., Lengauer, T.

Bioinformatics, (Suppl 1):145S-154S, 18, July 2002 (article)

Abstract
Motivation: Large scale gene expression data are often analysed by clustering genes based on gene expression data alone, though a priori knowledge in the form of biological networks is available. The use of this additional information promises to improve exploratory analysis considerably. Results: We propose constructing a distance function which combines information from expression data and biological networks. Based on this function, we compute a joint clustering of genes and vertices of the network. This general approach is elaborated for metabolic networks. We define a graph distance function on such networks and combine it with a correlation-based distance function for gene expression measurements. A hierarchical clustering and an associated statistical measure is computed to arrive at a reasonable number of clusters. Our method is validated using expression data of the yeast diauxic shift. The resulting clusters are easily interpretable in terms of the biochemical network and the gene expression data and suggest that our method is able to automatically identify processes that are relevant under the measured conditions.

Web [BibTex]

Web [BibTex]


no image
Confidence measures for protein fold recognition

Sommer, I., Zien, A., von Ohsen, N., Zimmer, R., Lengauer, T.

Bioinformatics, 18(6):802-812, June 2002 (article)

[BibTex]

[BibTex]


no image
The contributions of color to recognition memory for natural scenes

Wichmann, F., Sharpe, L., Gegenfurtner, K.

Journal of Experimental Psychology: Learning, Memory and Cognition, 28(3):509-520, May 2002 (article)

Abstract
The authors used a recognition memory paradigm to assess the influence of color information on visual memory for images of natural scenes. Subjects performed 5-10% better for colored than for black-and-white images independent of exposure duration. Experiment 2 indicated little influence of contrast once the images were suprathreshold, and Experiment 3 revealed that performance worsened when images were presented in color and tested in black and white, or vice versa, leading to the conclusion that the surface property color is part of the memory representation. Experiments 4 and 5 exclude the possibility that the superior recognition memory for colored images results solely from attentional factors or saliency. Finally, the recognition memory advantage disappears for falsely colored images of natural scenes: The improvement in recognition memory depends on the color congruence of presented images with learned knowledge about the color gamut found within natural scenes. The results can be accounted for within a multiple memory systems framework.

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Training invariant support vector machines

DeCoste, D., Schölkopf, B.

Machine Learning, 46(1-3):161-190, January 2002 (article)

Abstract
Practical experience has shown that in order to obtain the best possible performance, prior knowledge about invariances of a classification problem at hand ought to be incorporated into the training procedure. We describe and review all known methods for doing so in support vector machines, provide experimental results, and discuss their respective merits. One of the significant new results reported in this work is our recent achievement of the lowest reported test error on the well-known MNIST digit recognition benchmark task, with SVM training times that are also significantly faster than previous SVM methods.

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Model Selection for Small Sample Regression

Chapelle, O., Vapnik, V., Bengio, Y.

Machine Learning, 48(1-3):9-23, 2002 (article)

Abstract
Model selection is an important ingredient of many machine learning algorithms, in particular when the sample size in small, in order to strike the right trade-off between overfitting and underfitting. Previous classical results for linear regression are based on an asymptotic analysis. We present a new penalization method for performing model selection for regression that is appropriate even for small samples. Our penalization is based on an accurate estimator of the ratio of the expected training error and the expected generalization error, in terms of the expected eigenvalues of the input covariance matrix.

PostScript [BibTex]

PostScript [BibTex]


no image
Contrast discrimination with sinusoidal gratings of different spatial frequency

Bird, C., Henning, G., Wichmann, F.

Journal of the Optical Society of America A, 19(7), pages: 1267-1273, 2002 (article)

Abstract
The detectability of contrast increments was measured as a function of the contrast of a masking or “pedestal” grating at a number of different spatial frequencies ranging from 2 to 16 cycles per degree of visual angle. The pedestal grating always had the same orientation, spatial frequency and phase as the signal. The shape of the contrast increment threshold versus pedestal contrast (TvC) functions depend of the performance level used to define the “threshold,” but when both axes are normalized by the contrast corresponding to 75% correct detection at each frequency, the (TvC) functions at a given performance level are identical. Confidence intervals on the slope of the rising part of the TvC functions are so wide that it is not possible with our data to reject Weber’s Law.

PDF [BibTex]

PDF [BibTex]


no image
A Bennett Concentration Inequality and Its Application to Suprema of Empirical Processes

Bousquet, O.

C. R. Acad. Sci. Paris, Ser. I, 334, pages: 495-500, 2002 (article)

Abstract
We introduce new concentration inequalities for functions on product spaces. They allow to obtain a Bennett type deviation bound for suprema of empirical processes indexed by upper bounded functions. The result is an improvement on Rio's version \cite{Rio01b} of Talagrand's inequality \cite{Talagrand96} for equidistributed variables.

PDF PostScript [BibTex]


no image
Numerical evolution of axisymmetric, isolated systems in general relativity

Frauendiener, J., Hein, M.

Physical Review D, 66, pages: 124004-124004, 2002 (article)

Abstract
We describe in this article a new code for evolving axisymmetric isolated systems in general relativity. Such systems are described by asymptotically flat space-times, which have the property that they admit a conformal extension. We are working directly in the extended conformal manifold and solve numerically Friedrich's conformal field equations, which state that Einstein's equations hold in the physical space-time. Because of the compactness of the conformal space-time the entire space-time can be calculated on a finite numerical grid. We describe in detail the numerical scheme, especially the treatment of the axisymmetry and the boundary.

GZIP [BibTex]

GZIP [BibTex]


no image
Marginalized kernels for biological sequences

Tsuda, K., Kin, T., Asai, K.

Bioinformatics, 18(Suppl 1):268-275, 2002 (article)

PDF [BibTex]

PDF [BibTex]


no image
Stability and Generalization

Bousquet, O., Elisseeff, A.

Journal of Machine Learning Research, 2, pages: 499-526, 2002 (article)

Abstract
We define notions of stability for learning algorithms and show how to use these notions to derive generalization error bounds based on the empirical error and the leave-one-out error. The methods we use can be applied in the regression framework as well as in the classification one when the classifier is obtained by thresholding a real-valued function. We study the stability properties of large classes of learning algorithms such as regularization based algorithms. In particular we focus on Hilbert space regularization and Kullback-Leibler regularization. We demonstrate how to apply the results to SVM for regression and classification.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Subspace information criterion for non-quadratic regularizers – model selection for sparse regressors

Tsuda, K., Sugiyama, M., Müller, K.

IEEE Trans Neural Networks, 13(1):70-80, 2002 (article)

PDF [BibTex]

PDF [BibTex]


no image
Modeling splicing sites with pairwise correlations

Arita, M., Tsuda, K., Asai, K.

Bioinformatics, 18(Suppl 2):27-34, 2002 (article)

PDF [BibTex]

PDF [BibTex]


no image
Perfusion Quantification using Gaussian Process Deconvolution

Andersen, IK., Szymkowiak, A., Rasmussen, CE., Hanson, LG., Marstrand, JR., Larsson, HBW., Hansen, LK.

Magnetic Resonance in Medicine, (48):351-361, 2002 (article)

Abstract
The quantification of perfusion using dynamic susceptibility contrast MR imaging requires deconvolution to obtain the residual impulse-response function (IRF). Here, a method using a Gaussian process for deconvolution, GPD, is proposed. The fact that the IRF is smooth is incorporated as a constraint in the method. The GPD method, which automatically estimates the noise level in each voxel, has the advantage that model parameters are optimized automatically. The GPD is compared to singular value decomposition (SVD) using a common threshold for the singular values and to SVD using a threshold optimized according to the noise level in each voxel. The comparison is carried out using artificial data as well as using data from healthy volunteers. It is shown that GPD is comparable to SVD variable optimized threshold when determining the maximum of the IRF, which is directly related to the perfusion. GPD provides a better estimate of the entire IRF. As the signal to noise ratio increases or the time resolution of the measurements increases, GPD is shown to be superior to SVD. This is also found for large distribution volumes.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Tracking a Small Set of Experts by Mixing Past Posteriors

Bousquet, O., Warmuth, M.

Journal of Machine Learning Research, 3, pages: 363-396, (Editors: Long, P.), 2002 (article)

Abstract
In this paper, we examine on-line learning problems in which the target concept is allowed to change over time. In each trial a master algorithm receives predictions from a large set of n experts. Its goal is to predict almost as well as the best sequence of such experts chosen off-line by partitioning the training sequence into k+1 sections and then choosing the best expert for each section. We build on methods developed by Herbster and Warmuth and consider an open problem posed by Freund where the experts in the best partition are from a small pool of size m. Since k >> m, the best expert shifts back and forth between the experts of the small pool. We propose algorithms that solve this open problem by mixing the past posteriors maintained by the master algorithm. We relate the number of bits needed for encoding the best partition to the loss bounds of the algorithms. Instead of paying log n for choosing the best expert in each section we first pay log (n choose m) bits in the bounds for identifying the pool of m experts and then log m bits per new section. In the bounds we also pay twice for encoding the boundaries of the sections.

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
A femoral arteriovenous shunt facilitates arterial whole blood sampling in animals

Weber, B., Burger, C., Biro, P., Buck, A.

Eur J Nucl Med Mol Imaging, 29, pages: 319-323, 2002 (article)

[BibTex]

[BibTex]


no image
Contrast discrimination with pulse-trains in pink noise

Henning, G., Bird, C., Wichmann, F.

Journal of the Optical Society of America A, 19(7), pages: 1259-1266, 2002 (article)

Abstract
Detection performance was measured with sinusoidal and pulse-train gratings. Although the 2.09-c/deg pulse-train, or line gratings, contained at least 8 harmonics all at equal contrast, they were no more detectable than their most detectable component. The addition of broadband pink noise designed to equalize the detectability of the components of the pulse train made the pulse train about a factor of four more detectable than any of its components. However, in contrast-discrimination experiments, with a pedestal or masking grating of the same form and phase as the signal and 15% contrast, the noise did not affect the discrimination performance of the pulse train relative to that obtained with its sinusoidal components. We discuss the implications of these observations for models of early vision in particular the implications for possible sources of internal noise.

PDF [BibTex]

PDF [BibTex]


no image
Choosing Multiple Parameters for Support Vector Machines

Chapelle, O., Vapnik, V., Bousquet, O., Mukherjee, S.

Machine Learning, 46(1):131-159, 2002 (article)

Abstract
The problem of automatically tuning multiple parameters for pattern recognition Support Vector Machines (SVM) is considered. This is done by minimizing some estimates of the generalization error of SVMs using a gradient descent algorithm over the set of parameters. Usual methods for choosing parameters, based on exhaustive search become intractable as soon as the number of parameters exceeds two. Some experimental results assess the feasibility of our approach for a large number of parameters (more than 100) and demonstrate an improvement of generalization performance.

PDF PostScript [BibTex]

PDF PostScript [BibTex]

2001


no image
Anabolic and Catabolic Gene Expression Pattern Analysis in Normal Versus Osteoarthritic Cartilage Using Complementary DNA-Array Technology

Aigner, T., Zien, A., Gehrsitz, A., Gebhard, P., McKenna, L.

Arthritis and Rheumatism, 44(12):2777-2789, December 2001 (article)

Web [BibTex]

2001

Web [BibTex]


no image
Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators

Williamson, R., Smola, A., Schölkopf, B.

IEEE Transactions on Information Theory, 47(6):2516-2532, September 2001 (article)

Abstract
We derive new bounds for the generalization error of kernel machines, such as support vector machines and related regularization networks by obtaining new bounds on their covering numbers. The proofs make use of a viewpoint that is apparently novel in the field of statistical learning theory. The hypothesis class is described in terms of a linear operator mapping from a possibly infinite-dimensional unit ball in feature space into a finite-dimensional space. The covering numbers of the class are then determined via the entropy numbers of the operator. These numbers, which characterize the degree of compactness of the operator can be bounded in terms of the eigenvalues of an integral operator induced by the kernel function used by the machine. As a consequence, we are able to theoretically explain the effect of the choice of kernel function on the generalization performance of support vector machines.

DOI [BibTex]

DOI [BibTex]


no image
Centralization: A new method for the normalization of gene expression data

Zien, A., Aigner, T., Zimmer, R., Lengauer, T.

Bioinformatics, 17, pages: S323-S331, June 2001, Mathematical supplement available at http://citeseer.ist.psu.edu/574280.html (article)

Abstract
Microarrays measure values that are approximately proportional to the numbers of copies of different mRNA molecules in samples. Due to technical difficulties, the constant of proportionality between the measured intensities and the numbers of mRNA copies per cell is unknown and may vary for different arrays. Usually, the data are normalized (i.e., array-wise multiplied by appropriate factors) in order to compensate for this effect and to enable informative comparisons between different experiments. Centralization is a new two-step method for the computation of such normalization factors that is both biologically better motivated and more robust than standard approaches. First, for each pair of arrays the quotient of the constants of proportionality is estimated. Second, from the resulting matrix of pairwise quotients an optimally consistent scaling of the samples is computed.

PDF PostScript Web [BibTex]

PDF PostScript Web [BibTex]


no image
Regularized principal manifolds

Smola, A., Mika, S., Schölkopf, B., Williamson, R.

Journal of Machine Learning Research, 1, pages: 179-209, June 2001 (article)

Abstract
Many settings of unsupervised learning can be viewed as quantization problems - the minimization of the expected quantization error subject to some restrictions. This allows the use of tools such as regularization from the theory of (supervised) risk minimization for unsupervised learning. This setting turns out to be closely related to principal curves, the generative topographic map, and robust coding. We explore this connection in two ways: (1) we propose an algorithm for finding principal manifolds that can be regularized in a variety of ways; and (2) we derive uniform convergence bounds and hence bounds on the learning rates of the algorithm. In particular, we give bounds on the covering numbers which allows us to obtain nearly optimal learning rates for certain types of regularization operators. Experimental results demonstrate the feasibility of the approach.

PDF [BibTex]

PDF [BibTex]


no image
Failure Diagnosis of Discrete Event Systems

Son, HI., Kim, KW., Lee, S.

Journal of Control, Automation and Systems Engineering, 7(5):375-383, May 2001, In Korean (article)

[BibTex]

[BibTex]


no image
Pattern Selection Using the Bias and Variance of Ensemble

Shin, H., Cho, S.

Journal of the Korean Institute of Industrial Engineers, 28(1):112-127, March 2001 (article)

Abstract
[Abstract]: A useful pattern is a pattern that contributes much to learning. For a classification problem those patterns near the class boundary surfaces carry more information to the classifier. For a regression problem the ones near the estimated surface carry more information. In both cases, the usefulness is defined only for those patterns either without error or with negligible error. Using only the useful patterns gives several benefits. First, computational complexity in memory and time for learning is decreased. Second, overfitting is avoided even when the learner is over-sized. Third, learning results in more stable learners. In this paper, we propose a pattern “utility index” that measures the utility of an individual pattern. The utility index is based on the bias and variance of a pattern trained by a network ensemble. In classification, the pattern with a low bias and a high variance gets a high score. In regression, on the other hand, the one with a low bias and a low variance gets a high score. Based on the distribution of the utility index, the original training set is divided into a high-score group and a low-score group. Only the high-score group is then used for training. The proposed method is tested on synthetic and real-world benchmark datasets. The proposed approach gives a better or at least similar performance.

[BibTex]

[BibTex]


no image
Structure and Functionality of a Designed p53 Dimer.

Davison, TS., Nie, X., Ma, W., Lin, Y., Kay, C., Benchimol, S., Arrowsmith, C.

Journal of Molecular Biology, 307(2):605-617, March 2001 (article)

Abstract
P53 is a homotetrameric tumor suppressor protein involved in transcriptional control of genes that regulate cell proliferation and death. In order to probe the role that oligomerization plays in this capacity, we have previously designed and characterized a series of p53 proteins with altered oligomeric states through hydrophilc substitution of residues Met340 or Leu344 in the normally tetrameric oligomerization domain. Although such mutations have little effect on the overall secondary structural content of the oligomerization domain, both solubility and the resistance to thermal denaturation are substantially reduced relative to that of the wild-type domain. Here, we report the design and characterization of a double-mutant p53 with alterations of residues at positions Met340 and Leu344. The double-mutations Met340Glu/Leu344Lys and Met340Gln/Leu344Arg resulted in distinct dimeric forms of the protein. Furthermore, we have verified by NMR structure determination that the double-mutant Met340Gln/Leu344Arg is essentially a "half-tetramer". Analysis of the in vivo activities of full-length p53 oligomeric mutants reveals that while cell-cycle arrest requires tetrameric p53, transcriptional transactivation activity of monomers and dimers retain roughly background and half of the wild-type activity, respectively.

Web [BibTex]

Web [BibTex]


no image
An Introduction to Kernel-Based Learning Algorithms

Müller, K., Mika, S., Rätsch, G., Tsuda, K., Schölkopf, B.

IEEE Transactions on Neural Networks, 12(2):181-201, March 2001 (article)

Abstract
This paper provides an introduction to support vector machines, kernel Fisher discriminant analysis, and kernel principal component analysis, as examples for successful kernel-based learning methods. We first give a short background about Vapnik-Chervonenkis theory and kernel feature spaces and then proceed to kernel based learning in supervised and unsupervised scenarios including practical and algorithmic considerations. We illustrate the usefulness of kernel algorithms by discussing applications such as optical character recognition and DNA analysis

DOI [BibTex]

DOI [BibTex]


no image
Estimating the support of a high-dimensional distribution.

Schölkopf, B., Platt, J., Shawe-Taylor, J., Smola, A., Williamson, R.

Neural Computation, 13(7):1443-1471, March 2001 (article)

Abstract
Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a “simple” subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1. We propose a method to approach this problem by trying to estimate a function f that is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabeled data.

Web DOI [BibTex]

Web DOI [BibTex]


no image
The psychometric function: II. Bootstrap-based confidence intervals and sampling

Wichmann, F., Hill, N.

Perception and Psychophysics, 63 (8), pages: 1314-1329, 2001 (article)

PDF [BibTex]

PDF [BibTex]


no image
The psychometric function: I. Fitting, sampling and goodness-of-fit

Wichmann, F., Hill, N.

Perception and Psychophysics, 63 (8), pages: 1293-1313, 2001 (article)

Abstract
The psychometric function relates an observer'sperformance to an independent variable, usually some physical quantity of a stimulus in a psychophysical task. This paper, together with its companion paper (Wichmann & Hill, 2001), describes an integrated approach to (1) fitting psychometric functions, (2) assessing the goodness of fit, and (3) providing confidence intervals for the function'sparameters and other estimates derived from them, for the purposes of hypothesis testing. The present paper deals with the first two topics, describing a constrained maximum-likelihood method of parameter estimation and developing several goodness-of-fit tests. Using Monte Carlo simulations, we deal with two specific difficulties that arise when fitting functions to psychophysical data. First, we note that human observers are prone to stimulus-independent errors (or lapses ). We show that failure to account for this can lead to serious biases in estimates of the psychometric function'sparameters and illustrate how the problem may be overcome. Second, we note that psychophysical data sets are usually rather small by the standards required by most of the commonly applied statistical tests. We demonstrate the potential errors of applying traditional X^2 methods to psychophysical data and advocate use of Monte Carlo resampling techniques that do not rely on asymptotic theory. We have made available the software to implement our methods

PDF [BibTex]

PDF [BibTex]


no image
The control structure of artificial creatures

Zhou, D., Dai, R.

Artificial Life and Robotics, 5(3), 2001, invited article (article)

Web [BibTex]

Web [BibTex]


no image
Markovian domain fingerprinting: statistical segmentation of protein sequences

Bejerano, G., Seldin, Y., Margalit, H., Tishby, N.

Bioinformatics, 17(10):927-934, 2001 (article)

PDF Web [BibTex]

PDF Web [BibTex]