Catálogo de publicaciones - libros

Compartir en
redes sociales


Evolutionary Computation,Machine Learning and Data Mining in Bioinformatics: 5th European Conference, EvoBIO 2007, Valencia, Spain, April 11-13, 2007. Proceedings

Elena Marchiori ; Jason H. Moore ; Jagath C. Rajapakse (eds.)

En conferencia: 5º European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics (EvoBIO) . Valencia, Spain . April 11, 2007 - April 13, 2007

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Artificial Intelligence (incl. Robotics); Programming Techniques; Computation by Abstract Devices; Algorithm Analysis and Problem Complexity; Computational Biology/Bioinformatics; Pattern Recognition

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2007 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-71782-9

ISBN electrónico

978-3-540-71783-6

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2007

Tabla de contenidos

Evolutionary Search for Improved Path Diagrams

Kim Laurio; Thomas Svensson; Mats Jirstrand; Patric Nilsson; Jonas Gamalielsson; Björn Olsson

A path diagram relates observed, pairwise, variable correlations to a functional structure which describes the hypothesized causal relations between the variables. Here we combine path diagrams, heuristics and evolutionary search into a system which seeks to improve existing gene regulatory models. Our evaluation shows that once a correct model has been identified it receives a lower prediction error compared to incorrect models, indicating the overall feasibility of this approach. However, with smaller samples the observed correlations gradually become more misleading, and the evolutionary search increasingly converges on suboptimal models. Future work will incorporate publicly available sources of experimentally verified biological facts to computationally suggest model modifications which might improve the model’s fitness.

Pp. 114-121

Simplifying Amino Acid Alphabets Using a Genetic Algorithm and Sequence Alignment

Jacek Lenckowski; Krzysztof Walczak

In some areas of bioinformatics (like protein folding or sequence alignment) the full alphabet of amino acid symbols is not necessary. Often, better results are received with simplified alphabets. In general, simplified alphabets are as universal as possible. In this paper we show that this concept may not be optimal. We present a genetic algorithm for alphabet simplifying and we use it in a method based on global sequence alignment. We demonstrate that our algorithm is much faster and produces better results than the previously presented genetic algorithm. We also compare alphabets constructed on the base of universal substitution matrices like BLOSUM with our alphabets built through sequence alignment and propose a new coefficient describing the value of alphabets in the sequence alignment context. Finally we show that our simplified alphabets give better results in a sequence classification (using k-NN classifier), than most previously presented simplified alphabets and better than full 20-letter alphabet.

Pp. 122-131

Towards Evolutionary Network Reconstruction Tools for Systems Biology

Thorsten Lenser; Thomas Hinze; Bashar Ibrahim; Peter Dittrich

Systems biology is the ever-growing field of integrating molecular knowledge about biological organisms into an understanding at the systems level. For this endeavour, automatic network reconstruction tools are urgently needed. In the present contribution, we show how the applicability of evolutionary algorithms to systems biology can be improved by a domain-specific representation and algorithmic extensions, especially a separation of network structure evolution from evolution of kinetic parameters. In a case study, our presented tool is applied to a model of the mitotic spindle checkpoint in the human cell cycle.

Pp. 132-142

A Gaussian Evolutionary Method for Predicting Protein-Protein Interaction Sites

Kang-Ping Liu; Jinn-Moon Yang

Protein-protein interactions play a pivotal role in modern molecular biology. Identifying the protein-protein interaction sites is great scientific and practical interest for predicting protein-protein interactions. In this study, we proposed a Gaussian Evolutionary Method (GEM) to optimize 18 features, including ten atomic solvent and eight protein 2 structure features, for predicting protein-protein interaction sites. The training set consists of 104 unbound proteins selected from PDB and the predicted successful rate is 65.4% (68/104) proteins in the training dataset. These 18 parameters were then applied to a test set with 50 unbound proteins. Based on the threshold obtained from the training set, our method is able to predict the binding sites for 98% (49/50) proteins and yield 46% successful prediction and 42.3% average specificity. Here, a binding-site prediction is considered successful if 50% predicted area is indeed located in protein-protein interface (i.e. the specificity is more than 0.5). We believe that the optimized parameters of our method are useful for analyzing protein-protein interfaces and for interfaces prediction methods and protein-protein docking methods.

Pp. 143-154

Bio-mimetic Evolutionary Reverse Engineering of Genetic Regulatory Networks

Daniel Marbach; Claudio Mattiussi; Dario Floreano

The effective reverse engineering of biochemical networks is one of the great challenges of systems biology. The contribution of this paper is two-fold: 1) We introduce a new method for reverse engineering genetic regulatory networks from gene expression data; 2) We demonstrate how nonlinear gene networks can be inferred from steady-state data alone. The reverse engineering method is based on an evolutionary algorithm that employs a novel representation called Analog Genetic Encoding (AGE), which is inspired from the natural encoding of genetic regulatory networks. AGE can be used with biologically plausible, nonlinear gene models where analytical approaches or local gradient based optimisation methods often fail. Recently there has been increasing interest in reverse engineering gene networks from steady-state data. Here we demonstrate how more accurate can also be inferred from steady-state data alone.

Pp. 155-165

Tuning ReliefF for Genome-Wide Genetic Analysis

Jason H. Moore; Bill C. White

An important goal of human genetics is the identification of DNA sequence variations that are predictive of who is at risk for various common diseases. The focus of the present study is on the challenge of detecting and characterizing nonlinear attribute interactions or dependencies in the context of a genome-wide genetic study. The first question we address is whether the ReliefF algorithm is suitable for attribute selection in this domain. The second question we address is whether we can improve ReliefF for selecting important genetic attributes. Using simulated genetic datasets, we show that ReliefF is significantly better than a naïve chi-square test of independence for selecting two interacting attributes out of 10 candidates. In addition, we show that ReliefF can be improved in this domain by systematically removing the worst attributes and re-estimating ReliefF weights. Our simulation studies demonstrate that this new Tuned ReliefF (TuRF) algorithm is significantly better than ReliefF.

Pp. 166-175

Dinucleotide Step Parameterization of Pre-miRNAs Using Multi-objective Evolutionary Algorithms

Jin-Wu Nam; In-Hee Lee; Kyu-Baek Hwang; Seong-Bae Park; Byoung-Tak Zhang

MicroRNAs (miRNAs) form a large functional family of small noncoding RNAs and play an important role as posttranscriptional regulators, by repressing the translation of mRNAs. Recently, the processing mechanism of miRNAs has been reported to involve Drosha/DGCR8 complex and Dicer, however, the exact mechanism and molecular principle are still unknown. We thus have tried to understand the related phenomena in terms of the tertiary structure of pre-miRNA. Unfortunately, the tertiary structure of RNA double helix has not been studied sufficiently compared to that of DNA double helix. The tertiary structure of pre-miRNA double helix is determined by 15 types of dinucleotide step (d-step) parameters for three classes of angles, i.e., twist, roll, and tilt. In this study, we estimate the 45 d-step parameters (15 types by 3 classes) using an evolutionary algorithm, under several assumptions inferred from the literature. Considering the trade-off among the four objective functions in our study, we deployed a multi-objective evolutionary algorithm, NSGA-II, to the search for a nondominant set of parameters. The performance of our method was evaluated on a separate test dataset. Our study provides a novel approach to understanding the processing mechanism of pre-miRNAs with respect to their tertiary structure and would be helpful for developing a comprehensible prediction method for pre-miRNA and mature miRNA structures.

Pp. 176-186

Amino Acid Features for Prediction of Protein-Protein Interface Residues with Support Vector Machines

Minh N. Nguyen; Jagath C. Rajapakse; Kai-Bo Duan

Knowledge of protein-protein interaction sites is vital to determine proteins’ function and involvement in different pathways. Support Vector Machines (SVM) have been proposed over the recent years to predict protein-protein interface residues, primarily based on single amino acid sequence inputs. We investigate the features of amino acids that can be best used with SVM for predicting residues at protein-protein interfaces. The optimal feature set was derived from investigation into features such as amino acid composition, hydrophobic characters of amino acids, secondary structure propensity of amino acids, accessible surface areas, and evolutionary information generated by PSI-BLAST profiles. Using a backward elimination procedure, amino acid composition, accessible surface areas, and evolutionary information generated by PSI-BLAST profiles gave the best performance. The present approach achieved overall prediction accuracy of 74.2% for 77 individulal proteins collected from the Protein Data Bank, which is better than the previously reported accuracies.

Pp. 187-196

Predicting HIV Protease-Cleavable Peptides by Discrete Support Vector Machines

Carlotta Orsenigo; Carlo Vercellis

The Human Immunodeficiency Virus (HIV) encodes an enzyme, called HIV protease, which is responsible for the generation of infectious viral particles by cleaving the virus polypeptides. Many efforts have been devoted to perform accurate predictions on the HIV-protease cleavability of peptides, in order to design efficient inhibitor drugs. Over the last decade, linear and nonlinear supervised learning methods have been extensively used to discriminate between protease-cleavable and non cleavable peptides. In this paper we consider four different proteins encoding schemes and we apply a discrete variant of linear support vector machines to predict their HIV protease-cleavable status. Empirical results indicate the effectiveness of the proposed method, that is able to classify with the highest accuracy the cleavable and non cleavable peptides contained in two publicly available benchmark datasets. Moreover, the optimal classification rules generated are characterized by a strong generalization capability, as shown by their accuracy in predicting the HIV protease cleavable status of peptides in out-of-sample datasets.

Pp. 197-206

Inverse Protein Folding on 2D Off-Lattice Model: Initial Results and Perspectives

David Pelta; Alberto Carrascal

Inverse protein folding or protein design stands for searching a particular amino acids sequence whose native structure or folding matches a pre specified target.

The problem of finding the corresponding folded structure of a particular sequence is, , a hard computational problem.

We use a genetic algorithm for searching the space of potential sequences, and the fitness of each individual is measured with the output of a second GA performing a minimization process in the space of structures.

Using an off-lattice protein-like 2D model, we show how the implemented techniques are able to obtain a variety of sequences attaining the target structures proposed.

Pp. 207-216