Catálogo de publicaciones - libros

Compartir en
redes sociales


Algorithms in Bioinformatics: 5th International Workshop, WABI 2005, Mallorca, Spain, October 3-6, 2005, Proceedings

Rita Casadio ; Gene Myers (eds.)

En conferencia: 5º International Workshop on Algorithms in Bioinformatics (WABI) . Mallorca, Spain . October 3, 2005 - October 6, 2005

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

No disponibles.

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2005 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-29008-7

ISBN electrónico

978-3-540-31812-5

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2005

Tabla de contenidos

On the Complexity of Several Haplotyping Problems

Rudi Cilibrasi; Leo van Iersel; Steven Kelk; John Tromp

We present several new results pertaining to haplotyping. The first set of results concerns the combinatorial problem of reconstructing haplotypes from incomplete and/or imperfectly sequenced haplotype data. More specifically, we show that an interesting, restricted case of Minimum Error Correction (MEC) is NP-hard, question earlier claims about a related problem, and present a polynomial-time algorithm for the ungapped case of Longest Haplotype Reconstruction (LHR). Secondly, we present a polynomial time algorithm for the problem of resolving genotype data using as few haplotypes as possible (the Pure Parsimony Haplotyping Problem , PPH) where each genotype has at most two ambiguous positions, thus solving an open problem posed by Lancia et al in [15].

Palabras clave: Bipartite Graph; Input Matrix; Maximum Match; Ambiguous Position; Parity Haplotype.

Pp. 128-139

A Hidden Markov Technique for Haplotype Reconstruction

Pasi Rastas; Mikko Koivisto; Heikki Mannila; Esko Ukkonen

We give a new algorithm for the genotype phasing problem. Our solution is based on a hidden Markov model for haplotypes. The model has a uniform structure, unlike most solutions proposed so far that model recombinations using haplotype blocks. In our model, the haplotypes can be seen as a result of iterated recombinations applied on a few founder haplotypes. We find maximum likelihood model of this type by using the EM algorithm. We show how to solve the subtleties of the EM algorithm that arise when genotypes are generated using a haplotype model. We compare our method to the well-known currently available algorithms ( phase, hap, gerbil ) using some standard and new datasets. Our algorithm is relatively fast and gives results that are always best or second best among the methods compared.

Palabras clave: Hide Markov Model; Haplotype Block; Emission Probability; Hide Data; Haplotype Inference.

Pp. 140-151

Algorithms for Imperfect Phylogeny Haplotyping (IPPH) with a Single Homoplasy or Recombination Event

Yun S. Song; Yufeng Wu; Dan Gusfield

The haplotype inference (HI) problem is the problem of inferring 2 n haplotype pairs from n observed genotype vectors. This is a key problem that arises in studying genetic variation in populations, for example in the ongoing HapMap project [5]. In order to have a hope of finding the haplotypes that actually generated the observed genotypes, we must use some (implicit or explicit) genetic model of the evolution of the underlying haplotypes. The Perfect Phylogeny Haplotyping (PPH) model was introduced in 2002 [9] to reflect the “neutral coalescent” or “perfect phylogeny” model of haplotype evolution. The PPH problem (which can be solved in polynomial time) is to determine whether there is an HI solution where the inferred haplotypes can be derived on a perfect phylogeny (tree). Since the introduction of the PPH model, several extensions and modifications of the PPH model have been examined. The most important modification, to model biological reality better, is to allow a limited number of biological events that violate the perfect phylogeny model. This was accomplished implicitly in [7,12] with the inclusion of several heuristics into an algorithm for the PPH problem [8]. Those heuristics are invoked when the genotype data cannot be explained with haplotypes that fit the perfect phylogeny model. In this paper, we address the issue explicitly , by allowing one recombination or homoplasy event in the model of haplotype evolution. We formalize the problems and provide a polynomial time solution for one problem, using an additional, empirically-supported assumption. We present a related framework for the second problem which gives a practical algorithm. We believe the second problem can be solved in polynomial time.

Palabras clave: Phylogenetic Network; Haplotype Inference; Ancestral Recombination Graph; Coalescent Model; Leaf Label.

Pp. 152-164

A Faster Algorithm for Detecting Network Motifs

Sebastian Wernicke

Motifs in a network are small connected subnetworks that occur in significantly higher frequencies than in random networks. They have recently gathered much attention as a useful concept to uncover structural design principles of complex networks. Kashtan et al. [ Bioinformatics , 2004] proposed a sampling algorithm for efficiently performing the computationally challenging task of detecting network motifs. However, among other drawbacks, this algorithm suffers from sampling bias and is only efficient when the motifs are small (3 or 4 nodes). Based on a detailed analysis of the previous algorithm, we present a new algorithm for network motif detection which overcomes these drawbacks. Experiments on a testbed of biological networks show our algorithm to be orders of magnitude faster than previous approaches. This allows for the detection of larger motifs in bigger networks than was previously possible, facilitating deeper insight into the field.

Palabras clave: Random Graph; Random Network; Input Graph; Network Motif; Degree Sequence.

Pp. 165-177

Reaction Motifs in Metabolic Networks

Vincent Lacroix; Cristina G. Fernandes; Marie-France Sagot

The classic view of metabolism as a collection of metabolic pathways is being questioned with the currently available possibility of studying whole networks. Novel ways of decomposing the network into modules and motifs that could be considered as the building blocks of a network are being suggested. In this work, we introduce a new definition of motif in the context of metabolic networks. Unlike in previous works on (other) biochemical networks, this definition is not based only on topological features. We propose instead to use an alternative definition based on the functional nature of the components that form the motif. After introducing a formal framework motivated by biological considerations, we present complexity results on the problem of searching for all occurrences of a reaction motif in a network, and introduce an algorithm that is fast in practice in most situations. We then show an initial application to the study of pathway evolution.

Palabras clave: Bipartite Graph; Metabolic Network; Biochemical Network; Subgraph Isomorphism; Reaction Motif.

Pp. 178-191

Reconstructing Metabolic Networks Using Interval Analysis

Warwick Tucker; Vincent Moulton

Recently, there has been growing interest in the modelling and simulation of biological systems. Such systems are often modelled in terms of coupled ordinary differential equations that involve parameters whose (often unknown) values correspond to certain fundamental properties of the system. For example, in metabolic modelling, concentrations of metabolites can be described by such equations, where parameters correspond to the kinetic rates of the underlying chemical reactions. Within this framework, the increasing availability of time series data opens up the attractive possibility of reconstructing approximate parameter values, thus enabling the in silico exploration of the behaviour of complex dynamical systems. The parameter reconstruction problem, however, is very challenging – a fact that has resulted in a plethora of heuristics methods designed to fit parameters to the given data. In this paper we propose a completely deterministic method for parameter reconstruction that is based on interval analysis. We illustrate its utility by applying it to reconstruct metabolic networks using S-systems. Our method not only estimates the parameters very precisely, it also determines the appropriate network topologies. A major strength of the proposed method is that it proves that large portions of parameter space can be disregarded, thereby avoiding spurious solutions.

Palabras clave: Network Topology; Boolean Function; Metabolic Network; Interval Analysis; Metabolic Modelling.

Pp. 192-203

A 1.375-Approximation Algorithm for Sorting by Transpositions

Isaac Elias; Tzvika Hartman

Sorting permutations by transpositions is an important problem in genome rearrangements. A transposition is a rearrangement operation in which a segment is cut out of the permutation and pasted in a different location. The complexity of this problem is still open and it has been a ten-year-old open problem to improve the best known 1.5-approximation algorithm. In this paper we provide a 1.375-approximation algorithm for sorting by transpositions. The algorithm is based on a new upper bound on the diameter of 3-permutations. In addition, we present some new results regarding the transposition diameter: We improve the lower bound for the transposition diameter of the symmetric group, and determine the exact transposition diameter of 2-permutations and simple permutations.

Palabras clave: Symmetric Group; Genome Rearrangement; Algorithm Sort; Open Gate; Circular Permutation.

Pp. 204-215

A New Tight Upper Bound on the Transposition Distance

Anthony Labarre

We study the problem of computing the minimal number of adjacent, non-intersecting block interchanges required to transform a permutation into the identity permutation. In particular, we use the graph of a permutation to compute that number for a particular class of permutations in linear time and space, and derive a new tight upper bound on the so-called transposition distance.

Palabras clave: Circular Permutation; Identity Permutation; Oriented Cycle; Cycle Graph; Black Edge.

Pp. 216-227

Perfect Sorting by Reversals Is Not Always Difficult

Sèverine Bérard; Anne Bergeron; Cedric Chauve; Christophe Paul

This paper investigates the problem of conservation of combinatorial structures in genome rearrangement scenarios. We characterize a class of signed permutations for which one can compute in polynomial time a reversal scenario that conserves all common intervals, and that is parsimonious among such scenarios. Figeac and Varré (WABI 2004) announced that the general problem is NP-hard. We show that there exists a class of permutations for which this computation can be done in linear time with a very simple algorithm, and, for a larger class of signed permutations, the computation can be achieved in subquadratic time. We apply these methods to permutations obtained from the X chromosomes of the human, mouse and rat.

Palabras clave: Prime Node; Permutation Graph; Identity Permutation; Common Interval; Reversal Scenario.

Pp. 228-238

Minimum Recombination Histories by Branch and Bound

Rune B. Lyngsø; Yun S. Song; Jotun Hein

Recombination plays an important role in creating genetic diversity within species, and inferring past recombination events is central to many problems in genetics. Given a set M of sampled sequences, finding an evolutionary history for M with the minimum number of recombination events is a computationally very challenging problem. In this paper, we present a novel branch and bound algorithm for tackling that problem. Our method is shown to be far more efficient than the only preexisting exact method, described in [1]. Our software implementing the algorithm discussed in this paper is publicly available.

Palabras clave: Recombination Event; Hash Table; Ancestral State; Recent Common Ancestor; Ancestral Recombination Graph.

Pp. 239-250