Catálogo de publicaciones - libros

Compartir en
redes sociales


Algorithms in Bioinformatics: 6th International Workshop, WABI 2006, Zurich, Switzerland, September 11-13, 2006, Proceedings

Philipp Bücher ; Bernard M. E. Moret (eds.)

En conferencia: 6º International Workshop on Algorithms in Bioinformatics (WABI) . Zurich, Switzerland . September 11, 2006 - September 13, 2006

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

No disponibles.

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2006 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-39583-6

ISBN electrónico

978-3-540-39584-3

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2006

Tabla de contenidos

Measures of Codon Bias in Yeast, the tRNA Pairing Index and Possible DNA Repair Mechanisms

Markus T. Friberg; Pedro Gonnet; Yves Barral; Nicol N. Schraudolph; Gaston H. Gonnet

Protein translation is a rapid and accurate process, which has been optimized by evolution. Recently, it has been shown that tRNA reusage influences translation speed. We present the tRNA Pairing Index (TPI), a novel index to measure the degree of tRNA reusage in any gene. We describe two variants of the index, how to combine various such indices to a single one and an efficient algorithm for their computation. A statistical analysis of gene expression groups indicate that cell cycle genes have high TPI. This result is independent of other biases like GC content and codon bias. Furthermore, we find an additional unexpected codon bias that seems related to a context sensitive DNA repair.

Pp. 1-11

Decomposing Metabolomic Isotope Patterns

Sebastian Böcker; Matthias C. Letzel; Zsuzsanna Lipták; Anton Pervukhin

We present a method for determining the sum formula of metabolites solely from their mass and isotope pattern. Metabolites, such as sugars or lipids, participate in almost all cellular processes, but the majority still remains uncharacterized. Our input is a measured isotope pattern from a high resolution mass spectrometer, and we want to find those molecules that best match this pattern.

Determination of the sum formula is a crucial step in the identification of an unknown metabolite, as it reduces its possible structures to a hopefully manageable set. Our method is computationally efficient, and first results on experimental data indicate good identification rates for chemical compounds up to 700 Dalton.

Above 1000 Dalton, the number of molecules with a certain mass increases rapidly. To efficiently analyze mass spectra of such molecules, we define several additive invariants extracted from the input and then propose to solve a joint decomposition problem.

Pp. 12-23

A Method to Design Standard HMMs with Desired Length Distribution for Biological Sequence Analysis

Hongmei Zhu; Jiaxin Wang; Zehong Yang; Yixu Song

Hidden Markov Models (HMMs) have been widely used for biological sequence analysis. When modeling a phenomenon where for instance the nucleotide distribution does not change for various length of DNA, there are two popular approaches to achieve a desired length distribution: explicit or implicit modeling. The implicit modeling requires an elaborately designed model structure. So far there is no general procedure available for designing such a model structure from the training data automatically.

We present an iterative algorithm to design standard HMMs structure with length distribution from the training data. The basic idea behind this algorithm is to use multiple shifted negative binomial distributions to model empirical length distribution. The negative binomial distribution is obtained by an array of states, each with the same transition probability to itself. We shift this negative binomial distribution by using a serial of states linearly connected before the binomial model.

Pp. 24-31

Efficient Model-Based Clustering for LC-MS Data

Marta Łuksza; Bogusław Kluge; Jerzy Ostrowski; Jakub Karczmarski; Anna Gambin

Proteomic mass spectrometry is gaining an increasing role in diagnostics and in studies on protein complexes and biological systems. The issue of high-throughput data processing is therefore becoming more and more significant. The problems of data imperfectness, presence of noise and of various errors introduced during experiments arise.

In this paper we focus on the peak alignment problem. As an alternative to heuristic based approaches to aligning peaks from different mass spectra we propose a mathematically sound method which exploits the model-based approach. In this framework experiment errors are modeled as deviations from real values and mass spectra are regarded as finite Gaussian mixtures. The advantage of such an approach is that it provides convenient techniques for adjusting parameters and selecting solutions of best quality. The method can be parameterized by assuming various constraints. In this paper we investigate different classes of models and select the most suitable one. We analyze the results in terms of statistically significant biomarkers that can be identified after alignment of spectra.

Pp. 32-43

A Bayesian Algorithm for Reconstructing Two-Component Signaling Networks

Lukas Burger; Erik van Nimwegen

We present an algorithm, based on a Bayesian network model, for prediction of signaling interactions in bacterial two-component systems. The algorithm uses a large training set of known interacting kinase/receiver pairs to build a probabilistic model of dependency between the amino acid sequences of the two proteins and uses this model to predict which pairs interact. We show that the algorithm can accurately reconstruct cognate kinase/receiver pairs across all sequenced bacteria. We also present predictions of interacting orphan kinase/receiver pairs in the bacterium and show that these significantly overlap with experimentally observed interactions.

Pp. 44-55

Linear-Time Haplotype Inference on Pedigrees Without Recombinations

M. Y. Chan; Wun-Tat Chan; Francis Y. L. Chin; Stanley P. Y. Fung; Ming-Yang Kao

In this paper, a linear-time algorithm, which is optimal, is presented to solve the haplotype inference problem for pedigree data when there are no recombinations and the pedigree has no mating loops. The approach is based on the use of graphs to capture SNP, Mendelian and parity constraints of the given pedigree.

Pp. 56-67

Phylogenetic Network Inferences Through Efficient Haplotyping

Yinglei Song; Chunmei Liu; Russell L. Malmberg; Liming Cai

The genotype phasing problem is to determine the haplotypes of diploid individuals from their genotypes where linkage relationships are not known. Based on the model of perfect phylogeny, the genotype phasing problem can be solved in linear time. However, recombinations may occur and the perfect phylogeny model thus cannot interpret genotype data with recombinations. This paper develops a graph theoretical approach that can reduce the problem to finding a subgraph pattern contained in a given graph. Based on ordered graph tree decomposition, this problem can be solved efficiently with a parameterized algorithm. Our tests on biological genotype data showed that this algorithm is extremely efficient and its interpretation accuracy is better than or comparable with that of other approaches.

Pp. 68-79

Beaches of Islands of Tractability: Algorithms for Parsimony and Minimum Perfect Phylogeny Haplotyping Problems

Leo van Iersel; Judith Keijsper; Steven Kelk; Leen Stougie

The problem () asks for the smallest set of haplotypes which can explain a given set of genotypes, and the problem () asks for the smallest such set which also allows the haplotypes to be embedded in a evolutionary tree, a well-known biologically-motivated data structure. For we extend recent work of [17] by further mapping the interface between “easy” and “hard” instances, within the framework of (,)- By exploring, in the same way, the tractability frontier of we provide the first concrete, positive results for this problem, and the algorithms underpinning these results offer new insights about how might be further tackled in the future. In both and intriguing open problems remain.

Pp. 80-91

On the Complexity of SNP Block Partitioning Under the Perfect Phylogeny Model

Jens Gramm; Tzvika Hartman; Till Nierhoff; Roded Sharan; Till Tantau

Recent technologies for typing single nucleotide polymorphisms (SNPs) across a population are producing genome-wide genotype data for tens of thousands of SNP sites. The emergence of such large data sets underscores the importance of algorithms for large-scale haplotyping. Common haplotyping approaches first partition the SNPs into blocks of high linkage-disequilibrium, and then infer haplotypes for each block separately. We investigate an integrated haplotyping approach where a partition of the SNPs into a minimum number of non-contiguous subsets is sought, such that each subset can be haplotyped under the perfect phylogeny model. We show that finding an optimum partition is NP-hard even if we are guaranteed that two subsets suffice. On the positive side, we show that a variant of the problem, in which each subset is required to admit a perfect phylogeny haplotyping, is solvable in polynomial time.

Pp. 92-102

How Many Transcripts Does It Take to Reconstruct the Splice Graph?

Paul Jenkins; Rune Lyngsø; Jotun Hein

Alternative splicing has emerged as an important biological process which increases the number of transcripts obtainable from a gene. Given a sample of transcripts, the alternative splicing graph (ASG) can be constructed—a mathematical object minimally explaining these transcripts. Most research has so far been devoted to the reconstruction of ASGs from a sample of transcripts, but little has been done on the confidence we can have in these ASGs providing the full picture of alternative splicing. We address this problem by proposing probabilistic models of transcript generation, under which growth of the inferred ASG is investigated. These models are used in novel methods to test the nature of the collection of real transcripts from which the ASG was derived, which we illustrate on example genes. Statistical comparisons of the proposed models were also performed, showing evidence for variation in the pattern of dependencies between donor and acceptor sites.

Pp. 103-114