Catálogo de publicaciones - libros
Algorithms in Bioinformatics: 6th International Workshop, WABI 2006, Zurich, Switzerland, September 11-13, 2006, Proceedings
Philipp Bücher ; Bernard M. E. Moret (eds.)
En conferencia: 6º International Workshop on Algorithms in Bioinformatics (WABI) . Zurich, Switzerland . September 11, 2006 - September 13, 2006
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
No disponibles.
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2006 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-39583-6
ISBN electrónico
978-3-540-39584-3
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2006
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2006
Cobertura temática
Tabla de contenidos
doi: 10.1007/11851561_1
Measures of Codon Bias in Yeast, the tRNA Pairing Index and Possible DNA Repair Mechanisms
Markus T. Friberg; Pedro Gonnet; Yves Barral; Nicol N. Schraudolph; Gaston H. Gonnet
Protein translation is a rapid and accurate process, which has been optimized by evolution. Recently, it has been shown that tRNA reusage influences translation speed. We present the tRNA Pairing Index (TPI), a novel index to measure the degree of tRNA reusage in any gene. We describe two variants of the index, how to combine various such indices to a single one and an efficient algorithm for their computation. A statistical analysis of gene expression groups indicate that cell cycle genes have high TPI. This result is independent of other biases like GC content and codon bias. Furthermore, we find an additional unexpected codon bias that seems related to a context sensitive DNA repair.
Pp. 1-11
doi: 10.1007/11851561_2
Decomposing Metabolomic Isotope Patterns
Sebastian Böcker; Matthias C. Letzel; Zsuzsanna Lipták; Anton Pervukhin
We present a method for determining the sum formula of metabolites solely from their mass and isotope pattern. Metabolites, such as sugars or lipids, participate in almost all cellular processes, but the majority still remains uncharacterized. Our input is a measured isotope pattern from a high resolution mass spectrometer, and we want to find those molecules that best match this pattern.
Determination of the sum formula is a crucial step in the identification of an unknown metabolite, as it reduces its possible structures to a hopefully manageable set. Our method is computationally efficient, and first results on experimental data indicate good identification rates for chemical compounds up to 700 Dalton.
Above 1000 Dalton, the number of molecules with a certain mass increases rapidly. To efficiently analyze mass spectra of such molecules, we define several additive invariants extracted from the input and then propose to solve a joint decomposition problem.
Pp. 12-23
doi: 10.1007/11851561_3
A Method to Design Standard HMMs with Desired Length Distribution for Biological Sequence Analysis
Hongmei Zhu; Jiaxin Wang; Zehong Yang; Yixu Song
Hidden Markov Models (HMMs) have been widely used for biological sequence analysis. When modeling a phenomenon where for instance the nucleotide distribution does not change for various length of DNA, there are two popular approaches to achieve a desired length distribution: explicit or implicit modeling. The implicit modeling requires an elaborately designed model structure. So far there is no general procedure available for designing such a model structure from the training data automatically.
We present an iterative algorithm to design standard HMMs structure with length distribution from the training data. The basic idea behind this algorithm is to use multiple shifted negative binomial distributions to model empirical length distribution. The negative binomial distribution is obtained by an array of states, each with the same transition probability to itself. We shift this negative binomial distribution by using a serial of states linearly connected before the binomial model.
Pp. 24-31
doi: 10.1007/11851561_4
Efficient Model-Based Clustering for LC-MS Data
Marta Łuksza; Bogusław Kluge; Jerzy Ostrowski; Jakub Karczmarski; Anna Gambin
Proteomic mass spectrometry is gaining an increasing role in diagnostics and in studies on protein complexes and biological systems. The issue of high-throughput data processing is therefore becoming more and more significant. The problems of data imperfectness, presence of noise and of various errors introduced during experiments arise.
In this paper we focus on the peak alignment problem. As an alternative to heuristic based approaches to aligning peaks from different mass spectra we propose a mathematically sound method which exploits the model-based approach. In this framework experiment errors are modeled as deviations from real values and mass spectra are regarded as finite Gaussian mixtures. The advantage of such an approach is that it provides convenient techniques for adjusting parameters and selecting solutions of best quality. The method can be parameterized by assuming various constraints. In this paper we investigate different classes of models and select the most suitable one. We analyze the results in terms of statistically significant biomarkers that can be identified after alignment of spectra.
Pp. 32-43
doi: 10.1007/11851561_5
A Bayesian Algorithm for Reconstructing Two-Component Signaling Networks
Lukas Burger; Erik van Nimwegen
We present an algorithm, based on a Bayesian network model, for prediction of signaling interactions in bacterial two-component systems. The algorithm uses a large training set of known interacting kinase/receiver pairs to build a probabilistic model of dependency between the amino acid sequences of the two proteins and uses this model to predict which pairs interact. We show that the algorithm can accurately reconstruct cognate kinase/receiver pairs across all sequenced bacteria. We also present predictions of interacting orphan kinase/receiver pairs in the bacterium and show that these significantly overlap with experimentally observed interactions.
Pp. 44-55
doi: 10.1007/11851561_6
Linear-Time Haplotype Inference on Pedigrees Without Recombinations
M. Y. Chan; Wun-Tat Chan; Francis Y. L. Chin; Stanley P. Y. Fung; Ming-Yang Kao
In this paper, a linear-time algorithm, which is optimal, is presented to solve the haplotype inference problem for pedigree data when there are no recombinations and the pedigree has no mating loops. The approach is based on the use of graphs to capture SNP, Mendelian and parity constraints of the given pedigree.
Pp. 56-67
doi: 10.1007/11851561_7
Phylogenetic Network Inferences Through Efficient Haplotyping
Yinglei Song; Chunmei Liu; Russell L. Malmberg; Liming Cai
The genotype phasing problem is to determine the haplotypes of diploid individuals from their genotypes where linkage relationships are not known. Based on the model of perfect phylogeny, the genotype phasing problem can be solved in linear time. However, recombinations may occur and the perfect phylogeny model thus cannot interpret genotype data with recombinations. This paper develops a graph theoretical approach that can reduce the problem to finding a subgraph pattern contained in a given graph. Based on ordered graph tree decomposition, this problem can be solved efficiently with a parameterized algorithm. Our tests on biological genotype data showed that this algorithm is extremely efficient and its interpretation accuracy is better than or comparable with that of other approaches.
Pp. 68-79
doi: 10.1007/11851561_8
Beaches of Islands of Tractability: Algorithms for Parsimony and Minimum Perfect Phylogeny Haplotyping Problems
Leo van Iersel; Judith Keijsper; Steven Kelk; Leen Stougie
The problem () asks for the smallest set of haplotypes which can explain a given set of genotypes, and the problem () asks for the smallest such set which also allows the haplotypes to be embedded in a evolutionary tree, a well-known biologically-motivated data structure. For we extend recent work of [17] by further mapping the interface between “easy” and “hard” instances, within the framework of (,)- By exploring, in the same way, the tractability frontier of we provide the first concrete, positive results for this problem, and the algorithms underpinning these results offer new insights about how might be further tackled in the future. In both and intriguing open problems remain.
Pp. 80-91
doi: 10.1007/11851561_9
On the Complexity of SNP Block Partitioning Under the Perfect Phylogeny Model
Jens Gramm; Tzvika Hartman; Till Nierhoff; Roded Sharan; Till Tantau
Recent technologies for typing single nucleotide polymorphisms (SNPs) across a population are producing genome-wide genotype data for tens of thousands of SNP sites. The emergence of such large data sets underscores the importance of algorithms for large-scale haplotyping. Common haplotyping approaches first partition the SNPs into blocks of high linkage-disequilibrium, and then infer haplotypes for each block separately. We investigate an integrated haplotyping approach where a partition of the SNPs into a minimum number of non-contiguous subsets is sought, such that each subset can be haplotyped under the perfect phylogeny model. We show that finding an optimum partition is NP-hard even if we are guaranteed that two subsets suffice. On the positive side, we show that a variant of the problem, in which each subset is required to admit a perfect phylogeny haplotyping, is solvable in polynomial time.
Pp. 92-102
doi: 10.1007/11851561_10
How Many Transcripts Does It Take to Reconstruct the Splice Graph?
Paul Jenkins; Rune Lyngsø; Jotun Hein
Alternative splicing has emerged as an important biological process which increases the number of transcripts obtainable from a gene. Given a sample of transcripts, the alternative splicing graph (ASG) can be constructed—a mathematical object minimally explaining these transcripts. Most research has so far been devoted to the reconstruction of ASGs from a sample of transcripts, but little has been done on the confidence we can have in these ASGs providing the full picture of alternative splicing. We address this problem by proposing probabilistic models of transcript generation, under which growth of the inferred ASG is investigated. These models are used in novel methods to test the nature of the collection of real transcripts from which the ASG was derived, which we illustrate on example genes. Statistical comparisons of the proposed models were also performed, showing evidence for variation in the pattern of dependencies between donor and acceptor sites.
Pp. 103-114