Catálogo de publicaciones - libros

Compartir en
redes sociales


Research in Computational Molecular Biology: 11th Annual International Conference, RECOMB 2007, Oakland, CA, USA, April 21-25, 2007. Proceedings

Terry Speed ; Haiyan Huang (eds.)

En conferencia: 11º Annual International Conference on Research in Computational Molecular Biology (RECOMB) . Oakland, CA, USA . April 21, 2007 - April 25, 2007

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

No disponibles.

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2007 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-71680-8

ISBN electrónico

978-3-540-71681-5

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2007

Tabla de contenidos

Improved Ranking Functions for Protein and Modification-Site Identifications

Marshall Bern; David Goldberg

There are a number of computational tools for assigning identifications to peptide tandem mass spectra, but many fewer tools for the crucial next step of integrating spectral identifications into higher-level identifications, such as proteins or modification sites. Here we describe a new program called ComByne for scoring and ranking higher-level identifications. We compare ComByne to existing algorithms on several complex biological samples, including a sample of mouse blood plasma spiked with known concentrations of human proteins. A Web interface to our software is at .

Pp. 444-458

Peptide Retention Time Prediction Yields Improved Tandem Mass Spectrum Identification for Diverse Chromatography Conditions

Aaron A. Klammer; Xianhua Yi; Michael J. MacCoss; William Stafford Noble

Most tandem mass spectrum identification algorithms use information only from the final spectrum, ignoring precursor information such as peptide retention time (RT). Efforts to exploit peptide RT for peptide identification can be frustrated by its variability across liquid chromatography analyses. We show that peptide RT can be reliably predicted by training a support vector regressor on a chromatography run. This dynamically trained model outperforms a published statically trained model of peptide RT across diverse chromatography conditions. In addition, the model can be used to filter peptide identifications that produce large discrepancies between observed and predicted RT. After filtering, estimated true positive peptide identifications increase by as much as 50% at a false discovery rate of 3%, with the largest increase for non-specific cleavage with elastase.

Pp. 459-472

A Fast and Accurate Algorithm for the Quantification of Peptides from Mass Spectrometry Data

Ole Schulz-Trieglaff; Rene Hussong; Clemens Gröpl; Andreas Hildebrandt; Knut Reinert

Liquid chromatography combined with mass spectrometry (LC-MS) has become the prevalent technology in high-throughput proteomics research. One of the aims of this discipline is to obtain accurate quantitative information about all proteins and peptides in a biological sample. Due to size and complexity of the data generated in these experiments, this problem remains a challenging task requiring sophisticated and efficient computational tools.

We propose an algorithm that can quantify even low abundance peptides from LC-MS data. Our approach is flexible and can be applied to preprocessed and raw instrument data. It is based on a combination of the sweep line paradigm with a novel wavelet function tailored to detect isotopic patterns. We evaluate our technique on several data sets of varying complexity and show that we are able to rapidly quantify peptides with high accuracy in a sound algorithmic framework.

Pp. 473-487

Association Mapping of Complex Diseases with Ancestral Recombination Graphs: Models and Efficient Algorithms

Yufeng Wu

Association, or LD (linkage disequilibrium), mapping is an intensely-studied approach to gene mapping (genome-wide or in candidate regions) that is widely hoped to be able to efficiently locate genes influencing both complex and Mendelian traits. The logic underlying association mapping implies that the best possible mapping results would be obtained if the genealogical history of the sampled individuals were explicitly known. Such a history would be in the form of an “ancestral recombination graph (ARG)”. But despite the conceptual importance of genealogical histories to association mapping, few practical association mapping methods have explicitly used derived genealogical aspects of ARGs. Two notable exceptions are [35] and [23].

In this paper we develop an association mapping method that explicitly constructs and samples minARGs (ARGs that minimize the number of recombinations). We develop an ARG sampling method that provably samples minARGs at random, and that is practical for moderate sized datasets. We also develop a different, faster, ARG sampling method that still samples from a well-defined subspace of ARGs, and that is practical for larger sized datasets. We present novel efficient algorithms on extensions of the “phenotype likelihood” problem, a key step in the method in [35]. We also prove that computing the phenotype likelihood for a different natural extension of the penetrance model in [35] is NP-hard, answering a question unresolved in that paper. Finally, we put all of these results into practice, and examine how well the implemented methods perform, compared to the results in [35]. The empirical results show great speed ups, and definite but sometimes small, improvements in mapping accuracy. Speed is particularly important in doing genome-wide scans for causative mutations.

Pp. 488-502

An Efficient and Accurate Graph-Based Approach to Detect Population Substructure

Srinath Sridhar; Satish Rao; Eran Halperin

Currently, large-scale projects are underway to perform whole genome disease association studies. Such studies involve the genotyping of hundreds of thousands of SNP markers. One of the main obstacles in performing such studies is that the underlying population substructure could artificially inflate the -values, thereby generating a lot of false positives. Although existing tools cope well with very distinct sub-populations, closely related population groups remain a major cause of concern.

In this work, we present a graph based approach to detect population substructure.Our method is based on a distance measure between individuals. We show analytically that when the allele frequency differences between the two populations are large enough (in the -norm sense), our algorithm is guaranteed to find the correct classification of individuals to sub-populations.

We demonstrate the empirical performance of our algorithms on simulated and real data and compare it against existing methods, namely the widely used software method and the recent method . Our new technique is highly efficient (in particular it is hundreds of times faster than ), and overall it is more accurate than the two other methods in classifying individuals into sub-populations. We demonstrate empirically that unlike the other two methods, the accuracy of our algorithm consistently increases with the number of SNPs genotyped. Finally, we demonstrate that the efficiency of our method can be used to assess the significance of the resulting clusters. Surprisingly, we find that the different methods find population sub-structure in each of the homogeneous populations of the HapMap project. We use our significance score to demonstrate that these substructures are probably due to over-fitting.

Pp. 503-517

RB-Finder: An Improved Distance-Based Sliding Window Method to Detect Recombination Breakpoints

Wah-Heng Lee; Wing-Kin Sung

Recombination detection is important before inferring phylogenetic relationships. This will eventually lead to a better understanding of pathogen evolution, more accurate genotyping and advancements in vaccine development. In this paper, we introduce RB-Finder, a fast and accurate distance-based window method to detect recombination in a multiple sequence alignment. Our method introduces a more informative distance measure and a novel weighting strategy to reduce the window size sensitivity problem and hence improve the accuracy of breakpoint detection. Furthermore, our method is faster than existing phylogeny-based methods since we do not need to construct and compare complex phylogenetic trees. When compared with the current best method Pruned-PDM, we are about a few hundred times more efficient. Experimental evaluation of RB-Finder using synthetic and biological datasets showed that our method is more accurate than existing phylogeny-based methods. We also show how our method has potential use in other related applications such as genotyping.

Pp. 518-532

Comparative Analysis of Spatial Patterns of Gene Expression in Imaginal Discs

Cyrus L. Harmon; Parvez Ahammad; Ann Hammonds; Richard Weiszmann; Susan E. Celniker; S. Shankar Sastry; Gerald M. Rubin

Determining the precise spatial extent of expression of genes across different tissues, along with knowledge of the biochemical function of the genes is critical for understanding the roles of various genes in the development of metazoan organisms. To address this problem, we have developed high-throughput methods for generating images of gene expression in imaginal discs and for the automated analysis of these images. Our method automatically learns tissue shapes from a small number of manually segmented training examples and automatically aligns, extracts and scores new images, which are analyzed to generate gene expression maps for each gene. We have developed a reverse lookup procedure that enables us to identify genes that have spatial expression patterns most similar to a given gene of interest. Our methods enable us to cluster both the genes and the pixels that of the maps, thereby identifying sets of genes that have similar patterns, and regions of the tissues of interest that have similar gene expression profiles across a large number of genes.

Genomic imaging, Gene expression analysis,Clustering.

Microarray data analysis, Imaginal discs.

Pp. 533-547