Catálogo de publicaciones - libros

Compartir en
redes sociales


Bioinformatics Research and Development: First International Conference, BIRD 2007, Berlin, Germany, March 12-14, 2007. Proceedings

Sepp Hochreiter ; Roland Wagner (eds.)

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

No disponibles.

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2007 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-71232-9

ISBN electrónico

978-3-540-71233-6

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2007

Tabla de contenidos

Enhancing Protein Disorder Detection by Refined Secondary Structure Prediction

Chung-Tsai Su; Tong-Ming Hsu; Chien-Yu Chen; Yu-Yen Ou; Yen-Jen Oyang

More and more proteins have been observed to display functions through intrinsic disorder. Such structurally flexible regions are shown to play important roles in biological processes and are estimated to be abundant in eukaryotic proteomes. Previous studies largely use evolutionary information and combinations of physicochemical properties of amino acids to detect disordered regions from primary sequences. In our recent work DisPSSMP, it is demonstrated that the accuracy of protein disorder prediction is greatly improved if the disorder propensity of amino acids is considered when generating the condensed PSSM features. This work aims to investigate how the information of secondary structure can be incorporated in DisPSSMP to enhance the predicting power. We propose a new representation of secondary structure information and compare it with three naïve representations that have been discussed or employed in some related works. The experimental results reveal that the refined information from secondary structure prediction is of benefit to this problem.

- Session 10: Proteomics III (Structure) | Pp. 395-409

Joining Softassign and Dynamic Programming for the Contact Map Overlap Problem

Brijnesh J. Jain; Michael Lappe

Comparison of 3-dimensional protein folds is a core problem in molecular biology. The Contact Map Overlap (CMO) scheme provides one of the most common measures for protein structure similarity. Maximizing CMO is, however, NP-hard. To approximately solve CMO, we combine softassign and dynamic programming. Softassign approximately solves the maximum common subgraph (MCS) problem. Dynamic programming converts the MCS solution to a solution of the CMO problem. We present and discuss experiments using proteins with up to 1500 residues. The results indicate that the proposed method is extremely fast compared to other methods, scales well with increasing problem size, and is useful for comparing similar protein structures.

- Session 10: Proteomics III (Structure) | Pp. 410-423

An Evaluation of Text Retrieval Methods for Similarity Search of Multi-dimensional NMR-Spectra

Alexander Hinneburg; Andrea Porzel; Karina Wolfram

Searching and mining nuclear magnetic resonance (NMR)-spectra of naturally occurring substances is an important task to investigate new potentially useful chemical compounds. Multi-dimensional NMR-spectra are relational objects like documents, but consists of continuous multi-dimensional points called peaks instead of words. We develop several mappings from continuous NMR-spectra to discrete text-like data. With the help of those mappings any text retrieval method can be applied. We evaluate the performance of two retrieval methods, namely the standard vector space model and probabilistic latent semantic indexing (PLSI). PLSI learns hidden topics in the data, which is in case of 2D-NMR data interesting in its owns rights. Additionally, we develop and evaluate a simple direct similarity function, which can detect duplicates of NMR-spectra. Our experiments show that the vector space model as well as PLSI, which are both designed for text data created by humans, can effectively handle the mapped NMR-data originating from natural products. Additionally, PLSI is able to find meaningful ”topics” in the NMR-data.

- Session 11: Databases, Web and Text Analysis | Pp. 424-438

Ontology-Based MEDLINE Document Classification

Fabrice Camous; Stephen Blott; Alan F. Smeaton

An increasing and overwhelming amount of biomedical information is available in the research literature mainly in the form of free-text. Biologists need tools that automate their information search and deal with the high volume and ambiguity of free-text. Ontologies can help automatic information processing by providing standard concepts and information about the relationships between concepts. The Medical Subject Headings (MeSH) ontology is already available and used by MEDLINE indexers to annotate the conceptual content of biomedical articles. This paper presents a domain-independent method that uses the MeSH ontology inter-concept relationships to extend the existing MeSH-based representation of MEDLINE documents. The extension method is evaluated within a document triage task organized by the Genomics track of the 2005 Text REtrieval Conference (TREC). Our method for extending the representation of documents leads to an improvement of 17% over a non-extended baseline in terms of normalized utility, the metric defined for the task. The software is used to classify documents.

- Session 11: Databases, Web and Text Analysis | Pp. 439-452

Integrating Mutations Data of the TP53 Human Gene in the Bioinformatics Network Environment

Domenico Marra; Paolo Romano

We present in this paper some new network tools that can improve the accessibility of information on mutations of the TP53 human gene with the aims of allowing for the integration of this data in the growing bioinformatics network environment and of demonstrating a possible methodology for biological data integration. We implemented the IARC TP53 Mutations Database and related subsets in an SRS site, set up some Web Services allowing for a software oriented, programmatic access to this data, created some demo workflows that illustrate how to interact with Web Services and implemented these workflows in the biowep (Workflow Enactment Portal for Bioinformatics) system. In conclusion, we discuss a new flexible and adaptable methodology for data integration in the biomedical research application domain.

- Session 11: Databases, Web and Text Analysis | Pp. 453-463

Efficient and Scalable Indexing Techniques for Biological Sequence Data

Mihail Halachev; Nematollaah Shiri; Anand Thamildurai

We investigate indexing techniques for sequence data, crucial in a wide variety of applications, where efficient, scalable, and versatile search algorithms are required. Recent research has focused on suffix trees (ST) and suffix arrays (SA) as desirable index representations. Existing solutions for very long sequences however provide either efficient index construction or efficient search, but not both. We propose a new ST representation, STTD64, which has reasonable construction time and storage requirement, and is efficient in search. We have implemented the construction and search algorithms for the proposed technique and conducted numerous experiments to evaluate its performance on various types of real sequence data. Our results show that while the construction time for STTD64 is comparable with current ST based techniques, it outperforms them in search. Compared to ESA, the best known SA technique, STTD64 exhibits slower construction time, but has similar space requirement and comparable search time. Unlike ESA, which is memory based, STTD64 is scalable and can handle very long sequences.

- Session 11: Databases, Web and Text Analysis | Pp. 464-479