Catálogo de publicaciones - libros

Compartir en
redes sociales


Data Management. Data, Data Everywhere: 24th British National Conference on Databases, BNCOD 24, Glasgow, UK, July 3-5, 2007. Proceedings

Richard Cooper ; Jessie Kennedy (eds.)

En conferencia: 24º British National Conference on Databases (BNCOD) . Glasgow, UK . July 3, 2007 - July 5, 2007

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

No disponibles.

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2007 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-73389-8

ISBN electrónico

978-3-540-73390-4

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2007

Tabla de contenidos

Design Abstractions for Innovative Web Applications

Stefano Ceri

Web Modelling Language (WebML) [1-2] was defined, about 8 years ago, as a conceptual model for data-intensive Web applications. Early deployment technologies were very unstable and immature; as a reaction, WebML was thought as a high level, implementation-independent conceptual model, and the associated design support environment, called WebRatio [7], has always been platform-independent, so as to adapt to frequent technological changes. WebML is based upon orthogonal separation of concerns: content, interface logics, and presentation logics are defined as separate components. The main innovation in WebML comes from the interface logics, that enables the computation of Web pages made up of logical components (units) interconnected by logical links (i.e., not only the units but also the links have a formal semantics); the computation is associated with powerful defaults so as to associate to simple diagrams all the required semantics for a full deployment, through code generators.

- Invited Papers | Pp. 1-2

Automation Everywhere: Autonomics and Data Management

Norman W. Paton

Traditionally, database management systems (DBMSs) have been associated with high-cost, high-quality functionalities. That is, powerful capabilities are provided, but only in response to careful design, procurement, deployment and administration. This has been very successful in many contexts, but in an environment in which data is available in increasing quantities under the management of a growing collection of applications, and where effective use of available data often provides a competitive edge, there is a requirement for various of the benefits of a comprehensive data management infrastructure to be made available with rather fewer of the costs. If this requirement is to be met, automation will need to be deployed much more widely and systematically in data management platforms. This paper reviews recent results on autonomic data management, makes a case that current practice presents significant opportunities for further development, and argues that comprehensive support for automation should be central to future data management infrastructures.

- Invited Papers | Pp. 3-12

Exhaustive Peptide Searching Using Relations

Ela Hunt

We present a new robust solution to short peptide searching, tested on a relational platform, with a set of biological queries. Our algorithm is appropriate for large scale scientific data analysis, and has been tested with 1.4 GB of amino-acids. Protein sequences are indexed as short overlapping string windows, and stored in a relation. To find approximate matches, we use a neighbourhood generation algorithm. The words in the neighbourhood are then fetched and stored in a relation. We measure execution time and compare the matches found to those delivered by BLAST. We report some performance gains in exact matching and searching within edit distance 1, and very significant quality improvements over heuristics, as we guarantee to deliver all relevant matches.

- Data Applications | Pp. 13-24

Data Lineage Tracing in Data Warehousing Environments

Hao Fan

Data lineage tracing (DLT) is to find derivations of integrated data in integrated database systems, where the data sources might be autonomous, distributed and heterogeneous. In previous work, we present a DLT approach using partial schema transformation pathways. In this paper, we extend our DLT approach to using full schema transformation pathways and discuss the problem of lineage data ambiguities. Our DLT approach is not limited in one specific data model and query language, and would be useful in general data warehousing environments.

- Data Applications | Pp. 25-36

Fast Recognition of Asian Characters Based on Database Methodologies

Woong-Kee Loh; Young-Ho Park; Yong-Ik Yoon

Character recognition has been an active research area in the field of pattern recognition. The existing character recognition algorithms are focused mainly on increasing the recognition rate. However, as in the recent Google Library Project, the requirement for speeding up recognition of enormous amount of documents is growing. Moreover, the existing algorithms do not pay enough attention to Asian characters. In this paper, we propose an algorithm for fast recognition of Asian characters based on the database methodologies. Since the number of Asian characters is very large and their shapes are complicated, Asian characters require much more recognition time than numeric and Roman characters. The proposed algorithm extracts the feature from each of Asian characters through the Discrete Fourier Transform (DFT) and optimizes the recognition speed by storing and retrieving the features using a multidimensional index. We improve the recognition speed of the proposed algorithm using the association rule technique, which is a widely adopted data mining technique. The proposed algorithm has the advantage that it can be applied regardless of the language, size, and font of the characters to be recognized.

- Data Applications | Pp. 37-48

SPDBSW: A Service Prototype of SPDBS on the Web

Tae-Sung Jung; Wan-Sup Cho

As the amount of pathway information for various organisms is increasing very rapidly, performing various analyses on the full network of pathways for even multiple organisms can be possible and therefore developing an integrated database for storing and analyzing pathway information is becoming a critical issue. Until now analyzing these networks is not easy because of the nature of the existing pathway databases, which are often heterogeneous, incomplete, and/or inconsistent. We presented a database system called SPDBS to solve this problem. However, application-oriented systems like SPDBS have some limitations on the extension and integration of the heterogeneous databases.

In this paper, we extend previous SPDBS into a web service prototype (SPDBSW) where all functions can be serviced on the web environment. The web services include pathway database integration/search, import/export of SBML documents, pathway reconstruction/visualization. SPDBSW has been implemented by the combination SPDBS and external web services such as OLS, KEGG and NCBI. And user can get more confidential and delicate information from KEGG or NCBI through their web services. The system can be extended or modified immediately by replacing its component web services. We provide SPDBSW at the website

- Data Applications | Pp. 49-57

Indexing and Searching XML Documents Based on Content and Structure Synopses

Weimin He; Leonidas Fegaras; David Levine

We present a novel framework for indexing and searching schema-less XML documents based on concise summaries of their structural and textual content. Our search query language is XPath extended with full-text search. We introduce two novel data synopsis structures that correlate textual with positional information in an XML document and improves query precision. In addition, we present a two-phase containment filtering algorithm based on these synopses that improves the searching process. Our experimental evaluation shows that our data synopses indexing scheme outperforms the standard XML indexing scheme based on inverted lists; the query evaluation based on our data synopses is more accurate than related approximate approaches that do not consider positional information; our two-phase containment filtering algorithm is more efficient than a single-phase brute force algorithm.

- Searching XML Documents | Pp. 58-69

PosFilter: An Efficient Filtering Technique of XML Documents Based on Postfix Sharing

Jaehoon Kim; Youngsoo Kim; Seog Park

XML message filtering is to evaluate the path matching of a large number of registered path queries over a continuous stream of XML messages in real time. For this purpose, YFilter system has been suggested to exploit the prefix commonalities that exist among path expressions. Sharing such commonality gives the benefit of improving filtering performance through the tremendous reduction in filtering machine size. However, postfix sharing also can be useful for an XML filtering situation. For example, if a stream of XML messages does not have any defined DTD (or XML schema), the XPath queries beginning with the ancestor-descendant axis (’//’) can be used often, e.g., ’//buyer/name’, ’//seller/name’, and ’//name’, and such query type is most likely to have the postfix sharing. Therefore, in this paper, we propose a bottom up filtering approach exploiting postfix sharing against the top down approach of YFilter exploiting prefix sharing. Some experimental results show that our method has better performance in the postfix-shared scenario.

- Searching XML Documents | Pp. 70-81

OOXSearch: A Search Engine for Answering Loosely Structured XML Queries Using OO Programming

Kamal Taha; Ramez Elmasri

There has been extensive research in XML keyword-based and loosely structured querying. Some frameworks work well for certain types of XML data models and fail in others. The reason is that the proposed techniques are based on finding relationships between solely individual nodes while overlooking the context of these nodes. The context of a leaf node is determined by its parent node, because it specifies one of the characteristics of its parent node. Building relationships between individual leaf nodes without consideration of their parents may result in relationships that are semantically disconnected. Since leaf nodes are nothing but characteristics of their parents, we observe that we could treat each parent-children set of nodes as one unified entity. We then find semantic relationships between the different unified entities.Based on those observations, we propose an XML semantic search engine called OOXSearch, which answers loosely structured queries. The recall and precision of the engine were evaluated experimentally and compared with two recent proposed systems [1, 2] and the results showed marked improvement.

- Searching XML Documents | Pp. 82-100

Evaluating XPath Queries on XML Data Streams

Stefan Böttcher; Rita Steinmetz

Whenever queries have to be evaluated on XML data streams - or when the memory that is available to evaluate the XML data is relatively small com pared to the document - DOM based approaches that have to load and store large parts of the document in main memory will fail. In comparison, we pre sent an approach to evaluate XPath queries on SAX streams that supports all axes of core XPath, including the sibling axes. Starting from the XPath query, our approach generates a stack of automata that uses the SAX stream as input and generates the result of the query as an output SAX stream. An evaluation of our implementation shows that in gen eral our approach needs less main memory, but at the same time is faster than both, Saxon and YFilter.

- Querying XML Documents | Pp. 101-113