Catálogo de publicaciones - libros

Compartir en
redes sociales


Web Information Systems Engineering: WISE 2005: 6th International Conference on Web Information Systems Engineering, New York, NY, USA, November 20-22, 2005, Proceedings

Anne H. H. Ngu ; Masaru Kitsuregawa ; Erich J. Neuhold ; Jen-Yao Chung ; Quan Z. Sheng (eds.)

En conferencia: 6º International Conference on Web Information Systems Engineering (WISE) . New York, NY, USA . November 20, 2005 - November 22, 2005

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Popular Computer Science; Information Systems Applications (incl. Internet); Information Storage and Retrieval; Database Management; Artificial Intelligence (incl. Robotics); Computers and Society

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2005 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-30017-5

ISBN electrónico

978-3-540-32286-3

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2005

Tabla de contenidos

Automatic Keyword Extraction by Server Log Analysis

Chen Ding; Jin Zhou; Chi-Hung Chi

Traditionally, keywords are extracted from full texts of a document. While in the web environment, there are more sources we can use to provide a more complete view of a web page’s contents. In this paper, we propose to analyze web server logs to extract keywords of entry pages from anchor texts and query terms, and propagate these terms along user access paths to other linked pages. The major benefit of this method is that temporal changes could be reflected in extracted terms, and it is more about a user’s viewpoint on page’s contents instead of author’s.

- Poster Flash Session 2 | Pp. 605-606

Approximate Intensional Representation of Web Search Results

Yasunori Matsuike; Satoshi Oyama; Katsumi Tanaka

In this paper, we propose the notion of the “Approximate Intensional Representation (abbrieviated by AIR)” for Web search result. Intuitively, an AIR for a user query q is another query q’ such that the search result (Web pages) is approximately represented by the query expression q’. The purpose of the AIR is to support users to understand the outline of the searched Web pages in a form of query.

- Poster Flash Session 2 | Pp. 607-608

A Unique Design for High-Performance Decentralized Resources Locating: A Topological Perspective

Xinli Huang; Fanyuan Ma; Wenju Zhang

In this paper, we propose a unique protocol for high-performance decentralized resources locating, focusing on building overlays with good topological properties. Our protocol operates with only local knowledge, yet results in enlarged search scope and reduced network traffic, by better matching the heterogeneity and the physical topology, which is also justified by simulations.

Palabras clave: Topological Property; Node Degree; Unique Design; Search Radius; Underlying Network.

- Poster Flash Session 2 | Pp. 609-610

Searching the Web Through User Information Spaces

Athanasios Papagelis; Christos Zaroliagis

During the last years web search engines have moved from the simple but inefficient syntactical analysis (first generation) to the more robust and usable web graph analysis (second generation). Much of the current research is focussed on the so-called third generation search engines that, in principle, inject “human characteristics” on how results are obtained and presented to the end user. Approaches exploited towards this direction include (among others): an alteration of PageRank [1] that takes into account user specific characteristics and bias the page ordering using the user preferences (an approach, though, that does not scale well with the number of users). The approach is further exploited in [3], where several PageRanks are computed for a given number of distinct search topics. A similar idea is used in [6], where the PageRank computation takes into account the content of the pages and the query terms the surfer is looking for. In [4], a decomposition of PageRank to basic components is suggested that may be able to scale the different PageRank computations to a bigger number of topics or even distinct users. Another approach to web search is presented in [2], where a rich extension of the web, called semantic web, and the application of searching over this new setting is described.

- Poster Flash Session 2 | Pp. 611-612

REBIEX: Record Boundary Identification and Extraction Through Pattern Mining

Parashuram Kulkarni

Information on the web is often placed in a structure having a particular alignment and order. For example, Web pages produced by Web search engines, CGI scripts, etc generally have multiple records of information, with each record representing one unit of information and share a distinct visual pattern. The pattern formed by these records may be in the structure of documents or in the repetitive nature of their content. For effective information extraction it becomes essential to identify record boundaries for these units of information and apply extraction rules on individual record elements. In this paper I present REBIEX, a system to automatically identify and extract repeated patterns formed by the data records in a fuzzy way, allowing for slight inconsistencies using the structural elements of web documents as well as the content and categories of text elements in the documents without the need of any training data or human intervention. This technique, unlike the current ones makes use of the fact that it is not only HTML structure which repeats, but also the content matter of the document which repeats consistently. The system also employs a novel algorithm to mine repeating patterns in a fuzzy way with high accuracy.

Palabras clave: Pattern Mining; Text Element; Repetitive Nature; Multiple Record; Domain Specific Information.

- Poster Flash Session 2 | Pp. 613-615

Discovering the Biomedical Deep Web

Rajesh Ramanand; King-Ip Lin

The rapid growth of biomedical information in the Deep Web has produced unprecedented challenges for traditional search engines. This paper describes a new Deep web resource discovery system for biomedical information. We designed two hypertext mining applications: a Focused Crawler that selectively seeks out relevant pages using a classifier that evaluates the relevance of the document with respect to biomedical information, and a Query Interface Extractor that extracts information from the page to detect the presence of a Deep Web database. Our anecdotes suggest that combining focused crawling with query interface extraction is very effective for building high-quality collections of Deep Web resources on biomedical topics.

Palabras clave: Biomedical Data; Query Interface; Decision Tree Classifier; Biomedical Informa; Trained Classifier.

- Poster Flash Session 2 | Pp. 616-617

A Potential IRI Based Phishing Strategy

Anthony Y. Fu; Xiaotie Deng; Wenyin Liu

We anticipate a potential phishing strategy by obfuscation of Web links using Internationalized Resource Identifier (IRI). In the IRI scheme, the glyphs of many characters look very similar while their Unicodes are different. Hence, certain different IRIs may show high similarity. The potential phishing attacks based on this strategy are very likely to happen in the near future with the boosting utilization of IRI. We report this potential phishing strategy to provoke much further dissections of related counter measures.

Palabras clave: Internet security; Anti-phishing; Internationalized Resource Identifier (IRI).

- Poster Flash Session 2 | Pp. 618-619

Multiway Iceberg Cubing on Trees

Pauline LienHua Chou; Xiuzhen Zhang

The Star-cubing algorithm performs multiway aggregation on trees but incurs huge memory consumption. We propose a new algorithm MG-cubing that achieves maximal multiway aggregation. Our experiments show that MG-cubing achieves similar and very often better time and memory efficiency than Star-cubing.

- Poster Flash Session 2 | Pp. 620-622

Building a Semantic-Rich Service-Oriented Manufacturing Environment

Zhonghua Yang; Jing-Bing Zhang; Robert Gay; Liqun Zhuang; Hui Mien Lee

Service-orientation has emerged as a new promising paradigm for enterprise integration in the manufacturing sector. In this paper, we focus on the approach and technologies for constructing a service-oriented manufacturing environment. The service orientation is achieved via virtualization in which every thing, including machines, equipments, devices, various data sources, applications, and processes, are virtualized as standard-based Web services. The virtualization approach is based on the emerging Web Services Resource Framework (WS-RF). A case study of virtualizing an AGV system using WS-RF is described. The use of Semantic Web Services technologies to enhance manufacturing Web services for a semantic-rich environment is discussed, focusing on OWL-S for semantic markup of manufacturing Web services and OWL for the development of ontologies in the manufacturing domain. An enterprise integration architecture enabled by Semantic Web service composition is also discussed.

Palabras clave: Business Process; Service Composition; Domain Ontology; Automate Guide Vehicle; Enterprise Integration.

- Industry-1: Semantic Web | Pp. 623-632

Building a Semantic Web System for Scientific Applications: An Engineering Approach

Renato Fileto; Claudia Bauzer Medeiros; Calton Pu; Ling Liu; Eduardo Delgado Assad

This paper presents an engineering experience for building a Semantic Web compliant system for a scientific application – agricultural zoning. First, we define the concept of ontological cover and a set of relationships between such covers. These definitions, based on domain ontologies, can be used, for example, to support the discovery of services on the Web. Second, we propose a semantic acyclic restriction on ontologies which enables the efficient comparison of ontological covers. Third, we present different engineering solutions to build ontology views satisfying the acyclic restriction in a prototype. Our experimental results unveil some limitations of the current Semantic Web technology to handle large data volumes, and show that the combination of such technology with traditional data management techniques is an effective way to achieve highly functional and scalable solutions.

- Industry-1: Semantic Web | Pp. 633-642