Catálogo de publicaciones - libros

Compartir en
redes sociales


Progress in Artificial Intelligence: 12th Portuguese Conference on Artificial Intelligence, EPIA 2005, Covilha, Portugal, December 5-8, 2005, Proceedings

Carlos Bento ; Amílcar Cardoso ; Gaël Dias (eds.)

En conferencia: 12º Portuguese Conference on Artificial Intelligence (EPIA) . Covilha, Portugal . December 5, 2005 - December 8, 2005

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Artificial Intelligence (incl. Robotics); Computation by Abstract Devices; Database Management; Information Storage and Retrieval; Programming Techniques

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2005 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-30737-2

ISBN electrónico

978-3-540-31646-6

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2005

Tabla de contenidos

Constrained Atomic Term: Widening the Reach of Rule Templates in Transformation Based Learning

Cícero Nogueira dos Santos; Claudia Oliveira

Within the framework of Transformation Based Learning (TBL), the rule template is one of the most important elements in the learning process. This paper presents a new model for TBL templates, in which the basic unit, denominated here as an atomic term (AT), encodes a variable sized window and a test that precedes the capture of a feature’s value. A case study of Portuguese NP identification is described and the experimental results obtained are presented.

- Chapter 9 – Text Mining and Applications (TEMA 2005) | Pp. 622-633

Improving Passage Retrieval in Question Answering Using NLP

Jörg Tiedemann

This paper describes an approach for the integration of linguistic information in passage retrieval in an open-source question answering system for Dutch. Annotation produced by the wide-coverage dependency parser Alpino is stored in multiple index layers to be matched with natural language question that have been analyzed by the same parser. We present a genetic algorithm to select features to be included in retrieval queries and for optimizing keyword weights. The system is trained on questions annotated with their answers from the competition on Dutch question answering within the Cross-Language Evaluation Forum (CLEF). The optimization yielded a significant improvement of about 19% in mean reciprocal rank scores on unseen evaluation data compared to the base-line using traditional information retrieval with plain text keywords.

- Chapter 9 – Text Mining and Applications (TEMA 2005) | Pp. 634-646

Mining the Semantics of Text Via Counter-Training

Roman Yangarber

We report on a set of experiments in text mining, specifically, finding semantic patterns given only a few keywords. The experiments employ the Counter-training framework for discovery of semantic knowledge from raw text in a weakly supervised fashion. The experiments indicate that the framework is suitable for efficient acquisition of semantic word classes and collocation patterns, which may be used for Information Extraction.

- Chapter 9 – Text Mining and Applications (TEMA 2005) | Pp. 647-657

Minimum Redundancy Cut in Ontologies for Semantic Indexing

Florian Seydoux; Jean-Cédric Chappelier

This paper presents a new method that aims at improving semantic indexing while reducing the number of indexing terms. Indexing terms are determined using a minimum redundancy cut in a hierarchy of conceptual hypernyms provided by an ontology (e.g. , ). The results of some information retrieval experiments carried out on several standard document collections using the ontology are presented, illustrating the benefit of the method.

- Chapter 9 – Text Mining and Applications (TEMA 2005) | Pp. 658-668

Unsupervised Learning of Multiword Units from Part-of-Speech Tagged Corpora: Does Quantity Mean Quality?

Gaël Dias; Špela Vintar

This paper describes an original hybrid system that extracts multiword unit candidates from part-of-speech tagged corpora. While classical hybrid systems manually define local part-of-speech patterns that lead to the identification of well-known multiword units (mainly compound nouns), we automatically identify relevant syntactical patterns from the corpus. Word statistics are then combined with the endogenously acquired linguistic information in order to extract the most relevant sequences of words. As a result, (1) human intervention is avoided providing total flexibility of use of the system and (2) different multiword units like phrasal verbs, adverbial locutions and prepositional locutions may be identified. Finally, we propose an exhaustive evaluation of our architecture based on the multi-domain, bilingual Slovene-English IJS-ELAN corpus where surprising results are evidenced. To our knowledge, this challenge has never been attempted before.

- Chapter 9 – Text Mining and Applications (TEMA 2005) | Pp. 669-679

Lappin and Leass’ Algorithm for Pronoun Resolution in Portuguese

Thiago Thomes Coelho; Ariadne Maria Brito Rizzoni Carvalho

This paper presents a variant of Lappin and Leass’ Algorithm for pronoun resolution in Portuguese texts; the algorithm resolves third person pronominal anaphora, as well as reflexive and reciprocal pronouns. It relies on salience measures, derived from the syntactic structure of the sentence, and on a simple discourse representation model. The algorithm, as well as its evaluation with legal and literary corpora, are presented.

- Chapter 9 – Text Mining and Applications (TEMA 2005) | Pp. 680-692

STEMBR: A Stemming Algorithm for the Brazilian Portuguese Language

Reinaldo Viana Alvares; Ana Cristina Bicharra Garcia; Inhaúma Ferraz

Stemming algorithms have traditionally been utilized in information retrieval systems as they generate a more concise word representation. However, the efficiency of these algorithms varies according to the language they are used with. This paper presents STEMBR, a stemmer for Brazilian Portuguese whereby the suffix treatment is based on a statistical study of the frequency of the last letter for words found in Brazilian web pages. The proposed stemmer is compared with another algorithm specifically developed for Portuguese. The results show the efficiency of our stemmer.

- Chapter 9 – Text Mining and Applications (TEMA 2005) | Pp. 693-701