Catálogo de publicaciones - libros

Compartir en
redes sociales


MICAI 2006: Advances in Artificial Intelligence: 5th Mexican International Conference on Artificial Intelligence, Apizaco, Mexico, November 13-17, 2006, Proceedings

Alexander Gelbukh ; Carlos Alberto Reyes-Garcia (eds.)

En conferencia: 5º Mexican International Conference on Artificial Intelligence (MICAI) . Apizaco, Mexico . November 13, 2006 - November 17, 2006

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Artificial Intelligence (incl. Robotics); Computation by Abstract Devices; Mathematical Logic and Formal Languages; Image Processing and Computer Vision

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2006 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-49026-5

ISBN electrónico

978-3-540-49058-6

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2006

Tabla de contenidos

Speeding Up Target-Language Driven Part-of-Speech Tagger Training for Machine Translation

Felipe Sánchez-Martínez; Juan Antonio Pérez-Ortiz; Mikel L. Forcada

When training hidden-Markov-model-based part-of-speech (PoS) taggers involved in machine translation systems in an unsupervised manner the use of target-language information has proven to give better results than the standard Baum-Welch algorithm. The target-language-driven training algorithm proceeds by translating every possible PoS tag sequence resulting from the disambiguation of the words in each source-language text segment into the target language, and using a target-language model to estimate the likelihood of the translation of each possible disambiguation. The main disadvantage of this method is that the number of translations to perform grows exponentially with segment length, translation being the most time-consuming task. In this paper, we present a method that uses knowledge obtained in an unsupervised manner to prune unlikely disambiguations in each text segment, so that the number of translations to be performed during training is reduced. The experimental results show that this new pruning method drastically reduces the amount of translations done during training (and, consequently, the time complexity of the algorithm) without degrading the tagging accuracy achieved.

- Natural Language Processing | Pp. 844-854

Defining Classifier Regions for WSD Ensembles Using Word Space Features

Harri M. T. Saarikoski; Steve Legrand; Alexander Gelbukh

Based on recent evaluation of word sense disambiguation (WSD) systems [10], disambiguation methods have reached a standstill. In [10] we showed that it is possible to predict the best system for target word using word features and that using this ’optimal ensembling method’ more accurate WSD ensembles can be built (3-5% over Senseval state of the art systems with the same amount of possible potential remaining). In the interest of developing if more accurate ensembles, w e here define the strong regions for three popular and effective classifiers used for WSD task (Naive Bayes – NB, Support Vector Machine – SVM, Decision Rules – D) using word features (word grain, amount of positive and negative training examples, dominant sense ratio). We also discuss the effect of remaining factors (feature-based).

- Natural Language Processing | Pp. 855-867

Impact of Feature Selection for Corpus-Based WSD in Turkish

Zeynep Orhan; Zeynep Altan

Word sense disambiguation (WSD) is an important intermediate stage for many natural language processing applications. The senses of an ambiguous word are the classification of usages for that word. WSD is basically a mapping function from a context to a set of applicable senses depending on various parameters. Resource selection, determination of senses for ambiguous words, decision of effective features, algorithms, and evaluation criteria are the major issues in a WSD system. This paper deals with the feature selection strategies for word sense disambiguation task in Turkish language. There are many different features that can contribute to the meaning of a word. These features can vary according to the metaphorical usages, POS of the word, pragmatics, etc. The observations indicated that detecting the critical features can contribute much than the learning methodologies.

- Natural Language Processing | Pp. 868-878

Spanish All-Words Semantic Class Disambiguation Using Cast3LB Corpus

Rubén Izquierdo-Beviá; Lorenza Moreno-Monteagudo; Borja Navarro; Armando Suárez

In this paper, an approach to semantic disambiguation based on machine learning and semantic classes for Spanish is presented. A critical issue in a corpus-based approach for Word Sense Disambiguation (WSD) is the lack of wide-coverage resources to automatically learn the linguistic information. In particular, all-words sense annotated corpora such as SemCor do not have enough examples for many senses when used in a machine learning method. Using semantic classes instead of senses allows to collect a larger number of examples for each class while polysemy is reduced, improving the accuracy of semantic disambiguation. Cast3LB, a SemCor-like corpus, manually annotated with Spanish WordNet 1.5 senses, has been used in this paper to perform semantic disambiguation based on several sets of classes: lexicographer files of WordNet, WordNet Domains, and SUMO ontology.

- Natural Language Processing | Pp. 879-888

An Approach for Textual Entailment Recognition Based on Stacking and Voting

Zornitsa Kozareva; Andrés Montoyo

This paper presents a machine-learning approach for the recognition of textual entailment. For our approach we model lexical and semantic features. We study the effect of stacking and voting joint classifier combination techniques which boost the final performance of the system. In an exhaustive experimental evaluation, the performance of the developed approach is measured. The obtained results demonstrate that an ensemble of classifiers achieves higher accuracy than an individual classifier and comparable results to already existing textual entailment systems.

- Natural Language Processing | Pp. 889-899

Textual Entailment Beyond Semantic Similarity Information

Sonia Vázquez; Zornitsa Kozareva; Andrés Montoyo

The variability of semantic expression is a special characteristic of natural language. This variability is challenging for many natural language processing applications that try to infer the same meaning from different text variants. In order to treat this problem a generic task has been proposed: Textual Entailment Recognition. In this paper, we present a new Textual Entailment approach based on Latent Semantic Indexing (LSI) and the cosine measure. This proposed approach extracts semantic knowledge from different corpora and resources. Our main purpose is to study how the acquired information can be combined with an already developed and tested Machine Learning Entailment system (MLEnt). The experiments show that the combination of MLEnt, LSI and cosine measure improves the results of the initial approach.

- Natural Language Processing | Pp. 900-910

On the Identification of Temporal Clauses

Georgiana Puşcaşu; Patricio Martínez Barco; Estela Saquete Boró

This paper describes a machine learning approach to the identification of temporal clauses by disambiguating the subordinating conjunctions used to introduce them. Temporal clauses are regularly marked by subordinators, many of which are ambiguous, being able to introduce clauses of different semantic roles. The paper also describes our work on generating an annotated corpus of sentences embedding clauses introduced by ambiguous subordinators that might have temporal value. Each such clause is annotated as temporal or non-temporal by testing whether it answers the questions , or with respect to the action of its superordinate clause. Using this corpus, we then train and evaluate personalised classifiers for each ambiguous subordinator, in order to set apart temporal usages. Several classifiers are evaluated, and the best performing ones achieve an average accuracy of 89.23% across the set of ambiguous connectives.

- Natural Language Processing | Pp. 911-921

Issues in Translating from Natural Language to SQL in a Domain-Independent Natural Language Interface to Databases

B. Juan J. González; Rodolfo A. Pazos Rangel; I. Cristina Cruz C.; H. Héctor J. Fraire; L. de Santos Aguilar; O. Joaquín Pérez

This paper deals with a domain-independent natural language interface to databases (NLIDB) for the Spanish language. This NLIDB had been previously tested for the Northwind and Pubs domains and had attained good performance (86% success rate). However, domain independence complicates the task of achieving high translation success, and to this end the ATIS (Air Travel Information System) database, which has been used by several natural language interfaces, was selected to conduct a new evaluation. The purpose of this evaluation was to asses the efficiency of the interface after the reconfiguration for another domain and to detect the problems that affect translation success. For the tests a corpus of queries was gathered and the results obtained showed that the interface can easily be reconfigured and that attained a 50% success rate. When the found problems concerning query translation were analyzed, wording deficiencies of some user queries and several errors in the synonym dictionary were discovered. After correcting these problems a second test was conducted, in which the interface attained a 61.4% success rate. These experiments showed that user training is necessary as well as a dialogue system that permits to clarify a query when it is deficiently formulated.

- Natural Language Processing | Pp. 922-931

Interlinguas: A Classical Approach for the Semantic Web. A Practical Case

Jesús Cardeñosa; Carolina Gallardo; Luis Iraola

An efficient use of the web will imply the ability to find not only documents but also specific pieces of information according to user’s query. Right now, this last possibility is not tackled by current information extraction or question answering systems, since it requires both a deeper semantic understanding of queries and contents along with deductive capabilities. In this paper, the authors propose the use of Interlinguas as a plausible approach to search and extract specific pieces of information from a document, given the semantic nature of Interlinguas and their support for deduction. More concretely, the authors describe the UNL Interlinguas from the representational point of view and illustrate its deductive capabilities by means of an example.

- Information Retrieval and Text Classification | Pp. 932-942

A Fuzzy Embedded GA for Information Retrieving from Related Data Set

Yang Yi; JinFeng Mei; ZhiJiao Xiao

The arm of this work is to provide a formal model and an effective way for information retrieving from a big related data set. Based upon fuzzy logic operation, a fuzzy mathematical model of 0-1 mixture programming is addressed. Meanwhile, a density function indicating the overall possessive status of the effective mined out data is introduced. Then, a soft computing (SC) approach which is a genetic algorithm (GA) embedded fuzzy deduction is presented. During the SC process, fuzzy logic decision is taken into the uses of determining the genes’ length, calculating fitness function and choosing feasible solution. Stimulated experiments and comparison tests show that the methods can match the user’s most desired information from magnanimity data exactly and efficiently. The approaches can be extended in practical application in solving general web mining problem.

- Information Retrieval and Text Classification | Pp. 943-951