Catálogo de publicaciones - libros

Compartir en
redes sociales


Text, Speech and Dialogue: 10th International Conference, TSD 2007, Pilsen, Czech Republic, September 3-7, 2007. Proceedings

Václav Matoušek ; Pavel Mautner (eds.)

En conferencia: 10º International Conference on Text, Speech and Dialogue (TSD) . Pilsen, Czech Republic . September 3, 2007 - September 7, 2007

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Language Translation and Linguistics; Artificial Intelligence (incl. Robotics); Data Mining and Knowledge Discovery; Information Storage and Retrieval; Information Systems Applications (incl. Internet)

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2007 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-74627-0

ISBN electrónico

978-3-540-74628-7

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2007

Tabla de contenidos

Using Query-Relevant Documents Pairs for Cross-Lingual Information Retrieval

David Pinto; Alfons Juan; Paolo Rosso

The world wide web is a natural setting for cross-lingual information retrieval. The European Union is a typical example of a multilingual scenario, where multiple users have to deal with information published in at least 20 languages. Given queries in some source language and a target corpus in another language, the typical approximation consists in translating either the query or the target dataset to the other language. Other approaches use parallel corpora to obtain a statistical dictionary of words among the different languages. In this work, we propose to use a training corpus made up by a set of Query-Relevant Document Pairs (QRDP) in a probabilistic cross-lingual information retrieval approach which is based on the IBM alignment model 1 for statistical machine translation. Our approach has two main advantages over those that use direct translation and parallel corpora: we will not obtain a translation of the query, but a set of associated words which share their meaning in some way and, therefore, the obtained dictionary is, in a broad sense, more semantic than a translation one. Besides, since the queries are supervised, we are working in a more restricted domain than that when using a general parallel corpus (it is well known that in this context results are better than those which are performed in a general context). In order to determine the quality of our experiments, we compared the results with those obtained by a direct translation of the queries with a query translation system, observing promising results.

- Dialog | Pp. 630-637

Detection of Dialogue Acts Using Perplexity-Based Word Clustering

Iosif Mporas; Dimitrios P. Lyras; Kyriakos N. Sgarbas; Nikos Fakotakis

In the present work we used a word clustering algorithm based on the perplexity criterion, in a Dialogue Act detection framework in order to model the structure of the speech of a user at a dialogue system. Specifically, we constructed an n-gram based model for each target Dialogue Act, computed over the word classes. Then we evaluated the performance of our dialogue system on ten different types of dialogue acts, using an annotated database which contains 1,403,985 unique words. The results were very promising since we achieved about 70% of accuracy using trigram based models.

- Dialog | Pp. 638-643

Dialogue Management for Intelligent TV Based on Statistical Learning Method

Hyo-Jung Oh; Chung-Hee Lee; Yi-Gyu Hwang; Myung-Gil Jang

In this paper, we introduce a practical spoken dialogue interface for intelligent TV based on goal-oriented dialogue modeling. It uses a frame structure for representing the user intention and determining the next action. To analyze discourse context, we employ several statistical learning techniques and device an incremental dialogue strategy learning method from training corpus. By empirical experiments, we demonstrated the efficiency of the proposed system. In case of the subjective evaluation, we obtained 73% user satisfaction ratio, while the objective evaluation result was over 90% in case of a restricted situation for commercialization.

- Dialog | Pp. 644-652

Multiple-Taxonomy Question Classification for Category Search on Faceted Information

David Tomás; José L. Vicedo

In this paper we present a novel multiple-taxonomy question classification system, facing the challenge of assigning categories in multiple taxonomies to natural language questions. We applied our system to category search on faceted information. The system provides a natural language interface to faceted information, detecting the categories requested by the user and narrowing down the document search space to those documents pertaining to the facet values identified. The system was developed in the framework of language modeling, and the models to detect categories are inferred directly from the corpus of documents.

- Dialog | Pp. 653-660

Indexing and Retrieval Scheme for Content-Based Multimedia Applications

Martynov Dmitry; Eugenij Bovbel

In the past, Maximum Entropy based language models were constrained by training data n-gram counts, topic estimates, and triggers. We will investigate the obtainable gains from imposing additional constraints related to linguistic clusters, such as parts of speech, semantic/syntactic word clusters, and semantic labels. It will be shown that there substantial profit is available provided the estimates use Gaussian a priori statistics.

- Dialog | Pp. 661-661