Catálogo de publicaciones - libros
Text, Speech and Dialogue: 10th International Conference, TSD 2007, Pilsen, Czech Republic, September 3-7, 2007. Proceedings
Václav Matoušek ; Pavel Mautner (eds.)
En conferencia: 10º International Conference on Text, Speech and Dialogue (TSD) . Pilsen, Czech Republic . September 3, 2007 - September 7, 2007
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Language Translation and Linguistics; Artificial Intelligence (incl. Robotics); Data Mining and Knowledge Discovery; Information Storage and Retrieval; Information Systems Applications (incl. Internet)
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2007 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-74627-0
ISBN electrónico
978-3-540-74628-7
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2007
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2007
Tabla de contenidos
Using Query-Relevant Documents Pairs for Cross-Lingual Information Retrieval
David Pinto; Alfons Juan; Paolo Rosso
The world wide web is a natural setting for cross-lingual information retrieval. The European Union is a typical example of a multilingual scenario, where multiple users have to deal with information published in at least 20 languages. Given queries in some source language and a target corpus in another language, the typical approximation consists in translating either the query or the target dataset to the other language. Other approaches use parallel corpora to obtain a statistical dictionary of words among the different languages. In this work, we propose to use a training corpus made up by a set of Query-Relevant Document Pairs (QRDP) in a probabilistic cross-lingual information retrieval approach which is based on the IBM alignment model 1 for statistical machine translation. Our approach has two main advantages over those that use direct translation and parallel corpora: we will not obtain a translation of the query, but a set of associated words which share their meaning in some way and, therefore, the obtained dictionary is, in a broad sense, more semantic than a translation one. Besides, since the queries are supervised, we are working in a more restricted domain than that when using a general parallel corpus (it is well known that in this context results are better than those which are performed in a general context). In order to determine the quality of our experiments, we compared the results with those obtained by a direct translation of the queries with a query translation system, observing promising results.
- Dialog | Pp. 630-637
Detection of Dialogue Acts Using Perplexity-Based Word Clustering
Iosif Mporas; Dimitrios P. Lyras; Kyriakos N. Sgarbas; Nikos Fakotakis
In the present work we used a word clustering algorithm based on the perplexity criterion, in a Dialogue Act detection framework in order to model the structure of the speech of a user at a dialogue system. Specifically, we constructed an n-gram based model for each target Dialogue Act, computed over the word classes. Then we evaluated the performance of our dialogue system on ten different types of dialogue acts, using an annotated database which contains 1,403,985 unique words. The results were very promising since we achieved about 70% of accuracy using trigram based models.
- Dialog | Pp. 638-643
Dialogue Management for Intelligent TV Based on Statistical Learning Method
Hyo-Jung Oh; Chung-Hee Lee; Yi-Gyu Hwang; Myung-Gil Jang
In this paper, we introduce a practical spoken dialogue interface for intelligent TV based on goal-oriented dialogue modeling. It uses a frame structure for representing the user intention and determining the next action. To analyze discourse context, we employ several statistical learning techniques and device an incremental dialogue strategy learning method from training corpus. By empirical experiments, we demonstrated the efficiency of the proposed system. In case of the subjective evaluation, we obtained 73% user satisfaction ratio, while the objective evaluation result was over 90% in case of a restricted situation for commercialization.
- Dialog | Pp. 644-652
Multiple-Taxonomy Question Classification for Category Search on Faceted Information
David Tomás; José L. Vicedo
In this paper we present a novel multiple-taxonomy question classification system, facing the challenge of assigning categories in multiple taxonomies to natural language questions. We applied our system to category search on faceted information. The system provides a natural language interface to faceted information, detecting the categories requested by the user and narrowing down the document search space to those documents pertaining to the facet values identified. The system was developed in the framework of language modeling, and the models to detect categories are inferred directly from the corpus of documents.
- Dialog | Pp. 653-660
Indexing and Retrieval Scheme for Content-Based Multimedia Applications
Martynov Dmitry; Eugenij Bovbel
In the past, Maximum Entropy based language models were constrained by training data n-gram counts, topic estimates, and triggers. We will investigate the obtainable gains from imposing additional constraints related to linguistic clusters, such as parts of speech, semantic/syntactic word clusters, and semantic labels. It will be shown that there substantial profit is available provided the estimates use Gaussian a priori statistics.
- Dialog | Pp. 661-661