Catálogo de publicaciones - libros

Compartir en
redes sociales


Accessing Multilingual Information Repositories: 6th Workshop of the Cross-Language Evaluation Forum, CLEF 2005, Vienna, Austria, 21-23 September, 2005, Revised Selected Papers

Carol Peters ; Fredric C. Gey ; Julio Gonzalo ; Henning Müller ; Gareth J. F. Jones ; Michael Kluck ; Bernardo Magnini ; Maarten de Rijke (eds.)

En conferencia: 6º Workshop of the Cross-Language Evaluation Forum for European Languages (CLEF) . Vienna, Austria . September 21, 2005 - September 23, 2005

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Information Storage and Retrieval; Artificial Intelligence (incl. Robotics); Information Systems Applications (incl. Internet); Language Translation and Linguistics

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2006 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-45697-1

ISBN electrónico

978-3-540-45700-8

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2006

Tabla de contenidos

What Happened in CLEF 2005

Carol Peters

The organization of the CLEF 2005 evaluation campaign is described and details are provided concerning the tracks, test collections, evaluation infrastructure and participation.

- What Happened in CLEF 2005 | Pp. 1-10

CLEF 2005: Ad Hoc Track Overview

Giorgio M. Di Nunzio; Nicola Ferro; Gareth J. F. Jones; Carol Peters

We describe the objectives and organization of the CLEF 2005 ad hoc track and discuss the main characteristics of the tasks offered to test monolingual, bilingual, and multilingual textual document retrieval. The performance achieved for each task is presented and a statistical analysis of results is given. The mono- and bilingual tasks followed the pattern of previous years but included target collections for two new-to-CLEF languages: Bulgarian and Hungarian. The multilingual tasks concentrated on exploring the reuse of existing test collections from an earlier CLEF campaign. The objectives were to attempt to measure progress in multilingual information retrieval by comparing the results for CLEF 2005 submissions with those of participants in earlier workshops, and also to encourage participants to explore multilingual list merging techniques.

- Part I. Multilingual Textual Document Retrival (Ad Hoc) | Pp. 11-36

Ad-Hoc Mono- and Bilingual Retrieval Experiments at the University of Hildesheim

René Hackl; Thomas Mandl; Christa Womser-Hacker

This paper reports information retrieval experiments carried out within the CLEF 2005 ad-hoc multi-lingual track. The experiments focus on the two new languages Bulgarian and Hungarian. No relevance assessments are available for these collections yet. Optimization was mainly based on French data from CLEF 2004. Based on experience from last year, one of our main objectives was to improve and refine the n-gram-based indexing and retrieval algorithms within our system.

- Cross-Language and More | Pp. 37-43

MIRACLE at Ad-Hoc CLEF 2005: Merging and Combining Without Using a Single Approach

José M. Goñi-Menoyo; José C. González-Cristóbal; Julio Villena-Román

This paper presents the 2005 Miracle’s team approach to the Ad-Hoc Information Retrieval tasks. The goal for the experiments this year was twofold: to continue testing the effect of combination approaches on information retrieval tasks, and improving our basic processing and indexing tools, adapting them to new languages with strange encoding schemes. The starting point was a set of basic components: stemming, transforming, filtering, proper nouns extraction, paragraph extraction, and pseudo-relevance feedback. Some of these basic components were used in different combinations and order of application for document indexing and for query processing. Second-order combinations were also tested, by averaging or selective combination of the documents retrieved by different approaches for a particular query. In the multilingual track, we concentrated our work on the merging process of the results of monolingual runs to get the overall multilingual result, relying on available translations. In both cross-lingual tracks, we have used available translation resources, and in some cases we have used a combination approach.

- Cross-Language and More | Pp. 44-53

The XLDB Group at the CLEF 2005 Ad-Hoc Task

Nuno Cardoso; Leonardo Andrade; Alberto Simões; Mário J. Silva

This paper presents the participation of the XLDB Group in the CLEF 2005 ad-hoc monolingual and bilingual subtasks for Portuguese. We participated with an improved and extended configuration of the tumba! search engine software. We detail the new features and evaluate their performance.

- Cross-Language and More | Pp. 54-60

Thomson Legal and Regulatory Experiments at CLEF-2005

Isabelle Moulinier; Ken Williams

For the 2005 Cross-Language Evaluation Forum, Thomson Legal and Regulatory participated in the Hungarian, French, and Portuguese monolingual search tasks as well as French-to-Portuguese bilingual retrieval. Our Hungarian participation focused on comparing the effectiveness of different approaches toward morphological stemming. Our French and Portuguese monolingual efforts focused on different approaches to Pseudo-Relevance Feedback (PRF), in particular the evaluation of a scheme for selectively applying PRF only in the cases most likely to produce positive results. Our French-to-Portuguese bilingual effort applies our previous work in query translation to a new pair of languages and uses corpus-based language modeling to support term-by-term translation. We compare our approach to an off-the-self machine translation system that translates the query as a whole and find the latter approach to be more performant. All experiments were performed using our proprietary search engine. We remain encouraged by the overall success of our efforts, with our main submissions for each of the four tasks performing above the overall CLEF median. However, none of the specific enhancement techniques we attempted in this year’s forum showed significant improvements over our initial result.

- Cross-Language and More | Pp. 61-68

Using the X-IOTA System in Mono- and Bilingual Experiments at CLEF 2005

Loïc Maisonnasse; Gilles Sérasset; Jean-Pierre Chevallet

This document describes the CLIPS experiments in the CLEF 2005 campaign. We used a surface-syntactic parser in order to extract new indexing terms. These terms are considered syntactic dependencies. Our goal was to evaluate their relevance for an information retrieval task. We used them in different forms in different information retrieval models, in particular in a language model. For the bilingual task, we tried two simple tests of Spanish and German to French retrieval; for the translation we used a lemmatizer and a dictionary.

- Cross-Language and More | Pp. 69-78

Bilingual and Multilingual Experiments with the IR-n System

Elisa Noguera; Fernando Llopis; Rafael Muñoz; Rafael M. Terol; Miguel A. García-Cumbreras; Fernando Martínez-Santiago; Arturo Montejo-Raez

Our paper describes the participation of the IR-n system at CLEF-2005. This year, we participated in the bilingual task (English-French and English-Portuguese) and the multilingual task (English, French, Italian, German, Dutch, Finish and Swedish). We introduced the method of combined passages for the bilingual task. Futhermore we have applied the method of logic forms in the same task. For the multilingual task we had a joint participation with the University of Alicante and University of Jaén. We want to emphasize the good score achieved in the bilingual task improving around 45% in terms of average precision.

- Cross-Language and More | Pp. 79-82

Dictionary-Based Amharic-French Information Retrieval

Atelach Alemu Argaw; Lars Asker; Rickard Cöster; Jussi Karlgren; Magnus Sahlgren

We present four approaches to the Amharic – French bilingual track at CLEF 2005. All experiments use a dictionary based approach to translate the Amharic queries into French Bags-of-words, but while one approach uses word sense discrimination on the translated side of the queries, the other one includes all senses of a translated word in the query for searching. We used two search engines: The SICS experimental engine and Lucene, hence four runs with the two approaches. Non-content bearing words were removed both before and after the dictionary lookup. TF/IDF values supplemented by a heuristic function was used to remove the stop words from the Amharic queries and two French stopwords lists were used to remove them from the French translations. In our experiments, we found that the SICS search engine performs better than Lucene and that using the word sense discriminated keywords produce a slightly better result than the full set of non discriminated keywords.

- Cross-Language and More | Pp. 83-92

A Hybrid Approach to Query and Document Translation Using a Pivot Language for Cross-Language Information Retrieval

Kazuaki Kishida; Noriko Kando

This paper reports experimental results for cross-language infor-mation retrieval (CLIR) from German to French, in which a hybrid approach to query and document translation was attempted, i.e., combining the results of query translation (German to French) and of document translation (French to German). In order to reduce the complexity of computation when translating a large amount of texts, we performed pseudo-translation, i.e., a simple replacement of terms by a bilingual dictionary (for query translation, a machine translation system was used). In particular, since English was used as an intermediary language for both translation directions between German and French, English translations at the middle stage were employed as document representations in order to reduce the number of translation steps. By omitting a translation step (English to German), the performance was improved. Unfortunately, our hybrid approach did not show better performance than a simple query translation. This may be due to the low performance of document translation, which was carried out by a simple replacement of terms using a bilingual dictionary with no term disambiguation.

- Cross-Language and More | Pp. 93-101