Catálogo de publicaciones - libros
Accessing Multilingual Information Repositories: 6th Workshop of the Cross-Language Evaluation Forum, CLEF 2005, Vienna, Austria, 21-23 September, 2005, Revised Selected Papers
Carol Peters ; Fredric C. Gey ; Julio Gonzalo ; Henning Müller ; Gareth J. F. Jones ; Michael Kluck ; Bernardo Magnini ; Maarten de Rijke (eds.)
En conferencia: 6º Workshop of the Cross-Language Evaluation Forum for European Languages (CLEF) . Vienna, Austria . September 21, 2005 - September 23, 2005
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Information Storage and Retrieval; Artificial Intelligence (incl. Robotics); Information Systems Applications (incl. Internet); Language Translation and Linguistics
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2006 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-45697-1
ISBN electrónico
978-3-540-45700-8
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2006
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2006
Tabla de contenidos
doi: 10.1007/11878773_21
ENSM-SE at CLEF 2005: Using a Fuzzy Proximity Matching Function
Annabelle Mercier; Amélie Imafouo; Michel Beigbeder
Starting from the idea that the closer the query terms in a document are to each other the more relevant the document, we propose an information retrieval method that uses the degree of fuzzy proximity of key terms in a document to compute the relevance of the document to the query. Our model handles Boolean queries but, contrary to the traditional extensions of the basic Boolean information retrieval model, does not use a proximity operator explicitly. A single parameter makes it possible to control the proximity degree required. We explain how we construct the queries and report the results of our experiments in the ad-hoc monolingual French task of the CLEF 2005 evaluation campaign.
- Monolingual Experiments | Pp. 187-193
doi: 10.1007/11878773_22
Bulgarian and Hungarian Experiments with Hummingbird SearchServer at CLEF 2005
Stephen Tomlinson
Hummingbird participated in the Bulgarian and Hungarian monolingual information retrieval tasks of the Ad-Hoc Track of the Cross-Language Evaluation Forum (CLEF) 2005. In the ad hoc retrieval tasks, the system was given 50 natural language queries, and the goal was to find all of the relevant documents (with high precision) in a particular document set. We conducted diagnostic experiments with different techniques for matching word variations and handling stopwords. We found that the experimental stemmers significantly increased mean average precision for both languages. Analysis of individual topics found that the algorithmic Bulgarian and Hungarian stemmers encountered some unanticipated stopword collisions. A comparison to an experimental 4-gram technique suggested that Hungarian stemming would further benefit from decompounding.
- Monolingual Experiments | Pp. 194-203
doi: 10.1007/11878773_23
Combining Passages in the Monolingual Task with the IR-n System
Fernando Llopis; Elisa Noguera
The paper describes our participation in the monolingual tasks at CLEF 2005. We submitted results for the following languages: French, Portuguese, Bulgarian and Hungarian, using a passage retrieval system. We focused on a version of this system that combines passages of different size to improve retrieval performance. After an analysis of our experiments and of the official results at CLEF, we find that our passage retrieval combination model achieves considerably improved scores.
- Monolingual Experiments | Pp. 204-207
doi: 10.1007/11878773_24
Weighting Query Terms Based on Distributional Statistics
Jussi Karlgren; Magnus Sahlgren; Rickard Cöster
This year, the SICS team has concentrated on query processing and on the internal topical structure of the query, specifically compound translation. Compound translation is non-trivial due to dependencies between compound elements. This year, we have investigated topical dependencies between query terms: if a query term happens to be non-topical or noise, it should be discarded or given a low weight when ranking retrieved documents; if a query term shows high topicality its weight should be boosted. The two experiments described here are based on the analysis of the distributional character of query terms: one using similarity of occurrence context between query terms globally across the entire collection; the other using the likelihood of individual terms to appear topically in individual texts. Both – complementary – boosting schemes tested delivered improved results.
- Monolingual Experiments | Pp. 208-211
doi: 10.1007/11878773_25
Domain-Specific Track CLEF 2005: Overview of Results and Approaches, Remarks on the Assessment Analysis
Michael Kluck; Maximilian Stempfhuber
The challenge of the CLEF domain-specific track is to map user queries in one language to documents in different languages adapting the systems used to the vocabulary and wording of the social science domain. In addition to a general overview of this track and its tasks, some details on the approaches of the participating groups and their results are reported. One of the outcomes is the considerable improvement in results if the retrieval systems make use of the thesauri provided or the intellectually assigned descriptors. Other findings for IR in a domain-specific context are also given. Finally, considerations on the topic creation and assessment processes are made on the basis of empirical data mainly from the GIRT corpus.
- Part II. Domain-Specific Information Retrieval (Domain-Specific) | Pp. 212-221
doi: 10.1007/11878773_26
A Baseline for NLP in Domain-Specific IR
Johannes Leveling
The information retrieval (IR) methods employed for the third participation of the University of Hagen in the domain-specific task of the Cross Language Evaluation Campaign (CLEF 2005) provide a baseline for experiments with natural language processing (NLP) methods in domain-specific IR than methods employed in our previous participations. The baseline consists of a combination of state-of-the-art IR methods with NLP methods for document and query processing.
Our monolingual experiments with German documents combine several methods to achieve better performance, including an entry vocabulary module (EVM), query expansion with semantically related concepts, and a blind feedback technique. The monolingual experiments focus on comparing two techniques for constructing database queries: creating a and creating a semantic network by means of deep linguistic analysis of the query.
For the bilingual experiments, the English topics are translated into German queries with several machine translation (MT) services publicly available. Each set of translated topics is processed separately with the same techniques as in the monolingual experiments. Evaluation results for official experiments with a staged logistic regression and additional experiments with BM25 are presented.
- Part II. Domain-Specific Information Retrieval (Domain-Specific) | Pp. 222-225
doi: 10.1007/11878773_27
Domain-Specific CLIR of English, German and Russian Using Fusion and Subject Metadata for Query Expansion
Vivien Petras; Fredric Gey; Ray R. Larson
This paper describes the combined submissions of the Berkeley group for the domain-specific track at CLEF 2005. The data fusion technique being tested is the fusion of multiple probabilistic searches against different XML components using both Logistic Regression (LR) algorithms and a version of the Okapi BM-25 algorithm. We also combine multiple translations of queries in cross-language searching. The second technique analyzed is query enhancement with domain-specific metadata (thesaurus terms). We describe our technique of Entry Vocabulary Modules, which associates query words with thesaurus terms and suggest its use for monolingual as well as bilingual retrieval. Different weighting and merging schemes for adding keywords to queries as well as translation techniques are described.
- Part II. Domain-Specific Information Retrieval (Domain-Specific) | Pp. 226-237
doi: 10.1007/11878773_28
Evaluating a Conceptual Indexing Method by Utilizing WordNet
Mustapha Baziz; Mohand Boughanem; Nathalie Aussenac-Gilles
This paper describes our participation to the English Girt Task of CLEF 2005 Campaign. A method for conceptual indexing based on WordNet is used. Both documents and queries are mapped onto WordNet. Identified concepts belonging to WordNet synsets are extracted from documents and queries and those having a single sense are expanded. All runs are carried out using a conceptual indexing approach. Results prove a primacy of using queries from the title field of the topics and a slight gain of using stemming compared to the non stemming cases.
H3.3 []: Information Search and Retrieval;
H.3.1 [] –
Algorithms, Experimentation.
- Part II. Domain-Specific Information Retrieval (Domain-Specific) | Pp. 238-246
doi: 10.1007/11878773_29
Domain Specific Mono- and Bilingual English to German Retrieval Experiments with a Social Science Document Corpus
René Hackl; Thomas Mandl
This paper reports experiments in CLEF 2005’s domain-specific retrieval track carried out at the University of Hildesheim. The experiments were based on previous experiences with the GIRT document corpus and were run in parallel to the multi-lingual experiments for CLEF 2005. We optimized the parameters of the system with one corpus from 2004 and applied these settings to the domain specific task. In that manner, the robustness of our approach over different document collection was assessed.
- Part II. Domain-Specific Information Retrieval (Domain-Specific) | Pp. 247-250
doi: 10.1007/11878773_30
Overview of the CLEF 2005 Interactive Track
Julio Gonzalo; Paul Clough; Alessandro Vallin
The CLEF Interactive Track (iCLEF) is devoted to the comparative study of user-inclusive cross-language search strategies. In 2005, we have studied two cross-language search tasks: retrieval of answers and retrieval of annotated images. In both tasks, no further translation or post-processing is needed after performing the tasks to fulfill the information need.
In the interactive Question Answering task, users are asked to find the answer to a number of questions in a foreign-language document collection, and write the answers in their own native language. In the interactive image retrieval task, a picture is shown to the user, and then the user is asked to find the picture in the collection.
This paper summarizes the task design, experimental methodology, and the results obtained by the research groups participating in the track.
- Part III. Interactive Cross-Language Information Retrieval (iCLEF) | Pp. 251-262