Catálogo de publicaciones - libros

Compartir en
redes sociales


Título de Acceso Abierto

Governance for Drought Resilience: Land and Water Drought Management in Europe

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Climate Change Management and Policy; Water Policy/Water Governance/Water Management

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No requiere 2013 Directory of Open access Books acceso abierto
No requiere 2013 SpringerLink acceso abierto

Información

Tipo de recurso:

libros

ISBN impreso

978-3-642-30909-0

ISBN electrónico

978-3-642-30910-6

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Tabla de contenidos

Dutch Parallel Corpus: A Balanced Parallel Corpus for Dutch-English and Dutch-French

Hans Paulussen; Lieve Macken; Willy Vandeweghe; Piet Desmet

Parallel corpora are a valuable resource for researchers across a wide range of disciplines,i.e. machine translation, computer-assisted translation, terminology extraction, computer-assisted language learning, contrastive linguistics and translation studies. Since the development of a high-quality parallel corpus is a time-consuming and costly process, the DPC project aimed at the creation of a multifunctional resource that satisfies the needs of this diverse group of disciplines.

Part II - HLT Resource-Project Related Papers | Pp. 185-199

Identification and Lexical Representation of Multiword Expressions

Jan Odijk

The central problems that this paper addresses are (i) the lack of large and rich formalised lexicons for multi-word expressions for use in Natural Language Processing (NLP); (ii) the lack of proper methods and tools to extend the lexicon of an NLP-system for multi-word expressions given a text corpus in a maximally automated manner. The paper describes innovative methods and tools for the automatic identification and lexical representation of multi-word expressions. In addition, it describes a 5.000 entry corpus-based multi-word expression lexical database for Dutch developed using these methods. The database has been externally validated, and its usability has been evaluated in NLP-systems for Dutch. The MWE database developed fills a gap in existing lexical resources for Dutch. The generic methods and tools for MWE identification and lexical representation focus on Dutch, but they are largely language-independent and can also be used for other languages, new domains, and beyond this project. The research results and data described in this paper contribute directly to strengthening the digital infrastructure for Dutch.

Part II - HLT Resource-Project Related Papers | Pp. 201-217

The Construction of a 500-Million-Word Reference Corpus of Contemporary Written Dutch

Nelleke Oostdijk; Martin Reynaert; Véronique Hoste; Ineke Schuurman

The construction of a large and richly annotated corpus of written Dutch was identified as one of the priorities of the STEVIN programme. Such a corpus, sampling texts from conventional and new media, is invaluable for scientific research and application development. The present chapter describes how in two consecutive STEVIN-funded projects, viz. D-Coi and SoNaR, the Dutch reference corpus was developed. The construction of the corpus has been guided by (inter)national standards and best practices. At the same time through the achievements and the experiences gained in the D-Coi and SoNaR projects, a contribution was made to their further advancement and dissemination.

Part II - HLT Resource-Project Related Papers | Pp. 219-247

Lexical Modeling for Proper name Recognition in Autonomata Too

Bert Réveil; Jean-Pierre Martens; Henk van den Heuvel; Gerrit Bloothooft; Marijn Schraagen

The research in Autonomata Too aimed at the development of new pronunciation modeling techniques that can bring the speech recognition component of a Dutch/Flemish POI (Points of Interest) information providing business service to the required level of accuracy. The automatic recognition of spoken POI is extremely difficult because of the existence of multiple pronunciations that are frequently used for the same POI and because of the presence of important cross-lingual effects one has to account for. In fact, the ASR (Automatic Speech Recognition) engine must be able to cope with pronunciations of (partly) foreign POI names spoken by native speakers and pronunciations of native POI names uttered by non-native speakers. In order to deal adequately with such pronunciations, one must model them at the level of the acoustic models as well as at the level of the recognition lexicon. This paper describes a novel lexical modeling approach that was developed and tested in the Autonomata Too project. The new method employs a G2P-P2P (grapheme-to-phoneme, phoneme-to-phoneme) tandem to generate suitable lexical pronunciation variants. It was shown to yield a significant improvement over a baseline system already embedding state-of-the-art acoustic and lexical models.

Part III - HLT-Technology Related Papers | Pp. 251-270

N-Best 2008: A Benchmark Evaluation for Large Vocabulary Speech Recognition in Dutch

David A. van Leeuwen

In 2008 an evaluation of large vocabulary continuous speech recognition systems for the Dutch language was conducted. The tasks consisted of transcription of Broadcast News and Conversational Telephone Speech in the Northern and Southern regional language variants (Dutch and Flemish). The evaluation was modeled after the well known ARPA/NIST evaluations and the French Technolangue Evalda campaigns. This paper reviews the tasks and evaluation methodology used, presents the official results and discusses some additional analyses. Acoustic and textual training material was specified and provided in a primary evaluation condition. Seven academic sites from four European countries submitted results to this evaluation in four primary transcription tasks. The best results reported are a word error rate of 15.9% for Southern Dutch Broadcast News. Text normalisation, vocabulary and pronunciation modeling are common among the important system development efforts.

Part III - HLT-Technology Related Papers | Pp. 271-288

Missing Data Solutions for Robust Speech Recognition

Yujun Wang; Jort F. Gemmeke; Kris Demuynck; Hugo Van hamme

Current automatic speech recognisers rely for a great deal on statistical models learned from training data. When they are deployed in conditions that differ from those observed in the training data, the generative models are unable to explain the incoming data and poor accuracy results. A very noticeable effect is deterioration due to background noise. In the MIDAS project, the state-of-the-art in noise robustness was advanced on two fronts, both making use of the approach. First, novel sparse exemplar-based representations of speech were proposed. Compressed sensing techniques were used to impute noise-corrupted data from exemplars. Second, a missing data approach was adopted in the context of a large vocabulary speech recogniser, resulting in increased robustness at high noise levels without compromising on accuracy at low noise levels. The performance of the missing data recogniser was compared with that of the Nuance VOCON-3200 recogniser in a variety of noise conditions observed in field data.

Part III - HLT-Technology Related Papers | Pp. 289-304

Parse and Corpus-Based Machine Translation

Vincent Vandeghinste; Scott Martens; Gideon Kotzé; Jörg Tiedemann; Joachim Van den Bogaert; Koen De Smet; Frank Van Eynde; Gertjan van Noord

In this paper the PaCo-MT project is described, in which has been investigated: a data-driven approach to stochastic syntactic rule-based machine translation.In contrast to the phrase-based statistical machine translation systems (PB-SMT) which are and do not use any linguistic knowledge, an MT engine in a different paradigm was built: a tree-based data-driven system that automatically induces from a large syntactically analysed parallelcorpus. The architecture is presented in detail as well as an evaluation in comparison with our previous work and with the current state-of-the art PB-SMT system Moses.

Part III - HLT-Technology Related Papers | Pp. 305-319

Development and Integration of Speech Technology into COurseware for Language Learning: The DISCO Project

Helmer Strik; Joost van Doremalen; Jozef Colpaert; Catia Cucchiarini

Language learners seem to learn best in one-on-one interactive learning situations in which they receive optimal corrective feedback. However, providing this type of tutoring by trained language instructors is time-consuming and costly, and therefore not feasible for the majority of language learners. This particularly applies to oral proficiency, where corrective feedback has to be provided immediately after the utterance has been spoken, thus making it even more difficult to provide sufficient practice in the classroom. The recent appearance of Computer Assisted Language Learning (CALL) systems that make use of Automatic Speech Recognition (ASR) and other advanced automatic techniques offers new perspectives for practicing oral proficiency in a second language (L2). In the DISCO project a prototype of an ASR-based CALL application for practicing oral proficiency for Dutch as a second language (DL2) was developed. The application optimises learning through interaction in realistic communication situations and provides intelligent feedback on various aspects of DL2 speaking, viz. pronunciation, morphology and syntax. In this chapter we discuss the results of the DISCO project, we consider how DISCO has contributed to the state of the art and present some future perspectives.

Part IV - HLT Application Related Papers | Pp. 323-338

Question Answering of InformativeWeb Pages: How Summarisation Technology Helps

Jan De Belder; Daniël de Kok; Gertjan van Noord; Fabrice Nauze; Leonoor van der Beek; Marie-Francine Moens

During the DAISY project we have developed essential technology for automatic summarisation of Dutch informative web pages. The project especially focuses on paraphrasing and compression of Dutch sentences, and on the rhetorical classification of content blocks and sentences in the web pages. For the paraphrasing and compression we rely on language models and syntactic constraints. In addition, the Alpino parser for Dutch was extended with a fluency component. Because the rhetorical role of a sentence is dependent on the role of its surrounding sentences we improve the rhetorical classification by finding a globally optimal assignment for all the sentences in a web page. Both the sentence compression and rhetorical classification use an Integer Linear Programming optimization strategy.

Part IV - HLT Application Related Papers | Pp. 339-357

Generating, Refining and Using Sentiment Lexicons

Maarten de Rijke; Valentin Jijkoun; Fons Laan; Wouter Weerkamp; Paul Ackermans; Gijs Geleijnse

In order to use a sentiment extraction system for a media analysis problem, a system would have to be able to determine which of the extracted sentiments are relevant, i.e., it would not only have to identify targets of extracted sentiments, but also decide which targets are relevant for the topic at hand.

Part IV - HLT Application Related Papers | Pp. 359-377