Catálogo de publicaciones - libros

Compartir en
redes sociales


Natural Language Processing: IJCNLP 2004: First International Joint Conference, Hainan Island, China, March 22-24, 2004, Revised Selected Papers

Keh-Yih Su ; Jun’ichi Tsujii ; Jong-Hyeok Lee ; Oi Yee Kwong (eds.)

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Artificial Intelligence (incl. Robotics); Mathematical Logic and Formal Languages; Language Translation and Linguistics; Information Storage and Retrieval; Algorithm Analysis and Problem Complexity; Document Preparation and Text Processing

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2005 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-24475-2

ISBN electrónico

978-3-540-30211-7

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin/Heidelberg 2005

Tabla de contenidos

Fast Reinforcement Learning of Dialogue Policies Using Stable Function Approximation

Matthias Denecke; Kohji Dohsaka; Mikio Nakano

We propose a method to speed up reinforcement learning of policies for spoken dialogue systems. This is achieved by combining a coarse grained abstract representation of states and actions with learning only in frequently visited states. The value of unsampled states is approximated by a linear interpolation of known states. Experiments show that the proposed method effectively optimizes dialogue strategies for frequently visited dialogue states.

- Dialogue and Discourse | Pp. 1-11

Combining Labeled and Unlabeled Data for Learning Cross-Document Structural Relationships

Zhu Zhang; Dragomir Radev

Multi-document discourse analysis has emerged with the potential of improving various NLP applications. Based on the newly proposed Cross-document Structure Theory (CST), this paper describes an empirical study that classifies CST relationships between sentence pairs extracted from topically related documents, exploiting both labeled and unlabeled data. We investigate a binary classifier for determining existence of structural relationships and a full classifier using the full taxonomy of relationships. We show that in both cases the exploitation of unlabeled data helps improve the performance of learned classifiers.

- Dialogue and Discourse | Pp. 32-41

Parsing Mixed Constructions in a Type Feature Structure Grammar

Jong-Bok Kim; Jaehyung Yang

Because of the mixed properties of nominal and verbal properties, Korean gerundive phrases (GPs) posit intriguing issues to both theoretical as well as computational analyses. Various theoretical approaches have been proposed to solve this puzzle, but they all have ended up abandoning or modifying fundamental theory-neutral desiderata such as endocentricity (every phrase has a head), lexicalism (no syntactic rule refers to the word-internal structure), and null licensing (abstract entities are avoided if possible) (cf. Pullum 1991, Malouf 1998). This paper shows that it is possible to analyze and efficiently parse the mixed properties of Korean GPs in a way that maintains the desiderata while avoiding abstract entities. This has been achieved through Korean Phrase Structure Grammar, an extension of HPSG that models human languages as systems of constraints on typed feature structures. The feasibility of the grammar is tested by implementing it into the LKB (Linguistics Knowledge Building) system (cf. Copestake 2002).

- FSA, Parsing Algorithms | Pp. 42-51

A Novel Pattern Learning Method for Open Domain Question Answering

Yongping Du; Xuanjing Huang; Xin Li; Lide Wu

Open Domain Question Answering (QA) represents an advanced application of natural language processing. We develop a novel pattern based method for implementing answer extraction in QA. For each type of question, the corresponding answer patterns can be learned from the Web automatically. Given a new question, these answer patterns can be applied to find the answer. Although many other QA systems have used pattern based method, however, it is noteworthy that our method has been implemented automatically and it can handle the problem other system failed, and satisfactory results have been achieved. Finally, we give a performance analysis of this approach using the TREC-11 question set.

- Information Extraction and Question Answering | Pp. 81-89

Information Flow Analysis with Chinese Text

Paulo Cheong; Dawei Song; Peter Bruza; Kam-Fai Wong

This article investigates the effectiveness of an information inference mechanism on Chinese text. The information inference derives implicit associations via computation of information flow on a high dimensional conceptual space, which is approximated by a cognitively motivated lexical semantic space model, namely Hyperspace Analogue to Language (HAL). A dictionary-based Chinese word segmentation system was used to segment words. To evaluate the Chinese-based information flow model, it is applied to query expansion, in which a set of test queries are expanded automatically via information flow computations and documents are retrieved. Standard recall-precision measures are used to measure performance. Experimental results for TREC-5 Chinese queries and People Daily’s corpus suggest that the Chinese information flow model significantly increases average precision, though the increase is not as high as those achieved using English corpus. Nevertheless, there is justification to believe that the HAL-based information flow model, and in turn our psychologistic stance on the next generation of information processing systems, have a promising degree of language independence.

- Information Retrieval | Pp. 100-109

Phoneme-Based Transliteration of Foreign Names for OOV Problem

Wei Gao; Kam-Fai Wong; Wai Lam

A proper noun dictionary is never complete rendering name translation from English to Chinese ineffective. One way to solve this problem is not to rely on a dictionary alone but to adopt automatic translation according to pronunciation similarities, i.e. to map phonemes comprising an English name to the phonetic representations of the corresponding Chinese name. This process is called transliteration. We present a statistical transliteration method. An efficient algorithm for aligning phoneme chunks is described. Unlike rule-based approaches, our method is data-driven. Compared to source-channel based statistical approaches, we adopt a direct transliteration model, i.e. the direction of probabilistic estimation conforms to the transliteration direction. We demonstrate comparable performance to source-channel based system.

- Information Retrieval | Pp. 110-119

The Hinoki Treebank A Treebank for Text Understanding

Francis Bond; Sanae Fujita; Chikara Hashimoto; Kaname Kasahara; Shigeko Nariyama; Eric Nichols; Akira Ohtani; Takaaki Tanaka; Shigeaki Amano

In this paper we describe the motivation for and construction of a new Japanese lexical resource: the Hinoki treebank. The treebank is built from dictionary definition sentences, and uses an HPSG grammar to encode the syntactic and semantic information. We then show how this treebank can be used to extract thesaurus information from definition sentences in a language-neutral way using minimal recursion semantics.

- Lexical Semantics, Ontology and Linguistic Resources | Pp. 158-167

Visual Semantics and Ontology of Eventive Verbs

Minhua Ma; Paul Mc Kevitt

Various English verb classifications have been analyzed in terms of their syntactic and semantic properties, and conceptual components, such as syntactic valency, lexical semantics, and semantic/syntactic correlations. Here the visual semantics of verbs, particularly their , somatotopic effectors, and level-of-detail, is studied. We introduce the notion of and use it as a primary criterion to recategorize eventive verbs for language visualization (animation) in our intelligent multimodal storytelling system, CONFUCIUS. The visual valency approach is a framework for modelling deeper semantics of verbs. In our ontological system we consider both language and visual modalities since CONFUCIUS is a multimodal system.

- Lexical Semantics, Ontology and Linguistic Resources | Pp. 187-196

Bilingual Sentence Alignment Based on Punctuation Statistics and Lexicon

Thomas C. Chuang; Jian-Cheng Wu; Tracy Lin; Wen-Chie Shei; Jason S. Chang

This paper presents a new method of aligning bilingual parallel texts based on punctuation statistics and lexical information. It is demonstrated that the punctuation statistics prove to be effective means to achieve good results. The task of sentence alignment of bilingual texts written in disparate language pairs like English and Chinese is reportedly more difficult. We examine the feasibility of using punctuations for high accuracy sentence alignment. Encouraging precision rate is demonstrated in aligning sentences in bilingual parallel corpora based solely on punctuation statistics. Improved results were obtained when both punctuation statistics and lexical information were employed. We have experimented with an implementation of the proposed method on the parallel corpora of Sinorama Magazine and Records of the Hong Kong Legislative Council with satisfactory results.

- Machine Translation and Multilinguality | Pp. 224-232

Automatic Learning of Parallel Dependency Treelet Pairs

Yuan Ding; Martha Palmer

Induction of synchronous grammars from empirical data has long been an unsolved problem; despite generative synchronous grammars theoretically suit the machine translation task very well. This fact is mainly due to pervasive structural divergences between languages. This paper presents a statistical approach that learns dependency structure mappings from parallel corpora. The new algorithm automatically learns parallel dependency treelet pairs from loosely matched non-isomorphic dependency trees while keeping computational complexity polynomial in the length of the sentences. A set of heuristics is introduced and specifically optimized for parallel treelet learning purposes using Minimum Error Rate training.

- Machine Translation and Multilinguality | Pp. 233-243