Catálogo de publicaciones - libros

Compartir en
redes sociales


SmartKom: Foundations of Multimodal Dialogue Systems

Wolfgang Wahlster (eds.)

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

No disponibles.

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2006 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-23732-7

ISBN electrónico

978-3-540-36678-2

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2006

Tabla de contenidos

The Facial Expression Module

Carmen Frank; Johann Adelhardt; Anton Batliner; Elmar Nöth; Rui Ping Shi; Viktor Zeißler; Heinrich Niemann

In current dialogue systems the use of speech as an input modality is common. But this modality is only one of those human beings use. In human-human interaction people use gestures to point or facial expressions to show their moods as well. To give modern systems a chance to read information from all modalities used by humans, these systems must have multimodal user interfaces. The S mart K om system has such a multimodal interface that analyzes facial expression, speech and gesture simultaneously. Here we present the module that fulfills the task of facial expression analysis in order to identify the internal state of a user. In the following we first describe the state of the art in emotion and user state recognition by analyzing facial expressions. Next, we describe the facial expression recognition module . After that we present the experiments and results for recognition of user states. We summarize our results in the last section.

Palabras clave: Facial Expression; Recognition Rate; User State; Gesture Recognition; Face Region.

Part II - Multimodal Input Analysis | Pp. 167-180

Multiple Biometrics

Stephan Grashey; Matthias Schuster

Authentication is undoubtedly an important task for all systems providing a component responsible for the interaction between humans and computers. Traditional authentication techniques like PINs, passwords or ID cards show significant drawbacks: they might be forgotten, misplaced or lost, or even stolen, copied or forged. Biometrics use physical or behavioral characteristics to verify the identity of a person and thus overcome these problems. In this respect, biometric technology may be easier and more comfortable to use. First, a short overview of biometric technology in general is given. Then, the main part of this chapter explains the biometric technologies integrated in the SmartKom system. In particular, a new approach to combine several biometrics in a multimodal device is presented. The performance of this proposed combination method is shown to be superior to that of the single biometric subsystems.

Palabras clave: Speaker Recognition; Logic Combination; Biometric System; Feature Fusion; False Acceptance Rate.

Part II - Multimodal Input Analysis | Pp. 181-193

Natural Language Understanding

Ralf Engel

This chapter presents SPIN, a newly developed template-based semantic parser used for the task of natural language understanding in SmartKom . The most outstanding feature of the approach is a powerful template language to provide easy creation and maintenance of the templates and flexible processing. Nevertheless, to achieve fast processing, the templates are applied in a sequential order that is determined offline.

Palabras clave: Working Memory; Dependency Graph; Optional Condition; Dialogue System; Speech Recognizer.

Part II - Multimodal Input Analysis | Pp. 195-207

The Gesture Interpretation Module

Rui Ping Shi; Johann Adelhardt; Anton Batliner; Carmen Frank; Elmar Nöth; Viktor Zeißler; Heinrich Niemann

Humans make often conscious and unconscious gestures, which reflect their mind, thoughts and the way these are formulated. These inherently complex processes can in general not be substituted by a corresponding verbal utterance that has the same semantics (McNeill, 1992). Gesture, which is a kind of body language, contains important information on the intention and the state of the gesture producer. Therefore, it is an important communication channels in human computer interaction . In the following we describe first the state of the art in gesture recognition. The next section describes the gesture interpretation module . After that we present the experiments and results for recognition of user states. We summarize our results in the last section.

Palabras clave: Facial Expression; Hide Markov Model; Speech Recognition; User State; Gesture Recognition.

Part II - Multimodal Input Analysis | Pp. 209-219

Modality Fusion

Ralf Engel; Norbert Pfleger

In this chapter we give an general overview of the modality fusion component of SmartKom . Based on a selection of prominent multimodal interaction patterns, we present our solution for synchronizing the different modes. Finally, we give, on an abstract level, a summary of our approach to modality fusion.

Palabras clave: Gesture Recognition; Situational Context; Speech Input; Multimodal Interaction; Speech Recognizer.

Part III - Multimodal Dialogue Processing | Pp. 223-235

Discourse Modeling

Jan Alexandersson; Norbert Pfleger

We provide a discription of the robust and generic discourse module that is the central repository of contextual information in SmartKom . We tackle discourse modeling by using a three-tiered discourse structure enriched by partitions together with a local and global focus structure. For the manipulation of the discourse structures we use unification and a default unification operation enriched with a metric mirroring the similarity of competing structures called Overlay . We show how a wide variety of naturally occuring multimodal phenomena, in particular, short utterances including elliptical and referring expressions, can be processed in a generic and robust way. As all other modules of the SmartKom backbone, DiM relies on the a domain ontology for the representation of user intentions. Finally, we show that our approach is robust against phenomena caused by imperfect recognition and analysis of user actions.

Palabras clave: Dialogue System; Discourse Processing; Local Focus; Focus Structure; Dialogue Management.

Part III - Multimodal Dialogue Processing | Pp. 237-253

Overlay: The Basic Operation for Discourse Processing

Jan Alexandersson; Tilman Becker; Norbert Pfleger

We provide a formal description of the fundamental nonmonotonic operation used for discourse modeling. Our algorithm—overlay—consists of a default unification algorithm together with an elaborate scoring functionality. In addition to motivation and highlighting examples from the running system, we give some future directions.

Palabras clave: Dialogue System; Computational Linguistics; Discourse Processing; Discourse Context; Movie Theater.

Part III - Multimodal Dialogue Processing | Pp. 255-267

In Context: Integrating Domain- and Situation-Specific Knowledge

Robert Porzel; Iryna Gurevych; Rainer Malaka

We describe the role of context models in natural language processing systems and their implementation and evaluation in the SmartKom system. We show that contextual knowledge is needed for an ensemble of tasks, such as lexical and pragmatic disambiguation, decontextualizion of domain and common-sense knowledge that was left implicit by the user and for estimating an overall coherence score that is used in intention recognition. As the successful evaluations show, the implemented context model enables a multicontext system, such as SmartKom , to respond felicitously to contextually underspecified questions. This ability constitutes an important step toward making dialogue systems more intuitively usable and conversational without losing their reliability and robustness.

Palabras clave: Global Position System; Context Model; Geographic Information System; Dialogue System; Human Language Technology.

Part III - Multimodal Dialogue Processing | Pp. 269-284

Intention Recognition

Jürgen te Vrugt; Thomas Portele

The intention recognition module identifies the analyzed representation of the user input to the SmartKom system that best represents this input in a collection of possible representations. The alternative representations are generated by recognition and analysis components, being enriched with knowledge, e.g., from the discourse and context. A probabilistic model combines various scores, based on features in the representations and computed by the SmartKom modules during processing to support the selection. The parameters inside the model are optimized on annotated data that has been collected using the SmartKom system using both a parameter study and a rank based estimation algorithm.

Palabras clave: Speech Recognition; User Input; Context Model; Good Configuration; Parameter Setup.

Part III - Multimodal Dialogue Processing | Pp. 285-299

Plan-Based Dialogue Management for Multiple Cooperating Applications

Markus Löckelt

The SmartKom dialogue manager implements the personality of the system and its behaviour. It plans and manages the task-oriented dialogue with the user and coordinates the operations of the applications to reach his goals. It also helps the analysis modules of the system by providing hints about the expected future dialogue.

Palabras clave: Plan Operator; Output Channel; Plan Language; Dialogue Game; Communicative Game.

Part III - Multimodal Dialogue Processing | Pp. 301-316