Catálogo de publicaciones - libros
SmartKom: Foundations of Multimodal Dialogue Systems
Wolfgang Wahlster (eds.)
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
No disponibles.
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2006 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-23732-7
ISBN electrónico
978-3-540-36678-2
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2006
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2006
Tabla de contenidos
The Facial Expression Module
Carmen Frank; Johann Adelhardt; Anton Batliner; Elmar Nöth; Rui Ping Shi; Viktor Zeißler; Heinrich Niemann
In current dialogue systems the use of speech as an input modality is common. But this modality is only one of those human beings use. In human-human interaction people use gestures to point or facial expressions to show their moods as well. To give modern systems a chance to read information from all modalities used by humans, these systems must have multimodal user interfaces. The S mart K om system has such a multimodal interface that analyzes facial expression, speech and gesture simultaneously. Here we present the module that fulfills the task of facial expression analysis in order to identify the internal state of a user. In the following we first describe the state of the art in emotion and user state recognition by analyzing facial expressions. Next, we describe the facial expression recognition module . After that we present the experiments and results for recognition of user states. We summarize our results in the last section.
Palabras clave: Facial Expression; Recognition Rate; User State; Gesture Recognition; Face Region.
Part II - Multimodal Input Analysis | Pp. 167-180
Multiple Biometrics
Stephan Grashey; Matthias Schuster
Authentication is undoubtedly an important task for all systems providing a component responsible for the interaction between humans and computers. Traditional authentication techniques like PINs, passwords or ID cards show significant drawbacks: they might be forgotten, misplaced or lost, or even stolen, copied or forged. Biometrics use physical or behavioral characteristics to verify the identity of a person and thus overcome these problems. In this respect, biometric technology may be easier and more comfortable to use. First, a short overview of biometric technology in general is given. Then, the main part of this chapter explains the biometric technologies integrated in the SmartKom system. In particular, a new approach to combine several biometrics in a multimodal device is presented. The performance of this proposed combination method is shown to be superior to that of the single biometric subsystems.
Palabras clave: Speaker Recognition; Logic Combination; Biometric System; Feature Fusion; False Acceptance Rate.
Part II - Multimodal Input Analysis | Pp. 181-193
Natural Language Understanding
Ralf Engel
This chapter presents SPIN, a newly developed template-based semantic parser used for the task of natural language understanding in SmartKom . The most outstanding feature of the approach is a powerful template language to provide easy creation and maintenance of the templates and flexible processing. Nevertheless, to achieve fast processing, the templates are applied in a sequential order that is determined offline.
Palabras clave: Working Memory; Dependency Graph; Optional Condition; Dialogue System; Speech Recognizer.
Part II - Multimodal Input Analysis | Pp. 195-207
The Gesture Interpretation Module
Rui Ping Shi; Johann Adelhardt; Anton Batliner; Carmen Frank; Elmar Nöth; Viktor Zeißler; Heinrich Niemann
Humans make often conscious and unconscious gestures, which reflect their mind, thoughts and the way these are formulated. These inherently complex processes can in general not be substituted by a corresponding verbal utterance that has the same semantics (McNeill, 1992). Gesture, which is a kind of body language, contains important information on the intention and the state of the gesture producer. Therefore, it is an important communication channels in human computer interaction . In the following we describe first the state of the art in gesture recognition. The next section describes the gesture interpretation module . After that we present the experiments and results for recognition of user states. We summarize our results in the last section.
Palabras clave: Facial Expression; Hide Markov Model; Speech Recognition; User State; Gesture Recognition.
Part II - Multimodal Input Analysis | Pp. 209-219
Modality Fusion
Ralf Engel; Norbert Pfleger
In this chapter we give an general overview of the modality fusion component of SmartKom . Based on a selection of prominent multimodal interaction patterns, we present our solution for synchronizing the different modes. Finally, we give, on an abstract level, a summary of our approach to modality fusion.
Palabras clave: Gesture Recognition; Situational Context; Speech Input; Multimodal Interaction; Speech Recognizer.
Part III - Multimodal Dialogue Processing | Pp. 223-235
Discourse Modeling
Jan Alexandersson; Norbert Pfleger
We provide a discription of the robust and generic discourse module that is the central repository of contextual information in SmartKom . We tackle discourse modeling by using a three-tiered discourse structure enriched by partitions together with a local and global focus structure. For the manipulation of the discourse structures we use unification and a default unification operation enriched with a metric mirroring the similarity of competing structures called Overlay . We show how a wide variety of naturally occuring multimodal phenomena, in particular, short utterances including elliptical and referring expressions, can be processed in a generic and robust way. As all other modules of the SmartKom backbone, DiM relies on the a domain ontology for the representation of user intentions. Finally, we show that our approach is robust against phenomena caused by imperfect recognition and analysis of user actions.
Palabras clave: Dialogue System; Discourse Processing; Local Focus; Focus Structure; Dialogue Management.
Part III - Multimodal Dialogue Processing | Pp. 237-253
Overlay: The Basic Operation for Discourse Processing
Jan Alexandersson; Tilman Becker; Norbert Pfleger
We provide a formal description of the fundamental nonmonotonic operation used for discourse modeling. Our algorithm—overlay—consists of a default unification algorithm together with an elaborate scoring functionality. In addition to motivation and highlighting examples from the running system, we give some future directions.
Palabras clave: Dialogue System; Computational Linguistics; Discourse Processing; Discourse Context; Movie Theater.
Part III - Multimodal Dialogue Processing | Pp. 255-267
In Context: Integrating Domain- and Situation-Specific Knowledge
Robert Porzel; Iryna Gurevych; Rainer Malaka
We describe the role of context models in natural language processing systems and their implementation and evaluation in the SmartKom system. We show that contextual knowledge is needed for an ensemble of tasks, such as lexical and pragmatic disambiguation, decontextualizion of domain and common-sense knowledge that was left implicit by the user and for estimating an overall coherence score that is used in intention recognition. As the successful evaluations show, the implemented context model enables a multicontext system, such as SmartKom , to respond felicitously to contextually underspecified questions. This ability constitutes an important step toward making dialogue systems more intuitively usable and conversational without losing their reliability and robustness.
Palabras clave: Global Position System; Context Model; Geographic Information System; Dialogue System; Human Language Technology.
Part III - Multimodal Dialogue Processing | Pp. 269-284
Intention Recognition
Jürgen te Vrugt; Thomas Portele
The intention recognition module identifies the analyzed representation of the user input to the SmartKom system that best represents this input in a collection of possible representations. The alternative representations are generated by recognition and analysis components, being enriched with knowledge, e.g., from the discourse and context. A probabilistic model combines various scores, based on features in the representations and computed by the SmartKom modules during processing to support the selection. The parameters inside the model are optimized on annotated data that has been collected using the SmartKom system using both a parameter study and a rank based estimation algorithm.
Palabras clave: Speech Recognition; User Input; Context Model; Good Configuration; Parameter Setup.
Part III - Multimodal Dialogue Processing | Pp. 285-299
Plan-Based Dialogue Management for Multiple Cooperating Applications
Markus Löckelt
The SmartKom dialogue manager implements the personality of the system and its behaviour. It plans and manages the task-oriented dialogue with the user and coordinates the operations of the applications to reach his goals. It also helps the analysis modules of the system by providing hints about the expected future dialogue.
Palabras clave: Plan Operator; Output Channel; Plan Language; Dialogue Game; Communicative Game.
Part III - Multimodal Dialogue Processing | Pp. 301-316