Catálogo de publicaciones - libros

Compartir en
redes sociales

SmartKom: Foundations of Multimodal Dialogue Systems

Wolfgang Wahlster (eds.)

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

No disponibles.

Disponibilidad

Institución detectada	Año de publicación	Navegá	Descargá	Solicitá
No detectada	2006	SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-23732-7

ISBN electrónico

978-3-540-36678-2

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

2006

Información sobre derechos de publicación

Cobertura temática

Ciencias de la computación e información

Ingeniería eléctrica, electrónica e informática

Tabla de contenidos

Verificá que desde tu institución tengas acceso para descargar o solicitar el libro completo o alguno de sus capítulos.

doi: 10.1007/3-540-36678-4_21

Emotion Analysis and Emotion-Handling Subdialogues

Michael Streit; Anton Batliner; Thomas Portele

The chapter presents the cognitive model-based approach of abductive interpretation of emotions that is used in the multimodal dialogue system SmartKom . The approach is based on Ortony, Clore and Collins’ (OCC) model of emotions, which explains emotions by matches or mismatches of the attitudes of an agent with the state of affairs in the relevant situation. We explain how eliciting conditions, i.e., abstract schemata for the explanation of emotions, can be instantiated with general or abstract concepts for attitudes and actions, and further enhanced with conditions and operators for generating reactions, which allow for abductive inference of explanations of emotional states and determination of reactions. During this process concepts that are initially abstract are made concrete. Emotions may work as a self-contained dialogue move. They show a complex relation to explicit communication. Additionally, we present our approach of evaluating indicators of emotions and user states that come from different sources.

Palabras clave: Facial Expression; Negative Emotion; User State; Semantic Content; Problematic Situation.

Part III - Multimodal Dialogue Processing | Pp. 317-332

doi: 10.1007/3-540-36678-4_22

Problematic, Indirect, Affective, and Other Nonstandard Input Processing

Michael Streit

Natural communication is accompanied by errors, misunderstandings, and emotions. Although in the last decade considerable research has evolved on these topics, human-computer dialogue applications still focus on the exchange of rational specifications of the tasks that should be performed by the system. In this chapter we describe a component that is devoted to processing problematic input that is characterized by the lack of a clear specification of the user’s request. The following topics are discussed: interpretation of emotions and verbally communicated likes and dislikes, interpretation of indirect specifications, clarification of underspecified input, help on demand, robust navigation in multimodal dialogue and generation of error messages.

Part III - Multimodal Dialogue Processing | Pp. 333-346

doi: 10.1007/3-540-36678-4_23

Realizing Complex User Wishes with a Function Planning Module

Sunna Torge; Christian Hying

Recently, spoken dialogue systems became much more sophisticated and allow for rather complex dialogues. Moreover, the functionality of devices, applications, and services and the amount of digital content increased. Due to the network age (Internet, personal area networks, and so on) the applications one can control or wants to control change dynamically, as does the content. Therefore a dialogue system can no longer be manually designed for one application. Instead, a layer in between is required, which translates the functionalities of the devices such that they can be used via a generic dialogue system and which also resolves complex user wishes according to available applications. In this paper a module, called the function planning module, is described which builds a link between a dialogue system and an ensemble of applications. A planning component searches for the applications necessary to solve a user wish and the sequence of action which has to be performed. A formal description of the function planning module and its successful integration in the SmartKom prototype system are presented in this paper as well.

Palabras clave: Internal Model; Functional Model; Planning Algorithm; Planning Module; Dialogue System.

Part IV - Multimodal Output Generation | Pp. 349-362

doi: 10.1007/3-540-36678-4_24

Intelligent Integration of External Data and Services into SmartKom

Hidir Aras; Vasu Chandrasekhara; Sven Krüger; Rainer Malaka; Robert Porzel

The SmartKom multimodal dialogue system offers access to a wide range of information and planning services. A significant subset of these are constituted by external data and service providers. The work presented herein describes the challenging task of integrating such external data and service sources to make them semantically accessible to other systems and users. We present the implemented multiagent system the corresponding knowledge-based extraction and integration approach. As a whole these agents cooperate to provide users with topical high-quality information via unified and intuitively usable interfaces such as the SmartKom system.

Palabras clave: Multiagent System; External Data; Dialogue System; Service Layer; Electronic Program Guide.

Part IV - Multimodal Output Generation | Pp. 363-378

doi: 10.1007/3-540-36678-4_25

Multimodal Fission and Media Design

Peter Poller; Valentin Tschernomas

This chapter describes the output generation subsystem of SmartKom with special focus on the realization of the outstanding features of the new human-computer interaction paradigm. First, we start with a description and motivation of the design of the multimodal output modalities. Then we give a detailed characterization of the individual output modules, and finally we show how their collaboration is organized functionally in order to achieve a coherent overall system output behaviour.

Palabras clave: Graphical Element; Graphical Object; Output Element; Audiovisual Speech; Presentation Planner.

Part IV - Multimodal Output Generation | Pp. 379-400

doi: 10.1007/3-540-36678-4_26

Natural Language Generation with Fully Specified Templates

Tilman Becker

Based on the constraints of the project, the approach chosen for natural language generation (NLG) combines the advantages of a template-based system with a theory-based full representation. We also discuss the adaption of generation to different multimodal interaction modes and the special requirements of generation for concept-to-speech synthesis.

Palabras clave: Argument Structure; Speech Synthesis; Derivation Tree; Elementary Tree; Phrase Structure Tree.

Part IV - Multimodal Output Generation | Pp. 401-410

doi: 10.1007/3-540-36678-4_27

Multimodal Speech Synthesis

Antje Schweitzer; Norbert Braunschweiler; Grzegorz Dogil; Tanja Klankert; Bernd Möbius; Gregor Möhler; Edmilson Morais; Bettina Säuberlich; Matthias Thomae

Speech output generation in the S mart K om system is realized by a corpus-based unit selection strategy that preserves many properties of the human voice. When the system’s avatar “Smartakus” is present on the screen, the synthetic speech signal is temporally synchronized with Smartakus visible speech gestures and prosodically adjusted to his pointing gestures to enhance multimodal communication. The unit selection voice was formally evaluated and found to be very well accepted and reasonably intelligible in S mart K om - specific scenarios.

Palabras clave: Speech Synthesis; Pitch Accent; Unit Selection; Phrase Boundary; Prosodic Structure.

Part IV - Multimodal Output Generation | Pp. 411-435

doi: 10.1007/3-540-36678-4_28

Building Multimodal Dialogue Applications: System Integration in SmartKom

Gerd Herzog; Alassane Ndiaye

We report on the experience gained in building large-scale research prototypes of fully integrated multimodal dialogue systems in the context of the SmartKom project. The development of such systems requires a flexible software architecture and adequate software support to cope with the challenge of system integration. A practical result of our experimental work is an advanced integration platform that enables flexible reuse and extension of existing software modules and that is able to deal with a heterogeneous software environment. Starting from the foundations of our general framework, we give an overview of the SmartKom testbed, and we describe the practical organization of the development process within the project.

Palabras clave: System Integration; Application Programming Interface; Dialogue System; Integration Platform; Tuple Space.

Part V - Scenarios and Applications | Pp. 439-452

doi: 10.1007/3-540-36678-4_29

SmartKom-English: From Robust Recognition to Felicitous Interaction

David Gelbart; John Bryant; Andreas Stolcke; Robert Porzel; Manja Baudis; Nelson Morgan

This chapter describes the English-language SmartKom -Mobile system and related research. We explain the work required to support a second language in SmartKom and the design of the English speech recognizer. We then discuss research carried out on signal processing methods for robust speech recognition and on language analysis using the Embodied Construction Grammar formalism. Finally, the results of human-subject experiments using a novel Wizard and Operator model are analyzed with an eye to creating more felicitous interaction in dialogue systems.

Palabras clave: Speech Recognition; Automatic Speech Recognition; Dialogue System; Speech Recognition System; Word Error Rate.

Part V - Scenarios and Applications | Pp. 453-470

doi: 10.1007/3-540-36678-4_30

SmartKom-Public

Axel Horndasch; Horst Rapp; Hans Röttger

SmartKom -Public is the result of consistent development of traditional public telephone booths for members of a modern information society in the form of a multimodal communications booth for intuitive broad-bandwidth communication.

Palabras clave: User Study; Potential User; Standard Application; Audio Data; Data Chunk.

Part V - Scenarios and Applications | Pp. 471-492