Catálogo de publicaciones - libros

Compartir en
redes sociales

Affective Computing and Intelligent Interaction: 1st International Conference, ACII 2005, Beijing, China, October 22-24, 2005, Proceedings

Jianhua Tao ; Tieniu Tan ; Rosalind W. Picard (eds.)

En conferencia: 1º International Conference on Affective Computing and Intelligent Interaction (ACII) . Beijing, China . October 22, 2005 - October 24, 2005

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

No disponibles.

Disponibilidad

Institución detectada	Año de publicación	Navegá	Descargá	Solicitá
No detectada	2005	SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-29621-8

ISBN electrónico

978-3-540-32273-3

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

2005

Información sobre derechos de publicación

Cobertura temática

Ciencias de la computación e información

Ingeniería eléctrica, electrónica e informática

Tabla de contenidos

Verificá que desde tu institución tengas acceso para descargar o solicitar el libro completo o alguno de sus capítulos.

doi: 10.1007/11573548_55

Prosodic Reading Style Simulation for Text-to-Speech Synthesis

Oliver Jokisch; Hans Kruschke; Rüdiger Hoffmann

The simulation of different reading styles (mainly by adapting prosodic parameters) can improve the naturalness of synthetic speech and supports a more intelligent human machine interaction. The article exemplarily investigates the reading styles News and Tale. For comparison, all examined texts contained the same genre-neutral paragraphs which have been read without a specific style instruction: Normal but also faster, slower, rather monotone or more emotional which led to corresponding artificial styles.

The measured original intonation and durations style patterns control a diphone synthesizer (mapped contours). Additionally, the patterns are used to train a neural network (NN) model.

Within two separate listening tests, different stimuli presented as original signal/style, respectively, with mapped or NN generated prosodic contours have been evaluated. The results show that both, original utterances and artificial styles are basically perceived in their intended reading styles. Some reciprocal confusions indicate the similarities between different styles like News and Fast, Tale and Slow as well as Tale and Expressive. The confusions are more likely for synthetic speech. To produce e. g. the complex style Tale, different features of the prosodic variations Slow and Expressive are combined. The training method for the synthetic styles requires a further improvement.

- Affective Speech Processing | Pp. 426-432

doi: 10.1007/11573548_56

F0 Contour of Prosodic Word in Happy Speech of Mandarin

Haibo Wang; Aijun Li; Qiang Fang

This paper focuses on analyzing the F0 contour of happy speech. We designed some declarative sentences and recorded them in happy and neutral expressive states. All of our speakers were asked to express these sentences in the same imaginary scene. It is known that emotion can be expressed through modifying acoustic features of speech in various ways, such as pitch, intensity, voice quality and so on. In this study, we compared the difference of F0 contour between happy and neutral speech through which we found that: (1) F0 contour plays an important role when happiness is expressed. (2) The F0 contour of happy speech displays a kind of declination pattern, but the degree of declination is less than that of neutral speech. (3) Contrasting to neutral speech, the pitch register of happy speech is higher, and the slope of F0 contour of the final syllable of each prosodic word is bigger, especially for the syllable at the end of the sentence.

- Affective Speech Processing | Pp. 433-440

doi: 10.1007/11573548_57

A Novel Source Analysis Method by Matching Spectral Characters of LF Model with STRAIGHT Spectrum

Zhen-Hua Ling; Yu Hu; Ren-Hua Wang

This paper presents a voice source analysis method by studying the spectral characters of LF model and their representation in output speech signal. The estimation of source features is defined as the set of LF parameter whose spectrum has the most similar characters in frequency domain, including glottal formant and spectral tilt, with the corresponding characters held by the STRAIGHT spectrum of speech signal for analysis. Besides, the concept of analyzable frame is introduced to ensure the feasibility and improve the reliability of proposed method. Evaluation with synthetic speech proves this method is able to estimate the LF parameters with satisfactory precision. Furthermore, the experiment with emotional speech shows the effectiveness of proposed method in describing voice quality variety among speech with different emotions.

- Affective Speech Processing | Pp. 441-448

doi: 10.1007/11573548_58

Features Importance Analysis for Emotional Speech Classification

Jianhua Tao; Yongguo Kang

The paper analyzes the prosody features, which includes the intonation, speaking rate, intensity, based on classified emotional speech. As an important feature of voice quality, voice source are also deduced for analysis. With the analysis results above, the paper creates both a CART model and a weight decay neural network model to find acoustic importance towards the emotional speech classification and to disclose whether there is an underlying consistency between acoustic features and speech emotion. The result shows the proposed method can obtain the importance of each acoustic feature through its weight for emotional speech classification and further improve the emotional speech classification.

- Affective Speech Processing | Pp. 449-457

doi: 10.1007/11573548_59

Toward Making Humans Empathize with Artificial Agents by Means of Subtle Expressions

Takanori Komatsu

Can we assign attitudes to a computer based on its represented subtle expressions, such as beep sounds and simple animations? If so, which kinds of beep sounds or simple animations are perceived as specific attitudes, such as “disagreement”, “hesitation” or “agreement”? To examine this issue, I carried out two experiments to observe and clarify how participants perceive or assign an attitude to a computer according to beep sounds of different durations and F0 contour’s slopes (Experiment 1) or simple animations of different durations and objects’ velocities (Experiment 2). The results of these two experiments revealed that 1) subtle expressions with increasing intonations (Experiment 1) or velocities (Experiment 2) were perceived by participants as “disagreement”, 2) flat intonations and velocities with longer duration were interpreted as “hesitation”, and 3) decreasing intonations and velocities with shorter duration were taken as “agreement.”

- Evaluation of Affective Expressivity | Pp. 458-465

doi: 10.1007/11573548_61

Lexical Resources and Semantic Similarity for Affective Evaluative Expressions Generation

Alessandro Valitutti; Carlo Strapparava; Oliviero Stock

This paper presents resources and functionalities for the selection of affective evaluative terms. An affective hierarchy as an extension of the lexical database was developed in the first place. The second phase was the development of a semantic similarity function, acquired automatically in an unsupervised way from a large corpus of texts that allows us to put into relation concepts and emotional categories. The integration of the two components is a key element for several applications.

- Evaluation of Affective Expressivity | Pp. 474-481

doi: 10.1007/11573548_62

Because Attitudes Are Social Affects, They Can Be False Friends...

Takaaki Shochi; Véronique Aubergé; Albert Rilliard

The attitudes of the speaker during a verbal interaction are affects built by language and culture. Since they are a sophisticated material for expressing complex affects, using a channel of control that is surely not confused with emotions, they are the larger part of the affects expressed during an interaction, as it could be shown on large databases by Campbell [3]. strong Twelve representative attitudes of Japanese are given to be listened both by Japanese native speaker and French native speaker naive in Japanese. They include “y”, “”, “”, “”, “”, “”, “”, “”, and four socially referenced degrees of politeness: “”, “”, “” and “” (Sadanobu [11]). Two perception experiments using a closed forced choice were carried out, each attitude introduced by a definition and some examples of real situations. The 15 native Japanese subjects discriminate all the attitudes over chance, with some little confusion inside the politeness class. French subjects do not process the concept of degree of politeness: they do not identify the typical Japanese politeness degrees. The prosody of “, highest degree of politeness in Japanese, is misunderstood by French on contrary meaning as “impoliteness”, “authority” and “irritation”.

- Evaluation of Affective Expressivity | Pp. 482-489

doi: 10.1007/11573548_63

Emotion Semantics Image Retrieval: An Brief Overview

Shangfei Wang; Xufa Wang

Emotion is the most abstract semantic structure of images. This paper overviews recent research on emotion semantics image retrieval. First, the paper introduces the general frame of emotion semantics image retrieval and points out the four main research issues: to exact sensitive features from images, to define users’ emotion information, to build emotion user model and to individualize the user model. Then several algorithms to solve these four issues are analyzed in detail. After that, some future research topics, including construction of an emotion database, evaluation of the user model and computation of the user model, are discussed, and some resolved strategies are presented elementarily.

- Evaluation of Affective Expressivity | Pp. 490-497

doi: 10.1007/11573548_64

Affective Modeling in Behavioral Simulations: Experience and Implementations

Robert A. Duisberg

Recent studies have convincingly demonstrated the critical role of affect in human cognitive development and expression, supporting the case for incorporating affective representation into behavioral simulations for artificial intelligence. Music provides a powerful and concise mechanism for evoking and indeed representing emotions, and thus studying the ways in which music represents affect can provide insights into computer representations. That music can be understood as a multidimensional structure leads to the consideration of systemic grammars for this representation. A systemic grammar of emotions is presented which has proven effective as the basis for a concrete – and marketable – implementation of behavioral simulations for virtual characters, by allowing the system to parse interactions between characters into representations of emotional states, and using the attributes of those determined states as determinants of subsequent behavior.

- Evaluation of Affective Expressivity | Pp. 498-504

doi: 10.1007/11573548_66

The Reliability and Validity of the Chinese Version of Abbreviated PAD Emotion Scales

Xiaoming Li; Haotian Zhou; Shengzun Song; Tian Ran; Xiaolan Fu

The study aimed at testing the reliability and validity of the Chinese version of Abbreviated PAD Emotion Scales using a Chinese sample. 297 Chinese undergraduate students were tested with the Chinese version of Abbreviated PAD Emotion Scales; 98 of them were retested with the same scales after seven days in order to assess the test-retest reliability; and 102 of them were tested with SCL-90 at the same time which was intended as criteria for validity to assess the criterion validity. The results showed that the Chinese version of Abbreviated PAD Emotion Scales displayed satisfying reliability and validity on P (pleasure-displeasure), only moderate reliability and validity on D (dominance-submissiveness), but quite low reliability and validity on A (arousal-nonarousal).

- Evaluation of Affective Expressivity | Pp. 513-518