Catálogo de publicaciones - libros
Affective Computing and Intelligent Interaction: 1st International Conference, ACII 2005, Beijing, China, October 22-24, 2005, Proceedings
Jianhua Tao ; Tieniu Tan ; Rosalind W. Picard (eds.)
En conferencia: 1º International Conference on Affective Computing and Intelligent Interaction (ACII) . Beijing, China . October 22, 2005 - October 24, 2005
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
No disponibles.
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2005 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-29621-8
ISBN electrónico
978-3-540-32273-3
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2005
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2005
Tabla de contenidos
doi: 10.1007/11573548_55
Prosodic Reading Style Simulation for Text-to-Speech Synthesis
Oliver Jokisch; Hans Kruschke; Rüdiger Hoffmann
The simulation of different reading styles (mainly by adapting prosodic parameters) can improve the naturalness of synthetic speech and supports a more intelligent human machine interaction. The article exemplarily investigates the reading styles News and Tale. For comparison, all examined texts contained the same genre-neutral paragraphs which have been read without a specific style instruction: Normal but also faster, slower, rather monotone or more emotional which led to corresponding artificial styles.
The measured original intonation and durations style patterns control a diphone synthesizer (mapped contours). Additionally, the patterns are used to train a neural network (NN) model.
Within two separate listening tests, different stimuli presented as original signal/style, respectively, with mapped or NN generated prosodic contours have been evaluated. The results show that both, original utterances and artificial styles are basically perceived in their intended reading styles. Some reciprocal confusions indicate the similarities between different styles like News and Fast, Tale and Slow as well as Tale and Expressive. The confusions are more likely for synthetic speech. To produce e. g. the complex style Tale, different features of the prosodic variations Slow and Expressive are combined. The training method for the synthetic styles requires a further improvement.
- Affective Speech Processing | Pp. 426-432
doi: 10.1007/11573548_56
F0 Contour of Prosodic Word in Happy Speech of Mandarin
Haibo Wang; Aijun Li; Qiang Fang
This paper focuses on analyzing the F0 contour of happy speech. We designed some declarative sentences and recorded them in happy and neutral expressive states. All of our speakers were asked to express these sentences in the same imaginary scene. It is known that emotion can be expressed through modifying acoustic features of speech in various ways, such as pitch, intensity, voice quality and so on. In this study, we compared the difference of F0 contour between happy and neutral speech through which we found that: (1) F0 contour plays an important role when happiness is expressed. (2) The F0 contour of happy speech displays a kind of declination pattern, but the degree of declination is less than that of neutral speech. (3) Contrasting to neutral speech, the pitch register of happy speech is higher, and the slope of F0 contour of the final syllable of each prosodic word is bigger, especially for the syllable at the end of the sentence.
- Affective Speech Processing | Pp. 433-440
doi: 10.1007/11573548_57
A Novel Source Analysis Method by Matching Spectral Characters of LF Model with STRAIGHT Spectrum
Zhen-Hua Ling; Yu Hu; Ren-Hua Wang
This paper presents a voice source analysis method by studying the spectral characters of LF model and their representation in output speech signal. The estimation of source features is defined as the set of LF parameter whose spectrum has the most similar characters in frequency domain, including glottal formant and spectral tilt, with the corresponding characters held by the STRAIGHT spectrum of speech signal for analysis. Besides, the concept of analyzable frame is introduced to ensure the feasibility and improve the reliability of proposed method. Evaluation with synthetic speech proves this method is able to estimate the LF parameters with satisfactory precision. Furthermore, the experiment with emotional speech shows the effectiveness of proposed method in describing voice quality variety among speech with different emotions.
- Affective Speech Processing | Pp. 441-448
doi: 10.1007/11573548_58
Features Importance Analysis for Emotional Speech Classification
Jianhua Tao; Yongguo Kang
The paper analyzes the prosody features, which includes the intonation, speaking rate, intensity, based on classified emotional speech. As an important feature of voice quality, voice source are also deduced for analysis. With the analysis results above, the paper creates both a CART model and a weight decay neural network model to find acoustic importance towards the emotional speech classification and to disclose whether there is an underlying consistency between acoustic features and speech emotion. The result shows the proposed method can obtain the importance of each acoustic feature through its weight for emotional speech classification and further improve the emotional speech classification.
- Affective Speech Processing | Pp. 449-457
doi: 10.1007/11573548_59
Toward Making Humans Empathize with Artificial Agents by Means of Subtle Expressions
Takanori Komatsu
Can we assign attitudes to a computer based on its represented subtle expressions, such as beep sounds and simple animations? If so, which kinds of beep sounds or simple animations are perceived as specific attitudes, such as “disagreement”, “hesitation” or “agreement”? To examine this issue, I carried out two experiments to observe and clarify how participants perceive or assign an attitude to a computer according to beep sounds of different durations and F0 contour’s slopes (Experiment 1) or simple animations of different durations and objects’ velocities (Experiment 2). The results of these two experiments revealed that 1) subtle expressions with increasing intonations (Experiment 1) or velocities (Experiment 2) were perceived by participants as “disagreement”, 2) flat intonations and velocities with longer duration were interpreted as “hesitation”, and 3) decreasing intonations and velocities with shorter duration were taken as “agreement.”
- Evaluation of Affective Expressivity | Pp. 458-465
doi: 10.1007/11573548_61
Lexical Resources and Semantic Similarity for Affective Evaluative Expressions Generation
Alessandro Valitutti; Carlo Strapparava; Oliviero Stock
This paper presents resources and functionalities for the selection of affective evaluative terms. An affective hierarchy as an extension of the lexical database was developed in the first place. The second phase was the development of a semantic similarity function, acquired automatically in an unsupervised way from a large corpus of texts that allows us to put into relation concepts and emotional categories. The integration of the two components is a key element for several applications.
- Evaluation of Affective Expressivity | Pp. 474-481
doi: 10.1007/11573548_62
Because Attitudes Are Social Affects, They Can Be False Friends...
Takaaki Shochi; Véronique Aubergé; Albert Rilliard
The attitudes of the speaker during a verbal interaction are affects built by language and culture. Since they are a sophisticated material for expressing complex affects, using a channel of control that is surely not confused with emotions, they are the larger part of the affects expressed during an interaction, as it could be shown on large databases by Campbell [3]. strong Twelve representative attitudes of Japanese are given to be listened both by Japanese native speaker and French native speaker naive in Japanese. They include “y”, “”, “”, “”, “”, “”, “”, “”, and four socially referenced degrees of politeness: “”, “”, “” and “” (Sadanobu [11]). Two perception experiments using a closed forced choice were carried out, each attitude introduced by a definition and some examples of real situations. The 15 native Japanese subjects discriminate all the attitudes over chance, with some little confusion inside the politeness class. French subjects do not process the concept of degree of politeness: they do not identify the typical Japanese politeness degrees. The prosody of “, highest degree of politeness in Japanese, is misunderstood by French on contrary meaning as “impoliteness”, “authority” and “irritation”.
- Evaluation of Affective Expressivity | Pp. 482-489
doi: 10.1007/11573548_63
Emotion Semantics Image Retrieval: An Brief Overview
Shangfei Wang; Xufa Wang
Emotion is the most abstract semantic structure of images. This paper overviews recent research on emotion semantics image retrieval. First, the paper introduces the general frame of emotion semantics image retrieval and points out the four main research issues: to exact sensitive features from images, to define users’ emotion information, to build emotion user model and to individualize the user model. Then several algorithms to solve these four issues are analyzed in detail. After that, some future research topics, including construction of an emotion database, evaluation of the user model and computation of the user model, are discussed, and some resolved strategies are presented elementarily.
- Evaluation of Affective Expressivity | Pp. 490-497
doi: 10.1007/11573548_64
Affective Modeling in Behavioral Simulations: Experience and Implementations
Robert A. Duisberg
Recent studies have convincingly demonstrated the critical role of affect in human cognitive development and expression, supporting the case for incorporating affective representation into behavioral simulations for artificial intelligence. Music provides a powerful and concise mechanism for evoking and indeed representing emotions, and thus studying the ways in which music represents affect can provide insights into computer representations. That music can be understood as a multidimensional structure leads to the consideration of systemic grammars for this representation. A systemic grammar of emotions is presented which has proven effective as the basis for a concrete – and marketable – implementation of behavioral simulations for virtual characters, by allowing the system to parse interactions between characters into representations of emotional states, and using the attributes of those determined states as determinants of subsequent behavior.
- Evaluation of Affective Expressivity | Pp. 498-504
doi: 10.1007/11573548_66
The Reliability and Validity of the Chinese Version of Abbreviated PAD Emotion Scales
Xiaoming Li; Haotian Zhou; Shengzun Song; Tian Ran; Xiaolan Fu
The study aimed at testing the reliability and validity of the Chinese version of Abbreviated PAD Emotion Scales using a Chinese sample. 297 Chinese undergraduate students were tested with the Chinese version of Abbreviated PAD Emotion Scales; 98 of them were retested with the same scales after seven days in order to assess the test-retest reliability; and 102 of them were tested with SCL-90 at the same time which was intended as criteria for validity to assess the criterion validity. The results showed that the Chinese version of Abbreviated PAD Emotion Scales displayed satisfying reliability and validity on P (pleasure-displeasure), only moderate reliability and validity on D (dominance-submissiveness), but quite low reliability and validity on A (arousal-nonarousal).
- Evaluation of Affective Expressivity | Pp. 513-518