Catálogo de publicaciones - tesis

Compartir en
redes sociales


Título de Acceso Abierto

Modelado de estructuras prosódicas para el reconocimiento automático del habla

Enrique Marcelo Albornoz Diego Humberto Milone Humberto Torres Horacio Leone Marcelo Risk Omar Chiotti Hugo Leonardo Rufiner

acceptedVersion.

Resumen/Descripción – provisto por el repositorio digital
Prosody is used to describe certain physical quantities that can be measured in the voice signals (energy, fundamental frequency, etc.). They represent valuable information for the identification and classification of different aspects of voice production. Automatic Speech Recognition (ASR) is a multidisciplinary area of study. Its ultimate purpose is to make a machine that recognizes the words and even understand its meaning, considering any speaker in any environment. Current ASR systems use hidden Markov models (HMM) to perform a phonetic-acoustic characterization of speech, without considering prosodic information in an explicit way. This Thesis aims to find clear links between the prosodic features and the words that are spoken, and define a new way to classify the language accentual prominences. Word models to categorize the words are defined according to their prosodic information, and a way to incorporate the prosodic classifiers to standard ASR is proposed. Furthermore, it is performed a deep study about acoustic sequences, associated with words, that give problems to the ASR. For these, specialized prosodic classifiers are generated for each word. This Thesis also deals with the emotion recognition task. This work begins with an exploration of classifiers based on Gaussian mixtures and MOM. The prosodic-acoustic features of emotions were analyzed in order to group them together in an unsupervised way. Then, hierarchical classification models that include these groupings of emotions were developed. The novel models have improved performance relative to standard classifiers
Palabras clave – provistas por el repositorio digital

Automatic speech recognition; Emotion recognition; Prosodic modeling; Language models; Prosodic-acustic analysis; Hierarchical classifiers; Reconocimiento automático del habla; Reconocimiento de emociones; Modelado prosódico; Modelos de lenguaje; Análisis prosódico-acústicos; Clasificadores jerárquicos

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No requiere 2013 Biblioteca Virtual de la Universidad Nacional del Litoral (SNRD) acceso abierto

Información

Tipo de recurso:

tesis

Idiomas de la publicación

  • español castellano

País de edición

Argentina

Fecha de publicación