Biblioteca Electrónica de Ciencia y Tecnología - Catálogo de publicaciones

Institución detectada	Año de publicación	Navegá	Descargá	Solicitá
No requiere	2016	SpringerLink

Functional Organization of the Ventral Auditory Pathway

Yale E. Cohen; Sharath Bennur; Kate Christison-Lagay; Adam M. Gifford; Joji Tsunada

The fundamental problem in audition is determining the mechanisms required by the brain to transform an unlabelled mixture of auditory stimuli into coherent perceptual representations. This process is called auditory-scene analysis. The perceptual representations that result from auditory-scene analysis are formed through a complex interaction of perceptual grouping, attention, categorization and decision-making. Despite a great deal of scientific energy devoted to understanding these aspects of hearing, we still do not understand (1) how sound perception arises from neural activity and (2) the causal relationship between neural activity and sound perception. Here, we review the role of the “ventral” auditory pathway in sound perception. We hypothesize that, in the early parts of the auditory cortex, neural activity reflects the auditory properties of a stimulus. However, in latter parts of the auditory cortex, neurons encode the sensory evidence that forms an auditory decision and are causally involved in the decision process. Finally, in the prefrontal cortex, which receives input from the auditory cortex, neural activity reflects the actual perceptual decision. Together, these studies indicate that the ventral pathway contains hierarchical circuits that are specialized for auditory perception and scene analysis.

Pp. 381-388

Accedé al texto completo

doi: 10.1007/978-3-319-25474-6_41

Neural Segregation of Concurrent Speech: Effects of Background Noise and Reverberation on Auditory Scene Analysis in the Ventral Cochlear Nucleus

Mark Sayles; Arkadiusz Stasiak; Ian M. Winter

Concurrent complex sounds (e.g., two voices speaking at once) are perceptually disentangled into separate “auditory objects”. This neural processing often occurs in the presence of acoustic-signal distortions from noise and reverberation (e.g., in a busy restaurant). A difference in periodicity between sounds is a strong segregation cue under quiet, anechoic conditions. However, noise and reverberation exert differential effects on speech intelligibility under “cocktail-party” listening conditions. Previous neurophysiological studies have concentrated on understanding auditory scene analysis under ideal listening conditions. Here, we examine the effects of noise and reverberation on periodicity-based neural segregation of concurrent vowels /a/ and /i/, in the responses of single units in the guinea-pig ventral cochlear nucleus (VCN): the first processing station of the auditory brain stem. In line with human psychoacoustic data, we find reverberation significantly impairs segregation when vowels have an intonated pitch contour, but not when they are spoken on a monotone. In contrast, noise impairs segregation independent of intonation pattern. These results are informative for models of speech processing under ecologically valid listening conditions, where noise and reverberation abound.

Pp. 389-397

Accedé al texto completo

doi: 10.1007/978-3-319-25474-6_42

Audio Visual Integration with Competing Sources in the Framework of Audio Visual Speech Scene Analysis

Attigodu Chandrashekara Ganesh; Frédéric Berthommier; Jean-Luc Schwartz

We introduce “Audio-Visual Speech Scene Analysis” (AVSSA) as an extension of the two-stage Auditory Scene Analysis model towards audiovisual scenes made of mixtures of speakers. AVSSA assumes that a coherence index between the auditory and the visual input is computed prior to audiovisual fusion, enabling to determine whether the sensory inputs should be bound together. Previous experiments on the modulation of the McGurk effect by audiovisual coherent vs. incoherent contexts presented before the McGurk target have provided experimental evidence supporting AVSSA. Indeed, incoherent contexts appear to decrease the McGurk effect, suggesting that they produce lower audiovisual coherence hence less audiovisual fusion. The present experiments extend the AVSSA paradigm by creating contexts made of competing audiovisual sources and measuring their effect on McGurk targets. The competing audiovisual sources have respectively a high and a low audiovisual coherence (that is, large vs. small audiovisual comodulations in time). The first experiment involves contexts made of two auditory sources and one video source associated to either the first or the second audio source. It appears that the McGurk effect is smaller after the context made of the visual source associated to the auditory source with less audiovisual coherence. In the second experiment with the same stimuli, the participants are asked to attend to either one or the other source. The data show that the modulation of fusion depends on the attentional focus. Altogether, these two experiments shed light on audiovisual binding, the AVSSA process and the role of attention.

Pp. 399-408

Accedé al texto completo

doi: 10.1007/978-3-319-25474-6_43

Relative Pitch Perception and the Detection of Deviant Tone Patterns

Susan L. Denham; Martin Coath; Gábor P. Háden; Fiona Murray; István Winkler

Most people are able to recognise familiar tunes even when played in a different key. It is assumed that this depends on a general capacity for relative pitch perception; the ability to recognise the pattern of inter-note intervals that characterises the tune. However, when healthy adults are required to detect rare deviant melodic patterns in a sequence of randomly transposed standard patterns they perform close to chance. Musically experienced participants perform better than naïve participants, but even they find the task difficult, despite the fact that musical education includes training in interval recognition.

To understand the source of this difficulty we designed an experiment to explore the relative influence of the size of within-pattern intervals and between-pattern transpositions on detecting deviant melodic patterns. We found that task difficulty increases when patterns contain large intervals (5–7 semitones) rather than small intervals (1–3 semitones). While task difficulty increases substantially when transpositions are introduced, the effect of transposition size (large vs small) is weaker. Increasing the range of permissible intervals to be used also makes the task more difficult. Furthermore, providing an initial exact repetition followed by subsequent transpositions does not improve performance. Although musical training correlates with task performance, we find no evidence that violations to musical intervals important in Western music (i.e. the perfect fifth or fourth) are more easily detected. In summary, relative pitch perception does not appear to be conducive to simple explanations based exclusively on invariant physical ratios.

Pp. 409-417

Accedé al texto completo

doi: 10.1007/978-3-319-25474-6_44

Do Zwicker Tones Evoke a Musical Pitch?

Hedwig E. Gockel; Robert P. Carlyon

It has been argued that musical pitch, i.e. pitch in its strictest sense, requires phase locking at the level of the auditory nerve. The aim of the present study was to assess whether a musical pitch can be heard in the absence of peripheral phase locking, using Zwicker tones (ZTs). A ZT is a faint, decaying tonal percept that arises after listening to a band-stop (notched) broadband noise. The pitch is within the frequency range of the notch. Several findings indicate that ZTs are unlikely to be produced mechanically at the level of the cochlea and, therefore, there is unlikely to be phase locking to ZTs in the auditory periphery. In stage I of the experiment, musically trained subjects adjusted the frequency, level, and decay time of an exponentially decaying sinusoid so that it sounded similar to the ZT they perceived following a broadband noise, for various notch positions. In stage II, subjects adjusted the frequency of a sinusoid so that its pitch was a specified musical interval below that of either a preceding ZT or a preceding sinusoid (as determined in stage I). Subjects selected appropriate frequency ratios for ZTs, although the standard deviations of the adjustments were larger for the ZTs than for the equally salient sinusoids by a factor of 1.1–2.2. The results suggest that a musical pitch may exist in the absence of peripheral phase locking.

Pp. 419-426

Accedé al texto completo

doi: 10.1007/978-3-319-25474-6_45

Speech Coding in the Midbrain: Effects of Sensorineural Hearing Loss

Laurel H. Carney; Duck O. Kim; Shigeyuki Kuwada

In response to voiced speech sounds, auditory-nerve (AN) fibres phase-lock to harmonics near best frequency (BF) and to the fundamental frequency (F0) of voiced sounds. Due to nonlinearities in the healthy ear, phase-locking in each frequency channel is dominated either by a single harmonic, for channels tuned near formants, or by F0, for channels between formants. The alternating dominance of these factors sets up a robust pattern of F0-synchronized rate across best frequency (BF). This profile of a temporally coded measure is transformed into a mean rate profile in the midbrain (inferior colliculus, IC), where neurons are sensitive to low-frequency fluctuations. In the impaired ear, the F0-synchronized rate profile is affected by several factors: Reduced synchrony capture decreases the dominance of a single harmonic near BF on the response. Elevated thresholds also reduce the effect of rate saturation, resulting in increased F0-synchrony. Wider peripheral tuning results in a wider-band envelope with reduced F0 amplitude. In general, sensorineural hearing loss reduces the in AN F0-synchronized rates across BF. Computational models for AN and IC neurons illustrate how hearing loss would affect the F0-synchronized rate profiles set up in response to voiced speech sounds.

Pp. 427-435

Accedé al texto completo

doi: 10.1007/978-3-319-25474-6_46

Sources of Variability in Consonant Perception and Implications for Speech Perception Modeling

Johannes Zaar; Torsten Dau

The present study investigated the influence of various sources of response variability in consonant perception. A distinction was made between source-induced variability and receiver-related variability. The former refers to perceptual differences induced by differences in the speech tokens and/or the masking noise tokens; the latter describes perceptual differences caused by within- and across-listener uncertainty. Consonant-vowel combinations (CVs) were presented to normal-hearing listeners in white noise at six different signal-to-noise ratios. The obtained responses were analyzed with respect to the considered sources of variability using a measure of the perceptual distance between responses. The largest effect was found across different CVs. For stimuli of the same phonetic identity, the speech-induced variability across and within talkers and the across-listener variability were substantial and of similar magnitude. Even time-shifts in the waveforms of white masking noise produced a significant effect, which was well above the within-listener variability (the smallest effect). Two auditory-inspired models in combination with a template-matching back end were considered to predict the perceptual data. In particular, an energy-based and a modulation-based approach were compared. The suitability of the two models was evaluated with respect to the source-induced perceptual distance and in terms of consonant recognition rates and consonant confusions. Both models captured the source-induced perceptual distance remarkably well. However, the modulation-based approach showed a better agreement with the data in terms of consonant recognition and confusions. The results indicate that low-frequency modulations up to 16 Hz play a crucial role in consonant perception.

Pp. 437-446

Accedé al texto completo

doi: 10.1007/978-3-319-25474-6_47

On Detectable and Meaningful Speech-Intelligibility Benefits

William M. Whitmer; David McShefferty; Michael A. Akeroyd

The most important parameter that affects the ability to hear and understand speech in the presence of background noise is the signal-to-noise ratio (SNR). Despite decades of research in speech intelligibility, it is not currently known how much improvement in SNR is needed to provide a meaningful benefit to someone. We propose that the underlying psychophysical basis to a meaningful benefit should be the just noticeable difference (JND) for SNR. The SNR JND was measured in a series of experiments using both adaptive and fixed-level procedures across participants of varying hearing ability. The results showed an average SNR JND of approximately 3 dB for sentences in same-spectrum noise. The role of the stimulus and link to intelligibility was examined by measuring speech-intelligibility psychometric functions and comparing the intelligibility JND estimated from those functions with measured SNR JNDs. Several experiments were then conducted to establish a just meaningful difference (JMD) for SNR. SNR changes that could induce intervention-seeking behaviour for an individual were measured with subjective scaling and report, using the same stimuli as the SNR JND experiment as pre- and post-benefit examples. The results across different rating and willingness-to-change tasks showed that the mean ratings increased near linearly with a change in SNR, but a change of at least 6 dB was necessary to reliably motivate participants to seek intervention. The magnitude of the JNDs and JMDs for speech-intelligibility benefits measured here suggest a gap between what is achievable and what is meaningful.

Pp. 447-455

Accedé al texto completo

doi: 10.1007/978-3-319-25474-6_48

Individual Differences in Behavioural Decision Weights Related to Irregularities in Cochlear Mechanics

Jungmee Lee; Inseok Heo; An-Chieh Chang; Kristen Bond; Christophe Stoelinga; Robert Lutfi; Glenis Long

An unexpected finding of previous psychophysical studies is that listeners show highly replicable, individualistic patterns of decision weights on frequencies affecting their performance in spectral discrimination tasks—what has been referred to as . We, like many other researchers, have attributed these listening styles to peculiarities in how listeners attend to sounds, but we now believe they partially reflect in cochlear micromechanics modifying what listeners hear. The most striking evidence for cochlear irregularities is the presence of low-level spontaneous otoacoustic emissions (SOAEs) measured in the ear canal and the systematic variation in stimulus frequency otoacoustic emissions (SFOAEs), both of which result from back-propagation of waves in the cochlea. SOAEs and SFOAEs vary greatly across individual ears and have been shown to affect behavioural thresholds, behavioural frequency selectivity and judged loudness for tones. The present paper reports pilot data providing evidence that SOAEs and SFOAEs are also predictive of the relative decision weight listeners give to a pair of tones in a level discrimination task. In one condition the frequency of one tone was selected to be near that of an SOAE and the frequency of the other was selected to be in a frequency region for which there was no detectable SOAE. In a second condition the frequency of one tone was selected to correspond to an SFOAE maximum, the frequency of the other tone, an SFOAE minimum. In both conditions a statistically significant correlation was found between the average relative decision weight on the two tones and the difference in OAE levels.

Pp. 457-465

Accedé al texto completo

doi: 10.1007/978-3-319-25474-6_49

On the Interplay Between Cochlear Gain Loss and Temporal Envelope Coding Deficits

Sarah Verhulst; Patrycja Piktel; Anoop Jagadeesh; Manfred Mauermann

Hearing impairment is characterized by two potentially coexisting sensorineural components: (i) cochlear gain loss that yields wider auditory filters, elevated hearing thresholds and compression loss, and (ii) cochlear neuropathy, a noise-induced component of hearing loss that may impact temporal coding fidelity of supra-threshold sound. This study uses a psychoacoustic amplitude modulation (AM) detection task in quiet and multiple noise backgrounds to test whether these aspects of hearing loss can be isolated in listeners with normal to mildly impaired hearing ability. Psychoacoustic results were compared to distortion-product otoacoustic emission (DPOAE) thresholds and envelope-following response (EFR) measures. AM thresholds to pure-tone carriers (4 kHz) in normal-hearing listeners depended on temporal coding fidelity. AM thresholds in hearing-impaired listeners were normal, indicating that reduced cochlear gain may counteract how reduced temporal coding fidelity degrades AM thresholds. The amount with which a 1-octave wide masking noise worsened AM detection was inversely correlated to DPOAE thresholds. The narrowband noise masker was shown to impact the hearing-impaired listeners more so than the normal hearing listeners, suggesting that this masker may be targeting a temporal coding deficit. This study offers a window into how psychoacoustic difference measures can be adopted in the differential diagnostics of hearing deficits in listeners with mixed forms of sensorineural hearing loss.

Pp. 467-475

Accedé al texto completo

Catálogo de publicaciones - libros

Physiology, Psychoacoustics and Cognition in Normal and Impaired Hearing

Pim van Dijk ; Deniz Başkent ; Etienne Gaudrain ; Emile de Kleine ; Anita Wagner ; Cris Lanting (eds.)

Resumen/Descripción – provisto por la editorial

Palabras clave – provistas por la editorial

Disponibilidad

Información

Cobertura temática

Medicina básica

Tabla de contenidos