MIP4 Exploiting periodicity and aperiodicity to segregate target speech from background sounds

Numerous studies have shown that the perception of speech in noise is much more difficult for listeners with any kind of hearing impairment. In order to better understand and potentially overcome this major issue a deeper understanding of the acoustic factors involved in the perception of speech in noise is needed.

One key aspect in the study of speech intelligibility that is not well understood to date is the role of periodicity and aperiodicity. In the context of spoken speech periodicity and aperiodicity simply denote whether a sound is produced with or without the use of the periodically vibrating vocal folds. The perceptual consequence of this vibration is that the resulting sounds possess a prominent sonorous sound character and that we can attribute a stable pitch to them. The ability to produce periodic sounds is also a key component of any musical instrument, where this is usually achieved by making a string or air column vibrate. Aperiodic sounds on the other hand do not possess these qualities and thus sound rough and noise-like. Since the acoustic nature of these two classes of speech sounds is so inherently different, it seems reasonable to assume that it is possible to exploit this difference in segregating target speech from background sounds. In order to test this hypothesis we are presenting normal hearing listeners as well as hearing impaired subjects and users of cochlear implants with a variety of English sentences either in quiet or embedded in background noises that both vary with regard to the amount of periodicity and aperiodicity. While natural speech is characterised by a mix of periodic and aperiodic segments, specific computer software also allows to produce speech that is either entirely periodic or entirely aperiodic.

Up to now we have tested the performance levels of normal hearing listeners in a series of experiments and have found that periodic background sounds are generally much less effective than aperiodic maskers and that it is generally easier to understand speech embedded in a background noise if the amount of periodicity in both the target and the masker is high. Furthermore our results also indicate that, contrary to what has been believed previously, it is more difficult to benefit from the dips of a fluctuating masker if it is periodic. We are hoping that at least parts of these effects will transfer to hearing impaired subjects and cochlear implant users, since these findings may prove valuable for improving hearing prosthesis and cochlear implants. Additionally, we are planning to use cortical electroencephalography (EEG) as a method in future studies in order to obtain brain data that will tell us how the neural processing for specific combinations of target speech and noise backgrounds differs, and may also be able to explain the diverging performance levels between subjects.

Fellow: Kurt Steinmetzger

Main host institution: University College London

Second host institution: Technical University of Denmark

Industry partner: Royal National Throat, Nose and Ear Hospital, London

