Models which quantify intelligibility have been studied since the early days of telephony, but until recently, the focus has been on macroscopic models which provide average numerical estimates of overall intelligibility in the face of slowly-varying additive noise or in the presence of reverberant energy. However, microscopic intelligibility prediction has great potential for modelling speech perception, both for groups and individuals. Recent models, developed by INSPIRE members and others, aim to explain listener responses at the level of individual tokens such as words and syllables.
In INSPIRE, this microscopic approach will be taken much further, by using a large corpus of common listener confusions to diagnose, refine and evaluate computational models which predict responses across the corpus. Intelligibility modelling also contributes to a rapid prototyping and shortening of the development cycle for speech enhancement algorithms by bypassing expensive listener panel testing. Last but not least, it will improve the tuning of individual hearing aids.
Project MIP-1 focuses on early processes of signal decomposition at the auditory periphery, energetic masking, and signal component recombination, where the potential for misallocation of audible energy from both target speech and background noise exists.
Project MIP-2 adopts a complementary approach by examining binaural processes involved in improving intelligibility, and will investigate the organisation of sequential grouping in undoing the effects of informational masking, with and without reverberation.
The goal of Project MIP-3 is to better understand why current intelligibility models often fail to predict the intelligibility improvements obtained by noise reduction techniques, and to develop improved models capable of providing accurate estimates of the effect of signal processing on intelligibility. This work is especially relevant for the development and evaluation of algorithms for hearing-instruments, but also has applications in context- and listener-specific speech modification to improve communication with speech technology for groups such as the elderly or non-natives in adverse conditions.
Project MIP-4 will focus on the way in which periodicity and aperiodicity can be used to segregate speech from masking sound sources and will investigate the abilities of four groups of listeners to perceive speech targets in the background of speech maskers in a variety of conditions mixing presence and absence of periodicity: (1) those with normal hearing — listening normally and to simulations of cochlear implant (CI) processing or hearing loss, (2) hearing aid (HA) users, (3) CI users, and (4) those using a combination of HA and CI.