How does NELE work in real life?

NELE (Near End Listening Enhancement) algorithms are meant to improve the intelligibility of speech playback. Real-world applications are many and varied: PA systems in airports or train stations, TV sets and radios, computer loudspeakers, telephones, smart home devices and basically all electronic devices that play speech sounds. In recent decades scientists tried different NELE approaches, and several algorithms have proven to offer substantial benefit in terms of intelligibility, both with human listeners and objective measures. However, studies are typically conducted in lab-controlled conditions, i.e. with artificial noise. Moreover, additive noise and reverberation are usually treated as separate problems, whereas in real life they occur together.
So even if we have good results in the lab, how can we be sure that our algorithm is going to work fine in a train station or in your living room?
For this reason we wanted to build a realistic test platform for NELE algorithms. We simulated two common acoustic environments in which speech playback may occur: a small houseold room (the living room) and a big public space (the cafeteria). These two environments are representative because they have opposite characteristics: the living room has a short reverberation time and a highly fluctuating noise (children playing), while the cafeteria has a long reverberation time and a quite stationary noise (a crowd talking). In the image below, you can see a layout ouf our acoustic simulations (which are adaptations from the documentation of the HRIR database, which we used for the study): We selected three state-of-the-art NELE algorithms: SSDRC, AdaptDRC and ADOE, which is a combination of AdaptDRC and an algorithm against reverberation. We processed speech with these algorithms and we mixed it with noise in our two simulated environments. We ran some tests with normal hearing listeners, and found that all of the algorithms increased intelligibility in both environments. Our results may provide a useful insight on the strategies that are worth pursuing in order to create new technologies for real world scenarios.

You can find all the details of this study in the paper Evaluating Near End Listening Enhancement Algorithms in Realistic Environments, which was awarded the prize for Best Student Paper at Interspeech 2019, Graz, Austria.
You can listen to some of the samples from the listening tests below; the algorithm-modified files should be compared to the plain files. Plain speech is mixed with noise at a SNR that is meant to yield 75% intelligibility - which means that you should on average understand 75% of words. The modified files should yield more than that (up to 88%). Please listen via headphones for a correct representation of spatial cues.

living room cafeteria
plain
SSDRC
AdaptDRC
ADOE


Get in touch: c.chermaz@ed.ac.uk


This project has received funding from the EU's H2020 research and innovation programme under the MSCA GA 675324