The Automatic Sound Engineer (ASE) beta
This page contains some audio samples in support of the paper "A Sound Engineering Approach to Near End Listening Enhancement" by Carol Chermaz and Simon King. The paper describes the beta version of The Automatic Sound Engineer (ASE); you can watch my Interspeech presentation here.
ASE is built on sound engineering knowledge. The beta version is simpler, as it does not include all the automation features of the full version. The latter also presents an increased audio quality.
ASE beta was entered into the Hurricane Challenge 2.0. The algorithm won by a significant margin, achieving intelligibility improvements of up to +58% words understood on average in noise and reverberation (in comparison to unmodified speech played at the same SNR). ASE will be the baseline for the next Challenge.
As we do not have access to the stimuli that were used in this Challenge, we have created some mockup stimuli with the speech and noise used in this study. The two studies have comparable noise conditions.
In these examples speech is taken from the Hurricane Natural Corpus, a recording of the Harvard sentences. Binaural cafeteria noise and impulse responses are taken from the Oldenburg HRIR database. Plain speech is meant to yield 25% WAR (Word accuracy rate) at the low SNR, 50% at the mid and 75% at the high SNR.
The same utterance is processed with ASE beta in all conditions and mixed with noise at exactly the same SNRs as plain speech (the goal of the Challenge is to improve intelligibility without raising the volume).
To better hear the sound modification, we provide an example of ASE beta in reverberation only, and one in silence without reverberation.
For comparison we also provide the same utterance processed with SSDRC, a benchmark algorithm which was the second highest-scoring in this Challenge.
|NO NOISE||unmodified||ASE beta|
|Compare with SSDRC:|
This project has received funding from the EU's H2020 research and innovation programme under the MSCA GA 675324