ASE was declared the winner of the Hurricane Challenge 2.0 during the special session "Intelligibility Enhancing Speech Modifications" at Interspeech 2020. My algorithm, which provided intelligibility gains up to +58% words understood in noise (which corresponds to raising the volume by 7 dB), was chosen as the new baseline for the next challenge. Such a rewarding moment in my career, and such a strange way to experience it: I turned off Zoom at the end of the session and realised I was alone in my living room. No time for those long fruitful discussions and cheering with colleagues, which were the best part of in-person conferences. I hope the world goes back to normal... but in the meantime I will try to improve ASE even further.
The image below is a screenshot from the main presentation of the session, which can be found in the proceedings of Interspeech 2020.
Last year Mark Wright came to CSTR and interviewed a few of us: those recordings are now knit together into a beautiful story. My colleagues and I will tell you about all the things we do, from building voices for those who lost them, to recreating realistic acoustic environments. Mark gives a unique perspective with his story-telling, which managed to surprise me - like someone taking a picture of your home from an unexpected angle. I loved listening to the preview, and also got very emotional at some point, when I heard the ambient noise recordings from my beloved workplace (which I haven't been allowed to visit for months now).
I hope you enjoy this as much as I did.
My research was featured on "Il Piccolo", the local newspaper of my hometown Trieste(Italy). Such an honor! Thanks to Riccardo Tosques for writing this article, and giving me the change to get my reasearch out there to the wider audience!
I was invited to participate as a panelist in a talk for the series We need to talk about AI. We talked about the role and the ethics of artificial intelligence in assistive technologies. Am I qualified enough to talk about this? Not sure, but it was certainly a great honor to be invited! It was really interesting to get to know the other panelists, and it was great to have the opportunity to discuss with the online audience! Click on the image below to watch the video on YouTube.
I was selected as a finalist for the "Three Minute Thesis" competition of the University of Edinburgh - having won the round of the College of Engineering. This time I didn't win, but that is not the most important thing. I was given the chance to share my thoughts with quite a large audience, which is what matters to me.
I do science communication for three reasons:...
1) People have the right to know where public money goes.
2) People have the right to know what technology is already out there, and what's in store for the future. And they need to be able to discuss it and raise questions.
3) There is a reason that is very specific to my field.
My goal is to make life a bit easier for users of hearing prosthetics, which entails 2 tasks: one as a scientist and one as a human. The task of the scientist is to make better algorithms, that improve the quality of hearing devices. The task of the human is to do whatever I can to raise awareness on how difficult it is to live with hearing impairment. As if the practical difficulties weren't enough, users have to face the constant worry of being judged by a society that doesn't understand their needs. Being judged for wearing a prosthetic device.
One thing that really bothers me is that we could achieve much better quality if we were allowed to make bigger devices, with a bigger battery and all. But industry is making them smaller and smaller - not just on grounds of portability and comfort, but because many users prefer to conceal them. Why does this happen, while other prosthetic devices are widely accepted? Think of eyeglasses. They are prosthetic devices, but they have become fashionable items. What does it take for a device to cross the dividing line between representing disability and becoming fashionable?
My suggestion here is that if we could make hearing devices that include new features, which appeal to the general audience (the "normal hearing"), maybe we would see a shift in mentality.
But this is just an idea, I don't have the right answer, and probably there are many right answers. The best that I can do is to make you think, and if I can achieve that, then I have won.
Our lives changed so much during lockdown, but scientists kept on working from home. Mark Wright reached out to me to know how I was doing, and featured my story on the Listening Across Disciplines Blog.
"What I am listening for now is not just intelligibility, but also the pleasantness of sound. By this I mean that speech has to sound natural, and require no effort to be processed".
It's official, ASE beta won the Hurricane Challenge 2.0!
The Challenge was run in three languages, using 3 reverb conditions and binaural cafeteria noise. ASE beta showed amazing performance, with a record of +7.3 dB Equivalent Intensity Change. In lay terms, this means that when you listen to a voice over loudspeakers in a crowded place and understand 15% of words being said, if we process that voice with ASE beta (and play it at the same volume) you can understand up to 73% of words. If we don't process speech with ASE, you can only obtain this percentage by phisically raising the volume of the loudspeaker by 7.3 dB. ...
Results will be fully disclosed at Interspeech 2020, which will host a special session for the Challenge. The version of the algorithm I submitted to the Challenge was the beta, which was much simpler than the current, full version of ASE. I can't wait to see what this one is capable of!
In the picture below, you can see a depiction of a psychometric curve. This represents human perception of speech in noise. As you raise the volume of the loudspeaker in the cafeteria (represented by the horizontal axis, SNR), you'll understand a higher percentage of words (vertical axis, Word Accuracy Rate). Human perception is non-linear, hence the sigmoid curve. In the Challenge, Spanish listeners understood an average of 15% in the near reverberation condition, when plain (unprocessed) speech was played at -17.5 dB SNR (which is basically faint speech in loud noise). When that speech was processed with ASE (and played at the same volume) they understood 73%. This is an average, but when I analysed the results in detail, I was flabbergasted at seeing how for some participants ASE boosted the intelligibility from 0% to 100%. Crazy!
Wanna hear ASE beta in action? Click here.
Disclaimer: the picture below does NOT depict the exact way EIC (Equivalent intensity Change) is calculated; it is only meant to help the reader visualize the idea. For an explanation of EIC please refer to this paper.
We have submitted our virtual Show and Tell to ICASSP 2020!
I am so proud of this video - there is much more than science to it, there is all the love that we put into our work. ... And if you think you're doing great as a scientist on your own, you'll see you can do better with collaborators - it's like playing in a band.
We didn't know what the appropriate format for the video was - nobody did, as this is a first-timer. We can only say we did our best.
I am receiving polarised opinions on the fact that there is music in the background. Some like it, some find it distracting. I think it helps the flow of the narration and communicates on a meta-level. We're not only telling you what our research is, but also how we feel about it. I apologise to those who would rather listen without it, but I think it is an important element of our story-telling and we (as a group) convened that the intelligibility of the voice-over was not compromised by it. In any case, I will be happy to hear your opinions about it, so that we can make the next video better.
The whole of the scientific world is making a great effort to move everything online, so everybody has to make videos, even if they don't have any experience or the right tools. I recorded the voiceover to this video in my bedroom, with a makeshift setup - with my head among sweaters in the wardrobe to avoid reverberation and nylon tights on a coathanger as a DIY anti-pop shield. The result was far from ideal anyway, but I had a card up my sleeve: ASE, which I used to process the recording. Using the product of your research for your everyday work is priceless.. and avoiding the hassle of voiceover production in a video is an immense relief. If you have watched the video, then you can judge for yourself the quality of the output.
I feel the need to state that ASE is not intended to substitute human sound engineers, as in audio productions (for music or whatever it might be) intelligibility is not the only thing that matters. There is an artistic side to it that cannot, and should not, be substituted. However, this tool (ASE) can be incredibly helpful in mundane tasks, like mastering the audio of a scientific presentation, a conference or a podcast, where the person who is recording might not have any sound engineering experience. And even if they do, like me, it is a great relief to push a button and get everything done.
Click on the image below to watch me talk about ASE on the Scottish final of Famelab! You can see my part at 5'30" in the video.
... I haven't passed this round, but it was a very tough competition: participants were highly skilled and had very good topics. You always learn a lot from these experiences, even if you don't bring a prize home.
Everytime I participate in science communication competitions I am very nervous, and before starting I wonder "Why did I draw this upon myself?!"... Especially in very short talks, I think I give the worst of myself - as I usually feel tense for the first few minutes and then I calm down and perform better. But if there's only 3 minutes, there is only the nervous part. So why do I do this? Maybe because I like to push myself, and because the joy of explaining science wins over everything else.
I have passed the Edinburgh round at Famelab! Telling your research story in 3 minutes is such a challenging task, you really put yourself to the test and learn so much - and you may also get some new friends in the process.
The next round is going to be held via video-conferencing on April 15th. Stay tuned for news!
The ENRICH group presented their research in a public understanding event at the Royal Institution in London. So exciting to get out of the lab to talk to non-scientist! We got useful questions and feedback from members of the public. In the picture below, Simon (my supervisor) is presenting ASE, my algorithm. Click on the image to see the full picture. This is probably the last event we hold together as ENRICH, as the project is approaching completion. I am so grateful I was granted the opportunity to be part of this initiative, and I hope we will keep on collaborating after the end of the project. In the picture below, you can see my beautiful extended family.
We have been accepted at the Show and Tell of ICASSP 2020!
16/03/2020 Update: Unfortunately the conference has been moved to fully virtual due to the Covid-19 outbreak. We were looking forward to being in Barcelona... Yet we can consider ourselves very lucky to be able to communicate remotely, and make science move forward notwithstanding the difficult situation.
I submitted my entry to the Hurricane Challenge 2.0! I processed the speech files with the beta version of ASE.
The Challenge is the second edition of the 2013 large-scale evaluation of NELE algorithms. This time the listening tests will be run in 3 different languages, and real noise recordings plus reverberation will be used in the acoustic simulations.
I partecipated to Points of Listening #50 in London, at the London College of communication. The title of this event was: "Deaf Gain: exploring sound, technology and hearing diversity", made in collaboration with Listening Across Disciplines II.
I was given the opportunity to explain my research to a general audience, and even give a mock up listening test!
Beyond my wildest expectations, Evaluating Near End Listening Enhancement Algorithms in Realistic Environments won the Best Student Paper award at Intrespeech 2019, Graz, Austria. It's hard to describe how rewarding this is, after the massive amount of work we put into this study.
I feel really honoured to have been chosen among such a selection of brilliant papers, and I can't express enough gratitude to my co-authors, without whom this would not have happened. You are the best! I presented the study in the main hall at the Messecongress Graz. Such an amazing stage!
We're at the International Congress on Acoustics, Aachen, Germany. I partecipated to the contest "5 minute research story". I found out how challenging it is to summarize your research in such a short time, with one slide and no props... definitely more difficult and scary than giving a 1-hour talk! I was awarded the 3rd prize. The ENRICH group also had their own special session within the Congress:
This project has received funding from the EU's H2020 research and innovation programme under the MSCA GA 675324