Workshop on Machine Learning in Speech and Language Processing 2021

6 Sep, 2021

MLSLP is a recurring workshop, often joint with machine learning or speech/NLP conferences. Prior workshops were held in 2011, 2012, 2016, 2017, and 2018. While research in speech and language processing has always involved machine learning (ML), current research is benefiting from even closer interaction between these fields. Speech and language processing is continually mining new ideas from ML and ML, in turn, is devoting more interest to speech and language applications.

This workshop aims to be a venue for identifying and incubating the next waves of research directions for interaction and collaboration. The workshop will discuss the emerging research ideas with potential for impact in speech/language and bring together relevant researchers from ML and speech/language who may not regularly interact at conferences.

MLSLP is a workshop of SIGML, the SIG on machine learning in speech and language processing of ISCA (the International Speech Communication Association).


Registration is now open! Please register to receive the links to the talks and the poster session. The registration deadline is Aug 16, 2021.


Important Dates

Jun 24, 2021Papers/abstracts due
Jul 26, 2021Notification of acceptance
Aug 16, 2021Registration deadline
Aug 30, 2021Final paper/poster deadline
Sep 6, 2021Workshop

Invited Speakers

Michael Auli Facebook Michael Auli is a scientist at Facebook AI Research in Menlo Park, California. During his PhD he worked on natural language processing and parsing at the University of Edinburgh where he was advised by Adam Lopez and Philipp Koehn. Michael led the team which developed convolutional sequence to sequence models that were the first non-recurrent models to outperform RNNs for neural machine translation. He also led and co-led the teams which ranked first in several WMT news translation tasks in 2018 and 2019. Currently, Michael works on semi-supervised and self-supervised learning applied to natural language processing and speech recognition.
Jan Chorowski University of Wroclaw Jan Chorowski is an Associate Professor at Faculty of Mathematics and Computer Science at the University of Wrocław and Head of AI at NavAlgo. He received his M.Sc. degree in electrical engineering from the Wrocław University of Technology, Poland and EE PhD from the University of Louisville, Kentucky in 2012. He has worked with several research teams, including Google Brain, Microsoft Research and Yoshua Bengio’s Lab at the University of Montreal. He has led a research topic during the JSALT 2019 workshop. His research interests are applications of neural networks to problems which are intuitive and easy for humans and difficult for machines, such as speech and natural language processing.
Hung-yi Lee National Taiwan University Hung-yi Lee is currently an associate professor of the Department of Electrical Engineering of National Taiwan University (NTU), with a joint appointment at the Department of Computer Science & Information Engineering. He received Ph.D. degree from NTU in 2012. From 2012 to 2013, he was a postdoctoral fellow in Research Center for Information Technology Innovation, Academia Sinica. From 2013 to 2014, he was a visiting scientist at the Spoken Language Systems Group of MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). He gave tutorials at ICASSP 2018, APSIPA 2018, ISCSLP 2018, Interspeech 2019, SIPS 2019, and Interspeech 2020. He is the co-organizer of the special session, New Trends in self-supervised speech processing, at Interspeech 2020, and co-organizer of the workshop, Self-Supervised Learning for Speech and Audio Processing, at NeurIPS 2020. He is a member of the IEEE Speech and Language Processing Technical Committee (SLTC).
Yuzong Liu Amazon Yuzong Liu is an Applied Science Manager at AWS AI. His team develops speech recognition technology that supports a variety of cloud-based speech-to-text applications including Amazon Transcribe. Currently, Yuzong and his team work on acoustic modeling research & development, self-trained and label-free approaches to speech recognition. Before joining AWS, Yuzong spent three years in Alexa ASR team, working on acoustic modeling, on-device speech recognition, and had been the leading scientist in designing language ID system and multilingual Alexa mode. Yuzong obtained his PhD from University of Washington in 2016, and his dissertation focused on semi-supervised learning for acoustic modeling.
Odette Scharenborg TU Delft Odette Scharenborg is an Associate Professor and Delft Technology Fellow at the Multimedia Computing Group at Delft University of Technology, the Netherlands, working on human speech-processing inspired automatic speech processing. She has an interdisciplinary background in automatic speech recognition and psycholinguistics, and uses knowledge from how humans process speech to improve automatic speech recognition systems, with a focus on low-resource languages (including unwritten languages) and low-resource types of speech (oral cancer speech and dysarthric speech), and uses visual information for multi-modal speech learning and processing. In 2017, she co-organised a 6-weeks Frederick Jelinek Memorial Summer Workshop. Since 2017, she is on the Board of the International Speech Communication Association (ISCA). Since 2018, she is a member of the IEEE Speech and Language Processing Technical Committee, and since 2019 an Associate Editor of IEEE Signal Processing Letters.
Chengyi Wang Microsoft Chengyi Wang is a 3rd year joint-PhD student of Microsoft Research and Nankai University. Her research interests lie in Speech Recognition and Speech Translation. She has been interning at Microsoft since 2017. During the internship, she has published several papers at AAAI, ACL, ICML, ICASSP and Interspeech. One paper was nominated for Best Student Paper Award in Interspeech2020. In year 2020, she won the National Scholarship as an outstanding PhD student.
Yu Zhang Google Yu Zhang is currently a research scientist at Google Brain. He received his Ph.D degree in computer science from Massachusetts Institute of Technology in 2017. During his Ph.D, he worked on improving speech recognition performance. He is a fan of open source projects and contributed or involved to develop CNTK, MXNet and ESPNet to facilitate ASR research. Currently, his research interests are improving ML model performance for various speech processing applications, with a focus on sequence to sequence modeling. Yu is a main contributor to Google's next generation RNNT ASR model and Tacotron based text-to-speech system.

Organizing Committee

Anton Ragni University of Sheffield
Preethi Jyothi IIT Bombay
Yao Qian Microsoft
Hao Tang The University of Edinburgh

Scientific Committee

Xiaodong Cui IBM
Bhuvana Ramadabran Google
Liang Lu Microsoft
Huayun Zhang A*STAR
Mark Hasegawa-Johnson University of Illinois
Yanzhang (Ryan) He Google
Andrew Liu CUHK
Rohit Prabhavalkar Facebook
Karen Livescu TTIC
Vimal Manohar Facebook
Ralf Schlüter RWTH Aachen University


Please visit the homepage of SIGML and subscribe to our mailing list for the latest news.