|Date||Jun 25, 2012|
|Title||Why Deep Neural Networks Are Promising for Speech Recognition|
|Abstract||Recently we have proposed and developed the context-dependent deep neural network (DNN) hidden Markov model (CD-DNN-HMM) for large vocabulary speech recognition (LVSR) and demonstrated its superior performance on several benchmark tasks. In this talk I will share the observations and thoughts we have in understanding why DNNs can be more powerful than the shallow neural networks and why CD-DNN-HMMs can outperform the conventional CD-GMM-HMM system and earlier ANN/HMM hybrid systems. At the end of the talk I will discuss how CD-DNN-HMM can be further improved to achieve even better recognition accuracy.|
|Bio||Dr. Dong Yu joined Microsoft Corporation in 1998 and Microsoft Speech Research Group in 2002, where he is a researcher. His recent work focuses on deep learning and its application to large vocabulary speech recognition. The context-dependent deep neural network hidden Markov model (CD-DNN-HMM) he co-proposed and developed has been seriously challenging the dominant position of the conventional GMM based system for large vocabulary speech recognition.|
Dr. Dong Yu has published around 100 papers in speech processing and machine learning and is the inventor/coinventor of more than 40 granted/pending patents. He is currently serving as an associate editor of IEEE transactions on audio, speech, and language processing (2011-) and has served as an associate editor of IEEE signal processing magazine (2008-2011) and the lead guest editor of IEEE transactions on audio, speech, and language processing - special issue on deep learning for speech and language processing (2010-2011).