Korin Richmond

Centre for Speech Technology Research

All publications related to speech synthesis


[1] K. Richmond and S. King. Smooth talking: articulatory join costs for unit selection. In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 5150–5154. March 2016. [pdf]

[2] Q. Hu, J. Yamagishi, K. Richmond, K. Subramanian, and Y. Stylianou. Initial investigation of speech synthesis based on complex-valued neural networks. In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 5630–5634. March 2016. [pdf]

[3] R. Dall, S. Brognaux, K. Richmond, C. Valentini-Botinhao, G. E. Henter, J. Hirschberg, and J. Yamagishi. Testing the consistency assumption: pronunciation variant forced alignment in read and spontaneous speech synthesis. In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 5155–5159. March 2016. [pdf]

[4] K. Richmond, Z. Ling, and J. Yamagishi. The use of articulatory movement data in speech synthesis applications: an overview - application of articulatory movements using machine learning algorithms [invited review]. Acoustical Science and Technology, 36(6):467–477, 2015. doi:10.1250/ast.36.467. [doi]

[5] K. Richmond, J. Yamagishi, and Z.-H. Ling. Applications of articulatory movements based on machine learning. Journal of the Acoustical Society of Japan, 70(10):539–545, 2015.

[6] Q. Hu, Z. Wu, K. Richmond, J. Yamagishi, Y. Stylianou, and R. Maia. Fusion of multiple parameterisations for DNN-based sinusoidal speech synthesis with multi-task learning. In Proc. Interspeech. Dresden, Germany, September 2015. [pdf]

[7] Q. Hu, Y. Stylianou, R. Maia, K. Richmond, and J. Yamagishi. Methods for applying dynamic sinusoidal models to statistical parametric speech synthesis. In Proc. ICASSP. Brisbane, Austrilia, April 2015. [pdf]

[8] Q. Hu, Y. Stylianou, K. Richmond, R. Maia, J. Yamagishi, and J. Latorre. A fixed dimension and perceptually based dynamic sinusoidal model of speech. In Proc. ICASSP, 6311–6315. Florence, Italy, May 2014. [pdf]

[9] J. Cabral, K. Richmond, J. Yamagishi, and S. Renals. Glottal spectral separation for speech synthesis. Selected Topics in Signal Processing, IEEE Journal of, 8(2):195–208, April 2014. doi:10.1109/JSTSP.2014.2307274. [doi]

[10] Q. Hu, Y. Stylianou, R. Maia, K. Richmond, J. Yamagishi, and J. Latorre. An investigation of the application of dynamic sinusoidal models to statistical parametric speech synthesis. In Proc. Interspeech, 780–784. Singapore, September 2014. [pdf]

[11] K. Richmond, Z. Ling, J. Yamagishi, and B. Uría. On the evaluation of inversion mapping performance in the acoustic domain. In Proc. Interspeech. Lyon, France, August 2013. [pdf]

[12] M. Astrinaki, A. Moinet, J. Yamagishi, K. Richmond, Z.-H. Ling, S. King, and T. Dutoit. Mage-HMM-based speech synthesis reactively controlled by the articulators. In 8th ISCA Workshop on Speech Synthesis, 243. Barcelona, Spain, August 2013. [pdf]

[13] Q. Hu, K. Richmond, J. Yamagishi, and J. Latorre. An experimental comparison of multiple vocoder types. In 8th ISCA Workshop on Speech Synthesis, 155–160. Barcelona, Spain, August 2013. [pdf]

[14] M. Astrinaki, A. Moinet, J. Yamagishi, K. Richmond, Z.-H. Ling, S. King, and T. Dutoit. Mage - reactive articulatory feature control of HMM-based parametric speech synthesis. In 8th ISCA Workshop on Speech Synthesis, 227–231. Barcelona, Spain, August 2013. [pdf]

[15] Z. Ling, K. Richmond, and J. Yamagishi. Articulatory control of HMM-based parametric speech synthesis using feature-space-switched multiple regression. Audio, Speech, and Language Processing, IEEE Transactions on, 21(1):207–219, January 2013. doi:10.1109/TASL.2012.2215600. [doi]

[16] I. Steiner, K. Richmond, and S. Ouni. Using multimodal speech production data to evaluate articulatory animation for audiovisual speech synthesis. In 3rd International Symposium on Facial Analysis and Animation. Vienna, Austria, 2012. [pdf]

[17] Z. Ling, K. Richmond, and J. Yamagishi. Vowel creation by articulatory control in HMM-based parametric speech synthesis. In Proc. The Listening Talker Workshop, 72. Edinburgh, UK, May 2012. [pdf]

[18] Z.-H. Ling, K. Richmond, and J. Yamagishi. Vowel creation by articulatory control in HMM-based parametric speech synthesis. In Proc. Interspeech. Portland, Oregon, USA, September 2012. [pdf]

[19] Z.-H. Ling, K. Richmond, and J. Yamagishi. Feature-space transform tying in unified acoustic-articulatory modelling of articulatory control of HMM-based speech synthesis. In Proc. Interspeech, 117–120. Florence, Italy, August 2011. [pdf]

[20] L. Ming, J. Yamagishi, K. Richmond, Z.-H. Ling, S. King, and L.-R. Dai. Formant-controlled HMM-based speech synthesis. In Proc. Interspeech, 2777–2780. Florence, Italy, August 2011. [pdf]

[21] J. P. Cabral, S. Renals, J. Yamagishi, and K. Richmond. HMM-based speech synthesiser using the LF-model of the glottal source. In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, 4704–4707. May 2011. doi:10.1109/ICASSP.2011.5947405. [pdf | doi]

[22] K. Richmond, R. Clark, and S. Fitt. On generating Combilex pronunciations via morphological analysis. In Proc. Interspeech, 1974–1977. Makuhari, Japan, September 2010. [pdf]

[23] D. Felps, C. Geng, M. Berger, K. Richmond, and R. Gutierrez-Osuna. Relying on critical articulators to estimate vocal tract spectra in an articulatory-acoustic database. In Proc. Interspeech, 1990–1993. September 2010. [pdf]

[24] J. Cabral, S. Renals, K. Richmond, and J. Yamagishi. Transforming voice source parameters in a HMM-based speech synthesiser with glottal post-filtering. In Proc. 7th ISCA Speech Synthesis Workshop (SSW7), 365–370. NICT/ATR, Kyoto, Japan, September 2010. [pdf]

[25] I. Steiner and K. Richmond. Towards unsupervised articulatory resynthesis of German utterances using EMA data. In Proc. Interspeech, 2055–2058. Brighton, UK, September 2009. [pdf]

[26] K. Richmond, R. Clark, and S. Fitt. Robust LTS rules with the Combilex speech technology lexicon. In Proc. Interspeech, 1295–1298. Brighton, UK, September 2009. [pdf]

[27] Z. Ling, K. Richmond, J. Yamagishi, and R. Wang. Integrating articulatory features into HMM-based parametric speech synthesis. IEEE Transactions on Audio, Speech and Language Processing, 17(6):1171–1185, August 2009. \textbf IEEE SPS 2010 Young Author Best Paper Award. doi:10.1109/TASL.2009.2014796. [doi]

[28] I. Steiner and K. Richmond. Generating gestural timing from EMA data using articulatory resynthesis. In Proc. 8th International Seminar on Speech Production. Strasbourg, France, December 2008.

[29] Z.-H. Ling, K. Richmond, J. Yamagishi, and R.-H. Wang. Articulatory control of HMM-based parametric speech synthesis driven by phonetic knowledge. In Proc. Interspeech, 573–576. Brisbane, Australia, September 2008. [pdf]

[30] J. Cabral, S. Renals, K. Richmond, and J. Yamagishi. Glottal spectral separation for parametric speech synthesis. In Proc. Interspeech, 1829–1832. Brisbane, Australia, September 2008. [pdf]

[31] K. Richmond, V. Strom, R. Clark, J. Yamagishi, and S. Fitt. Festival multisyn voices for the 2007 blizzard challenge. In Proc. Blizzard Challenge Workshop (in Proc. SSW6). Bonn, Germany, August 2007. [pdf]

[32] R. A. J. Clark, K. Richmond, and S. King. Multisyn: open-domain unit selection for the Festival speech synthesis system. Speech Communication, 49(4):317–330, 2007. doi:10.1016/j.specom.2007.01.014. [pdf | doi]

[33] J. Cabral, S. Renals, K. Richmond, and J. Yamagishi. Towards an improved modeling of the glottal source in statistical parametric speech synthesis. In Proc.of the 6th ISCA Workshop on Speech Synthesis. Bonn, Germany, 2007. [pdf]

[34] R. Clark, K. Richmond, V. Strom, and S. King. Multisyn voices for the Blizzard Challenge 2006. In Proc. Blizzard Challenge Workshop (Interspeech Satellite). Pittsburgh, USA, September 2006. (http://festvox.org/blizzard/blizzard2006.html). [pdf]

[35] G. Hofer, K. Richmond, and R. Clark. Informed blending of databases for emotional speech synthesis. In Proc. Interspeech. September 2005. [pdf | ps]

[36] R. A. J. Clark, K. Richmond, and S. King. Multisyn voices from ARCTIC data for the Blizzard challenge. In Proc. Interspeech 2005. September 2005. [pdf]

[37] R. A. J. Clark, K. Richmond, and S. King. Festival 2 – build your own general purpose unit selection speech synthesiser. In Proc. 5th ISCA workshop on speech synthesis. 2004. [pdf | ps]