Srikanth Ronanki
Srikanth Ronanki
PhD Student About Me
Centre for Speech Technology Research Research
Informatics Forum 3.33 Publications
10 Crichton Street
Edinburgh - EH8 9AB Curriculm Vitae
srikanth.ronanki (at) ed.ac.uk
About me

I am a PhD student at CSTR, University of Edinburgh under the supervision of Prof. Simon King. My goal is to build a robust prosody system for generating natural-sounding expressive audibooks (eg: children's storybooks). My research interests include text-to-speech synthesis, speech recognition and machine learning.

Prior to joining PhD at CSTR, I worked with various speech groups such as CMU Sphinx, iLabs at [24]7 inc. and Red Hen Lab at UCLA.

Before that, I did my Bachelors from IIIT-Hyderabad in 2011, Masters (by Research) in 2014 from Speech and Vision Lab, IIIT-H under the supervision of Dr. Kishore Prahallad.

News and Updates
Research Highlights
 
Indian Languages: Demos
Speech synthesis in Indian languages has seen lot of progress over the decade partly due to the annual Blizzard challenges. These systems assume the text to be written in Devanagari or Dravidian scripts which are nearly phonemic orthography scripts. However, the most common form of computer interaction among Indians is ASCII written transliterated text. Such text is generally noisy with many variations in spelling for the same word. We first convert the ASCII text to a phonetic script using Grapheme-to-Phoneme (G2P) approach, and then learn a Deep Neural Network to synthesize speech from that. This grapheme-to-phoneme conversion enabled us to build indic-search, a search engine that helps end-users use ASCII to search for pages written in Unicode. Text-to-speech interfaces with ASCII input also enable users to type in their own pronunciation rather than conforming to a specific notation.

Indic-Search Modi Speaks
Indic Search (2016) Text-To-Speech Interface (2016)
   
 
Publications

  • Srikanth Ronanki, Oliver Watts, Simon King. (2017). A Hierarchical Encoder-Decoder Model for Statistical Parametric Speech Synthesis. Proc. Interspeech. Stockholm, Sweden.

  • pdf

  • Srikanth Ronanki, Zhizheng Wu, Oliver Watts, Simon King. (2016). A Demonstration of the Merlin Open Source Neural Network Speech Synthesis System. Proc. special demo session, 9th ISCA Speech Synthesis Workshop (SSW9). Sunnyvale, CA, USA.

  • pdf

  • Srikanth Ronanki, Oliver Watts, Simon King, Gustav Eje Henter. (2016). Median-Based Generation of Synthetic Speech Durations using a Non-Parametric Approach. In submission to SLT Workshop. Puerto Rico, USA.

  • pdf

  • Thomas Merritt, Srikanth Ronanki, Zhizheng Wu, Oliver Watts. (2016). The CSTR entry to the Blizzard Challenge 2016. In proceedings of The Blizzard challenge workshop. Cupertino, USA.

  • pdf

  • Srikanth Ronanki, Gustav Eje Henter, Zhizheng Wu, Simon King. (2016). A template-based approach for speech synthesis intonation generation using LSTMs. In proceedings of Interspeech. San Francisco, USA.

  • pdf

  • Srikanth Ronanki, Siva Reddy, Bajibabu Bollepalli, Simon King. (2016). DNN-based speech synthesis for Indian languages from ASCII text. In proceedings of 9th ISCA Speech Synthesis Workshop (SSW9). Sunnyvale, USA.

  • pdf

  • Gustav Eje Henter, Srikanth Ronanki, Oliver Watts, Mirjam Wester, Zhizheng Wu, Simon King. (2016). Robust TTS duration modelling using DNNs. In Proceedings of ICASSP. Shangai, China.

  • pdf

  • Oliver Watts, Srikanth Ronanki, Zhizheng Wu, Tuomo Raitio, Antti Suni. (2015). The NST–GlottHMM entry to the Blizzard Challenge 2015. In Proceedings of The Blizzard Challenge workshop. Berlin, Germany.

  • pdf

  • Srikanth Ronanki, Zhizheng Wu, Robert A. J. Clark. (2015). Joint modeling of F0 and duration in deep neural network based speech synthesis. In Proceedings of The UKSpeech workshop. Norwich, UK.

  • pdf

  • Srikanth Ronanki, Li-Bo, James Salsman. (2012). Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx. In Proceedings of SLP-TED workshop, Coling. Mumbai, India. p. 61-68.

  • pdf

  • Srikanth Ronanki, Bajibabu Bollepalli and Kishore S. Prahallad. (2012). Duration Modelling in Voice Conversion Using Artificial Neural Networks. In Proceedings of IWSSIP. Vienna, Austria. p. 11-13.

  • pdf

  • Bajibabu Bollepalli, Srikanth Ronanki, Sathya Adithya Thati, Bhiksha Raj, B. Yegnanarayana and Kishore S. Prahallad. (2011). A Comparison of Prosody Modification Using Instants of Significant Excitation and Mel-cepstral Vocoder. In Proceedings of Centenary Conference. IISc Banglore, India.
  • pdf
Technical Reports

  • Srikanth Ronanki, Zhizheng Wu, Robert A. J. Clark. (2015). Joint Modeling of F0 and Duration in Deep Neural Network Based Speech Synthesis. In Proceedings of UKSpeech workshop. University of East Anglia, Norwich, UK.

  • pdf

  • Srikanth Ronanki, Oliver Watts, Simon King, Robert A. J. Clark. (2013). Syllable based models for prosody modeling in HMM based speech synthesis. Simple4All Internship Report. CSTR, University of Edinburgh. Feb-May, 2013.

  • pdf

  • Srikanth Ronanki, Kishore S. Prahallad. (2013). Prosody Modeling for Voice Conversion. Research Project Report. Speech and Vision Lab, IIIT-Hyderabad, India.

  • pdf

  • Srikanth Ronanki, Peri Bhaskararao, Kishore S. Prahallad. (2012). Acoustic correlates of syllable-level prominence in Telugu. Research Project Report. Speech and Vision Lab, IIIT-Hyderabad, India.

  • pdf
Master Thesis

  • Srikanth Ronanki. (2014). Generation of syllable level templates using dynamic programming for statistical speech synthesis. Masters by Research Dissertation. IIIT-Hyderabad, India. April, 2014.

  • pdf