Research supervision
I've enjoyed working with many, many people over the course of my career so far, conducting research on a wide range of topics. The people below here are some of those with whom I've worked in a research supervision/mentoring role.
PhD Research
PhD | Topic | Graduation |
---|---|---|
Jinzuomu Zhong | Accent and style control for text-to-speech synthesis | (2028) |
Nicholas Sanders | Personification and Controllable TTS | (2025) |
Siqi Sun | TTS Frontend modelling | (2025) |
Dan Wells | Low-resource text-to-speech synthesis using self-supervised speech representations | (2025) |
Emelie Van de Vreken | Evaluating Emotive Synthetic Speech | (2025) |
Jason Taylor | Pronunciation modelling in end-to-end text-to-speech synthesis | 2022 |
Qiong Hu | Statistical parametric speech synthesis based on sinusoidal models | 2016 |
João Cabral | HMM-based speech synthesis using an acoustic glottal source model | 2010 |
Research Staff
Researcher | Project |
---|---|
Cassia Valentini Botinhao | SGILE (NRC funded "Speech generation for Indigenous language education") |
Lorena Aldana | SGILE (NRC funded "Speech generation for Indigenous language education") |
Niamh Corkey | SGILE (NRC funded "Speech generation for Indigenous language education") |
Cassia Valentini Botinhao | Ultrax2020 (EPSRC Healthcare Partnerships Programme grant EP/P02338X/1) |
Sam Ribeiro | Ultrax2020 (EPSRC Healthcare Partnerships Programme grants EP/I027696/1 and EP/P02338X/1) |
Aciel Eskey | Ultrax2020 (EPSRC Healthcare Partnerships Programme grants EP/I027696/1 and EP/P02338X/1) |
Research Visitors
I've hosted a number of research visits, usually of 6 months to 1 year duration, and working with researchers from PhDs to Professors. Here's a selection of them, with links to some of the published research outcomes from our collaboration. I'm always open to further collaboration, so if you'd like to propose a research visit project, please do get in touch!
MSc Research
I enjoy supervising a number of MSc projects each year, including the following in more recent years. (Most project topics proposed by me; overall average grade = 72%, which is a distinction, with 28 distinctions to date!)
Ben Hunt | The Best Bang for your Bucca: Making Best Use of Electromagnetic Articulography Data for Acoustic to Articulatory Inversion |
Jinzuomu Zhong | AccentBox: High-Fidelity Zero-Shot Accent Generation |
You-Xuan Lin | Articulation Encoding in the Self-Supervised Speech Model and Effects on Acoustic-to-Articulatory Inversion |
Zulong Chen | Leveraging Speech-to-Text Models in Silent Speech Recognition |
Mingli Zhou | Acoustic-to-Articulatory Inversion Mapping of Depressive Speech |
Spencer Jensen | ContinuousAccent: An L2 Speech Dataset For Foreign Accent Intensity |
Noe Berger | Low-Resource Accent TTS Support via Large Multi-Accent Neural Frontend Pronunciation Knowledge Transfer |
Eilish Newmark | Improving Homograph Disambiguation in Whole-Sentence Neural TTS Frontend Modelling |
Talia Apter | Mer Yidish! Extending a Text-to-Speech System for Yiddish |
Lian-Hui Tan | Articulatory-to-Acoustic Inversion Mapping using Single Speaker Bilingual Data |
Chloe Austin | Disentangling Speaker Accent and Identity in TTS - Investigating the Roles of Speaker and Accent Embeddings |
Jacob Rosen | Acoustic-to-Articulatory Inversion with Extracted Tongue Contour Features from Ultrasound Tongue Imaging |
Rachel Beeson | Multi-speaker Speech Recognition using Articulator Pose Estimation on Silent versus Voiced Speech |
Jemima Goodall | Automatic Hypernasality Severity Classification using a Combined CNN-LSTM Network |
Jannis Spiekermann | Uncertainty in Natural Language Processing: Bayesian Neural Networks for Deep Sequence Tagging |
Brent Ho | Pedagogical Charts, Personalized Audio, and Visual Feedback in Lexical Tone Learning |
Adam Drazsky | An Alternative Method for Deriving Phoneme Durations for Voice Puppetry |
Aidan Pine | Low Resource Speech Synthesis |
Stephen Walters | Utilising Native Language Information for Improved Foreign Accent Classification |
Samuel Lo | The First Text-to-Speech System for Yiddish |
Ronghua Chen | Speech-driven Smile Detection Using Neural Networks |
Elisa Gambicchia | I-Vector Extraction and Visualisation Analysis for Multi-Accent Speech Recognition |
Dan Lyth | Modeling vocal effort in speech synthesis with variational autoencoders |
Daniel Jordan | Linguistically-augmented approaches to G2P conversion |
Caela Northey | Neural approaches to morphological decomposition for pronunciation learning |
Noel Kleber | Beyond Speaker Independent Articulatory Inversion, using a Locally Linear Embedding to Predict Speaker-Indepedent Vocal Tract Shapes. |
Danil Khristov | |
John Stockdale | Combining Multiple Data Sources for Articulatory Inversion-Mapping Using Neural Networks |
Alexandra Antonides | Ultrasound-based Audio-Visual Speech Recognition for Children with Speech Disorders |
Yu Bai | Feature Extraction Based on MFCCs and DCT Cepstral Coefficients for Replay Attack Detection |
Shannon Wotherspoon | Human vs. Machine Detection of Replay Spoofing Attacks |
Mark Leisten | Quantitative Target Approximation Modelling of Fundamental Frequency in Statistical Parametric Speech Synthesis |
Harriet Collier | Investigating the Success of an Articulatory Join Cost for Unit Selection Synthesis |
Maiia Bikmetova | Grapheme-to-Metaphoneme Conversion for Unisyn and Combilex Baseform Transcriptions |
Zeb Taylor | Using Spectral Discontinuity Features to Detect Unit-Selection-generated Spoofing Attacks |
Yue Liu | Speaker Verification Countermeasures against Synthetic Speech Spoofing and their Vulnerability to Re-vocoded Speech |
Maria Naka | Replay Spoofing Attacks and Countermeasures |
Ruiduan Li | Examining the Effects of Vocoder and Statistical Model Type on Spoofing Countermeasure Performance |
Alessandro Di Martino | Spotting Anomalies in Carbon Fibre Composites using Deep Convolutional Neural Networks |
Phoebe Parsons | Acoustic-to-Articulatory Inversion Using the DoubleTalk Corpus |
Ricardo Cortez | Deep Neural Networks for Factored Speech Synthesis Modelling |
Terence Simms | Investigating Non-Uniqueness in the Acoustic-Articulatory Inversion Mapping |
Simon Hammond | Using Keyword Metaphones in HMM-based Speech Synthesis |
Matthieu Chassot | Magnetic Field Optimization from Limited Data |
Alexis Grant | Dae ye ken me?: Speech Synthesis in the Gorbals Region of Glasgow |
Gregor Hofer | Emotional Speech Synthesis |
Emina Kurtić | Polyglot voice design for unit selection speech synthesis |
Steinthor Steingrimsson | Bilingual Voice for Unit Selection Speech Synthesis |
Yoko Saikachi | Building a Unit-selection Voice for Festival |