New

Research

I'm interested in understanding human communication using machine learning and statistical models, and constructing systems that can recognize and interpret communication scenes. My research career is grounded in speech processing, and our approaches start from the signals.

Speech Recognition and Synthesis

How can we improve conversational speech recognition? How can we make speech synthesis more natural? We are looking at better speech recognition systems that are better adapted or normalised to new domains or speakers, that can be ported across languages, and that are robust to different acoustic environments. We are particularly interested in models based on deep neural networks, for both acoustic modelling and language modelling. Current research students in speech recognition and synthesis include Pawel Swietojanski and Siva Reddy Gangireddy. Researchers working with me on speech recognition include Peter Bell, Joris Driesen, Liang Lu, and Fergus McInnes. In speech synthesis, these days I just try to keep up with the great work of Simon King, Junichi Yamagishi, and their colleagues, as well as working with Korin Richmond on tongue modelling and Shinji Takaki on deep neural network acoustic models.
Read more...

Multimodal Interaction

Human communication is factored across more than one modality. The analysis and interpretation of multimodal interaction presents a number of challenges, ranging from ways to model multiple asynchronous streams of data to the construction of systems that can interpret aspects of multiparty human communication. A lot of this work is about augmenting communication in meetings (previously in the AMI and AMIDA Integrated Projects, and now in the InEvent) project; we are also interested in the development of systems for home care. Current students in multimodal interaction include Karl Isaac, and I work with Catherine Lai, Qiang Huang, Jonathan Kilgour, and Jean Carletta in these areas. And I try to keep up with Hiroshi Shimodaira's work on synthesising conversational agents and social signals.
Read more...

Projects

I'm principal investigator of the following projects: Natural Speech Technology, Ultrax, InEvent, and uDialogue. I'm a co-investigator on SSPNet, EU-Bridge, and Simple4All. Previous projects include the AMI and AMIDA Integrated Projects.

Opportunities

We are always looking for excellent research students: see the page about PhD opportunities at CSTR. I am not looking for visiting interns for the foreseeable future.

Teaching

This year I am teaching Speech Processing (Learn) and Automatic Speech Recognition (jointly with Hiroshi Shimodaira).