About me

Organising

I’m currently part of the UKSpeech organizing committee, and co-chaired the associated workshop in 2022. Recently, I was on the organizing committee for Interspeech 2023. In the past, I have served on the Student Advisory Committee of the International Speech Communication Association (ISCA-SAC). A long time ago, I helped organize a workshop on New Tools and Methods for Very-Large-Scale Phonetics. I have also served of the organizing committee of the 2010 Young Researcher’s Roundtable on Spoken Dialogue Systems (YRRSDS).

How did I get here?

I’m now a lecturer in the Department of Linguistics and English Language (LEL), but it’s been a long and winding road!

Before taking up my position in LEL, I was a post-doc/senior researcher in the School of Informatics. Prior to starting lecturing, I was Principle Investigator on a grant at Toyota (in Europe and Japan) researching ‘Spoken Dialogue Processing for Robot companions’, where we looked at extracting topical and affective information in conversational speech. The hope was (is!) that this sort of research will help make assistive technologies more robust, as well as provide new ways for linguists and other social scientists to explore speech and video data. I’m still working in this general area, but as you’ll see from my publications that my interests have shifted/widened a bit.

Before the Toyota project, I had a post-doc position with Johanna Moore. We worked on various topics around prosody and discourse structure, as well as emotion recognition with our PhD student Leimin Tian. That work led pretty directly to getting the grant from Toyota.

I originally came to Edinburgh to work on the EU project InEvent, where I had the extremely good fortune to join CSTR and work with Steve Renals and Jean Carletta. The goal of InEvent was to use and improve automatic processing of audiovisual data to aid browsing of large video archives. I mainly worked on how to features often associated with speaker affect, e.g. prosody and measures of participation, can be used for summarization and affect detection. I also ended up doing a bit of work on human computer interaction and interface evaluation. You’ll see from my publications that I also worked on lot of other projects with colleagues in CSTR and beyond.

Before that, I was graduate student in the Department of Linguistics at the University of Pennsylvania. At Penn, I worked in the phonetics lab where I took advice from Jiahong Yuan, Mark Liberman, Florian Schwarz, and many other people. My PhD was about where intonational features, particularly final pitch rises, fit in with semantic and pragmatic theories. I also worked on various other topics ranging from iterated learning in language change, gradability and modality in semantics, tone and stress in Chinese, and prosody of second language learners.

Quite a long time ago (now), I did a research masters at the University of Melbourne, Australia. I was part of the Language Technology Group where my supervisor was Steven Bird. Back in the day, I researched querying and manipulating linguistically annotated structured data. I decided to do a PhD in linguistics after this because I found understanding language and interaction to be an irresistible research topic (I still do!).

And for completeness, before that I studied maths and computer science at the University of Melbourne. Back then, I prefered the more abstract/theory stuff (theory of computation, pure maths), so it’s quite funny that I mostly do empirical stuff now. But looking back, I always found the formal language theory bits really interesting - a gateway into syntax of non-programming languages, you might say! I was lucky enough to do a summer research internship with Graham Byrnes working on bioinformatics and wrote my honours thesis on phylogenetic hypothesis testing. I liked the project but realised that there was a lot of cross-over in methods between bioinformatics and computational linguistics/natural language processing, and I found linguistics much more interesting!

Before my research life really kicked off, I worked as a shop assistant, medical secretary, and tutor. It’s funny to look back and realise that a good part of my job as a medical secretary was to essentially do speech recognition (i.e., transcription of medical letters). I certainly learned a lot implicitly about the value of language models for ASR! Back then, there was a doctor I worked for who was trying some very new speech recognition gizmo from Dragon Naturally Speaking (which later became Nuance). The automatic transcripts needed a lot of editing. They still do now, but back then, we wouldn’t have dreamed of sending them off without at least two sets of eyes checking for errors.