Publications: M. Aylett


Thomas, L., Farrow, E., Aylett, M., Briggs, P. (2018) href="">A life story in three parts: the use of triptychs to make sense of personaldigital data. Personal and Ubiquitous Computing. 2, pages 1-15.
Wester, M., Aylett, M. P., & Braude, D. A. (2017) Bot or not: exploring the fine line between cyber and human identity. ICMI, pages 506-507.
Wester, M., Braude, D. A., Potard, B., Aylett, M. P., Shaw, F. (2017) Real-time reactive speech synthesis: incorporating interruptions. Proc. Interspeech 2017, 3996-4000.
Mendelson, J., Aylett, M. (2017) Beyond the Listening Test: An interactive approach to TTS Evaluation. Interspeech 2017, 249-253).
Aylett, M. P., Vinciarelli, A., Wester, M. (2017) Speech Synthesis for the Generation of Artificial Personality. IEEE Transactions on Affective Computing.
Potard, B., Aylett, M.P., Braude, D.A. (2016) Cross Modal Evaluation of High Quality Emotional Speech Synthesis with the Virtual Human Toolkit. IVA.
Potard, B., Aylett, M.P., Braude, D.A., Motlicek, P. (2016) Idlak Tangle: An Open Source Kaldi Based Parametric Speech Synthesiser based on DNN. Interspeech.
Aylett, M.P., Lawson, S. (2016) The Smartphone: A Lacanian Stain, A Tech Killer, and an Embodiment of Radical Individualism. CHI '16.
Sun, Y., Aylett, M.P., Vazquez-Alvarez, Y. (2016) e-Seesaw: A Tangible, Ludic, Parent-child, Awareness System. CHI EA '16.
Aylett, M.P., Thomas, L., Green, D.P., Shamma, D.A., Briggs, P., Kerrigan, F. (2016) My Life On Film. CHI EA '16 Workshop.
Munteanu, C., Irani, P., Oviatt, S., Aylett, M.P., Penn, G., Sharma, N., Rudzicz, F., Gomez, R. (2016) Designing Speech and Multimodal Interactions for Mobile, Wearable, and Pervasive Applications. CHI EA '16 Workshop.
Aylett, M.P., Pullin, G. Braude, D.A., Potard, B., Henning, S., Antunes Ferreira, M. (2016) Don't Say Yes, Say Yes: Interacting with Synthetic Speech Using Tonetable. CHI EA '16 Demo.
Vazquez-Alvarez, Y., Aylett, M. P., Brewster, S., von Jungenfeld, R., and Virolainen, A. (2015) Designing Interactions with Multilevel Auditory Displays in Mobile Audio-Augmented Reality. ACM Transactions on Computer-Human Interaction, vol 23:1
Wester, M., Aylett, M., Tomalin, M., and Dall, R. (2015) Artificial Personality and Disfluency. Interspeech 2015
Aylett, M.P., Quigley, A.J. (2015) The Broken Dream of Pervasive Sentient Ambient Calm Invisible Ubiquitous Computing. CHI Extended Abstracts.
Aylett, M.P., Farrow, E., Pschetz, L., and Dickinson, T.. (2015). Generating Narratives from Personal Digital Data: Triptychs. CHI Extended Abstracts.
Aylett, M. P., Vazquez-Alvarez, Y., Baillie, L. (2015) Interactive Radio: A New Platform for Calm Computing. CHI Extended Abstracts.
Aylett, M.P., Dall, R., Ghoshal, A., Eje Henter, G. Merritt. T. (2014) A flexible front-end for HTS. In Proc. Interspeech, pages 1283-1287
Aylett, M.P., Kristensson, P.O., Whittaker, S., Vazquez-Alvarez, Y. (2014) None of a CHInd: Relationship Counselling for HCI and Speech Technology. CHI Extended Abstracts pp749-760.
Vazquez-Alvarez, Y., Aylett, M.P., Brewster, S.A., von Jungenfeld, R., Virolainen, A. (2014) Multilevel Auditory Displays for Mobile Eyes-free Location-based Interaction. CHI Extended Abstracts pp1567-1572.
Munteanu, C., Jones, M., Whittaker, S., Oviatt, S., Aylett, M.P., Penn, G., Brewster, S.A., D'Alessandro, N. (2014) Designing Speech and Language Interactions, CHI Extended Abstracts pp75-78.
Kane, J., Aylett, M.P., Yanushevskaya, I., Gobl, C. (2014) Phonetic Feature Extraction for Context-sensitive Glottal Source Processing, Speech Communication Vol. 59 pp10-21.
Aylett, M.P., Potard, B., Pidcock, C.J. (2013) Expressive Speech Synthesis: Synthesising Ambiguity, 8th ISCA Speech Synthesis Workshop, Barcelona.
Kane, J., Scherer, S., Aylett, M., Morency, L-P., Gobl, C., (2013) Speaker and language independent voice quality classification applied to unlabelled corpora of expressive speech, ICASSP 2013, Vancouver, Canada.
Aylett, M.P., Vazquez-Alvarez, Y., Baillie, L. (2013) Evaluating Speech Synthesis in a Mobile Context: Audio Presentation of Facebook, Twitter and RSS Information Technology Interfaces ITI 2013.
Potard, B., Aylett, M.P., Pidcock, C.J. (2012) Proper Name Splicing in Computer Games with TTS Interspeech 2012, Portland.
Aylett, M.P., Potard, B. (2012) Synthesising and evaluating cross-modal emotional ambiguity in virtual agents IVA2012, Santa Cruz, U.S.A, Proceedings. Lecture Notes in Computer Science 7502 Springer, pp471-3.
Stan, A., Yamagishi, J., King, S., Aylett, M. (2011) The Romanian speech synthesis (RSS) corpus: Building a high quality HMM-based speech synthesis system using a high sampling rate, Speech Communication 53:3, pp442-50.
Aylett, M.P., Kimball, T., Andert, G. (2010) Scalable Mobile Implementation of High Quality Real Time Text to Speech Synthesis Fifth Workshop on Speech in Mobile and Pervasive Environments, Lisbon.
Andersson, S. Georgila. K., Traum, D., Aylett, M., Clarke, R. (2010) Prediction and Realisation of Conversational Characteristics by Utilising Spontaneous Speech for Unit Selection, Speech Prosody 2010.
Aylett, M.P., King, S., Yamagishi, J. (2009) Speech Synthesis Without a Phone Inventory Interspeech 2009, Brighton, 2087-90
Aylett, M.P., Pidcock, C.J. (2009) The CereProc Blizzard Entry 2009: Some dumb algorithms that don't work Blizzard Challenge Workshop, Edinburgh.
Andersson, J.S., Badino, L., Watts, O.S., Aylett, M.P. (2008) The CSTR/Cereproc Blizzard Entry 2008: The Inconvenient Data. (University of Edinburgh, UK / CereProc Ltd, UK), Blizzard Challenge Workshop, Brisbane.
Aylett, M.P., Yamagishi, J., (2008) Combining Statistical Parametric Speech Synthesis and Unit-Selection for Automatic Voice Cloning. LangTech 2008, Rome.
Aylett, M.P., Pidcock, C.J., (2007) The CereVoice Characterful Speech Synthesiser SDK (Industrial Demo). IVA 2007, Paris, France, Proceedings. Lecture Notes in Computer Science 4722 Springer.
Aylett, M.P., Andersson, J.S., Badino, L., Pidcock, C.J. (2007)The Cerevoice Blizzard Entry 2007: Are Small Database Errors Worse than Compression Artifacts? Blizzard Challenge Workshop, Bonn.
Aylett, M.P, King, S. (2007) Single Speaker Segmentation and Inventory Selection Using Dynamic Time Warping Self Organization and Joint Multigram Mapping, Proceedings of ISCA Speech Synthesis Workshop, Bonn 2007
Aylett, M.P., Pidcock, C.J., (2007) The CereVoice Characterful Speech Synthesiser SDK, AISB 2007, Newcastle. pp.174-8
Aylett, M.P., Pidcock, C.J., Fraser, M.E. (2006) The Cerevoice Blizzard Entry 2006: A prototype Database Unit Selection Engine, Blizzard Challenge Workshop, Pittsburgh.
Aylett, M., Turk, A. (2006) Language Redundancy Predicts Syllabic Duration and the Spectral Characteristics of Vocalic Syllable Nuclei, JASA, 119:3048-58
Aylett, M.P. (2006) Detecting High Level Structure without Lexical Information. ICASSP 2006, Toulouse.
Aylett, M.P. (2005) Extracting the Acoustic Features of Interruption Points Using Non-Lexical Prosodic Analysis. DISS 2005, Aix-en-Provence, 17-20
Aylett, M.P. (2005) Synthesising Hyperarticulation in Unit Selection TTS. Interspeech 2005, Lisbon, 2521-24
Aylett, M., Turk, A. (2004) The Smooth Signal Redundancy Hypothesis: A Functional Explanation for Relationships between Redundancy, Prosodic Prominence and Duration in Spontaneous Speech. Language and Speech, Volume 47(1), 31-56
Aylett, M.P. (2004) Merging Data Driven and Rule Based Prosodic Models for Unit Selection TTS. Proceedings of ISCA Speech Synthesis Workshop, Pitsburgh 2004, published online.
Bard, E. G., Aylett, M. P. (2004) Referential Form, Word Duration, and Modeling the Listener in Spoken Dialogue. In John C. Trueswell and Michael K. Tanenhaus, eds. Approaches to Studying World-Situated Language Use: Bridging the Language-as-Product and Language-as-Action Traditions. Cambridge, MA: MIT Press.
Aylett, M.P. (2003) Disfluency and Speech Recognition Profile Factors. Proceedings of DiSS 03, Disfluency in Spontaneous Speech Workshop, Göteborg University, Sweden. Robert Eklund (ed.), Gothenburg Papers in Theoretical Linguistics 89, ISSN 0349 1021, pp. 49-52.
Aylett, M.P., Fackrell, J. & Rutten P. (2003) My Voice, Your Prosody: Sharing a Speaker Specific Prosody Model Across Speakers in Unit Selection TTS. Eurospeech-2003 Geneva.
Aylett, M.P. (2002) Stochastic Suprasegmentals: Relationships Between the Spectral Characteristics of Vowels, Redundancy and Prosodic Structure. ICSLP-2002 Denver.
Bard, E.G., Lickley, R.J. & Aylett, M.P. (2001) Is Disfluency Just Difficult? In Proceedings of DISS '01, An ISCA Tutorial and Research Workshop, Edinburgh.
Aylett, M.P. (2001) Modelling Care of Articulation with HMMs is Dangerous. In Proceedings of Eurospeech-2001, Aalberg.
Bard, E.G., Sotillo C., Kelly M.L., Aylett, M.P. (2001) Taking the Hit: Leaving some Lexical Competition to be Resolved Post-Lexically. Language and Cognitive Processes, Volume 16, 5-6 p173-176
Aylett, M.P. (2000) Stochastic Suprasegmentals - Relationships between Redundancy, Prosodic Structure and Care of Articulation in Spontaneous Speech. PhD Thesis, Department of Linguistics, University of Edinburgh.
Bard, E.G., Anderson, A.H., Sotillo, C., Aylett, M., Doherty-Sneddon, G., and Newlands, A. (2000) Controlling the Intelligibility of Referring Expressions in Dialogue. Journal of Memory and Language, Volume 42-1 p1-22.
Aylett, M.P. (2000) Stochastic Suprasegmentals: Relationships between Redundancy, Prosodic Structure and Care of Articulation in Spontaneous Speech. In Proceedings of ICSLP-2000, Beijing.
Aylett, M.P. (2000) Modelling clarity change in spontaneous speech. In R.J. Baddeley, P.J.B. Hancock, and P.Foldiak, editors, Information Theory and the Brain. Cambridge University Press, New York.
Bard, E.G. & Aylett, M.P. (1999) The Dissociation of Deaccenting, Givenness and Syntactic Role in Spontaneous Speech. In Proceedings of ICPhS-99, San Francisco.
Aylett, M.P. (1999) Stochastic Suprasegmentals: Relationships between Redundancy, Prosodic Structure and Syllabic Duration. In Proceedings of ICPhS-99, San Francisco.
Bull, M.C. & Aylett, M.P. (1998) An Analysis of the Timing of Turn-Taking in a Corpus of Goal-Orientated Dialogue. In Proceedings of ICSLP-98 Sidney, Australia (4)1175-8pp.
Aylett, M.P. & Bull M.C. (1998) The Automatic Marking of Prominence in Spontaneous Speech Using Duration and Part of Speech Information. In Proceedings of ICSLP-98 Sidney, Australia (5)2123-6pp.
Aylett, M.P. (1998) Building a Statistical Model of the Vowel Space for Phoneticians In Proceedings of SST-98 ICSLP-98 Sidney, Australia.
Aylett, M. & Turk, A. (1998) Vowel quality in spontaneous speech: What makes a good vowel? In Proceedings of ICSLP-98 Sidney, Australia
Mayo, C., Aylett, M. & Ladd, D. (1997) Prosodic Transcription of Glasgow English: An Evaluation Study of GlaToBI. In Botinis, A., Kouroupetroglou, G. & Carayiannis, G. editors, Proceedings of an ESCA Workshop: Intonation: Theory, Models and Applications. Athens, Greece. ESCA and The University of Athens, 231-234pp.

Thesis: Prosodic structure, statistical redundancy and care of articulation in spontaneous speech.
Speech Technology: Prosodic control in Unit Selection Synthesis. The use of sub-word units in concatenative speech synthesis. Voice cloning.
Dialogue: The relationships between dialogue structure, use of reference and care of articulation.