Publications: M. Aylett
Publications
Aylett, M.P., Clark, L., Cowan, B.R. & Torre, I.(2021) Building and
Designing Expressive Speech Synthesis. In The Handbook on Socially
Interactive Agents: 20 years of Research on Embodied Conversational
Agents, Intelligent Virtual Agents, and Social Robotics Volume 1:
Methods, Behavior, Cognition. (pp. 173-212).
Doyle, P.R, Rough, D.J., Edwards, J., Cowan, B.R., Clark, L.,
Porcheron, M., Schlögl, S., Torres, M.I., Munteanu, C. & Murad,
C. (2021) CUI@ IUI: Theoretical and Methodological Challenges. In
Intelligent Conversational User Interface Interactions,26th
International Conference on Intelligent User
Interfaces. (pp. 12-14).
Murad, C., Munteanu, C., Cowan, B.R., Clark, L., Porcheron, M.,
Candello, H., Schlögl, S., Aylett, M.P., Sin, J. & Moore,
R.J. (2021) Let’s Talk About CUIs: Putting Conversational User
Interface Design Into Practice. In Extended Abstracts of the 2021 CHI
Conference on Human Factors in Computing Systems. (pp. 1-6).
Munteanu, C., Clark, L., Cowan, B., Schlögl, S., Torres, M.I.,
Edwards, J., Murad, C., Aylett, M., Porcheron, M. & Candello,
H. (2020) CUI: Conversational User Interfaces: A Workshop on New
Theoretical and Methodological Perspectives for Researching
Speech-based Conversational Interactions. In Proceedings of the 25th
International Conference on Intelligent User Interfaces
Companion. (pp. 15-16).
Aylett, M.P. & Vazquez-Alvarez, Y. (2020) Voice Puppetry:
Speech Synthesis Adventures in Human Centred AI. In Proceedings of the
25th International Conference on Intelligent User Interfaces
Companion. (pp. 108-109).
Aylett, M.P. & Vazquez-Alvarez, Y. (2020) Voice Puppetry:
Towards Conversational HRI WoZ Experiments with Synthesised
Voices. In Companion of the 2020 ACM/IEEE International Conference on
Human-Robot Interaction. (pp. 69-69).
Aylett, M.P., Vazquez-Alvarez, Y. & Butkute, S. (2020) Creating
Robot Personality: Effects of Mixing Speech and Semantic Free
Utterances. In Companion of the 2020 ACM/IEEE International Conference on
Human-Robot Interaction. (pp. 110-112).
Porcheron, M., Clark, L., Jones, M., Candello, H., Cowan, B.R.,
Murad, C., Sin, J., Aylett, M.P., Lee, M. & Munteanu, C. (2020)
CUI@ CSCW: Collaborating through Conversational User Interfaces. In
Conference Companion Publication of the 2020 on Computer Supported
Cooperative Work and Social Computing. (pp. 483-492).
Hanson, D., Storm, F., Huang, W., Krisciunas, V., Darrow, T.,
Brown, A., Lei, M., Aylett, M. & Pickrell, A. (2020) SophiaPop:
Experiments in Human-AI Collaboration on Popular Music. arXiv
preprint arXiv:2011.10363.
Aylett, M., Braude, D., Pidcock, C., & Potard, B.(2019) Voice Puppetry:
Exploring Dramatic Performance to Develop Speech Synthesis. In
Proc. 10th ISCA Speech Synthesis Workshop (pp. 117-120).
Clark, L., Doyle, P., Garaialde, D., Gilmartin, E., Schlögl, S.,
Edlund, J., Aylett, M.P., Cabral, J., Munteanu, C., Edwards, J. &
Cowan, B. R. (2019) The State of Speech in
HCI: Trends, Themes and Challenges, Interacting with Computers.
Scott, K. M., Ashby, S., Braude, D. A., & Aylett, M. P. (2019)
Who owns your voice?: ethically sourced voices for non-commercial
TTS applications. In Proceedings of the 1st International Conference
on Conversational User Interfaces (p. 17). ACM.
Aylett, M. P., Sutton, S. J., & Vazquez-Alvarez, Y. (2019) The right kind of unnatural: designing a robot voice. In
Proceedings of the 1st International Conference on Conversational User
Interfaces (p. 25). ACM.
Braude, D. A., Aylett, M. P., Laoide-Kemp, C., Ashby, S., Scott,
K. M., Raghallaigh, B. O. & Stan, A. (2019) All Together Now:
The Living Audio Dataset. Proc. Interspeech 2019, 1521-1525.
Aylett, M. P., Cowan, B. R., & Clark, L. (2019) Siri, Echo and
Performance: You have to Suffer Darling. In Extended Abstracts of the
2019 CHI Conference on Human Factors in Computing Systems. ACM.
Buchanan, C. G., Aylett M. P., & Braude, D.A. (2018) Adding
Personality to Neutral Speech Synthesis Voices. International
Conference on Speech and Computer, SPECOM 2018, 49-57
Aylett, M. P., & Braude, D. A. (2018) Designing speech
interaction for the Sony Xperia Ear and Oakley Radar Pace
smartglasses. In Proceedings of the 20th International Conference on
Human-Computer Interaction with Mobile Devices and Services Adjunct
(pp. 379-384). ACM.
Thomas, L., Farrow, E., Aylett, M., Briggs, P. (2018) A life story in three parts: the use of
triptychs to make sense of personaldigital data. Personal and
Ubiquitous Computing. 2, pages 1-15.
Wester, M., Aylett, M. P., & Braude, D. A. (2017) Bot or not:
exploring the fine line between cyber and human identity. ICMI, pages
506-507.
Wester, M., Braude, D. A., Potard, B., Aylett, M. P., Shaw,
F. (2017) Real-time reactive speech synthesis: incorporating
interruptions. Proc. Interspeech 2017, 3996-4000.
Mendelson, J., Aylett, M. (2017) Beyond the Listening Test: An
interactive approach to TTS Evaluation. Interspeech 2017,
249-253).
Aylett, M. P., Vinciarelli, A., Wester, M. (2017) Speech Synthesis
for the Generation of Artificial Personality. IEEE Transactions on
Affective Computing.
Potard, B., Aylett, M.P., Braude,
D.A. (2016) Cross Modal
Evaluation of High Quality Emotional Speech Synthesis with the
Virtual Human Toolkit. IVA.
Potard, B., Aylett, M.P., Braude, D.A., Motlicek, P. (2016)
Idlak Tangle: An Open
Source Kaldi Based Parametric Speech Synthesiser based on DNN. Interspeech.
Aylett, M.P., Lawson, S. (2016) The Smartphone: A Lacanian Stain, A
Tech Killer, and an Embodiment of Radical Individualism. CHI '16.
Sun, Y., Aylett, M.P., Vazquez-Alvarez, Y. (2016) e-Seesaw: A
Tangible, Ludic, Parent-child, Awareness System. CHI EA '16.
Aylett, M.P., Thomas, L., Green, D.P., Shamma, D.A., Briggs, P.,
Kerrigan, F. (2016) My Life On Film. CHI EA '16 Workshop.
Munteanu, C., Irani, P., Oviatt, S., Aylett, M.P., Penn, G.,
Sharma, N., Rudzicz, F., Gomez, R. (2016) Designing Speech and
Multimodal Interactions for Mobile, Wearable, and Pervasive
Applications. CHI EA '16 Workshop.
Aylett, M.P., Pullin, G. Braude, D.A., Potard, B., Henning, S.,
Antunes Ferreira, M. (2016) Don't Say Yes, Say Yes: Interacting
with Synthetic Speech Using Tonetable. CHI EA '16 Demo.
Vazquez-Alvarez, Y., Aylett, M. P., Brewster, S., von Jungenfeld,
R., and Virolainen, A. (2015) Designing Interactions with
Multilevel Auditory Displays in Mobile Audio-Augmented Reality. ACM Transactions on Computer-Human Interaction, vol 23:1
Wester, M., Aylett, M., Tomalin, M., and Dall,
R. (2015) Artificial
Personality and Disfluency. Interspeech 2015
Aylett, M.P., Quigley, A.J. (2015) The Broken Dream of
Pervasive Sentient Ambient Calm Invisible Ubiquitous Computing. CHI
Extended Abstracts.
Aylett, M.P., Farrow, E., Pschetz, L., and Dickinson,
T.. (2015). Generating Narratives from Personal Digital Data:
Triptychs. CHI Extended Abstracts.
Aylett, M. P., Vazquez-Alvarez, Y., Baillie, L. (2015)
Interactive Radio: A New Platform for Calm Computing. CHI Extended
Abstracts.
Aylett, M.P., Dall, R., Ghoshal, A., Eje Henter,
G. Merritt. T. (2014) A flexible front-end for HTS. In
Proc. Interspeech, pages 1283-1287
Aylett, M.P., Kristensson, P.O., Whittaker, S., Vazquez-Alvarez,
Y. (2014) None of a CHInd: Relationship Counselling for HCI and
Speech Technology. CHI Extended Abstracts pp749-760.
Vazquez-Alvarez, Y., Aylett, M.P., Brewster, S.A., von Jungenfeld,
R., Virolainen, A. (2014) Multilevel Auditory Displays for
Mobile Eyes-free Location-based Interaction. CHI Extended Abstracts
pp1567-1572.
Munteanu, C., Jones, M., Whittaker, S., Oviatt, S.,
Aylett, M.P., Penn, G., Brewster, S.A., D'Alessandro, N. (2014)
Designing Speech and Language Interactions, CHI Extended Abstracts
pp75-78.
Kane, J., Aylett, M.P., Yanushevskaya, I., Gobl, C. (2014)
Phonetic Feature Extraction for Context-sensitive Glottal Source
Processing, Speech Communication Vol. 59 pp10-21.
Aylett, M.P., Potard, B., Pidcock,
C.J. (2013) Expressive Speech
Synthesis: Synthesising Ambiguity, 8th ISCA Speech Synthesis
Workshop, Barcelona.
Kane, J., Scherer, S., Aylett, M., Morency, L-P., Gobl, C., (2013)
Speaker and language independent voice quality classification applied
to unlabelled corpora of expressive speech, ICASSP 2013, Vancouver, Canada.
Aylett, M.P., Vazquez-Alvarez, Y., Baillie, L. (2013)
Evaluating Speech
Synthesis in a Mobile Context: Audio Presentation of Facebook, Twitter
and RSS Information Technology Interfaces ITI 2013.
Potard, B., Aylett, M.P., Pidcock,
C.J. (2012) Proper Name
Splicing in Computer Games with TTS Interspeech 2012,
Portland.
Aylett, M.P., Potard, B. (2012) Synthesising and
evaluating cross-modal emotional ambiguity in virtual agents
IVA2012, Santa Cruz, U.S.A, Proceedings. Lecture Notes in Computer
Science 7502 Springer, pp471-3.
Stan, A., Yamagishi, J., King, S., Aylett, M. (2011) The Romanian
speech synthesis (RSS) corpus: Building a high quality HMM-based
speech synthesis system using a high sampling rate, Speech
Communication 53:3, pp442-50.
Aylett, M.P., Kimball, T., Andert, G. (2010) Scalable Mobile Implementation of High Quality Real Time Text to Speech Synthesis Fifth Workshop on Speech in Mobile and Pervasive Environments, Lisbon.
Andersson, S. Georgila. K., Traum, D., Aylett, M., Clarke,
R. (2010) Prediction and Realisation of Conversational
Characteristics by Utilising Spontaneous Speech for Unit Selection,
Speech Prosody 2010.
Aylett, M.P., King, S., Yamagishi, J. (2009) Speech Synthesis Without a Phone Inventory Interspeech 2009, Brighton,
2087-90
Aylett, M.P., Pidcock, C.J. (2009) The CereProc Blizzard Entry 2009: Some dumb algorithms that don't work Blizzard Challenge
Workshop, Edinburgh.
Andersson, J.S., Badino, L., Watts, O.S., Aylett, M.P. (2008) The
CSTR/Cereproc Blizzard Entry 2008: The Inconvenient Data.
(University of Edinburgh, UK / CereProc Ltd, UK), Blizzard Challenge
Workshop, Brisbane.
Aylett, M.P., Yamagishi, J., (2008) Combining Statistical Parametric Speech Synthesis and Unit-Selection for Automatic Voice Cloning. LangTech 2008, Rome.
Aylett, M.P., Pidcock, C.J., (2007) The CereVoice Characterful Speech Synthesiser SDK (Industrial Demo). IVA 2007, Paris, France, Proceedings. Lecture Notes in Computer Science 4722 Springer.
Aylett, M.P., Andersson, J.S., Badino, L., Pidcock, C.J. (2007)The Cerevoice Blizzard Entry 2007: Are Small Database Errors Worse than Compression Artifacts? Blizzard Challenge Workshop, Bonn.
Aylett, M.P, King, S. (2007) Single
Speaker Segmentation and Inventory Selection Using Dynamic Time
Warping Self Organization and Joint Multigram Mapping, Proceedings
of ISCA Speech Synthesis Workshop, Bonn 2007
Aylett, M.P., Pidcock, C.J., (2007) The
CereVoice Characterful Speech Synthesiser SDK, AISB 2007,
Newcastle. pp.174-8
Aylett, M.P., Pidcock, C.J., Fraser, M.E. (2006) The Cerevoice
Blizzard Entry 2006: A prototype Database Unit Selection Engine,
Blizzard Challenge Workshop, Pittsburgh.
Aylett, M., Turk, A. (2006) Language Redundancy
Predicts Syllabic Duration and the Spectral Characteristics of Vocalic
Syllable Nuclei, JASA, 119:3048-58
Aylett, M.P. (2006) Detecting
High Level Structure without Lexical Information. ICASSP 2006,
Toulouse.
Aylett, M.P. (2005) Extracting
the Acoustic Features of Interruption Points Using Non-Lexical
Prosodic Analysis. DISS 2005, Aix-en-Provence, 17-20
Aylett, M.P. (2005) Synthesising
Hyperarticulation in Unit Selection TTS. Interspeech 2005, Lisbon,
2521-24
Aylett, M., Turk, A. (2004) The Smooth Signal Redundancy
Hypothesis: A Functional Explanation for Relationships between
Redundancy, Prosodic Prominence and Duration in Spontaneous
Speech. Language and Speech, Volume 47(1), 31-56
Aylett, M.P. (2004) Merging Data Driven and Rule
Based Prosodic Models for Unit Selection TTS. Proceedings of ISCA
Speech Synthesis Workshop, Pitsburgh 2004, published online.
Bard, E. G., Aylett, M. P. (2004) Referential Form, Word Duration,
and Modeling the Listener in Spoken Dialogue. In John C. Trueswell and
Michael K. Tanenhaus, eds. Approaches to Studying World-Situated
Language Use: Bridging the Language-as-Product and Language-as-Action
Traditions. Cambridge, MA: MIT Press.
Aylett, M.P. (2003) Disfluency
and Speech Recognition Profile Factors. Proceedings of DiSS 03,
Disfluency in Spontaneous Speech Workshop, Göteborg University,
Sweden. Robert Eklund (ed.), Gothenburg Papers in Theoretical
Linguistics 89, ISSN 0349 1021, pp. 49-52.
Aylett, M.P., Fackrell, J. & Rutten P. (2003) My
Voice, Your Prosody: Sharing a Speaker Specific Prosody Model Across
Speakers in Unit Selection TTS. Eurospeech-2003 Geneva.
Aylett, M.P. (2002) Stochastic
Suprasegmentals: Relationships Between the Spectral Characteristics of
Vowels, Redundancy and Prosodic Structure. ICSLP-2002 Denver.
Bard, E.G., Lickley, R.J. & Aylett, M.P. (2001) Is
Disfluency Just Difficult? In Proceedings of DISS '01, An ISCA
Tutorial and Research Workshop, Edinburgh.
Aylett, M.P. (2001) Modelling
Care of Articulation with HMMs is Dangerous. In Proceedings of
Eurospeech-2001, Aalberg.
Bard, E.G., Sotillo C., Kelly M.L., Aylett, M.P. (2001) Taking
the Hit: Leaving some Lexical Competition to be Resolved
Post-Lexically. Language and Cognitive Processes, Volume 16, 5-6
p173-176
Aylett, M.P. (2000) Stochastic
Suprasegmentals - Relationships between Redundancy, Prosodic Structure
and Care of Articulation in Spontaneous Speech. PhD Thesis,
Department of Linguistics, University of Edinburgh.
Bard, E.G., Anderson, A.H., Sotillo, C., Aylett, M.,
Doherty-Sneddon, G., and Newlands, A. (2000) Controlling the
Intelligibility of Referring Expressions in Dialogue. Journal of
Memory and Language, Volume 42-1 p1-22.
Aylett, M.P. (2000) Stochastic
Suprasegmentals: Relationships between Redundancy, Prosodic Structure
and Care of Articulation in Spontaneous Speech. In Proceedings of
ICSLP-2000, Beijing.
Aylett, M.P. (2000) Modelling
clarity change in spontaneous speech. In R.J. Baddeley,
P.J.B. Hancock, and P.Foldiak, editors, Information Theory and the
Brain. Cambridge University Press, New York.
Bard, E.G. & Aylett, M.P. (1999) The
Dissociation of Deaccenting, Givenness and Syntactic Role in
Spontaneous Speech. In Proceedings of ICPhS-99, San Francisco.
Aylett, M.P. (1999) Stochastic
Suprasegmentals: Relationships between Redundancy, Prosodic Structure
and Syllabic Duration. In Proceedings of ICPhS-99, San
Francisco.
Bull, M.C. & Aylett, M.P. (1998) An
Analysis of the Timing of Turn-Taking in a Corpus of Goal-Orientated
Dialogue. In Proceedings of ICSLP-98 Sidney, Australia
(4)1175-8pp.
Aylett, M.P. & Bull M.C. (1998) The
Automatic Marking of Prominence in Spontaneous Speech Using Duration
and Part of Speech Information. In Proceedings of ICSLP-98 Sidney,
Australia (5)2123-6pp.
Aylett, M.P. (1998) Building
a Statistical Model of the Vowel Space for Phoneticians In
Proceedings of SST-98 ICSLP-98 Sidney, Australia.
Aylett, M. & Turk, A. (1998) Vowel
quality in spontaneous speech: What makes a good vowel? In
Proceedings of ICSLP-98 Sidney, Australia
Mayo, C., Aylett, M. & Ladd, D. (1997) Prosodic
Transcription of Glasgow English: An Evaluation Study of GlaToBI.
In Botinis, A., Kouroupetroglou, G. & Carayiannis, G. editors,
Proceedings of an ESCA Workshop: Intonation: Theory, Models and
Applications. Athens, Greece. ESCA and The University of Athens,
231-234pp.