Five PhD studentship
25 November 2011 14:44 Filed in:
Other
The Centre for Speech Technology Research at University of
Edinburgh invites applications for
five fully-funded PhD studentships
in speech and language processing.
For details, see http://www.ed.ac.uk/schools-departments/informatics/postgraduate/fees/research-grant-funding/speechtechnologyphd
For details, see http://www.ed.ac.uk/schools-departments/informatics/postgraduate/fees/research-grant-funding/speechtechnologyphd
Simple4All kick off
25 November 2011 14:44 Filed in:
HTS
EC FP7 Simple4All project has
started.
This is a 3 year EC collaboration project.
The partners are Aalto University, University of Helsinki, University Politecnia de Madrid and Technical University of Cluj-Napoca.
URL http://simple4all.org/

This is a 3 year EC collaboration project.
The partners are Aalto University, University of Helsinki, University Politecnia de Madrid and Technical University of Cluj-Napoca.
URL http://simple4all.org/

uDialogue kick off meeting
21 October 2011 17:56 Filed in:
HTS
JST CREST "uDialogue" kick off meeting
was held in Nagoya, Japan.
This is a 5 year project supported by JST CREST.

This is a 5 year project supported by JST CREST.

English PodCastle
12 October 2011 17:34 Filed in:
Other
PodCastle is a service that enables
users to find speech data that include a search term, read full
texts of their recognition results, and easily correct recognition
errors by simply selecting from a list of candidates.
http://en.podcastle.jp/
An English version of PodCastle is now running. This utilises the CSTR's speech recogniser, which was developed under their EU projects "FP6 AMI" and "FP6 AMIDA".
http://www.aist.go.jp/aist_j/press_release/pr2011/pr20111012/pr20111012.html
http://en.podcastle.jp/
An English version of PodCastle is now running. This utilises the CSTR's speech recogniser, which was developed under their EU projects "FP6 AMI" and "FP6 AMIDA".
http://www.aist.go.jp/aist_j/press_release/pr2011/pr20111012/pr20111012.html
Three PhD positions available
05 September 2011 14:13
Three PhD positions available
Centre for Speech Technology Research
University of Edinburgh
uDialogue is a new JST CREST project, involving Nagoya Institute of Technology and the University of Edinburgh. Its objective is to develop a new spoken dialogue system framework based on user-generated content, and to advance speech recognition and synthesis technologies that strongly attract crowdsource dialogue content creators. uDialogue starts in October 2011, and has a duration of 5 years.
Three PhD positions in speech recognition and speech synthesis are available at CSTR, Edinburgh:
1. PhD position in speech recognition
2. PhD position in speech synthesis
3. PhD position in dialogue-content processing (content linking etc)
The PhD positions will start in September 2012, and have a duration of 4 years. They will include a 6-month internship at Nagoya. For details, please contact us by e-mail.
Interspeech-ad-uedin-v2
Centre for Speech Technology Research
University of Edinburgh
uDialogue is a new JST CREST project, involving Nagoya Institute of Technology and the University of Edinburgh. Its objective is to develop a new spoken dialogue system framework based on user-generated content, and to advance speech recognition and synthesis technologies that strongly attract crowdsource dialogue content creators. uDialogue starts in October 2011, and has a duration of 5 years.
Three PhD positions in speech recognition and speech synthesis are available at CSTR, Edinburgh:
1. PhD position in speech recognition
2. PhD position in speech synthesis
3. PhD position in dialogue-content processing (content linking etc)
The PhD positions will start in September 2012, and have a duration of 4 years. They will include a 6-month internship at Nagoya. For details, please contact us by e-mail.
Interspeech-ad-uedin-v2
Video Recordings
05 September 2011 14:04
Two video recordings (Simon's lecture at
IRCAM and my presentation at Odyssey 2010) which you might have
interests
http://www.dailymotion.com/video/xjswae_advances-in-speech-technologies-ircam-simon-king_tech
http://www.superlectures.com/odyssey/lecture.php?lang=en&id=29&query=Junichi%20Yamagishi
http://www.dailymotion.com/video/xjswae_advances-in-speech-technologies-ircam-simon-king_tech
http://www.superlectures.com/odyssey/lecture.php?lang=en&id=29&query=Junichi%20Yamagishi
New Journal Paper
05 September 2011 13:59 Filed in:
Journal papars
Sebastian’s paper was published in
Speech Communication. Congratulations!
http://dx.doi.org/10.1016/j.specom.2011.08.001
Abstract:
Spontaneous conversational speech has many characteristics that are currently not modelled well by HMM-based speech synthesis and in order to build synthetic voices that can give an impression of someone partaking in a conversation, we need to utilise data that exhibits more of the speech phenomena associated with conversations than the more generally used carefully read aloud sentences. In this paper we show that synthetic voices built with HMM-based speech synthesis techniques from conversational speech data, preserved segmental and prosodic characteristics of frequent conversational speech phenomena. An analysis of an evaluation investigating the perception of quality and speaking style of HMM-based voices confirms that speech with conversational characteristics are instrumental for listeners to perceive successful integration of conversational speech phenomena in synthetic speech. The achieved synthetic speech quality provides an encouraging start for the continued use of conversational speech in HMM-based speech synthesis.
http://dx.doi.org/10.1016/j.specom.2011.08.001
Abstract:
Spontaneous conversational speech has many characteristics that are currently not modelled well by HMM-based speech synthesis and in order to build synthetic voices that can give an impression of someone partaking in a conversation, we need to utilise data that exhibits more of the speech phenomena associated with conversations than the more generally used carefully read aloud sentences. In this paper we show that synthetic voices built with HMM-based speech synthesis techniques from conversational speech data, preserved segmental and prosodic characteristics of frequent conversational speech phenomena. An analysis of an evaluation investigating the perception of quality and speaking style of HMM-based voices confirms that speech with conversational characteristics are instrumental for listeners to perceive successful integration of conversational speech phenomena in synthetic speech. The achieved synthetic speech quality provides an encouraging start for the continued use of conversational speech in HMM-based speech synthesis.
Three new grants awarded
18 July 2011 15:28 Filed in:
Other
Three new grants awarded!
- Deep architectures for statistical speech synthesis (EPSRC Career Acceleration Fellowship): £914k
- Silence speech interface for MND patients (EMC Seedcorn funding): £5k
- uDialogue (JST CREST Project) : £700k
- Deep architectures for statistical speech synthesis (EPSRC Career Acceleration Fellowship): £914k
- Silence speech interface for MND patients (EMC Seedcorn funding): £5k
- uDialogue (JST CREST Project) : £700k
Lecture in Granada
11 July 2011 15:56 Filed in:
Other
I attended a summer school on
"Application of Speech technology" in Granada, Spain, and gave a
lecture about text-to-speech synthesis and festival/HTS toolkits. I
had a lot of nice sea food dishes there. The information of the
course can be seen from here
http://ceres.ugr.es/THA/eng/

http://ceres.ugr.es/THA/eng/

USTC
06 April 2011 18:01 Filed in:
Other
Korin and I visited USTC and iFlytek
under the RSE-NSFC joint project to discuss further
collaboration.




Speech synthesis seminar series at the University of Cambridge
14 February 2011 18:56
I attended speech synthesis seminar
series at the University of Cambridge and gave a talk about the
adaptive speech synthesis and our current projects. My slides can
be seen from
http://mi.eng.cam.ac.uk/mi/Main/SeminarsSpeech
http://mi.eng.cam.ac.uk/mi/Main/SeminarsSpeech
A new journal paper
02 February 2011 16:54 Filed in:
HTS | Journal papars
Adriana (Technical University of
Cluj-Napoca)’s new journal paper was published in Speech
Communication!
doi:10.1016/j.specom.2010.12.002
This paper introduces a new speech corpus named "RSS" and HMM-based speech synthesis systems using higher sampling rates such as 48kHz. The following is abstract.
This paper first introduces a newly-recorded high quality Romanian speech corpus designed for speech synthesis, called “RSS”, along with Romanian front-end text processing modules and HMM-based synthetic voices built from the corpus. All of these are now freely available for academic use in order to promote Romanian speech technology research. The RSS corpus comprises 3500 training sentences and 500 test sentences uttered by a female speaker and was recorded using multiple microphones at 96 kHz sampling frequency in a hemianechoic chamber. The details of the new Romanian text processor we have developed are also given.
Using the database, we then revisit some basic configuration choices of speech synthesis, such as waveform sampling frequency and auditory frequency warping scale, with the aim of improving speaker similarity, which is an acknowledged weakness of current HMM-based speech synthesisers. As we demonstrate using perceptual tests, these configuration choices can make substantial differences to the quality of the synthetic speech. Contrary to common practice in automatic speech recognition, higher waveform sampling frequencies can offer enhanced feature extraction and improved speaker similarity for HMM-based speech synthesis.
doi:10.1016/j.specom.2010.12.002
This paper introduces a new speech corpus named "RSS" and HMM-based speech synthesis systems using higher sampling rates such as 48kHz. The following is abstract.
This paper first introduces a newly-recorded high quality Romanian speech corpus designed for speech synthesis, called “RSS”, along with Romanian front-end text processing modules and HMM-based synthetic voices built from the corpus. All of these are now freely available for academic use in order to promote Romanian speech technology research. The RSS corpus comprises 3500 training sentences and 500 test sentences uttered by a female speaker and was recorded using multiple microphones at 96 kHz sampling frequency in a hemianechoic chamber. The details of the new Romanian text processor we have developed are also given.
Using the database, we then revisit some basic configuration choices of speech synthesis, such as waveform sampling frequency and auditory frequency warping scale, with the aim of improving speaker similarity, which is an acknowledged weakness of current HMM-based speech synthesisers. As we demonstrate using perceptual tests, these configuration choices can make substantial differences to the quality of the synthetic speech. Contrary to common practice in automatic speech recognition, higher waveform sampling frequencies can offer enhanced feature extraction and improved speaker similarity for HMM-based speech synthesis.
IEEE SPS 2010 Young Author Best Paper Award
25 January 2011 12:02
Dr. Zhen-Hua Ling's journal paper
published from IEEE in 2009 (titled "Integrating Articulatory
Features into HMM-based Parametric Speech Synthesis") wins IEEE SPS
2010 Young Author Best Paper Award. Congratulations!
http://www.signalprocessingsociety.org/awards-fellows/award-recipient/
http://www.signalprocessingsociety.org/awards-fellows/award-recipient/
Christophe and the Voice reconstruction project
25 January 2011 11:59
Welcome to Christophe Veaux, who joins
us from IRCAM. He'll be working on a new voice reconstruction
project. Some preliminary experimental results can be seen in my
tutorial slides
http://homepages.inf.ed.ac.uk/jyamagis/ISCSLP2010/ISCLSP-Tutorial.pdf
RSE-NSFC
25 January 2011 11:55
The Royal Society of Edinburgh -
National Science Foundation China travel grant has been awarded to
CSTR and USTC for further joint and linked research on our novel
framework for speech synthesis. Korin Richmond and I will visit
USTC in China.

