Installation Instructions

0. Download Voice Clone Toolkit and untar it

%tar cvfz VoiceCloneBSD-ver0.51.tar.gz

1. Due to the license conditions, HTK-3.4.1 and HDecode-3.4.1 need to be downloaded separately. You can download them from


2. After downloading HTK-3.4.1.tar.gz and HDecode-3.4.1.tar.gz, copy them into HTS-2.2_for_HTK-3.4.1 directory

% cp HTK-3.4.1.tar.gz HTS-2.2_for_HTK-3.4.1/
% cp HDecode-3.4.1.tar.gz HTS-2.2_for_HTK-3.4.1/

3. Then please run make.

% make

This tool requires python (we recommend version 2.6) and wxpython (version 2.7 or newer) pre-installed. If they are not installed, please download and install them first.

------- python -------
How to check version of python
% python -V
Python 2.6.1

How to use a different version of python:
Please modify one line below of and specify your version
setenv PYTHONPATH $ADAPT_DEMO_ROOT/lib/python2.6/site-packages

------- wxpython --------
How to check the version of your wxpython
% python
>>>import wx
>>>import wx.richtext

How to install or update wxpython

Known issues
Speech tools have a issue with Debian 5 and gcc 4.3.2.
The only solution we can offer at this time is to use gcc4.2 on Debian:

How to get gcc4.2
apt-get install libncurses5-dev gcc-4.2 g++-4.2

How to use gcc4.2 for speech_tools
Please add the following two lines at the bottom of speech_tools/config/config add:
CXX = g++-4.2
CC = gcc-4.2

To run the tool
0) Open terminal and type ./

To create a new voice:

1) Click on the 'New Voice' Button.
2) Give the new voice a unique name.
3) Follow the instructions for recording a voice

To record a voice:
1) Place Microphone on subject
2) Set levels: Click the 'Check Levels' button and adjust the level on the icicle so clipping does not occur. Click 'Stop' when happy.
3) Calibrate Silence: be quiet and click 'Calibrate Silence'
4) Hit record to start recording.
- To restart a sentence, click 'Stop' then 'Record' again.
- If the background noise level changes, click 'Stop' and re-calibrate silence.
- If lots of clipping occurs adjust the Icicle level knob, and ideally re-calibrate silence.
- Cmd-, will give a preference pane where you can switch off the auto proceed forward feature if necessary
5) When you have recorded enough, click 'Stop' and then close the recording window.

Note: Please use 48kHz sampling rate

To use existing speech data [recommended only for experts]
1) Please create a directory in Research-Demo/database/wav/Eng/Name-of-data and Research-Demo/database/txt/Eng/Name-of-data
"Name-of-data" is a name of speaker which you can specifiy. You have to use the same name for both wav and txt directories
Waveforms should be 1)RIFF format, 2) 48kHz, 3) single channel, and 3) 2 bytes.
2) Please copy waveforms to Research-Demo/database/wav/Eng/Name-of-data
3) Please copy texts/prompts to Research-Demo/database/txt/Eng/Name-of-data
4) run the voice clone toolkit ./
5) Click "Rescan"

To build a voice:
1) Select a voice and click the 'Build Voice' Button

To abort building a voice:
1) Not recommended!

To synthesise with a voice:
1) Select an available voice
2) Type text in the box and click 'speak'
- Errors should be reported in the festival server window. If files are missing, the voice failed to build.

To synthesise with a voice from a command line:
1) you can synthesize a voice using flite+hts_engine.
A sample shell script is also available for generating speech
% csh -model Research-Demo/inter-module/hts_engine/Eng/SLT -text wave/example.txt -out wave