Basic operations

Here are some examples of basic usage of Ossian. Instructions given for building all voices in this documentation use tiny amounts of data (a few minutes) for training. This is to speed up the time it takes to run these demos, but it means the quality of the voices is very low. However, all databases used are publicly available in full, so building full versions of these voices is possible. Trained voices built on larger data sets will be made available in the near future.

Installing existing voices

To demonstrate how to install a voice that has already been built and packaged up, examples of such voices are included with releases of Ossian, in $OSSIAN/example_voices. rm-rss_toy-naive_example is an example of a voice trained on just a couple of minutes of the Romanian Speech Synthesis Database. To install it, just unpack it:

cd $OSSIAN/  ## voice will unpack relative to this location
tar xvf ./example_voices/rm-rss_toy-naive_example.tar

The data for building this simple voice is included under ./corpus.

Synthesis from installed voices

To synthesise from the voice we have just installed using textfile as input:

cd $OSSIAN/
./scripts/speak.py -l rm -s rss_rnd1 \
        -play naive ./test/txt/romanian.txt

To do the same, but store audio to file:

./scripts/speak.py -l rm -s rss_rnd1 \
        -o ./test/wav/romanian_rnd1_naive.wav naive ./test/txt/romanian.txt

The script speak.py accepts input from stdin in the absence of filenames on the command line, so we can use it interactively:

cd $OSSIAN/
while read textin ; do
    echo $textin | ./scripts/speak.py -l rm -s rss_rnd1 \
        -play naive ;
    echo "Type your text here:" ;
done

... and press CTRL+C to end the session.

Try the same for the small 2 minute voice for comparison:

cd $OSSIAN/
while read textin ; do
    echo $textin | ./scripts/speak.py -l rm -s rss_toy_demo \
        -play naive_example ;
    echo "Type your text here:" ;
done

Training a new voice from an existing recipe

This shows how to build a Romanian (rm) voice on the very small rss_toy_demo speech corpus using the naive recipe – it will be the same as the existing voice installed above. This recipe is essentially that used for the voices described in this paper.

## Assuming that we want to start from scratch:
rm -r ./train/rm/speakers/rss_toy_demo/naive/ ./voices/rm/rss_toy_demo/naive/

## Train:
python ./scripts/train.py -s rss_toy_demo -l rm -text wikipedia_10K_words naive

The -text flag is used to specify an extra large text corpus used for training vector space models, in addition to the speech transcripts.

To test the resulting voice (it should be the same as the installed one apart from the name):

./scripts/speak.py -l rm -s rss_toy_demo \
        -o ./test/wav/romanian_toy_naive.wav naive ./test/txt/romanian.txt

Packing up a trained voice

To make a packed version of a voice that others can install:

cd $OSSIAN/  ## pack up voice relative to this location
## Rename the voice as you like by appending something to the name:
cp -r  ./voices/rm/rss_toy_demo/naive/ ./voices/rm/rss_toy_demo/naive_example_02/
tar cvf ./example_voices/rm-rss_toy-naive_example_02.tar voices/rm/rss_toy_demo/naive_example_02/

Table Of Contents

Previous topic

Setting up

Next topic

New improved recipes

This Page