Results of the 2010 Blizzard Challenge

The evaluation results of the 2010 Blizzard Challenge come out! The followings are MOS scores on naturalness for task EH1, EH2 and ES1. My new HTS systems, which I introduce in the past news, are system V. System A is natural speech for reference and System B is a standard Festival unit-selection system.

voiceEH1_all_mos

Task EH1 (4 hours of speech data, Speaker RJS)

According to Wilcoxon signed rank tests with alpha Bonferoni correction (1% level), the system V is not as good as systems M, J, T. There is no significant differences between system V and B.

voiceEH2_all_mos

Task EH2 (1 hour of speech data, Arctic sentences, Speaker Roger)

Likewise, according to Wilcoxon signed rank tests with alpha Bonferoni correction (1% level), the system V is the second best and is significantly better than B.

voiceES1_all_mos

Task ES1 (100 sentences of arctic sentences, Speaker Roger)

Likewise, according to Wilcoxon signed rank tests with alpha Bonferoni correction (1% level), the systems M and V are the equal best.

In summary, the new HTS system performs very good on small dataset and 1 hour of speech data set. Even on 4 hours of speech data set, it is as good as the Festival unit-selection system.