Comparison of context clustering decision trees for speaker-adaptive HMM-based speech synthesis


Tree Structure
Phonetic
Shared single
4oa
4oi


System configuration
Training data for average voice model:SI-84 set of WSJ corpora
Adaptation data: 40 'block adaptation' sentences included in November 1993 CSR H2 task
Model: state-tied context-dependent MSD-HSMMs
Adaptation: CSMAPLR+MAP
Acoustic features: STRAIGHT me-cepstrum (40dim), logF0 and aperiodicity + their delta and delta-delta