Comparison of the number of states and frame-shifts for speaker-adaptive HMM-based speech synthesis
System configuration
Training data for average voice model:SI-84 set of WSJ corpora
Adaptation data: 40 'block adaptation' sentences included in November 1993 CSR H2 task
Model: state-tied context-dependent MSD-HSMMs
Adaptation: CSMAPLR+MAP
Acoustic features: STRAIGHT mel-cepstrum (40-dim), logF0 and aperiodicity + their delta and delta-delta