Speech enhancement for noise-robust TTS

Experiments with noisy data

OMLSA: baseline speech enhancement method
RNN-A: RNN enhancement using acoustic-derived features
RNN-AT: RNN enhancement using acoustic- and text-derived features

Natural and vocoded speech samples:

FEMALE MALE
Bus 7.5dB Cafe 12.5dB Bus 7.5dB Cafe 12.5dB
Natural Vocoded Natural Vocoded Natural Vocoded Natural Vocoded
CLEAN
NOISY
OMLSA
RNN-A X X X X
RNN-AT X X X X

HMM-based synthetic speech samples:

FEMALE MALE
Sample 1 Sample 2 Sample 3 Sample 4 Sample 1 Sample 2 Sample 3 Sample 4
CLEAN
NOISY
OMLSA
RNN-A
RNN-AT