Symposium on Machine Learning in Speech and Language Processing (MLSLP)
September 14, 2012
Portland, Oregon, USA
Speaker: Mark Gales (University of Cambridge)
Title: Log-Linear Models for Speech Recognition
Abstract:
Generative models, normally in the form of hidden Markov models, have
been the dominant form of acoustic model for automatic speech
recognition for more than two decades. In recent years there has been
interest in applying structured discriminative models to this
task. This talk discusses one particular form of discriminative model,
log-linear models, and how they may be applied to continuous speech
recognition tasks. Two important issues will be discussed in detail:
the appropriate form of features for this model; and the training
criterion to be used. Generative models are proposed to extract the
features for the discriminative log-linear model. This combination of
generative and discriminative models enables state-of-the-art
adaptation and noise robustness approaches to be used to handle
mismatches between the training and test conditions. An interesting
aspect of these features is that the conditional independence
assumptions of the underlying generative models are not necessarily
reflected in the features that are derived from the models. Various
forms of training criteria, including minimum Bayes' risk and large
margin approaches, are discussed. The relationship between
large-margin training of log-linear models and structured support
vector machines is described. Results are presented on two noise-robustness tasks:
AURORA-2 and AURORA-4. This is work with Anton Ragni, Austin Zhang and
Federico Flego.