Masked Autoregressive Flow for Density Estimation

Iain Murray, Theo Pavlakou, and George Papamakarios.

Autoregressive models are among the best performing neural density estimators. We describe an approach for increasing the flexibility of an autoregressive model, based on modelling the random numbers that the model uses internally when generating data. By constructing a stack of autoregressive models, each modelling the random numbers of the next model in the stack, we obtain a type of normalizing flow suitable for density estimation, which we call Masked Autoregressive Flow. This type of flow is closely related to Inverse Autoregressive Flow and is a generalization of Real NVP. Masked Autoregressive Flow achieves state-of-the-art performance in a range of general-purpose density estimation tasks.

Advances in Neural Information Processing Systems 30, 2017.
[PDF, Supplementary, DjVu, GoogleViewer, arXiv, BibTeX].

Code: github, snapshot. Data.

Also Dillon et. al's TensorFlow Distributions work provided an implementation in tf.contrib, and now in TensorFlow Probability.

Related papers: This paper considers different ways of stacking MADE for density estimation of real-valued variables. See previous work on autoregressive distribution estimation, for alternatives and the discrete case.

This work, and other flows, can be made more flexible by adding montonic splines: Neural Spline Flows (NeurIPS, 2019). Code is available.