My picture

George Papamakarios

I'm a PhD student at the School of Informatics, University of Edinburgh, supervised by Iain Murray. I'm a member of the Centre for Doctoral Training in Data Science and co-funded by Microsoft Research.

I'm interested in probabilistic approaches to machine learning, especially Bayesian inference. In the past I've worked on optimization, computer vision, and parallel computing. I studied Electrical and Computer Engineering at the Aristotle University of Thessaloniki and Advanced Computing at Imperial College London.

CV

Studies

MSc by Research in Data Science, University of Edinburgh.
Grade 92/100, with Distinction. Won the MSc by Research in Data Science Class Prize.

MSc in Advanced Computing, Imperial College London.
Grade 90/100, with Distinction. Won the Corporate Partnership Programme Award for Academic Excellence and the Winton Capital Applied Computing MSc Project Prize.

MEng in Electrical and Computer Engineering, Aristotle University of Thessaloniki.
Grade 8.96/10, with Distinction.

Work

Research intern, Microsoft Research Cambridge.
I worked on performing Bayesian inference in computer vision models using Infer.NET. My supervisor was John Winn.

Teaching assistant, University of Edinburgh.
I've tutored and/or marked the following courses: Machine Learning & Pattern Recognition; Probabilistic Modelling & Reasoning; Informatics 2B - Algorithms, Data Structures & Learning; Introduction to Theoretical Computer Science.

Research assistant, Information Technologies Institute, Centre for Research & Technology Hellas.
I've participated in the EU-funded project Adapt4EE and the Greek-funded project EnNoisis. Most of my work focused on automatic activity recognition in smart homes with ambient sensors and Kinect cameras. Quite a lot of machine learning and computer vision involved.

Research assistant, Aristotle University of Thessaloniki.
I've participated in the EU-funded project AutoGPU, where I developed software for fast parallel low-level image processing on GPUs. I was writing a lot of CUDA back then.

Masked Autoregressive Flow

Masked Autoregressive Flow

Autoregressive models and normalizing flows are types of neural networks that achieve state-of-the-art performance in density estimation. We developed Masked Autoregressive Flow, which is a normalizing flow whose layers are autoregressive models. MAF is obtained by stacking together a number of MADEs, such that each MADE models the random numbers that drive the next MADE in the stack. MAF has close connections to Inverse Autoregressive Flow and RealNVP, and yields state-of-the-art performance in several general-purpose density estimation tasks.

For more details you can have a look at the paper. Code that reproduces the experiments can be found here.

Fast ε-free Inference of Simulation Models

Fast ε-free Inference of Simulation Models

Suppose we have a probabilistic model which we can simulate forward to generate data from, but whose likelihood we can't evaluate. How can we do Bayesian inference in such a model? We propose using simulated data from the model to train a Bayesian neural network to return the intractable posterior. By using preliminary fits to the posterior to guide future simulations, we can dramatically speed up the process. Our approach improves over the state-of-the-art in likelihood-free inference in three ways: (a) it targets the exact posterior, (b) it represents the posterior parametrically, and (c) it significantly reduces the number of required simulations.

For more details you can have a look at the paper. Code that reproduces the experiments can be found here. Dennis Prangle wrote a very nice blog post about our work.

Distilling Model Knowledge

Distilling Model Knowledge

In machine learning, many good models are large, expensive or intractable. Knowledge distillation is the idea of training a convenient model to mimic a good but cumbersome model. This way we obtain the good performance of the original model in a much more convenient compact form. We apply this idea in: (a) model compression, where we compress large discriminative models, such as ensembles of neural nets, into models of much smaller size; (b) Bayesian inference, where we distil streams of MCMC samples into closed-form predictive distributions; (c) intractable generative models, where we distil unnormalizable models such as RBMs into tractable models such as NADEs.

You can read more in the relevant MSc thesis, or the accompanying poster. Code can be found here.

Robust Low-Rank Modelling on Matrices and Tensors

Robust Low-Rank Modelling on Matrices and Tensors

When represented as matrices, real-world data often have a low-rank structure, whereas corruptions are often sparse. Based on this observation, several optimization-based algorithms that aim to separate the low-rank component from the sparse component have been developed. In this work, we make three contributions in the area of robust low-rank modellling: (a) we review and compare existing matrix-based methods; (b) we extend matrix-based methods to tensors, introducing several tensor-based algorithms; (c) we apply both matrix-based and tensor-based algorithms in practical computer vision tasks.

You can read more in the relevant MSc thesis. This thesis won the Winton Capital Applied Computing MSc Project Prize.

Comparison of Modern Stochastic Optimization Algorithms

Comparison of Modern Stochastic Optimization Algorithms

Stochastic gradient-based optimization algorithms have become the standard method for training machine learning models such as neural nets, due to their good scalability to large datasets. Nevertheless, standard stochastic gradient descent has a slower theoretical convergence rate compared to batch gradient descent. Semi-stochastic algorithms, such as S2GD and SAG, combine fast convergence with scalability. In this project, we compare the performance of semi-stochastic methods to standard stochastic and batch methods in convex machine learning problems. We find that semi-stochastic methods indeed converge to the optimum much faster, but this doesn't necessarily translate to better generalization performance.

This is a small project I did with Peter Richtárik. You can read more in the technical report, or the accompanying poster. MATLAB code with my implementation of the algorithms and scripts to reproduce the experiments can be found here.

Fast Convolution and Local Correlation Coefficients on GPUs

Fast Convolution and Local Correlation Coefficients on GPUs

Convolution and correlation, with or without local normalization, are fundamental low-level operations in image processing applications. In this project, we develop algorthms and software for their fast computation, based on (a) use of the Fourier domain for large templates, and (b) parallelization on GPUs. We've developed the FLCC Library, a software tool which automatically determines which algorithm/platform combination works the fastest for the particular problem at hand and then executes it appropriately.

For more information see the relevant MEng thesis. The latest version of the code is hosted on the FLCC Library website and the AutoGPU website.

Preprints

G. Papamakarios, T. Pavlakou, and I. Murray. Masked Autoregressive Flow for Density Estimation. arXiv:1705.07057. 2017.
pdf bibtex code

Conferences

G. Papamakarios and I. Murray. Fast ε-free Inference of Simulation Models with Bayesian Conditional Density Estimation. Advances in Neural Information Processing Systems. 2016.
pdf poster bibtex code

G. Papamakarios, Y. Panagakis, and S. Zafeiriou. Generalised Scalable Robust Principal Component Analysis. In Proceedings of the British Machine Vision Conference. 2014.
pdf poster bibtex code

G. Papamakarios, D. Giakoumis, K. Votis, S. Segouli, D. Tzovaras, and C. Karagiannidis. Synthetic Ground Truth Data Generation for Automatic Trajectory-based ADL Detection. In Proceedings of the IEEE-EMBS International Conference on Biomedical and Health Informatics, pages 33–36. 2014.
web bibtex

G. Papamakarios, G. Rizos, N. P. Pitsianis, and X. Sun. Fast Computation of Local Correlation Coefficients on Graphics Processing Units. In Proceedings of SPIE, volume 7444, pages 744412–744412-8. 2009.
pdf bibtex

Workshops

G. Papamakarios and I. Murray. Distilling Intractable Generative Models. Probabilistic Integration Workshop at the Neural Information Processing Systems Conference. 2015.
pdf slides bibtex code

G. Papamakarios, D. Giakoumis, M. Vasileiadis, K. Votis, D. Tzovaras, S. Segouli, and C. Karagiannidis. A Tool to Monitor and Support Physical Exercise Interventions for MCI and AD Patients. 2nd Patient Rehabilitation Techniques Workshop at the 8th International Conference on Pervasive Computing Technologies for Healthcare, 2014.
web bibtex

Book chapters

G. Papamakarios, D. Giakoumis, M. Vasileiadis, A. Drosou, and D. Tzovaras. Human Computer Confluence in the Smart Home Paradigm: Detecting Human States and Behaviours for 24/7 Support of Mild-Cognitive Impairments. In Human Computer Confluence: Transforming Human Experience Through Symbiotic Technologies, chapter 16, pages 275–293, De Gruyter Open, 2016.
pdf bibtex

Theses

G. Papamakarios. Distilling Model Knowledge. MSc by Research Thesis, Centre for Doctoral Training in Data Science, University of Edinburgh. 2015.
pdf poster bibtex code

G. Papamakarios. Robust Low-Rank Modelling on Matrices and Tensors. MSc Thesis, Department of Computing, Imperial College London. 2014.
pdf slides bibtex

G. Papamakarios and G. Rizos. FLCC: A Library for Fast Computation of Convolution and Local Correlation Coefficients. MEng Thesis, Department of Electrical and Computer Engineering, Aristotle University of Thessaloniki. 2011.
pdf bibtex code

Email
Work Address
Room 2.25, Informatics Forum
University of Edinburgh
10 Crichton Street, EH8 9AB
Edinburgh, UK