I'm a final-year PhD student at the School of Informatics, University of Edinburgh, supervised by Iain Murray. I'm a member of the Centre for Doctoral Training in Data Science and co-funded by Microsoft Research.
I'm interested in probabilistic approaches to machine learning. Currently my work focuses on deep learning methods for density estimation and Bayesian inference. Previously, I studied Electrical and Computer Engineering at the Aristotle University of Thessaloniki and Advanced Computing at Imperial College London.
MSc by Research in Data Science, University of Edinburgh.
Grade 92%, with Distinction. Won the MSc by Research in Data Science Class Prize.
MSc in Advanced Computing, Imperial College London.
Grade 90%, with Distinction. Won the Corporate Partnership Programme Award for Academic Excellence and the Winton Capital Applied Computing MSc Project Prize.
MEng in Electrical and Computer Engineering, Aristotle University of Thessaloniki.
Grade 89.6%, with Distinction.
Teaching assistant, University of Edinburgh.
I've tutored and/or marked the following courses: Machine Learning & Pattern Recognition; Introductory Applied Machine Learning; Probabilistic Modelling & Reasoning; Informatics 2B - Algorithms, Data Structures & Learning; Introduction to Theoretical Computer Science.
Research assistant, Information Technologies Institute, Centre for Research & Technology Hellas.
I've participated in the EU-funded project Adapt4EE and the Greek-funded project EnNoisis. Most of my work focused on automatic activity recognition in smart homes with ambient sensors and Kinect cameras. Quite a lot of machine learning and computer vision involved.
Research assistant, Aristotle University of Thessaloniki.
I've participated in the EU-funded project AutoGPU, where I developed software for fast parallel low-level image processing on GPUs. I was writing a lot of CUDA back then.
Sequential Neural Likelihood is a fast and robust algorithm for inference in simulator models, which are models we can simulate but whose likelihood we can't compute. SNL works by trainining a Masked Autoregressive Flow on simulated data to learn the simulator model's intractable likelihood. During training, preliminary fits to the likelihood are used to suggest what simulations to run next, which reduces the total number of simulations dramatically. SNL brings together ideas from likelihood-free inference and neural density estimation, and it's a more robust alternative to related methods that learn the posterior directly.
Autoregressive models and normalizing flows are types of neural networks that achieve state-of-the-art performance in density estimation. We developed Masked Autoregressive Flow, which is a normalizing flow whose layers are autoregressive models. MAF is obtained by stacking together a number of MADEs, such that each MADE models the random numbers that drive the next MADE in the stack. MAF has close connections to Inverse Autoregressive Flow and RealNVP, and yields state-of-the-art performance in several general-purpose density estimation tasks.
Suppose we have a probabilistic model which we can simulate forward to generate data from, but whose likelihood we can't evaluate. How can we do Bayesian inference in such a model? We propose using simulated data from the model to train a Bayesian neural network to return the intractable posterior. By using preliminary fits to the posterior to guide future simulations, we can dramatically speed up the process. Our approach improves over the state-of-the-art in likelihood-free inference in three ways: (a) it targets the exact posterior, (b) it represents the posterior parametrically, and (c) it significantly reduces the number of required simulations.
In machine learning, many good models are large, expensive or intractable. Knowledge distillation is the idea of training a convenient model to mimic a good but cumbersome model. This way we obtain the good performance of the original model in a much more convenient compact form. We apply this idea in: (a) model compression, where we compress large discriminative models, such as ensembles of neural nets, into models of much smaller size; (b) Bayesian inference, where we distil streams of MCMC samples into closed-form predictive distributions; (c) intractable generative models, where we distil unnormalizable models such as RBMs into tractable models such as NADEs.
When represented as matrices, real-world data often have a low-rank structure, whereas corruptions are often sparse. Based on this observation, several optimization-based algorithms that aim to separate the low-rank component from the sparse component have been developed. In this work, we make three contributions in the area of robust low-rank modellling: (a) we review and compare existing matrix-based methods; (b) we extend matrix-based methods to tensors, introducing several tensor-based algorithms; (c) we apply both matrix-based and tensor-based algorithms in practical computer vision tasks.
You can read more in the relevant MSc thesis. This thesis won the Winton Capital Applied Computing MSc Project Prize.
Stochastic gradient-based optimization algorithms have become the standard method for training machine learning models such as neural nets, due to their good scalability to large datasets. Nevertheless, standard stochastic gradient descent has a slower theoretical convergence rate compared to batch gradient descent. Semi-stochastic algorithms, such as S2GD and SAG, combine fast convergence with scalability. In this project, we compare the performance of semi-stochastic methods to standard stochastic and batch methods in convex machine learning problems. We find that semi-stochastic methods indeed converge to the optimum much faster, but this doesn't necessarily translate to better generalization performance.
This is a small project I did with Peter Richtárik. You can read more in the technical report. MATLAB code with my implementation of the algorithms and scripts to reproduce the experiments can be found here.
Convolution and correlation, with or without local normalization, are fundamental low-level operations in image processing applications. In this project, we develop algorthms and software for their fast computation, based on (a) use of the Fourier domain for large templates, and (b) parallelization on GPUs. We've developed the FLCC Library, a software tool which automatically determines which algorithm/platform combination works the fastest for the particular problem at hand and then executes it appropriately.
Robust low-rank tensor modelling using Tucker and CP decomposition.
In Proceedings of the 25th European Signal Processing Conference, pages 1185–1189. 2017.
Synthetic Ground Truth Data Generation for Automatic Trajectory-based ADL Detection.
In Proceedings of the IEEE-EMBS International Conference on Biomedical and Health Informatics, pages 33–36. 2014.
N. P. Pitsianis,
Fast Computation of Local Correlation Coefficients on Graphics Processing Units.
In Proceedings of SPIE, volume 7444, pages 744412–744412-8. 2009.
A Tool to Monitor and Support Physical Exercise Interventions for MCI and AD Patients.
2nd Patient Rehabilitation Techniques Workshop at the 8th International Conference on Pervasive Computing Technologies for Healthcare, 2014.
Human Computer Confluence in the Smart Home Paradigm: Detecting Human States and Behaviours for 24/7 Support of Mild-Cognitive Impairments.
In Human Computer Confluence: Transforming Human Experience Through Symbiotic Technologies, chapter 16, pages 275–293, De Gruyter Open, 2016.
FLCC: A Library for Fast Computation of Convolution and Local Correlation Coefficients.
MEng Thesis, Department of Electrical and Computer Engineering, Aristotle University of Thessaloniki. 2011.
pdf bibtex code