Asa Cooper Stickland


I'm a PhD student in the EPSRC Centre for Doctoral Training in Data Science at Edinburgh University. I'm supervised by Iain Murray and my second supervisor is Ivan Titov.

I'm interested in transfer learning and robustness, particularly for multilingual models, and I'm interested in parameter-efficient ways to use and train large text encoder models like BERT. Recently I did an internship at Facebook AI and mainly worked on using pre-trained models for machine translation. I've previously worked on approximate inference (e.g. variational inference, MCMC, ABC), and am excited about Bayesian deep learning, and more recently improving the calibration of neural networks.

I did my undergrad in Durham (Mphys in Physics) where my masters project involved using Bayesian linear regression for finding which properties of proteins were most effective for killing bacteria. I did a research internship in Durham doing fluid dynamics simulations, and in summer 2017 I did an internship in a startup called Five AI who are making autonomous vehicles.




Recipes for Adapting Pre-trained Monolingual and Multilingual Models to Machine Translation

Asa Cooper Stickland, Xian Li, Marjan Ghazvininejad

Result of my Facebook AI internship, we examine which parameters to leave frozen when fine-tuning large pre-trained sequence-to-sequence models on machine translation, for both monolingual and multilingual pre-trained models: link. EACL, 2021.


Deep Transformers with Latent Depth

Xian Li, Asa Cooper Stickland, Yuqing Tang, Xiang Kong

We model the choice of which transformer layer to use as a latent variable, allowing us to train deeper models and e.g. learn which layers to share between languages for multilingual machine translation: arxiv link. Neurips, 2020.

Diverse Ensembles Improve Calibration

Asa Cooper Stickland, Iain Murray

Short paper on whether calibration and accuracy improve when using ensembles of models with different data augmentation for each ensemble member: arxiv link. ICML Workshop on Uncertainty and Robustness in Deep Learning, 2020.


BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning

Asa Cooper Stickland, Iain Murray

This was a project about finding an efficient way to add parameters to a large pre-trained model, BERT, to get good performance for tasks in the GLUE benchmark: arxiv link. In ICML 2019, and featured in the NAACL 2019 transfer learning tutorial.



Intern at Five AI

During my summer internship with Five AI I was tasked with correcting for the motion of the car during a LIDAR sweep. My method was validated by testing in the real world with Five AI's prototype car. During this project I used C++ and ROS.

I also worked on a side project where I was attempting to predict depth from pairs of stereo images using convolutional neural networks. I used ROS and openCV to preprocess the data, and tensorflow for the prediction of depth.

Durham Masters Project

Project was to build an algorithm that can predict the ability of a peptide-like molecule to fight bacteria. I used sparse Bayesian linear regression to deal with a problem where the dimensions of the data are large and there are a small number of training instances.