me

Asa Cooper Stickland

About


I am a student in the EPSRC Centre for Doctoral Training in Data Science at Edinburgh University. I'm supervised by Iain Murray and my second supervisor is Ivan Titov.

I'm interested in multi-task learning and continual learning, and I've been working on finding efficient ways to use and train large text encoder models like BERT. Recently I did an internship at Facebook AI and mainly worked on using pre-trained models for machine translation. I've previously worked on approximate inference (e.g. variational inference, MCMC, ABC), and am excited about Bayesian deep learning.

I did my undergrad in Durham (Mphys in Physics) where my masters project involved using Bayesian linear regression for finding which properties of proteins were most effective for killing bacteria. I did a research internship in Durham doing fluid dynamics simulations, and in summer 2017 I did an internship in a startup called Five AI who are making autonomous vehicles.

CV

Publications


2019

BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning

This was a project about finding an efficient way to add parameters to a large pre-trained model, BERT, to get good performance for tasks in the GLUE benchmark: arxiv link. In ICML 2019, and featured in the NAACL 2019 transfer learning tutorial.

Projects


2017

Intern at Five AI

During my summer internship with Five AI I was tasked with correcting for the motion of the car during a LIDAR sweep. My method was validated by testing in the real world with Five AI's prototype car. During this project I used C++ and ROS.

I also worked on a side project where I was attempting to predict depth from pairs of stereo images using convolutional neural networks. I used ROS and openCV to preprocess the data, and tensorflow for the prediction of depth.

Durham Masters Project

Project was to build an algorithm that can predict the ability of a peptide-like molecule to fight bacteria. I used sparse Bayesian linear regression to deal with a problem where the dimensions of the data are large and there are a small number of training instances.