Informatics, Edinburgh. Machine Learning PhD Scholarship: Better Sample Efficiency in Deep Reinforcement Learning

Supervisors: Amos Storkey (School of Informatics, University of Edinburgh), Kamil Ciosek (Microsoft Research, Cambridge)

New advert date 10 Oct 2020. Applying/contacting earlier is advantageous.

A PhD scholarship is available for a student based in the School of Informatics, University of Edinburgh for a PhD position starting September 2020 or later. This studentship is a PhD Scholarship from Microsoft Research, Cambridge. It will fund full fees and stipend for a UK or EU student. It may be possible to fund a non-EU student if other funds can be found to cover the difference.

To apply

Informal contact can be made by emailing amos+msrphd2020@inf.ed.ac.uk with a CV, transcripts, outline research proposal. Email responses are likely to be subject to some delay, but it can be helpful to make informal contact. Please follow up with a second email if you have not heard back within a week.

A formal application can be made selecting PhD Informatics: ANC: Machine Learning, Computational Neuroscience, Computational Biology - 3 Years (Full-time) on the right hand side of the following page:

Informatics:ANC:PhD Applications form

Please ensure you state “Amos Storkey (MSR Scholarship)” under the Potential Supervisor(s) section of the form. Please also email me to state that you have applied. Note an application is not marked as complete until all parts are in (including both references), so please push referees to provide the references promptly, and remind them until they confirm they have submitted.

More about the PhD

Deep reinforcement learning has had huge empirical success and is a major enabling technology for many applications of AI. However, recent RL algorithms still require millions of samples to obtain good performance. Since obtaining environment interactions is often costly and since challenging environments are rarely static, this inhibits many practical applications. This project will investigate ways of reducing this cost, aiming to find more sample-efficient RL algorithms. We aim for the algorithms to be deployable in realistic settings, where agents use deep networks to represent knowledge about the environment. Improving sample efficiency of RL has immediate applications to Microsoft’s efforts in applying RL games. It is also likely to lead to improved performance of other systems making automated decisions.

Candidates for this post must have evidence of excellent capability in mathematics and in programming (but the previous degree could be from one of many disciplines), and must be able to show a solid understanding of the fundamentals of machine learning across the breadth of the field. Experience of programming in PyTorch is desirable, as is some training in reinforcement learning. We are looking for candidates with the potential and drive to excel and be a future research leader in this field.

More about the PhD Supervisors

Amos Storkey is Professor of Machine Learning and Artificial Intelligence at the University of Edinburgh and has been involved in research in neural networks and practical Bayesian algorithms for over 20 years. Previously, together with Marc Toussaint, he formulated the variable horizon MDP and POMDP planning problem as an instantiation of probabilistic inference. He was responsible for the first application of convolutional neural networks for playing Go, work which was included in the open-source Feugo implementation. More recently his team along with researchers from OpenAI and Berkeley (Burda et al., 2018) obtained state-of-the-art sample efficiency in Montezuma's Revenge, a RL benchmark notorious for difficulty of exploration. In another paper, a substantial evaluation of exploration driven RL has clarified the importance (and more to the point, unimportance) of certain design elements in exploration driven learning. Amos has a long history of successful PhD projects. He has been primary or acting supervisor for 30 PhD students, and has a long history of prioritising student supervision.

Kamil Ciosek is a researcher at MSR Cambridge and focuses on Reinforcement Learning, particularly exploration. His recent work includes a new actor-critic method (Ciosek et al., 2019) and work on regularization in RL (Igl et al., 2019). He also has experience in actor-critic methods for reinforcement learning (Ciosek and Whiteson, 2018, Fellows et al, 2018). He is also interested in the applications of RL, particularly how it can be applied in games. More broadly, the Malmo project at MSR Cambridge is an important hub of Reinforcement Learning research within Microsoft. The lab itself has had strong Bayesian credentials since its inception and is involved in recent innovation in uncertainty estimation.

More about the School of Informatics

The PhD studentship will be located in the School of Informatics, University of Edinburgh.

The School of Informatics at the University of Edinburgh is unique in its Machine Learning and AI research and provision in Europe. The school has over 100 and over 400 PhD students. Researching in AI continuously since 1963 it is now home to over 60 academic staff across the breadth of AI, 4 Centres for Doctoral Training in AI. Its AI staff have received over 40 top awards and fellowships. It has a long-standing relationship with Microsoft Research, Cambridge. The student will join an immediate team of 15 researchers, and have a broader set of more than 50 researchers in related fields. The student will receive active hands-on supervision from both supervisors.

The immediate research group has more than 40 GPUs directly available just to the group, and the school as a whole has 600 GPUs available for student use; this is in addition to any compute made available through this project.

More about the funding

Full fees and stipend funding is available for a UK or EU student. It may be possible to obtain funding for a non-EU student, but that is subject to us acquiring other funds. The funding includes substantial resource for conference travel, laptop and other facilities. There will be an opportunity to apply for an internship at Microsoft Research, Cambridge, though that will be subject to the usual recruitment process.

Equality and Diversity

We value diversity and inclusiveness and believe that maximising the contribution of every individual enables us all. Whilst welcoming and supporting freedom of thought and expression, we also seek to embed a culture where all students and staff are treated with respect and feel safe and fulfilled within our community. The School holds a Silver Athena SWAN award, in recognition of its commitment to advancing the representation of women in science, mathematics, engineering and technology.

Research Group

Further information on projects, my research group, publications and a research blog can be found on the Bayeswatch group website:

www.bayeswatch.com