Postdoctoral position on graph databases and provenance management

Project: ADAPT: A Diagnostics Approach for Persistent Threat Detection
Supervisor:James Cheney
Deadline: February 12, 2015, 5pm GMT

We have an opening for a Research Associate in Graph Data Management on the project "ADAPT: A Diagnostics Approach to Advanced Persistent Threat Prevention", part of the "Transparent Computing" programme funded by the US Defence Advanced Research Projects Agency (DARPA).

Project description

"ADAPT: A Diagnostics Approach to Advanced Persistent Threat Prevention", is part of the "Transparent Computing" programme funded by the US Defence Advanced Research Projects Agency (DARPA). Transparent Computing (TC) is a $60 million research initiative to use provenance to improve security of critical systems in the face of advanced persistent threats (APTs), or attackers who gradually infiltrate a system in order to achieve long-term (and often highly damaging) objectives.

ADAPT is one of several TC-funded research projects working together to instrument mainstream systems to collect provenance, manage and analyse the resulting massive amounts of provenance graph data, and diagnose or identify potential attacks and attackers. ADAPT is a joint project between Galois Inc., Xerox PARC, Oregon State University, and the University of Edinburgh. The University of Edinburgh team is focusing on applying provenance and database expertise to support provenance graph queries and segmentation/feature extraction, needed by normalcy detection, classification, and diagnosis techniques provided by the other ADAPT partners.

About the position

This position is based at The University of Edinburgh. You are expected to take a leading role in investigating graph database techniques applied to provenance management problems in the ADAPT project. In particular, you are expected to play an important role in leading to the successful completion of one or more of the following project tasks:

  • Specify and implement techniques for identifying segments or features in provenance graphs
  • Analyse and improve the performance of such queries or extraction techniques on large-scale provenance graph data
  • If necessary, develop new query language features or optimizations tailored to the provenance graph queries needed by other parts of the project
  • Investigate incremental or stream-based techniques for extracting needed data from provenance graphs
  • Develop and implement abstractions for hierarchical modelling and causal linking of activities in provenance graphs, as needed by other parts of the project

The position will require system development as part of an international research project, as well as independent ideas-led research. You will be expected to work effectively with other researchers to produce prototypes, production-quality systems, high quality publications and demonstrations, and contribute to dissemination activities for the project, e.g. participating in project meetings and publishing papers in top conferences and journals. Duties will also include intermittent travel to project meetings.

Background required

The successful candidate will have expertise in databases, particularly graph data management, provenance management, query languages and optimisation, or incremental computation. The emphasis of this position is on systems-oriented research and development, so experience with database systems implementation is essential; programming languages preferred for the project include Haskell, Scala, and Python. Familiarity with virtualization technology (Docker), message queues (Apache Kafka), graph databases (Titan/Cassandra; Gremlin), or provenance querying and standards (e.g. W3C PROV) would be especially advantageous.

Applicants must, at a minimum, have a PhD degree (or be close to completion) in Computer Science, with either a track record of high quality publications or industrial experience adequate to the needs of the project. A strong background in graph databases/systems (or the ability to learn new systems quickly) is required. We expect that the project will involve practical systems development informed by conceptual or foundational research, so an ideal candidate will have strong development skills and the ability to engage with theory. Previous research experience on provenance or related topics such as machine learning/classification and information flow security would be desirable.

Please ensure that your application includes:

  • a CV listing relevant education, research experience and publications.
  • a 1-2 page statement of your research interests and how they relate to this position.

Applications that do not include these documents may not receive full consideration.

Duration and starting date

The postdoctoral position is available for 18 months starting on or as soon as possible after March 1, 2016.

The Transparent Computing programme as a whole will run from July 2015 until June 2019. This postdoctoral position is subject to extension beyond the initial 18 months, contingent on availability of funding.

Prospective applicants are encouraged to contact James Cheney ( before applying to discuss the position.

Application process and deadlines

A complete application consists of a CV and a 1-2 page research statement summarizing your background, previous research experience, and how they relate to this position.

Applications must be submitted by 5pm GMT on Febriary 12, 2016, through the University of Edinburgh recruitment site:

Reference number: 035230

or directly by following this link:

direct link to the application site

Interviews will likely be held (either in person or via Skype) in mid-February.

The Team

The ADAPT project is a joint effort between four sites, with expertise ranging from databases and provenance management to classification, normalcy detection and model-based diagnosis:

  • Galois Inc., led by David Archer, David Burke and Rogan Creswick
  • Oregon State University, led by Alan Fern
  • Xerox PARC, led by Johan DeKleer and Hoda Eldardiry
  • The University of Edinburgh, led by James Cheney

This postdoctoral position will under the supervision of the Edinburgh PI, Dr. James Cheney, whose group currently includes one postdoctoral researcher (Dr. Wilmer Ricciotti) and three PhD students, all working on topics involving provenance, programming languages, security, and databases. The current team is supported by funding from AFOSR, Microsoft Research, Google, the Royal Society, and the European Union. The successful candidate will also have the opportunity to supervise MSc or PhD students and to collaborate with other world-leading experts on data provenance, graph databases, and data integration as part of the Edinburgh Database Group.


The University of Edinburgh School of Informatics brings together world-class research groups in theoretical computer science, artificial intelligence and cognitive science. The School led the UK 2014 REF rankings in volume of internationally recognized or internationally excellent research. In 2013, the School of Informatics received an Athena Swan Silver Award, in recognition of its commitment to advancing the careers of women in science, technology, engineering, mathematics and medicine (STEMM) employment in higher education and research. Overall the University of Edinburgh has achieved a Silver Award.

The Laboratory for Foundations of Computer Science (LFCS) established by Burstall, Milner and Plotkin in 1986, is recognized worldwide for groundbreaking research on topics in programming languages, semantics, type theory, proof theory, algorithms and complexity, databases, security, and systems biology. Formal aspects of databases, XML and provenance (Libkin, Fan, Buneman), language-based security (Aspinall, Stark, Gordon), and Web programming languages (Wadler) are active areas of investigation in LFCS complementary to this project.

The Edinburgh Database Group is part of the Laboratory for the Foundations of Computer Science and includes six faculty members, five postdoctoral researchers, and six PhD students. Interests of the group span all aspects of database systems and theory. Topics of current interest include graph databases, XML, data integration, novel approaches to query processing and storage, data provenance, archiving and annotation. Many of these topics are relevant to scientific data management, an area in which Edinburgh has unique strengths.

For more information about study in Edinburgh and the School of Informatics, see these pages:

Last modified: Thu Jan 14 11:13:33 GMT 2016