Recommender system for an interactive theorem prover

Proposer: Alan Bundy, 502716, A.Bundy@ed.ac.uk

Self-Proposed: No

Supervisor: Alan Bundy, 502716, A.Bundy@ed.ac.uk

Other Suggested Supervisors: Hazel Duncan

Subject Areas: Automated Reasoning/Theorem Proving, Machine Learning/Neural Networks/Connectionist Computing,

Suitable for the following degrees: BSc in Artificial Intelligence and Computer Science, BSc in Artificial Intelligence and Mathematics, BEng in Artificial Intelligence and Software Engineering, BSc in Computer Science, BEng in Computer Science, BEng in Computer Science and Electronics, BSc in Computer Science and Management Science, BSc in Computer Science and Mathematics, BSc in Computer Science and Physics, BEng in Electronics and Software Engineering, BEng in Software Engineering,

Principal goal of the project: To incorporate a recommendation option into the interactive theorem prover Isabelle, that will use an analysis of previous proofs to suggest possible proofs steps.

Description of the project:

For her PhD, Hazel Duncan has used data-mining and genetic programming techniques to extract proof tactics from large corpora of proofs. A tactic is a program that guides a theorem prover by specifying which proof steps it should make, based on an analysis of the current goal(s). In Duncan's project, the proofs were interactively constructed by previous users of the Isabelle interactive theorem prover. Data-mining techniques, such as Variable Length Markov Models, were first used to identify common sequences of proof steps. Genetic programming was then used to evolve tactics from these common sequences, e.g. by incorporating branching and iteration. The extracted tactics were tested to show that they found proofs with less redundant search than brute-force rule application.

The results of the data-mining phase of Duncan's program could be used to make a recommender system for Isabelle. The common sequences are constructed by calculating the probability of a proof step following from a previous sequence of steps. This probabilistic knowledge could be used directly, rather than as part of automatic tactic construction. Suppose that an Isabelle user has just taken steps A and B. A recommender system could look-up any common sequences it had stored in which A and B were the initial steps. It could then return with recommendations of the form "60% of previous users who made steps A and B then made step C". This recommender system should be implemented, reusing Duncan's code where possible, and then evaluated to see if Isabelle users found it useful or irritating, and how often they took its advice.

Resources Required: Access to the Isabelle system and corpora of proofs developed with it. Access to Hazel Duncan's PhD program.

Degree of Difficulty: A simple implementation is fairly straightforward, but there are lots of interesting optional extensions to challenge a gifted student.

Background Needed: Some knowledge of automated theorem proving and machine learning techniques is highly desirable, eg attendence at the Automated Reasoning and Learning from Data modules.

References: