Claire Grover

Senior Research Fellow in the Language Technology Group, which is part of the Institute for Language, Cognition and Computation (ILCC) in the School of Informatics


Tel: +44 131 650 4441
Fax: +44 131 650 4587



Current and Recent Projects


The Palimpsest project uses natural language processing technology, informed by literary scholars’ input, in order to text mine literary works set in Edinburgh and to visualise the results in accessible ways.


The focus of the Hiberlink project is to assess the extent of so-called ‘reference rot’. This two-year study investigates how web links in online scientific and other academic articles fail to lead to the resources that were originally referenced.

Trading Consequences

The Trading Consequences project is a multi-institutional, international collaboration between environmental historians in Canada and computer scientists in the UK that uses text-mining software to explore thousands of pages of historical documents related to international commodity trading in the British Empire, involving Canada in particular, during the 19th century, and its impact on the economy and environment.


A Sicsa Smart Tourism project. Botanitours is a service which provides access to information about plants and gardens within a given locality.

DEEP (The Digitisation of English Placenames)

This project aims to digitise the 86 volumes of the Survey of English Place-Names, a county by county survey started in 1922 by the specialists of the English Place-Name Society (EPNS). Our role is to convert the semi-structured text into a full structured resource from which a historical gazetteer is derived for use with the Edinburgh Geoparser as embedded in Unlock Text.A browsable view of the gazetteer can be found at

SYNC3 (Synergetic Content Creation and Communication)

The goal of SYNC3 is to create a framework for structuring, rendering more accessible and enabling collaborative creation of the extensive user-provided content that is located in personal blogs and refers to running news issues. Funded by the European Union's 7th Framework Programme: Information and Communication Technologies (ICT).


GeoDigRef is a short project investigating the advantages of metadata enrichment across three diverse resource collections funded under the JISC Digitisation programme.


Text Mining for Biomedical Content Curation

EASIE (Edinburgh And Stanford Information Extraction)

Combining Shallow Semantics and Domain Knowledge: the project builds on existing techniques for information extraction (IE) in order to develop and implement improved methods for extracting semantic content from text.


Smart Qualitative Data: Methods and Community Tools for Data Mark-Up.

SEER (Stanford Edinburgh Entity Recognition)

Named Entity Recognition with emphasis on bootstrapping, machine learning and porting to new domains (Biomedicine, Astronomy, Archaeology)


Summarisation of Legal Texts

CROSSMARC (CROSS-lingual Multi-Agent retail Comparison)

Information Extraction from web pages

DISP (Data Intensive Semantics and Pragmatics

Corpus-based lexical semantics in the biomedical domain