University Crest

Dr. Beatrice Alex

Research Fellow in Text Mining

Bea Alex

Contact Details:
University of Edinburgh
School of Informatics
10 Crichton Street, Room 4.38
Edinburgh, EH8 9AB, UK
Tel: +44 (131) 650 2684


I lead the Edinburgh Language Technology Group, a cross-college research group that specialises in research and development of natural language processing and text mining methods with applications in healthcare, medical history, population science and literary studies.

My Research

My research is on text mining for different domains and speech transcript analytics. I am also working on text mining methods for humanities and social science use cases and well as for electronic healthcare records

I am currently collaborating as joint coinvestigator with medical historian Dr. Lukas Engelmann on mining historical plague reports written about the third global plague epidemic.

I also lead a Turing project on large scale and robust text mining methods for radiology reports. This leads on from an ongoing collaboration with Dr. William Whiteley and other clinicians and scientists at the Centre for Clinicial Brain Sciences, developing text mining methods for brain image reports for disease observations specific to stroke.

I was previously working on geoparsing of literary text in the Palimpsest project, one of the Digital Transformations in the Arts and Humanities, Big Data projects funded by AHRC. Palimpsest's main goal was to mine and geo-reference Edinburgh's literature. It was a collaboration with English Literature scholars and visualisation experts. Our aim was to adapt the Edinburgh Geoparser to do fine-grained geo-referencing (on the street and building level) using an Edinburgh gazetteer which we are in the process of aggregating. We also analysed the context of geo-referenced locations in text in order to visualise Edinburgh's literature in different ways. The web interface to it is called LitLong and can be found here:

In the past, I also worked on the following projects:

  • BotaniTours, a project on aggregating and mining botanical information (wild plants and gardens) for tourists to the Scottish Borders.
  • Trading Consequences, a Digging Into Data project focussed on historical text mining.
  • TXM and TXV, work on buidling text mining pipelines for biomedicine and the recruitment sector and user evaluation.

I hold a PhD in Computational Linguistics from the University of Edinburgh (ESRC and Edinburgh-Stanford-Link-funded). My PhD thesis is on the automatic foreign inclusion detection in text. It involved interfacing an English inclusion classifier with a statistical parser in order to improve parsing performance (EMNLP 2007).

During my PhD, I was also involved in the following projects:

  • SEER - Machine learning of entity recognisers for modular retargetable natural language processing
  • SUM - The use of rhetorical and discourse structure information for the generation of flexible, high-compression summaries in the legal domain
  • CROSSMARC - CROSS-lingual Multi-Agent Retail Comparison