University Crest

Dr. Beatrice Alex

Research Fellow in Text Mining

Bea Alex

Contact Details:
University of Edinburgh
School of Informatics
10 Crichton Street, Room 4.38
Edinburgh, EH8 9AB, UK
Tel: +44 (131) 650 2684

My Research

I'm a Research Fellow at the Institute for Language, Cognition and Computation (ILCC) at the School of Informatics at the University of Edinburgh and a Turing Fellow of the Alan Turing Institute. My research interests are in text mining for different domains and speech transcript analytics. I am currently pursuing research on text mining news broadcasts in collaboration with the British Library as part of my Turing fellowship and converstational speech in collabration with Toyota Motor Europe. I am planning to build on this line of work with the aim to make speech archives more accessible. I am also work on text mining methods for electronic healthcare records with scientiests and clinicians in Edinburgh.

I was a lead researcher on the Palimpsest project, one of the Digital Transformations in the Arts and Humanities, Big Data projects funded by AHRC. Palimpsest's main goal was to mine and geo-reference Edinburgh's literature. It was a collaboration with English Literature scholars and visualisation experts. Our aim was to adapt the Edinburgh Geoparser to do fine-grained geo-referencing (on the street and building level) using an Edinburgh gazetteer which we are in the process of aggregating. We also analysed the context of geo-referenced locations in text in order to visualise Edinburgh's literature in different ways. The web interface to it is called LitLong and can be found here:

I was also involved in a collaboration with William Whiteley on text mining brain image reports for disease information (Targeted treatment for acute stroke: development of prognostic models & decision support tools, MRC).

In the past, I worked on the following projects:

I hold a PhD in Computational Linguistics from the University of Edinburgh (ESRC and Edinburgh-Stanford-Link-funded). My PhD thesis is on the automatic foreign inclusion detection in text. It involved interfacing an English inclusion classifier with a statistical parser in order to improve parsing performance (EMNLP 2007).

During my PhD, I was also involved in the following projects:

  • SEER - Machine learning of entity recognisers for modular retargetable natural language processing
  • SUM - The use of rhetorical and discourse structure information for the generation of flexible, high-compression summaries in the legal domain
  • CROSSMARC - CROSS-lingual Multi-Agent Retail Comparison