University Crest

Dr. Beatrice Alex

Research Fellow in Text Mining

Bea Alex








Contact Details:
University of Edinburgh
School of Informatics
10 Crichton Street, Room 4.38
Edinburgh, EH8 9AB, UK
balex@staffmail.ed.ac.uk
Tel: +44 (131) 650 2684

Affiliations

My Research

My research is on text mining for different domains and speech transcript analytics. I am also working on text mining methods for humanities and social science use cases and well as for electronic healthcare records

I am currently collaborating as joint coinvestigator with medical historian Dr. Lukas Engelmann on mining historical plague reports written about the third global plague epidemic.

I also lead a Turing project on large scale and robust text mining methods for radiology reports. This leads on from an ongoing collaboration with Dr. William Whiteley and other clinicians and scientists at the Centre for Clinicial Brain Sciences, developing text mining methods for brain image reports for disease observations specific to stroke.

I was previously working on geoparsing of literary text in the Palimpsest project, one of the Digital Transformations in the Arts and Humanities, Big Data projects funded by AHRC. Palimpsest's main goal was to mine and geo-reference Edinburgh's literature. It was a collaboration with English Literature scholars and visualisation experts. Our aim was to adapt the Edinburgh Geoparser to do fine-grained geo-referencing (on the street and building level) using an Edinburgh gazetteer which we are in the process of aggregating. We also analysed the context of geo-referenced locations in text in order to visualise Edinburgh's literature in different ways. The web interface to it is called LitLong and can be found here: www.litlong.org

In the past, I also worked on the following projects:

I hold a PhD in Computational Linguistics from the University of Edinburgh (ESRC and Edinburgh-Stanford-Link-funded). My PhD thesis is on the automatic foreign inclusion detection in text. It involved interfacing an English inclusion classifier with a statistical parser in order to improve parsing performance (EMNLP 2007).

During my PhD, I was also involved in the following projects:

  • SEER - Machine learning of entity recognisers for modular retargetable natural language processing
  • SUM - The use of rhetorical and discourse structure information for the generation of flexible, high-compression summaries in the legal domain
  • CROSSMARC - CROSS-lingual Multi-Agent Retail Comparison