XML-Based NLP for Analysing and Annotating Medical Langauge [pdf]

Grover, C., Klein, E., Lascarides, A. and Lapata, M. [2002] XML-Based NLP for Analysing and Annotating Medical Langauge, Proceedings of the Second Workshop on NLP and XML (NLPXML-02), Coling 2002, Taipei.

We describe the use of a suite of highly flexible XML-based NLP tools in a project for processing and interpreting text in the medical domain. The main aim of the paper is to demonstrate the central role that XML mark-up and XML NLP tools have played in the analysis process and to describe the resultant annotated corpus of MedLine abstracts. In addition to the XML tools, we have succeeded in integrating a variety of non-XML `off the shelf' NLP tools into our pipelines, so that their output is added into the mark-up. We demonstrate the utility of the annotations that result in two ways. First, we investigate how they can be used to improve parse coverage of a hand-crafted grammar that generates logical forms. And second, we investigate how they contribute to automatic lexical semantic acquisition processes.


@inproceedings{grover:etal:2002,
author = {Claire Grover and Ewan Klein and Alex Lascarides and Mirella Lapata},
year = {2002},
title = {{\sc xml}-based {\sc nlp} Tools for Analysing and Annotating Medical Language},
booktitle = {Proceedings of the Second International Workshop on {\sc nlp} and {\sc xml}},
address = {Coling 2002, Taipei, Taiwan}
}