Joel Lang and Mirella Lapata. 2010. Unsupervised Induction of Semantic Roles. In Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 939-947. Los Angeles, CA.

Datasets annotated with semantic roles are an important prerequisite to developing high-performance role labeling systems. Unfortunately, the reliance on manual annotations, which are both difficult and highly expensive to produce, presents a major obstacle to the widespread application of these systems across different languages and text genres. In this paper we describe a method for inducing the semantic roles of verbal arguments directly from unannotated text. We formulate the role induction problem as one of detecting alternations and finding a canonical syntactic form for them. Both steps are implemented in a novel probabilistic model, a latent-variable variant of the logistic classifier. Our method increases the purity of the induced role clusters by a wide margin over a strong baseline.


@InProceedings{lang-lapata:2010:NAACLHLT,
  author    = {Lang, Joel  and  Lapata, Mirella},
  title     = {Unsupervised Induction of Semantic Roles},
  booktitle = {Human Language Technologies: The 2010 Annual Conference
  of the North American Chapter of the Association for Computational
  Linguistics},
  month     = {June},
  year      = {2010},
  address   = {Los Angeles, California},
  publisher = {Association for Computational Linguistics},
  pages     = {939--947},
  url       = {http://www.aclweb.org/anthology/N10-1137}
}