Shay Cohen

Institute for Language, Cognition and Computation
School of Informatics
University of Edinburgh

"The problem with the Internet is that nobody can confirm the information there is correct." -- Carl Friedrich Gauss

Prospective students who might be interested in working with me: please see the note here.

About me

My broad interests are in the intersection of computational linguistics and statistical learning. I am most interested in predicting structure from text. Such structure underlies natural language at all its levels, from discourse, through semantics to syntax.

My current interests, all interacting at some level, are: (1) the use of linear algebra for learning statistical models, especially those with hidden variables that are not observed in the data; (2) semantic representations, especially abstract meaning representation and discourse representation structures, and their applications such as document summarization. In my work I use various statistical learning algorithms and language formalisms, including neural networks, spectral methods, probabilistic grammars and others.

Click here for a bio.

Jiangming's new document-level DRS parser (ACL 2019) is available as a demo here.

Slides from the Semantic Representation Learning workshop at Cardiff are available here.

A demo of Marco's AMR-to-text generator from NAACL 2019 is available here.

A second edition (with a neural networks and representation learning chapter) of the book about
Bayesian Analysis in Natural Language Processing is out (website, hardcopy on Amazon).

A demo of Jiangming's DRS parser from ACL 2018 is available here.

A demo of our XSum abstractive summarization system from EMNLP 2018 is available here.

A demo of our reinforcement learning extractive summarization system from NAACL 2018 is available here.

Our work on crime drama in the news: BBC, New Scientist, Scottish Legal, The Register,
The Telegraph, Daily Mail, The Scotsman, Scottish Daily Mail, Digital Trends.

We released the Rainbow Parser, a parser with spectral learning algorithms (and EM) for latent-variable PCFGs. It is on github.

The slides from my Mathematics of Language 2017 talk are available here.

A new book about Bayesian Analysis in Natural Language Processing is out (website, hardcopy on Amazon).

Marco has developed a new AMR parser called AMREager and a new set of evaluation metrics for AMR.

  • Accelerated Natural Language Processing (ANLP). Course website (Autumn 2019; Autumn 2020).
  • Foundations of Natural Language Processing (FNLP). Course website (Spring 2020).
  • Processing Formal and Natural Languages (INF2A). Course website (Autumn 2018; Autumn 2017; Autumn 2015).
  • Topics in Natural Language Processing (INFR11113). Course website (Spring 2015; Spring 2016; Spring 2017; Spring 2018). Course on PATH. Click here for a synopsis.
  • Lecture on linear classification at the Lisbon Machine Learning School (LxMLS), July 2015.
  • A tutorial about Spectral learning algorithms for NLP (NAACL, 2013). Similar tutorial with overlapping material at CMU (June, 2014).
  • Seminar at Columbia - Bayesian analysis for NLP (Spring, 2013).
  • A course at IBM about Probability and Structure in NLP (May, 2011).
Students and Post-docs
  • Esma Balkır (PhD student, 2016-)
  • Ronald Cardenas (PhD student, 2019-)
  • Javad Hosseini (PhD student, 2017-; co-advised with Mark Steedman)
  • Zheng Zhao (PhD student, 2020-; co-advised with Bonnie Webber)
  • Chunchuan Lyu (PhD student, 2017-21; co-advised with Ivan Titov)
  • Jiangming Liu (PhD student, 2017-21; co-advised with Mirella Lapata)
  • Emily Scher (PhD student, 2017-20; co-advised with Guido Sanguinetti; now postdoc at the University of Edinburgh)
  • Matthieu Labeau (postdoc, 2018-19, now Assistant Professor at Telecom Paris)
  • Maximin Coavoux (postdoc 2018-19, Research Scientist at Naver Labs then CNRS Researcher)
  • Shashi Narayan (postdoc 2014-19, now Research Scientist at Google)
  • John Torr (PhD student, 2015-19; co-advised with Mark Steedman; now Research Scientist at Wluper)
  • Marco Damonte (PhD student, 2015-19; now Researcher at Amazon)
  • Nikos Papasarantopoulos (PhD student, 2016-19)
  • Joana Ribeiro (MPhil student, 2015-19)
Here is a group picture from summer 2020, a bit unique.

Here is an older picture from summer 2019.

Here is an older picture from summer 2018.

Here is an older picture from summer 2017.

Here is an older picture from summer 2016.

Here is an even older picture from summer 2015.

Here is a link to our group page that includes code, project and demo pages.

Code and Data
Contact information

scohen [strudel]

10 Crichton Street
Informatics Forum 4.26
Edinburgh EH8 9AB
United Kingdom

Phone: +44 (0) 131 650 6542