Shay Cohen's homepage

Shay Cohen

Reader
Institute for Language, Cognition and Computation
School of Informatics
University of Edinburgh

Prospective students who might be interested in working with me: please see the note here.

About me

My interests are in the broad intersection of natural language processing and machine learning. I work mainly in the areas of text generation, parsing on its varieties and representation learning and analysis. I use various tools, including large language models, neural networks, (multi)linear algebra and probabilistic grammars.

Click here for a bio.

How can RNA structure prediction be formalized using good-old dependency parsing? Check out Ke's ICLR paper about DEPfold.

How have commercial machine translation systems changed in the past six years since 2018? Check out Guojun's EMNLP paper about MT improvements over time.

How can we use concept erasure methods to mitigate hallucination? Check out Yifu's and Zheng's work about spectral editing of activations.

Can context-free grammars define a "language" of neural architectures? Check out Linus' work about einspace, using PCFGs to perform neural architecture search.

How far can we push symbolic modeling to summarize documents? Check out Ronald's work in JAIR that makes use of work in psycholinguistics to summarize scientific papers.

Searching for alternatives to probing? Check out Zheng's and Yftah's work about using matrix factorization to analyze representations in BlackBoxNLP 2022 and EMNLP 2023.

Product and service review language has become more terse and more polarized over the past 20 years. Why and what does it mean about our use of language? LRE paper here.

Slides from talk at the University of Rochester (10 February 2023) and RISE (9 March 2023) available here. Talk here.

Slides from talk at Heidelberg (11 May 2022) available here, and with a bit more recent fun with GPT-3 at ETH AI Center (30 September 2022).

Jiangming's new document-level DRS parser (ACL 2019) is available as a demo here.

Slides from the Semantic Representation Learning workshop at Cardiff are available here.

A demo of Marco's AMR-to-text generator from NAACL 2019 is available here.

A second edition (with a neural networks and representation learning chapter) of the book about
Bayesian Analysis in Natural Language Processing is out (website, hardcopy on Amazon).

A demo of Jiangming's DRS parser from ACL 2018 is available here.

A demo of our XSum abstractive summarization system from EMNLP 2018 is available here.

A demo of our reinforcement learning extractive summarization system from NAACL 2018 is available here.

Our work on crime drama in the news: BBC, New Scientist, Scottish Legal, The Register,
The Telegraph, Daily Mail, The Scotsman, Scottish Daily Mail, Digital Trends.

We released the Rainbow Parser, a parser with spectral learning algorithms (and EM) for latent-variable PCFGs. It is on github.

The slides from my Mathematics of Language 2017 talk are available here.

A new book about Bayesian Analysis in Natural Language Processing is out (website, hardcopy on Amazon).

Marco has developed a new AMR parser called AMREager and a new set of evaluation metrics for AMR.

Publications

Click here for a list

Teaching

Natural Language Understanding, Generation, and Machine Translation. Course website (Spring 2025; Spring 2024).
Accelerated Natural Language Processing (ANLP). Course website (Autumn 2019; Autumn 2020; Autumn 2021; Autumn 2023).
Foundations of Natural Language Processing (FNLP). Course website (Spring 2020).
Processing Formal and Natural Languages (INF2A). Course website (Autumn 2018; Autumn 2017; Autumn 2015).
Topics in Natural Language Processing (INFR11113). Course website (Spring 2015; Spring 2016; Spring 2017; Spring 2018). Course on PATH. Click here for a synopsis.

Natural language processing is an application area in computer science, heavily supported by the industry with new applications emerging on a constant basis. The goal of this course is to give a different angle and look into natural language processing. We will explore basic concepts in computer science, machine learning, and statistics that make natural language processing such a rich area of research. You will learn how to use generic methods for application to specific problems you need to address in order to make use of natural language. As such, we will take a method-oriented view of NLP instead of an application-oriented one.

Topics we will discuss include: basic probability and statistics used in NLP, structured prediction with log-linear models, Bayesian inference, finite state transducers, context-free grammars and other constructs, latent-variable modeling, basic concepts in learning theory.

Hopefully, after taking the class, when using a generic NLP tool such as a part-of-speech tagger or a syntactic parser, you will be able to hypothesize how the tool generally works under the hood and why. This class can also assist you later in research in natural language processing, should you choose to pursue a PhD degree in the area.
Lecture on linear classification at the Lisbon Machine Learning School (LxMLS), July 2015.
A tutorial about Spectral learning algorithms for NLP (NAACL, 2013). Similar tutorial with overlapping material at CMU (June, 2014).
Seminar at Columbia - Bayesian analysis for NLP (Spring, 2013).
A course at IBM about Probability and Structure in NLP (May, 2011).

Students and Post-docs

Zheng Zhao (PhD student, 2020-; co-advised with Bonnie Webber)
Matt Grenander (PhD student, 2021-; co-advised with Mark Steedman)
Balint Gyevnar (PhD student, 2021-; co-advised with Stefano Albrecht and Chris Lucas)
Nickil Maveli (PhD student, 2022-; co-advised with Antonio Vergari)
Yifu Qiu (PhD student, 2022-; co-advised with Edoardo Ponti and Anna Korhonen; ELLIS student and Apple AI/ML Scholar)
Ke Wang (PhD student, 2023-)
Dominik Grabarczyk (PhD student, 2023-; co-advised with Javier Alfaro)

Alumni:

Marcio Fonseca (PhD student, 2021-2024)
Ronald Cardenas (PhD student, 2019-2024)
Esma Balkır (PhD student, 2016-2021; now Researcher at National Research Council Canada)
Chunchuan Lyu (PhD student, 2017-21; co-advised with Ivan Titov)
Javad Hosseini (PhD student, 2017-; co-advised with Mark Steedman; now Researcher at Google)
Jiangming Liu (PhD student, 2017-21; co-advised with Mirella Lapata)
Emily Scher (PhD student, 2017-20; co-advised with Guido Sanguinetti; now postdoc at the University of Edinburgh)
Matthieu Labeau (postdoc, 2018-19, now Assistant Professor at Telecom Paris)
Maximin Coavoux (postdoc 2018-19, Research Scientist at Naver Labs then CNRS Researcher)
Shashi Narayan (postdoc 2014-19, now Research Scientist at Google)
John Torr (PhD student, 2015-19; co-advised with Mark Steedman; now Research Scientist at Wluper)
Marco Damonte (PhD student, 2015-19; now Researcher at Amazon)
Nikos Papasarantopoulos (PhD student, 2016-19)
Joana Ribeiro (MPhil student, 2015-19)
Zheng Zhao (MRes student, 2020)
Nickil Maveli (MRes student, 2021)
Waylon Li (MRes student, 2021-2022; now PhD student at Edinburgh)
Shun Shao (MRes student, 2022-2023; now PhD student at Cambridge)
Chenmien Tan (MRes student, 2024)

Outreach

Informatics Circle, a recurring (online) event aimed at teaching Computer Science concepts to kids.

Events

Workshop about the scaling behavior of large language models at EACL 2024 (SCALE-LLM's website).
Workshop about representation learning in NLP at ACL 2017 (workshop's website).
Workshop about representation learning in NLP at ACL 2016 (workshop's website).
Workshop about vector space modelling in NLP at NAACL 2015 (workshop's website with post-workshop materials and pictures).

Contact information

scohen [strudel] inf.ed.ac.uk

10 Crichton Street
Informatics Forum 4.26
Edinburgh EH8 9AB
United Kingdom

Phone: +44 (0) 131 650 6542

Tours in Edinburgh, German, French, Deutsch
Edinburgh Touren Deutsch