Research

I am currently a professor in computer science at the the School of Informatics at the University of Edinburgh and a member of  EdinburghNLP,  the Natural Language Processing Group at the University of Edinburgh.

My overarching research goal is to develop AI systems that not only follow patterns but reason, generalize, and adapt to novel situations. I aim to create models capable of understanding requests, aggregating and conveying information across modalities, forming long-term plans, and reasoning creatively about new challenges. My research interests focus on improving compositional generalization in deep learning models, enhancing cross-lingual transfer in multilingual language models, addressing limitations in understanding and generating long contexts, and creating verifiable systems, which generate responses with supporting evidence. Problems that I'm currently excited about include:

(1) Agent-based frameworks for collaborative writing tasks (e.g., writing a story or a book chapter).
(2) Improving the representation of long context (e.g., for movie summarization, video QA).
(3) Parameter efficient approaches for LLM generalization to new tasks.
(4) Language grounding (e.g., semantic parsing, multimodal representations).

PhD Students

Akash Gupta (since September 2024)
Ashutosh Adhikari (since September 2024)
Alex Gurung (since September 2023)
Irina Saparina (since September 2022)
Argyrios Papoudakis (since September 2022; co-supervised with Frank Keller)
Danna Zheng (since September 2022; co-supervised with Jeff Pan)
Agostina Calabrese (since September, 2020; co-supervised with Björn Ross)
Danyang Liu (since September, 2020; co-supervised with Frank Keller)
Parag Jain (since September, 2019)

Postdoctoral Researchers

Miao Li (is joining us soon!)
Iñigo Alonso (since 2025)
Louis Mahon (since 2023)
Laura Perez-Beltrachini (since 2018)

Alumni

Tom Hosking (PhD 2024, Learning Weakly Structured Representations for Text-to-Text Generation)
Yao Fu (PhD 2024, Improving Complex Reasoning in Large Language Models)
Tom Sherborne (PhD 2024, Modelling Cross-lingual Transfer for Semantic Parsing)
Hao Zheng (PhD 2023, Towards Human-like Compositional Generalization with Neural Models)
Nelly Papalampidi (PhD 2022, Structure-aware Narrative Summarization from Multiple Views)
Yumo Xu (PhD 2022, Document Summarization with Neural Query Modling)
Ratish Puduppully (PhD 2022, Data-to-Text Generation with Neural Planning)
Reinald Kim Amplayo (PhD 2022, Opinion Summarization of Multiple Reviews: Data Synthesis and Modeling)
Rui Cai (PhD 2021, Neural Semantic Role Labeling with more and less Supervision)
Jonathan Mallinson (PhD 2021, Universal Rewriting via Machine Translation)
Jiangming Liu (PhD 2021, Understanding and Generating Language with Discourse Representation Structures)
Yang Liu (PhD 2020, Neural Document Modeling and Summarization)
Li Dong (PhD 2019, Learning Natural Language Interfaces with Neural Models)
Stefanos Angelidis (PhD 2019, Weakly Supervised Sentiment Analysis and Opinion Extraction)
Jianpeng Cheng (PhD 2019,  The Lifecycle of Neural Semantic Parsing)
Philip Gorinski (PhD 2018, Automatic Movie Analysis and Summarization)
Xingxing Zhang (PhD 2017, Natural Language Generation as Sequence Learning and Beyond)
Siva Reddy (PhD 2017, Syntax-Mediated Semantic Parsing)
Lea Frermann (PhD 2016, Bayesian Models of Category Acquisition and Meaning Development)
Carina Silberer, (PhD 2015, Learning Visually Grounded Meaning Representations)
Ioannis Konstas, (PhD 2014, Joint Models for Concept-to-text Generation)
Trevor Fountain (PhD 2013, Modelling the Acquisition of Natural Language Categories)
Joel Lang (PhD 2012, Unsupervised Induction of Semantic Roles)
Yansong Feng (PhD 2011, Automatic Caption Generation for News Images)
Neil McIntyre (PhD 2011, Learning to Tell Tales: Automatic Story Generation from Corpora)
Jeff Mitchell (PhD 2011, Composition in Distributional Models of Semantics)
Sam Brody (PhD 2009, Closing the Gap in WSD: Supervised results with Unsupervised Methods)
James Clarke (PhD 2008, Global Inference for Sentence Compression: An Integer Linear Programming Approach)
Sebastian Padó (PhD 2007, Cross-Lingual Annotation Projection Models for Role-Semantic Information)

Journals and Conference Papers


This page contains publications for the past three years. For older papers, please consult my Google Scholar Profile and/or the ACL anthology.

>

Publications 2024


Hierarchical Indexing for Retrieval-Augmented Opinion Summarization
Tom Hosking, Hao Tang, Mirella Lapata. TACL (2024) Paper Code

DOLOMITES: Domain-Specific Long-Form Methodical Tasks
Chaitanya Malaviya, Priyanka Agrawal, Kuzman Ganchev, Pranesh Srinivasan, Fantine Huot, Jonathan Berant, Mark Yatskar, Dipanjan Das, Mirella Lapata, Chris Alberti. TACL (2024) Paper Code

AMBROSIA: A Benchmark for Parsing Ambiguous Questions into Database Queries
Irina Saparina, Mirella Lapata. NeurIPS (2024) Paper Code

Finding the Right Moment: Human-Assisted Trailer Creation via Task Decomposition
Pinelopi Papalampidi, Frank Keller, Mirella Lapata. IEEE Transactions on Pattern Analysis and Machine Intelligence 46:1, 292-304 Paper Code

Evaluating LLMs for Targeted Concept Simplification for Domain-Specific Texts
Sumit Asthana, Hannah Rashkin, Elizabeth Clark, Fantine Huot, Mirella Lapata. EMNLP (2024)
Paper Code

Low-Rank Adaptation for Multilingual Summarization: An Empirical Study
Chenxi Whitehouse, Fantine Huot, Jasmijn Bastings, Mostafa Dehghani, Chu-Cheng Lin, Mirella Lapata. NAACL Findings (2024) Paper Code

BookWorm: A Dataset for Character Description and Analysis
Argyrios Papoudakis, Mirella Lapata, Frank Keller. EMNLP Findings (2024) Paper Code

CHIRON: Rich Character Representations in Long-Form Narratives
Alex Gurung, Mirella Lapata. EMNLP Findings (2024) Paper Code

Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA Wenyu Huang, Guancheng Zhou, Hongru Wang, Pavlos Vougiouklis, Mirella Lapata, Jeff Z. Pan. EMNLP Findings (2024) Paper Code

Archer: A Human-Labeled Text-to-SQL Dataset with Arithmetic, Commonsense and Hypothetical Reasoning Danna Zheng, Mirella Lapata, Jeff Z. Pan. EACL (2024) Paper Code

Improving Generalization in Semantic Parsing by Increasing Natural Language Variation
Irina Saparina, Mirella Lapata. EACL (2024) Paper Code

μPLAN: Summarizing Using a Content Plan as Cross-Lingual Bridge
Fantine Huot, Joshua Maynez, Chris Alberti, Reinald Kim Amplayo, Priyanka Agrawal, Constanza Fierro, Shashi Narayan, Mirella Lapata. EACL (2024) Paper Code

PixT3: Pixel-based Table-to-Text Generation
Inigo Alonson, Eneko Agirre, Mirella Lapata. ACL (2024) Paper Code

A Modular Approach for Multimodal Summarization of TV Shows
Louis Mahon, Mirella Lapata. ACL (2024) Paper Code

Learning to Plan and Generate Text with Citations
Constanza Fierro, Reinald Kim Amplayo, Fantine Huot, Nicola De Cao, Joshua Maynez, Shashi Narayan, Mirella Lapata. ACL (2024) Paper Code

Explainability and Hate Speech: Structured Explanations Make Social Media Moderatrs Faster
Agostina Calabrese, Leonardo Neves, Neil Shah, Maarten Bos, Björn Ross, Mirella Lapata, Francesco Barbieri. ACL (2024) Paper Code

Little Red Riding Hood Goes around the Globe: Crosslingual Story Planning and Generation with Large Language Models
Evgeniia Razumovskaia, Joshua Maynez, Annie Louis, Mirella Lapata, Shashi Narayan.
LREC (2024) Paper Code

Publications 2023


Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing
Tom Sherborne, Tom Hosking, Mirella Lapata. TACL (2023), volume 11 Paper Code

Meta-Learning a Cross-lingual Manifold for Semantic Parsing
Tom Sherborne, Mirella Lapata. TACL (2023), volume 11 Paper Code

QAmeleon: Multilingual QA with Only 5 Examples
Priyanka Agrawal, Chris Alberti, Fantine Huot, Joshua Maynez, Ji Ma, Sebastian Ruder, Kuzman Ganchev, Dipanjan Das, Mirella Lapata TACL (2023), volume 11 Paper Code

Conditional Generation with a Question-Answering Blueprint
Shahsi Narayan, Joshua Maynez, Reinald Kim Amplayo, Kuzman Ganchev, Annie Louis, Fantine Huot, Anders Sandholm, Dipanjan Das, Mirella Lapata. TACL (2023), volume 11 Paper Code

Attributable and Scalable Opinion Summarization
Tom Hosking, Hao Tang, Mirella Lapata. ACL (2023) Paper Code

Semantic Parsing for Conversational Question Answering
Laura Perez-Beltrachini, Parag Jain, Emilio Monti, Mirella Lapata. EACL (2023) Paper Code

Text-Blueprint: An Interactive Platform for Plan-based Conditional Generation
Fantine Huot, Joshua Maynez, Shashi Narayan, Reinald Kim Amplayo, Kuzman Ganchev, Annie Louis, Anders Sandholm, Dipanjan Das, Mirella Lapata. EACL (2023) Paper Demo

Conversational Semantic Parsing Using Dynamic Context Graphs
Parag Jain, Mirella Lapata. EMNLP (2023) Paper Code

Real-World Compositional Generalization with Disentangled Sequence-to-Sequence Learning
Hao Zheng, Mirella Lapata. ACL Findings (2023) Paper Code

Hierarchical3D Adapaters for Long Video-to-Text Summarization
Pinelopi Papalampidi, Mirella Lapata. EACL Findings (2023) Paper Code

Text Summarization with Oracle Expectation
Yumo Xu, Mirella Lapata ICLR (2023) Paper Code

Multilingual Summarization with Factual Consistency Evaluation
Roee Ahraroni, Shahsi Narayan, Joshua Maynez, Jonathan Herzig, Elizabeth Clark, Mirella Lapata. ACL Findings (2023) Paper Code

Visual Storytelling with Question-Answer Plans
Danyang Liu, Mirella Lapata, Frank Keller. EMNLP Findings (2023) Paper Code

Retrieval Augmented Generation with Rich Answer Encoding
Wenyu Huang, Mirella Lapata, Pavlos Vougiouklis, Nikos Papasarantopoulos, Jeff Pan Joint COLING and AACL (2023) Paper Code

Publications 2022


Document Summarization for Latent Queries
Yumo Xu, Mirella Lapata. TACL (2022), Volume 10 Paper Code

Data-to-text Generation with Variational Sequential Planning
Ratish Puduppully, Yao Fu, Mirella Lapata. TACL (2022), Volume 10 Paper Code

Explainable Abuse Detection as Intent Classification
Agostina Calabrese, Björn Ross, Mirella Lapata. TACL (2022), Volume 10 Paper Code

A Well-Composed Text is Half Done! Composition Sampling for Diverse Conditional Generation Shashi Narayan, Gonçalo Simões, Yao Zhao, Joshua Maynez, Dipanjan Das, Michael Collins, Mirella Lapata ACL (2022) Paper Code

Hierarchical Sketch Induction for Paraphrase Generation
Tom Hosking, Hao Tang, Mirella Lapata ACL (2022) Paper Code

Zero-Shot Cross-lingual Semantic Parsing
Tom Sherborne, Mirella Lapata ACL (2022) Paper Code

Disentangled Sequence to Sequence Learning for Compositional Generalization Hao Zheng, Mirella Lapata ACL (2022) Paper Code

Current Projects

UKRI AI Centre for Doctoral Training in Responsible and Trustworthy in-the-world NLP, Research Training Grant, funded by UKRI, 2024-2032.
Addressing Hallucinations in Generative Language Models, Innovate UK award, funded by UKRI, 02/2024-02/2021-2025.
Teaching Machines to Reason like Humans, Turing AI Fellowship, funded by UKRI, 2021-2026.
UKRI Centre for Doctoral Training in Natural Language Processing, Research Training Grant, funded by UKRI, 2019-2027.

Past Projects

Multimodal Summarization of Creative Content, funded by Amazon, 01/2022-01/2023.
Generating Opinion Summaries and their Explanations from User Reviews, funded by Megagon Labs, 08/2019-07/2020.
Question-Answering with Knowledge Graph Queries, funded by Huawei, 2021-2024.
Real-world Semantic Parsing, funded by Amazon, 2020-2021.
Foreign Language Automated Information Retrieval (FLAIR), funded by IARPA-BAA-16-11,
1/10/17-31/08/21.
Translating from Multiple Modalities into Text, Consolidator Grant (GoG), funded by the ERC, 09/2016-08/2021.
A Unified Model of Compositional and Distributional Semantics: Theory and Applications, Research Project Grant, funded by the EPSRC, 03/2013-03/2016.
Readers: Evaluation and Development of Reading Systems, Research Project Grant, funded by the EPSRC
An integrated model of syntactic and semantic prediction in human language processing, Research Project Grant, funded by the EPSRC, 07/2011-02/2015.
Global Inference for Summarization Using Integer Linear Programming, Research Project Grant, funded by the EPSRC, 01/2009-01/2012.
Ranking Word Senses for Disambiguation: Models and Applications, Research Project Grant, funded by the EPSRC, 09/2005-09/2008.
Statistical Models for Text-to-text Generation, Advanced Fellowship, funded by the EPSRC, 02/2005-01/2010.
Application-based Text-to-Text Generation, Research Project Grant, funded by the EPSRC, 06/2006-06/2009.
Robust Pragmatics for Narrative Text, funded by the EPSRC, 01/2002-03/2005.

Current Projects

UKRI AI Centre for Doctoral Training in Responsible and Trustworthy in-the-world NLP, Research Training Grant, funded by UKRI, 2024-2032.
Addressing Hallucinations in Generative Language Models, Innovate UK award, funded by UKRI, 02/2024-02/2021-2025.
Teaching Machines to Reason like Humans, Turing AI Fellowship, funded by UKRI, 2021-2026.
UKRI Centre for Doctoral Training in Natural Language Processing, Research Training Grant, funded by UKRI, 2019-2027.

AI @ Edinburgh

The Generative AI Laboratory (GAIL) at the University of Edinburgh is a centre for excellence dedicated to researching all aspects of generative artificial intelligence (AI) in society. Uniting the diverse research expertise across the University with generative AI at its core, GAIL taps into a thriving AI landscape with recognised strengths in natural language processing, machine learning, and data-driven innovation.

The Edinburgh Laboratory for Integrated Artificial Intelligence (ELIAI) at the School of Informatics is seeking to enhance neural network models with reasoning capabilities, a skill required to enable many AI applications.

The UKRI AI Centre for Doctoral Training (CDT) in Responsible and Trustworthy in-the-world is a PhD training programme aiming to develop doctoral graduates that represent a new paradigm of interdisciplinary NLP researcher, who are ready to realise the full potential of NLP-based systems and enable richer interactions that allow genuine partnerships between humans and AI.