First NLG Challenge on Generating Instructions in Virtual Environments (GIVE)

Part of Generation Challenges 2009
Endorsed by SIGGEN, SIGDIAL, and SIGSEM

Try it for yourself!

Download the software package for participants:
(tar.gz)   (zip)   (Javadoc)

Overview

Evaluating natural language generation systems is a notoriously hard problem: Unlike NL interpretation, where annotated corpora may provide a gold standard against which a system can be measured, there are generally multiple equally good outputs that an NLG system might produce. On the other hand, access to human experimental subjects who could judge the quality of the system's output is usually too expensive for large-scale use. Nevertheless, there has recently been an increased interest in shared tasks and new methodologies for evaluating and comparing NLG systems.

We invite participation in the first installment of the Challenge on Giving Instructions in Virtual Environments (GIVE). In this scenario, a human user must perform a certain game-like task in a virtual 3D environment. The NLG system's job is to generate, in real time, a sequence of natural-language instructions that will help the user in performing this task. Users will connect to the system over the Internet; they will then be shown the generated instructions and attempt to solve the task by following these instructions. The system's performance will be evaluated with respect to such measures as average task completion accuracy, speed, and efficiency. Because the user and the system don't need to be physically in the same place, access to experimental subjects over the Internet becomes easy, and we anticipate high numbers of evaluation runs per system.

To get a better idea of how this works, we invite you to try a prototype for yourself.

The GIVE challenge is a theory-neutral, end-to-end evaluation effort for NLG systems. It involves research opportunities in text planning, sentence planning, realization, and situated communication. One particularly interesting aspect of situating the generation problem in a virtual environment is that spatial and relational expressions will play a bigger role than usual.

In this first installment, we would like to particularly invite contributions from students and student teams, but contributions from anyone who is interested are welcome as well. All participating systems will be evaluated, and the results will be compared at a workshop in 2009. We anticipate making the GIVE challenge an ongoing event and repeating it in regular intervals.

The Challenge

During the evaluation period of the challenge, each participating team will run their NLG system as a server at their own institution. We will provide a central website from which users can start the 3D client with which they can move around in and manipulate the virtual environment. Clients will connect over the Internet to a central matchmaking service which we provide as well. This service will then distribute them randomly over the participating NLG systems. After each evaluation run, a log of this run will be stored in our database. These logs will then be evaluated with respect to the NLG system's performance, and can be replayed for future analysis.

The NLG systems will have complete symbolic information about the world and the task. We will also provide easy access to a planning system to enable the NLG system to compute the sequence of actions that the user must perform.

We will provide each participating team with a complete package including software, demo worlds, and documentation, by May 1. The evaluation will then take place over a period of three months next winter, from October 1 to December 31. We will analyze the logs early in 2009 and distribute the results to the participants. In Spring 2009, we will organize a workshop where participants can present their results and experiences. At this workshop we will also discuss ways in which the GIVE challenge can be further improved and refined.

If you or your students are interested in participating in the challenge, please send us an e-mail at a.koller@ed.ac.uk.

You can find further information about the challenge in our Working Group Report from the Workshop for Shared Tasks and Comparative Evaluation in NLG, which took place in April 2007 at Arlington, VA.

Important dates

1 May materials distributed to participants
1 October - 31 December evaluation period
Early 2009 results distributed to participants
Spring 2009 workshop

Organizing committee

Donna Byron, Northeastern University
Justine Cassell, Northwestern University
Robert Dale, Macquarie University
Alexander Koller, University of Edinburgh
Johanna Moore, University of Edinburgh
Jon Oberlander, University of Edinburgh
Kristina Striegnitz, Union College

Last modified: Thu Jul 17 11:27:18 BST 2008