Code

Neural Attention Tradeoff (NEAT) Model: This implementation contains the code for the neural attention model of human reading presented in in Hahn and Keller(2016). The model is able to capture skipping behavior and reading times as recorded in eye-tracking data, and has been evaluated against the Dundee corpus.

Psycholinguistically Motivated Tree-Adjoinging Grammar (PLTAG) Parser: This implementation contains a fully incremental PLTAG parser, with incremental semantic role labeling capability and discriminative reranking. The parser is described in Demberg et al. (2013) and in subsequent papers.

WebExp: A software package for conducting experiments over the world-wide web. Web-based experimentation gives access to a large and varied set of potential subjects, and experiments can be administered without the overheads of lab setups, attendance schedules, and so on. WebExp is written in Java, and uses XML as the description language for defining experiments and storing results. The software is described in Keller et al. (2009).

Datasets

MultiSense Dataset: 9,504 images paired with translation-ambiguous verbs. Each image is annotated with an English verb and its translations in German and Spanish. The data can be used to train multilingual, multimodal sense disambiguation models. A 995 image subset of MultiSense is also annotated with English description and their German translations. This can be used to evaluate the sense disambiguation capabilities of multimodal translation models. The dataset is described in Gella et al. (2019).

Verb Senses in Images (VerSe) Dataset: 3,518 images, each annotated with one of 90 verbs and with the OntoNotes sense realized for the verb in the image. The images are taken from two existing multimodal datasets (COCO and TUHOI). The dataset is described in Gella et al. (2019).

Pascal 2007 Center-click Annotation Dataset: This dataset provides center-click annotations collected on Amazon Mechanical Turk for all 20 classes of the whole trainval set of Pascal VOC 2007. Each image is annotated by two different annotators for each class in the image. This results in 14,612 clicks in total for the 5,011 trainval images. We also provide the localizations produced by our center-click object localization approach. The approach and the dataset are described in Papadopoulos et al. (2017).

Pascal Objects Eye Tracking (POET) Dataset: 6,270 images from ten Pascal VOC 2012 objects classes (cat, dog, bicycle, motorbike, boat, aeroplane, horse, cow, sofa, diningtable). Each image is annotated with the eye movement record of five participants, whose task was to identify which object class was present in the image. The dataset is described in Papadopoulos et al. (2014).

Comparing Image Description Measures: This is the dataset and code used to estimate the correlation of different text-based evaluation measures for automatic image description on the Flickr8K dataset. The measures compared include BLEU4, TER, Meteor, and ROUGE-SU4. The work is described in Elliott and Keller (2014).

Visual and Linguistic Treebank: 2,424 images with human-generated image descriptions; 341 of these images are also annotated with object boundaries and Visual Dependency Representations. The dataset is described in Elliott and Keller (2013).

Object Naming Dataset: 100 images with eye-tracking data from 24 participants performing an object naming task. The data includes manually annotated object boundaries and object labels produced by participants. The dataset is described in Clarke et al. (2013).

Task Classification Dataset: Eye-movement dataset containing 1,756 unique trials across the three tasks: visual search, image description, and object naming. For each trial, the following standard features are extracted: (a) number of fixations, (b) mean fixation duration, (c) mean saccade amplitude, and (d) percent of image covered by fixations assuming a 18 circle around the fixation position, proportion of dwell time on (e) faces, (f) bodies, and (g) objects. A set of 15 additional features is also provided. The dataset is described in Coco and Keller (2014).

Scan Pattern Dataset: An image description dataset, which contains the eye-movement data of 24 participants describing 24 visual scenes. It includes scan patterns, transcribed sentences, and pairwise similarity scores for scan patterns and sentences. The dataset is described in Coco and Keller (2012).

Padó Plausiblity Dataset: Plausibility judgments for 207 verbs, with two arguments each, annoated with PropBank and FrameNet 1.2 semantic roles. Likert-scale judgments of plausibility were obtained in a web-based experiment from 100 participants. The dataset is described in Padó et al. (2006).

Bigram Plausiblity Dataset: Plausibility judgments for seen and unseen adjective-noun, noun-noun, and verb-object bigrams (90 items each). Magnitude estimation judgments of plausibility were obtained in a web-based experiment from 27 to 40 participants per item. The dataset is described in Keller and Lapata (2003).