HL Dataset: Visually-grounded Description of Scenes, Actions and Rationales

Michele Cafagna; Kees van Deemter; Albert Gatt

doi:10.18653/v1/2023.inlg-main.21

Published September 11, 2023 | Version v2

Conference paper Open

HL Dataset: Visually-grounded Description of Scenes, Actions and Rationales

1. University of Malta
2. University of Utrecht

Current captioning datasets focus on object-centric captions, describing the visible objects in the image, often ending up stating the obvious (for humans), e.g. “people eating food in a park”. Although these datasets are useful to evaluate the ability of Vision & Language models to recognize and describe visual content, they do not support controlled experiments involving model testing or fine-tuning, with more high-level captions, which humans find easy and natural to produce. For example, people often describe images based on the type of scene they depict (“people at a holiday resort”) and the actions they perform (“people having a picnic”). Such concepts are based on personal experience and contribute to forming common sense assumptions. We present the High-Level Dataset, a dataset extending 14997 images from the COCO dataset, aligned with a new set of 134,973 human-annotated (high-level) captions collected along three axes: scenes, actions and rationales. We further extend this dataset with confidence scores collected from an independent set of readers, as well as a set of narrative captions generated synthetically, by combining each of the three axes. We describe this dataset and analyse it extensively. We also present baseline results for the High-Level Captioning task.

Files

2023.inlg-main.21v2.1.pdf

Files (15.6 MB)

Name	Size	Download all
2023.inlg-main.21v2.1.pdf md5:babcf6a1170ec9f4d9ba0aaaf5c8c0ca	7.8 MB	Preview Download
2023.inlg-main.21v2.pdf md5:babcf6a1170ec9f4d9ba0aaaf5c8c0ca	7.8 MB	Preview Download

Additional details

URL: https://aclanthology.org/2023.inlg-main.21

European Commission
NL4XAI - Interactive Natural Language Technology for Explainable Artificial Intelligence 860621

Accepted: 2023-07-12

Repository URL: https://github.com/michelecafagna26/HL-dataset

	All versions	This version
Views	22	22
Downloads	38	38
Data volume	459.9 MB	459.9 MB

2023.inlg-main.21v2.1.pdf

Files (15.6 MB)

Identifiers

Funding

Dates

Software

HL Dataset: Visually-grounded Description of Scenes, Actions and Rationales

Authors/Creators

Description

Files

2023.inlg-main.21v2.1.pdf

Files (15.6 MB)

Additional details

Identifiers

Funding

Dates

Software