Natural Language-Guided Programming User Study

Heyman, Geert; Huysegems, Rafeal; Justen, Pascal; Van Cutsem, Tom

doi:10.5281/zenodo.5384768

Published September 2, 2021 | Version 0.0.1

Dataset Open

Natural Language-Guided Programming User Study

1. Nokia Bell Labs

In this dataset you find the user study data that was used in the Natural Language-Guided Programming paper, which is accepted for Onward! 2021. A preprint can be found here https://arxiv.org/pdf/2108.05198.pdf. The dataset consists of the following files:

benchmark.json contains 201 test cases. Each test case consists of context, a natural language intent and target code. The test cases are intended to evaluate a model that can predict code giving a piece of context code and a natural language intent. The test cases were derived from Jupyter notebooks that were crawled from Github projects with permissive licenses. In the project_metadata field you find information about the original project such as its git url and license.
predictions-annotated.json contains predictions of the three models used in the paper for 100 test cases in benchmark.json. Each prediction is accompanied with qualitive assesments from three annotators.
train-index.jsonl is the list of github projects that were used for training the models.
eval-index.jsonl is a list of github projects that we kept separate for evaluation. The benchmark.json was created from a random subset of the projects in this list.

For more details we refer to the paper.

Files

benchmark.json

Files (16.9 MB)

Name	Size	Download all
benchmark.json md5:e1080d051298444befa1014d4fa7bda0	4.2 MB	Preview Download
eval-index.jsonl md5:39f504c8d64f261b7862e3917aa259f7	632.6 kB	Download
predictions-annotated.json md5:a50ad18a28bf8d1d750afba51118f00e	6.4 MB	Preview Download
train-index.jsonl md5:cfd046716905235ee8f7139ee4ee240f	5.7 MB	Download

	All versions	This version
Views	516	515
Downloads	301	301
Data volume	1.4 GB	1.4 GB

Natural Language-Guided Programming User Study

Authors/Creators

Description

Files

benchmark.json

Files (16.9 MB)