Dataset Open Access

Natural Language-Guided Programming User Study

Heyman, Geert; Huysegems, Rafeal; Justen, Pascal; Van Cutsem, Tom


DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd">
  <identifier identifierType="DOI">10.5281/zenodo.5384768</identifier>
  <creators>
    <creator>
      <creatorName>Heyman, Geert</creatorName>
      <givenName>Geert</givenName>
      <familyName>Heyman</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0001-6276-424X</nameIdentifier>
      <affiliation>Nokia Bell Labs</affiliation>
    </creator>
    <creator>
      <creatorName>Huysegems, Rafeal</creatorName>
      <givenName>Rafeal</givenName>
      <familyName>Huysegems</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0001-6244-9864</nameIdentifier>
      <affiliation>Nokia Bell Labs</affiliation>
    </creator>
    <creator>
      <creatorName>Justen, Pascal</creatorName>
      <givenName>Pascal</givenName>
      <familyName>Justen</familyName>
      <affiliation>Nokia Bell Labs</affiliation>
    </creator>
    <creator>
      <creatorName>Van Cutsem, Tom</creatorName>
      <givenName>Tom</givenName>
      <familyName>Van Cutsem</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0003-4116-4290</nameIdentifier>
      <affiliation>Nokia Bell Labs</affiliation>
    </creator>
  </creators>
  <titles>
    <title>Natural Language-Guided Programming User Study</title>
  </titles>
  <publisher>Zenodo</publisher>
  <publicationYear>2021</publicationYear>
  <subjects>
    <subject>code completion</subject>
    <subject>code prediction</subject>
    <subject>natural language-guided programming</subject>
    <subject>example-centric programming</subject>
  </subjects>
  <dates>
    <date dateType="Issued">2021-09-02</date>
  </dates>
  <resourceType resourceTypeGeneral="Dataset"/>
  <alternateIdentifiers>
    <alternateIdentifier alternateIdentifierType="url">https://zenodo.org/record/5384768</alternateIdentifier>
  </alternateIdentifiers>
  <relatedIdentifiers>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.5384767</relatedIdentifier>
  </relatedIdentifiers>
  <version>0.0.1</version>
  <rightsList>
    <rights rightsURI="https://opensource.org/licenses/BSD-3-Clause">BSD 3-Clause "New" or "Revised" License</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
  </rightsList>
  <descriptions>
    <description descriptionType="Abstract">&lt;p&gt;In this dataset you find the&amp;nbsp;user study data that was used in the &lt;strong&gt;&lt;em&gt;Natural Language-Guided Programming&lt;/em&gt;&lt;/strong&gt; paper, which is accepted for Onward! 2021. A preprint can be found here&amp;nbsp;&lt;a href="https://arxiv.org/pdf/2108.05198.pdf"&gt;https://arxiv.org/pdf/2108.05198.pdf&lt;/a&gt;. The dataset consists of the following files:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;
	&lt;p&gt;benchmark.json contains 201 test cases. Each test case consists of context, a natural language intent and target code. The test cases are intended to evaluate a model that can predict code giving a piece of context code and a natural language intent. The test cases were derived from Jupyter notebooks that were crawled from Github projects with permissive licenses. In the project_metadata field you find information about the original project such as its git url&amp;nbsp;and&amp;nbsp;license.&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;predictions-annotated.json contains predictions of the three models used in the paper for 100 test cases in benchmark.json. Each prediction is accompanied with qualitive assesments from three annotators.&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;train-index.jsonl is the list of github projects that were used for training the models.&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;eval-index.jsonl is a list of github projects that we kept separate for evaluation. The benchmark.json was created from a random subset of the projects in this list.&lt;/p&gt;
	&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For more details we refer to the paper.&lt;/p&gt;</description>
  </descriptions>
</resource>
66
4
views
downloads
All versions This version
Views 6666
Downloads 44
Data volume 16.9 MB16.9 MB
Unique views 5454
Unique downloads 11

Share

Cite as