Dataset Open Access

A Code Token Type Taxonomy-enhanced dataset with pre-computed token types for Python150k

Le, Kim Tuyen; Rashidi, Gabriel; Andrzejak, Artur


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator>Le, Kim Tuyen</dc:creator>
  <dc:creator>Rashidi, Gabriel</dc:creator>
  <dc:creator>Andrzejak, Artur</dc:creator>
  <dc:date>2021-11-28</dc:date>
  <dc:description>Code Token Type Taxonomy (CT3) is a methodology for refined evaluation of ML-based code completion approaches.

We published the CT3-enhanced dataset with pre-computed token types for each token in the Python150k dataset.

The dataset was obtained from an empirical study of the below paper:

Kim Tuyen Le, Gabriel Rashidi, and Artur Andrzejak. A Methodology for Refined Evaluation of ML-based Code Completion Approaches. In Special Issue on Programming Language Processing, Data Mining and Knowledge Discovery.

Please read the README.txt file for detailed information of structuring the enhanced dataset.</dc:description>
  <dc:identifier>https://zenodo.org/record/5733013</dc:identifier>
  <dc:identifier>10.5281/zenodo.5733013</dc:identifier>
  <dc:identifier>oai:zenodo.org:5733013</dc:identifier>
  <dc:relation>doi:10.5281/zenodo.5148585</dc:relation>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:rights>https://creativecommons.org/licenses/by/4.0/legalcode</dc:rights>
  <dc:subject>code completion</dc:subject>
  <dc:subject>accuracy evaluation</dc:subject>
  <dc:subject>code token types</dc:subject>
  <dc:title>A Code Token Type Taxonomy-enhanced dataset with pre-computed token types for Python150k</dc:title>
  <dc:type>info:eu-repo/semantics/other</dc:type>
  <dc:type>dataset</dc:type>
</oai_dc:dc>
50
2
views
downloads
All versions This version
Views 5030
Downloads 22
Data volume 2.1 GB2.1 GB
Unique views 3723
Unique downloads 22

Share

Cite as