Conference paper Open Access

A Structural Model for Contextual Code Changes

Brody, Shaked; Alon, Uri; Yahav, Eran

DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="" xmlns="" xsi:schemaLocation="">
  <identifier identifierType="DOI">10.5281/zenodo.4036303</identifier>
      <creatorName>Brody, Shaked</creatorName>
      <creatorName>Alon, Uri</creatorName>
      <creatorName>Yahav, Eran</creatorName>
    <title>A Structural Model for Contextual Code Changes</title>
    <subject>Programming Languages</subject>
    <subject>Machine Learning</subject>
    <date dateType="Issued">2020-09-18</date>
  <resourceType resourceTypeGeneral="ConferencePaper"/>
    <alternateIdentifier alternateIdentifierType="url"></alternateIdentifier>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.4036302</relatedIdentifier>
    <rights rightsURI="">Creative Commons Attribution 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
    <description descriptionType="Abstract">&lt;p&gt;We address the problem of predicting edit completions based on a learned model that was trained on past edits.&lt;br&gt;
Given a code snippet that is partially edited, our goal is to predict a completion of the edit for the rest of the&lt;br&gt;
snippet. We refer to this task as the EditCompletion&amp;nbsp;task and present a novel approach for tackling it. The&lt;br&gt;
main idea is to directly represent structural edits. This allows us to model the likelihood of the edit itself, rather&lt;br&gt;
than learning the likelihood of the edited code. We represent an edit operation as a path in the program&amp;rsquo;s Abstract&lt;br&gt;
Syntax Tree (AST), originating from the source of the edit to the target of the edit. Using this representation, we&lt;br&gt;
present a powerful and lightweight neural model for the EditCompletion&amp;nbsp;task.&lt;/p&gt;

We conduct a thorough evaluation, comparing our approach to a variety of representation and modeling&lt;br&gt;
approaches that are driven by multiple strong models such as LSTMs, Transformers, and neural CRFs. Our&lt;br&gt;
experiments show that our model achieves 28% relative gain over state-of-the-art sequential models and 2&amp;times;&lt;br&gt;
higher accuracy than syntactic models that learn to generate the edited code instead of modeling the edits&lt;br&gt;
directly. We make our code, dataset, and trained models publicly available.&lt;/p&gt;</description>
All versions This version
Views 7373
Downloads 1616
Data volume 488.8 MB488.8 MB
Unique views 7070
Unique downloads 1313


Cite as