Service Incident: New DOI registrations are working again. Re-registration of failed DOI registrations (~500) are still affected by the service incident at DataCite (our DOI registration agency).
Published March 15, 2019 | Version v1
Conference paper Open

PathMiner : A Library for Mining of Path-Based Representations of Code

Description

One recent, significant advance in modeling source code for machine learning algorithms has been the introduction of path-based representation -- an approach consisting in representing a snippet of code as a collection of paths from its syntax tree. Such representation efficiently captures the structure of code, which, in turn, carries its semantics and other information.
Building the path-based representation involves parsing the code and extracting the paths from its syntax tree; these steps build up to a substantial technical job. With no common reusable toolkit existing for this task, the burden of mining diverts the focus of researchers from the essential work and hinders newcomers in the field of machine learning on code.


In this paper, we present PathMiner -- an open-source library for mining path-based representations of code. PathMiner is fast, flexible, well-tested, and easily extensible to support input code in any common programming language. Preprint [https://doi.org/10.5281/zenodo.2595271]; released tool [https://doi.org/10.5281/zenodo.2595257].

Files

pathminer-preprint.pdf

Files (514.6 kB)

Name Size Download all
md5:f1707759e8f38c1e7ec53a4c5fd49462
514.6 kB Preview Download