Published January 26, 2023 | Version 1
Dataset Open

Source Code Snippets and Quality Analytics Dataset

  • 1. Electrical and Computer Engineering Dept., Aristotle University of Thessaloniki, Greece

Description

This dataset contains the Java code snippets of CodeSearchNet, processed along with their abstract syntax trees and clustered according to their similarity. It also includes static analysis metrics, PMD violations and readability metrics for each snippet.

You can use the dataset simply with the following steps:

   1. Download the data.

   2. Navigate to the download folder and use the mongorestore (https://docs.mongodb.com/manual/reference/program/mongorestore/) command. (Have in mind to use the --gzip flag)

Files

Files (924.6 MB)

Name Size Download all
md5:1fcd7433aea2d159440d33ce5fa35f77
50.9 MB Download
md5:b78d640447d72a4e0295fb5c726d82c8
179 Bytes Download
md5:8710c09631ce91a7bc02569de502c6a8
203.2 MB Download
md5:92a55d4815588912e70ddbe1eddf2c7a
180 Bytes Download
md5:0ccde44f37440a19f4faf745f1127735
636.9 MB Download
md5:c2191c479033ddb6aee56d229dfdc62a
204 Bytes Download
md5:b8429a2ac5ed22eee87ec1718d91ab6a
33.6 MB Download
md5:c96f7004dd4e84600d890cb102057f8e
175 Bytes Download