Published January 26, 2023
| Version 1
Dataset
Open
Source Code Snippets and Quality Analytics Dataset
- 1. Electrical and Computer Engineering Dept., Aristotle University of Thessaloniki, Greece
Description
This dataset contains the Java code snippets of CodeSearchNet, processed along with their abstract syntax trees and clustered according to their similarity. It also includes static analysis metrics, PMD violations and readability metrics for each snippet.
You can use the dataset simply with the following steps:
1. Download the data.
2. Navigate to the download folder and use the mongorestore (https://docs.mongodb.com/manual/reference/program/mongorestore/) command. (Have in mind to use the --gzip flag)
Files
Files
(924.6 MB)
Name | Size | Download all |
---|---|---|
md5:1fcd7433aea2d159440d33ce5fa35f77
|
50.9 MB | Download |
md5:b78d640447d72a4e0295fb5c726d82c8
|
179 Bytes | Download |
md5:8710c09631ce91a7bc02569de502c6a8
|
203.2 MB | Download |
md5:92a55d4815588912e70ddbe1eddf2c7a
|
180 Bytes | Download |
md5:0ccde44f37440a19f4faf745f1127735
|
636.9 MB | Download |
md5:c2191c479033ddb6aee56d229dfdc62a
|
204 Bytes | Download |
md5:b8429a2ac5ed22eee87ec1718d91ab6a
|
33.6 MB | Download |
md5:c96f7004dd4e84600d890cb102057f8e
|
175 Bytes | Download |