Published May 31, 2018
| Version v1
Dataset
Open
Zenodo Machine Learning
Creators
Description
The Zenodo-ML dataset is a collection of just under 10K records from the Zenodo service for generation of digital object identifiers (DOIs) for software and associated digital resources. In human terms, this means that someone writes a codebase for their software, and links it to Zenodo so others can find and cite it. For this dataset, it means that we can find these codebases, and do the following:
- convert each script file into a set of 80x80 images, with characters converted to ordinal, for use with machine learning
- generate a file hierarchy tree for graph analysis
- extract complete metadata like domain, authors, and description for the software
Files
Files
(20.7 GB)
Name | Size | Download all |
---|---|---|
md5:bff9f8ca3632fa7372f0b9e440b85c5a
|
20.7 GB | Download |