Published May 31, 2018
                      
                       | Version v1
                    
                    
                      
                        
                          Dataset
                        
                      
                      
                        
                          
                        
                        
                          Open
                        
                      
                    
                  Zenodo Machine Learning
Creators
Description
The Zenodo-ML dataset is a collection of just under 10K records from the Zenodo service for generation of digital object identifiers (DOIs) for software and associated digital resources. In human terms, this means that someone writes a codebase for their software, and links it to Zenodo so others can find and cite it. For this dataset, it means that we can find these codebases, and do the following:
- convert each script file into a set of 80x80 images, with characters converted to ordinal, for use with machine learning
- generate a file hierarchy tree for graph analysis
- extract complete metadata like domain, authors, and description for the software
Files
      
        Files
         (20.7 GB)
        
      
    
    | Name | Size | Download all | 
|---|---|---|
| md5:bff9f8ca3632fa7372f0b9e440b85c5a | 20.7 GB | Download |