There is a newer version of this record available.

Dataset Open Access Open Source Repository and Dependency Metadata

Andrew Nesbitt; Benjamin Nickolls

JSON-LD ( Export

  "description": "<p><strong>What is in this release?</strong></p>\n\n<p>In this release you will find data about software distributed and/or crafted publicly on the Internet. You will find information about its development, its distribution and its relationship with other software included as a dependency. You will not find any information about the individuals who create and maintain these projects.</p>\n\n<p>Further information and documentation on this data set can be found at</p>\n\n<p>For enquiries please contact</p>\n\n<p>This dataset contains seven csv files:</p>\n\n<p><strong>Projects</strong></p>\n\n<p>A project is a piece of software available on any one of the 33 package managers supported by</p>\n\n<p><strong>Versions</strong></p>\n\n<p>A version is an immutable published version of a Project from a package manager. Not all package managers have a concept of publishing versions, often relying directly on tags/branches from a revision control tool.</p>\n\n<p><strong>Tags</strong></p>\n\n<p>A tag is equivalent to a tag in a revision control system. Tags are sometimes used instead of Versions where a package manager does not use the concept of versions. Tags are often semantic version numbers.</p>\n\n<p><strong>Dependencies</strong></p>\n\n<p>Dependencies describe the relationship between a project and the software it builds upon. Dependencies belong to Version. Each Version can have different sets of dependencies. Dependencies point at a specific Version or range of versions of other projects.</p>\n\n<p><strong>Repositories</strong></p>\n\n<p>A repository represents a publically accessible source code repository from either, or Repositories are distinct from Projects, they are not distributed via a package manager and typically an application for end users rather than component to build upon.</p>\n\n<p><strong>Repository dependencies</strong></p>\n\n<p>A repository dependency is a dependency upon a Version from a package manager has been specified in a manifest file, either as a manually added dependency committed by a user or listed as a generated dependency listed in a lockfile that has been automatically generated by a package manager and committed.</p>\n\n<p><strong>Projects with related Repository fields</strong></p>\n\n<p>This is an alternative projects export that denormalizes a projects related source code repository inline to reduce the need to join between two data sets.</p>\n\n<p><strong>Licence</strong></p>\n\n<p>This dataset is released under the Creative Commons Attribution-ShareAlike 4.0 International Licence.</p>\n\n<p>This licence provides the user with the freedom to use, adapt and redistribute this data. In return the user must publish any derivative work under a similarly open licence, attributing as a data source. The full text of the licence is included in the data.</p>\n\n<p><strong>Access, Attribution and Citation</strong></p>\n\n<p>The dataset is available to download from Zenodo at\u00a0</p>\n\n<p>Please attribute as a data source by including the words \u2018Includes data from\u2019 and reference the Digital Object identifier: 10.5281/Zenodo.808273.</p>", 
  "license": "", 
  "creator": [
      "affiliation": "", 
      "@type": "Person", 
      "name": "Andrew Nesbitt"
      "affiliation": "", 
      "@type": "Person", 
      "name": "Benjamin Nickolls"
  "url": "", 
  "datePublished": "2017-06-15", 
  "version": "1.0.0", 
  "keywords": [
    "open source", 
    "package managers"
  "@context": "", 
  "distribution": [
      "contentUrl": "", 
      "encodingFormat": "zip", 
      "@type": "DataDownload"
  "identifier": "", 
  "@id": "", 
  "@type": "Dataset", 
  "name": " Open Source Repository and Dependency Metadata"
All versions This version
Views 18,7154,760
Downloads 43,2831,558
Data volume 770.4 TB9.2 TB
Unique views 14,8404,189
Unique downloads 11,3201,165


Cite as