There is a newer version of this record available.

Dataset Open Access Open Source Repository and Dependency Metadata

Andrew Nesbitt; Benjamin Nickolls

JSON Export

  "files": [
      "links": {
        "self": ""
      "checksum": "md5:053fcb882d0ee038632a819a7829e1b8", 
      "bucket": "5768c386-c8c8-44a7-872a-1ee5adfb8984", 
      "key": "", 
      "type": "gz", 
      "size": 7572176930
  "owners": [
  "doi": "10.5281/zenodo.1068916", 
  "stats": {
    "version_unique_downloads": 6742.0, 
    "unique_views": 3937.0, 
    "views": 4194.0, 
    "version_views": 11541.0, 
    "unique_downloads": 724.0, 
    "version_unique_views": 9322.0, 
    "volume": 6459066921290.0, 
    "version_downloads": 30824.0, 
    "downloads": 853.0, 
    "version_volume": 509216495072882.0
  "links": {
    "doi": "", 
    "conceptdoi": "", 
    "bucket": "", 
    "conceptbadge": "", 
    "html": "", 
    "latest_html": "", 
    "badge": "", 
    "latest": ""
  "conceptdoi": "10.5281/zenodo.808272", 
  "created": "2017-11-30T14:19:35.910630+00:00", 
  "updated": "2020-02-13T21:10:15.276415+00:00", 
  "conceptrecid": "808272", 
  "revision": 11, 
  "id": 1068916, 
  "metadata": {
    "access_right_category": "success", 
    "doi": "10.5281/zenodo.1068916", 
    "description": "<p><strong>What is in this release?</strong></p>\n\n<p>In this release you will find data about software distributed and/or crafted publicly on the Internet. You will find information about its development, its distribution and its relationship with other software included as a dependency. You will not find any information about the individuals who create and maintain these projects.</p>\n\n<p>Further information and documentation on this data set can be found at</p>\n\n<p>For enquiries please contact</p>\n\n<p>This dataset contains seven csv files:</p>\n\n<p><strong>Projects</strong></p>\n\n<p>A project is a piece of software available on any one of the 34 package managers supported by</p>\n\n<p><strong>Versions</strong></p>\n\n<p>A version is an immutable published version of a Project from a package manager. Not all package managers have a concept of publishing versions, often relying directly on tags/branches from a revision control tool.</p>\n\n<p><strong>Tags</strong></p>\n\n<p>A tag is equivalent to a tag in a revision control system. Tags are sometimes used instead of Versions where a package manager does not use the concept of versions. Tags are often semantic version numbers.</p>\n\n<p><strong>Dependencies</strong></p>\n\n<p>Dependencies describe the relationship between a project and the software it builds upon. Dependencies belong to Version. Each Version can have different sets of dependencies. Dependencies point at a specific Version or range of versions of other projects.</p>\n\n<p><strong>Repositories</strong></p>\n\n<p>A repository represents a publically accessible source code repository from either, or Repositories are distinct from Projects, they are not distributed via a package manager and typically an application for end users rather than component to build upon.</p>\n\n<p><strong>Repository dependencies</strong></p>\n\n<p>A repository dependency is a dependency upon a Version from a package manager has been specified in a manifest file, either as a manually added dependency committed by a user or listed as a generated dependency listed in a lockfile that has been automatically generated by a package manager and committed.</p>\n\n<p><strong>Projects with related Repository fields</strong></p>\n\n<p>This is an alternative projects export that denormalizes a projects related source code repository inline to reduce the need to join between two data sets.</p>\n\n<p><strong>Licence</strong></p>\n\n<p>This dataset is released under the Creative Commons Attribution-ShareAlike 4.0 International Licence.</p>\n\n<p>This licence provides the user with the freedom to use, adapt and redistribute this data. In return the user must publish any derivative work under a similarly open licence, attributing as a data source. The full text of the licence is included in the data.</p>\n\n<p><strong>Access, Attribution and Citation</strong></p>\n\n<p>The dataset is available to download from Zenodo at&nbsp;</p>\n\n<p>Please attribute as a data source by including the words &lsquo;Includes data from; and reference the Digital Object identifier: 10.5281/Zenodo.1068916.</p>", 
    "language": "eng", 
    "title": " Open Source Repository and Dependency Metadata", 
    "license": {
      "id": "CC-BY-SA-4.0"
    "relations": {
      "version": [
          "count": 6, 
          "index": 2, 
          "parent": {
            "pid_type": "recid", 
            "pid_value": "808272"
          "is_last": false, 
          "last_child": {
            "pid_type": "recid", 
            "pid_value": "3626071"
    "version": "1.1.0", 
    "keywords": [
      "open source", 
      "package managers"
    "publication_date": "2017-11-29", 
    "creators": [
        "affiliation": "", 
        "name": "Andrew Nesbitt"
        "affiliation": "", 
        "name": "Benjamin Nickolls"
    "access_right": "open", 
    "resource_type": {
      "type": "dataset", 
      "title": "Dataset"
    "related_identifiers": [
        "scheme": "doi", 
        "identifier": "10.5281/zenodo.808272", 
        "relation": "isVersionOf"
All versions This version
Views 11,5414,194
Downloads 30,824853
Data volume 509.2 TB6.5 TB
Unique views 9,3223,937
Unique downloads 6,742724


Cite as