Dataset Open Access

The Collaborative Organization of Knowledge: Data Set

Spinellis, Diomidis; Louridas, Panos


JSON Export

{
  "files": [
    {
      "links": {
        "self": "https://zenodo.org/api/files/c7286458-a94e-4bc3-9374-031baa7cd162/full.out.bz2"
      }, 
      "checksum": "md5:d7c533b075894084627895aaecc80c37", 
      "bucket": "c7286458-a94e-4bc3-9374-031baa7cd162", 
      "key": "full.out.bz2", 
      "type": "bz2", 
      "size": 105598743
    }
  ], 
  "owners": [
    57930
  ], 
  "doi": "10.5281/zenodo.2526703", 
  "stats": {
    "version_unique_downloads": 13.0, 
    "unique_views": 243.0, 
    "views": 289.0, 
    "version_views": 289.0, 
    "unique_downloads": 13.0, 
    "version_unique_views": 243.0, 
    "volume": 1372783659.0, 
    "version_downloads": 13.0, 
    "downloads": 13.0, 
    "version_volume": 1372783659.0
  }, 
  "links": {
    "doi": "https://doi.org/10.5281/zenodo.2526703", 
    "conceptdoi": "https://doi.org/10.5281/zenodo.2526702", 
    "bucket": "https://zenodo.org/api/files/c7286458-a94e-4bc3-9374-031baa7cd162", 
    "conceptbadge": "https://zenodo.org/badge/doi/10.5281/zenodo.2526702.svg", 
    "html": "https://zenodo.org/record/2526703", 
    "latest_html": "https://zenodo.org/record/2526703", 
    "badge": "https://zenodo.org/badge/doi/10.5281/zenodo.2526703.svg", 
    "latest": "https://zenodo.org/api/records/2526703"
  }, 
  "conceptdoi": "10.5281/zenodo.2526702", 
  "created": "2018-12-26T07:55:55.124278+00:00", 
  "updated": "2020-01-24T19:25:23.613342+00:00", 
  "conceptrecid": "2526702", 
  "revision": 9, 
  "id": 2526703, 
  "metadata": {
    "access_right_category": "success", 
    "doi": "10.5281/zenodo.2526703", 
    "description": "<p>Wikipedia is an ongoing endeavor to create a free encyclopedia through an open computer-mediated collaborative effort. How does Wikipedia grow and maintain its coverage? This page contains supporing material relevant to a publication that examines this question.</p>\n\n<ul>\n\t<li>Diomidis Spinellis and Panagiotis Louridas. The collaborative organization of knowledge. Communications of the ACM, 51(8):68&ndash;73, August 2008. (<a href=\"http://dx.doi.org/10.1145/1378704.1378720\">doi:10.1145/1378704.1378720</a>)</li>\n</ul>\n\n<p>In the above paper, a longitudinal study of Wikipedia&#39;s evolution shows that although Wikipedia&#39;s scope is increasing, its coverage is not deteriorating. This can be explained by the fact that referring to an non-existing entry typically leads to the establishment of an article for it. Wikipedia&#39;s evolution also demonstrates the creation of a large real world scale-free graph through a combination of incremental growth and preferential attachment.</p>\n\n<p>Though this data set you can download the processed results. The file starts with a header giving various attributes of the processed data set.</p>\n\n<pre>% Number of bins: 72\n% Total revisions: 28247658\n% Maximum revisions: 28273 (George W. Bush)\n% Maximum reverts: 9218 (George W. Bush)\n% Number of moves: 81380\n% Total pages: 1898139\n% Revisions from IP addresses: 8518913\n% Total contributors: 230130\n% Maximum different contributors: 2539 (George W. Bush)\n% Redirected pages: 631567\n% Restricted pages: 2441\n% Maximum number of contained references: 17577 (List of all three letter acrony\nms)\n% Pages with at least one revert: 211704\n% Total number of reverts across all pages: 1147151\n% Total time between reverts: 54524346346\n% Moved pages: 80332\n</pre>\n\n<p>Next comes one line of data for each one of Wikipedia&#39;s entries. Here is an example.</p>\n\n<pre>A (musical note):1128386876:Mailer diablo:1130566991:MrD9:10:7:18:0:0:0:0:0:0:0:\n0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:\n0:0:0:0:0:0:0:0:0:0:0:1:1:1:2:2:2:2:2:2:2:2:2:2:2:2:E\n</pre>\n\n<p>Each line contains the following fields.</p>\n\n<ul>\n\t<li>Entry name</li>\n\t<li>Time of first definition (in seconds since Unix epoch)</li>\n\t<li>Name of the contributor who first defined the entry</li>\n\t<li>Time of first reference (in seconds since Unix epoch)</li>\n\t<li>Name of the contributor who first referenced the entry</li>\n\t<li>Number of references</li>\n\t<li>Number of contributors</li>\n\t<li>Number of revisions</li>\n\t<li>Number of reverts</li>\n\t<li>For each one of the time period bins (72 in this file) the number of references to the entry</li>\n\t<li>The letter &quot;E&quot;</li>\n</ul>\n\n<p>The fields are colon-separated. Colons in the input data are converted to an underscore.</p>\n\n<p>Finally, come lines summarizing the data set&#39;s characteristics for each time period. Here is an example.</p>\n\n<pre>2001-07-01 4851 0  27106   15129        13458   531\n</pre>\n\n<p>Each line contains the following fields.</p>\n\n<ul>\n\t<li>Start date of this period</li>\n\t<li>Number of entries</li>\n\t<li>Number of entries that are stubs</li>\n\t<li>Number of references</li>\n\t<li>Number of referenced articles</li>\n\t<li>Number of undefined entries</li>\n\t<li>Number of active contributors in this period</li>\n</ul>", 
    "language": "eng", 
    "title": "The Collaborative Organization of Knowledge: Data Set", 
    "license": {
      "id": "CC-BY-4.0"
    }, 
    "relations": {
      "version": [
        {
          "count": 1, 
          "index": 0, 
          "parent": {
            "pid_type": "recid", 
            "pid_value": "2526702"
          }, 
          "is_last": true, 
          "last_child": {
            "pid_type": "recid", 
            "pid_value": "2526703"
          }
        }
      ]
    }, 
    "communities": [
      {
        "id": "zenodo"
      }
    ], 
    "keywords": [
      "Replication package", 
      "Wikipedia", 
      "Evolution"
    ], 
    "publication_date": "2008-06-03", 
    "creators": [
      {
        "orcid": "0000-0003-4231-1897", 
        "affiliation": "Athens University of Economics and Business", 
        "name": "Spinellis, Diomidis"
      }, 
      {
        "orcid": "0000-0002-3971-4612", 
        "affiliation": "Athens University of Economics and Business", 
        "name": "Louridas, Panos"
      }
    ], 
    "access_right": "open", 
    "resource_type": {
      "type": "dataset", 
      "title": "Dataset"
    }, 
    "related_identifiers": [
      {
        "scheme": "doi", 
        "identifier": "10.1145/1378704.1378720", 
        "relation": "isSupplementTo"
      }, 
      {
        "scheme": "doi", 
        "identifier": "10.5281/zenodo.2526733", 
        "relation": "isSupplementTo"
      }, 
      {
        "scheme": "doi", 
        "identifier": "10.5281/zenodo.2526702", 
        "relation": "isVersionOf"
      }
    ]
  }
}
289
13
views
downloads
All versions This version
Views 289289
Downloads 1313
Data volume 1.4 GB1.4 GB
Unique views 243243
Unique downloads 1313

Share

Cite as