Dataset Open Access

Wiki-based Communities of Interest: Demographics and Outliers

Hiba Arnaout; Simon Razniewski; Jeff Z. Pan


JSON-LD (schema.org) Export

{
  "inLanguage": {
    "alternateName": "eng", 
    "@type": "Language", 
    "name": "English"
  }, 
  "description": "<p>These datasets contains&nbsp;statements about demographics and outliers&nbsp;of Wiki-based Communities of Interest.&nbsp;</p>\n\n<p><strong>Group-centric dataset (sample):</strong></p>\n\n<pre><code class=\"language-json\">{\n\t\"title\": \"winners of Priestley Medal\", \n\t\"recorded_members\": 83, \n\t\"topics\": [\"STEM.Chemistry\"], \n\t\"demographics\": [\n            \"occupation-chemist\",\n            \"gender-male\", \n            \"citizen-U.S.\"\n\t], \n\t\"outliers\": [\n\t\t{\n\t\t\t\"reason\": \"NOT(chemist) unlike 82 recorded members\", \n\t\t\t\"members\": [\n            \"Francis Garvan (lawyer, art collector)\"\n            ]\n\t\t}, \n\t\t{\n\t\t\t\"reason\": \"NOT(male) unlike 80 recorded members\", \n\t\t\t\"members\": [\n            \"Mary L. Good (female)\",\n            \"Darleane Hoffman (female)\", \n            \"Jacqueline Barton (female)\"\n            ]\n\t\t}\n\t]\n}</code></pre>\n\n<p><strong>Subject-centric dataset (sample):</strong></p>\n\n<pre><code class=\"language-json\">{\n\t\"subject\": \"Serena Williams\", \n\t\"statements\": [\n\t\t{\n\t\t\t\"statement\": \"NOT(sport-basketball) but (tennis) unlike 4 recorded winners of Best Female Athlete ESPY Award.\", \n\t\t\t\"score\": 0.36\n\t\t},\n  \t{\n\t\t\t\"statement\": \"NOT(occupation-politician) but (tennis player, businessperson, autobiographer) unlike 20 recorded  winners of Michigan Women's Hall of Fame.\",\n\t\t\t\"score\": 0.17\n\t\t}\n\t]\n}</code></pre>\n\n<p><strong>This data can be also browsed at:&nbsp;<a href=\"https://wikiknowledge.onrender.com/demographics/\">https://wikiknowledge.onrender.com/demographics/</a></strong></p>", 
  "license": "https://creativecommons.org/licenses/by/4.0/legalcode", 
  "creator": [
    {
      "affiliation": "Max Planck Institute for Informatics", 
      "@type": "Person", 
      "name": "Hiba Arnaout"
    }, 
    {
      "affiliation": "Max Planck Institute for Informatics", 
      "@type": "Person", 
      "name": "Simon Razniewski"
    }, 
    {
      "affiliation": "The University of Edinburgh", 
      "@type": "Person", 
      "name": "Jeff Z. Pan"
    }
  ], 
  "url": "https://zenodo.org/record/7537200", 
  "datePublished": "2022-12-07", 
  "keywords": [
    "wikipedia", 
    "wikimedia", 
    "wikidata", 
    "demography", 
    "trivia"
  ], 
  "@context": "https://schema.org/", 
  "distribution": [
    {
      "contentUrl": "https://zenodo.org/api/files/4114f475-9bed-403f-93f1-a03bd3fd7782/group_centric.jsonl", 
      "encodingFormat": "jsonl", 
      "@type": "DataDownload"
    }, 
    {
      "contentUrl": "https://zenodo.org/api/files/4114f475-9bed-403f-93f1-a03bd3fd7782/subject_centric.jsonl", 
      "encodingFormat": "jsonl", 
      "@type": "DataDownload"
    }
  ], 
  "identifier": "https://doi.org/10.5281/zenodo.7537200", 
  "@id": "https://doi.org/10.5281/zenodo.7537200", 
  "@type": "Dataset", 
  "name": "Wiki-based Communities of Interest: Demographics and Outliers"
}
109
15
views
downloads
All versions This version
Views 10952
Downloads 158
Data volume 1.2 GB834.7 MB
Unique views 7946
Unique downloads 115

Share

Cite as