There is a newer version of this record available.

Dataset Open Access

Wiki-based Knowledge about Demographics and Outstanding Members

Hiba Arnaout; Simon Razniewski; Jeff Z. Pan

These datasets contains statements about demographics and outliers of Wiki-based Communities of Interests. 

Group-centric dataset (sample):

{
	"title": "winners of Priestley Medal", 
	"recorded_members": 83, 
	"topics": ["STEM.Chemistry"], 
	"demographics": [
            "occupation-chemist",
            "gender-male", 
            "citizen-U.S."
	], 
	"outliers": [
		{
			"reason": "NOT(chemist) unlike 82 recorded members", 
			"members": [
            "Francis Garvan (lawyer, art collector)"
            ]
		}, 
		{
			"reason": "NOT(male) unlike 80 recorded members", 
			"members": [
            "Mary L. Good (female)",
            "Darleane Hoffman (female)", 
            "Jacqueline Barton (female)"
            ]
		}
	]
}

Subject-centric dataset (sample):

{
	"subject": "Serena Williams", 
	"statements": [
		{
			"statement": "NOT(sport-basketball) but (tennis) unlike 4 recorded winners of Best Female Athlete ESPY Award.", 
			"score": 0.36
		},
  	{
			"statement": "NOT(occupation-politician) but (tennis player, businessperson, autobiographer) unlike 20 recorded  winners of Michigan Women's Hall of Fame.",
			"score": 0.17
		}
	]
}

This data can be also browsed at: https://wikiknowledge.onrender.com/demographics/

Files (116.2 MB)
Name Size
group_centric.jsonl
md5:12788b50f36bbfd3d81c26179aa2475b
31.0 MB Download
subject_centric.jsonl
md5:fbc0bc6fe19287d08ab5786f943c9fad
85.3 MB Download
110
15
views
downloads
All versions This version
Views 1107
Downloads 152
Data volume 1.2 GB116.2 MB
Unique views 806
Unique downloads 112

Share

Cite as