Published December 7, 2022 | Version v9
Dataset Open

Wiki-based Communities of Interest: Demographics and Outliers

  • 1. Max Planck Institute for Informatics
  • 2. The University of Edinburgh


These datasets contains statements about demographics and outliers of Wiki-based Communities of Interest. 

Group-centric dataset (sample):

	"title": "winners of Priestley Medal", 
	"recorded_members": 83, 
	"topics": ["STEM.Chemistry"], 
	"demographics": [
	"outliers": [
			"reason": "NOT(chemist) unlike 82 recorded members", 
			"members": [
            "Francis Garvan (lawyer, art collector)"
			"reason": "NOT(male) unlike 80 recorded members", 
			"members": [
            "Mary L. Good (female)",
            "Darleane Hoffman (female)", 
            "Jacqueline Barton (female)"

Subject-centric dataset (sample):

	"subject": "Serena Williams", 
	"statements": [
			"statement": "NOT(sport-basketball) but (tennis) unlike 4 recorded winners of Best Female Athlete ESPY Award.", 
			"score": 0.36
			"statement": "NOT(occupation-politician) but (tennis player, businessperson, autobiographer) unlike 20 recorded  winners of Michigan Women's Hall of Fame.",
			"score": 0.17

This data can be also browsed at:


Files (235.8 MB)

Name Size Download all
63.7 MB Download
172.0 MB Download