Published August 16, 2025 | Version v1
Dataset Open

Snapshot — WikiCite subclass hierarchy for the Wikidata scholarly graph split

Authors/Creators

  • 1. ROR icon Wikimedia Brasil

Description

This dataset is a modification of a snapshot taken on August 15, 2025, from the Wikicite subclass hierarchy Google spreadsheet, which contained mostly machine-generated content. Column headers were rewritten for clarity. 

This table contains results of queries in the Wikidata Query Service at some unknown point previous to the Wikidata graph split, which occurred May 9, 2025. 

It shows a list of P31 ("instance of") values that mark items considered "scholarly" for the purposes of the split. Items that have one of these items as values for the P31 property have all their triples directed to the graph queryable in the alternative endpoint (https://query-scholarly.wikidata.org). 

The table also shows other entities considered, but not included, in the final rule for the split. 

The presence of sitelinks (links from Wikidata to other Wikimedia projects, such as English Wikipedia) was also considered in the analysis, as the team tried to minimize the number of items with sitelinks in the scholarly graph.  


Files

WikiCite subclass hierarchy for the scholarly graph split.csv

Files (535.8 kB)