Video/Audio Open Access

Crowdsourcing historical text and data with the Chinese Text Project

Sturgeon, Donald

Paper presented on Friday 11 June 2021 at the Digital Medievalist Global Symposium The past, present, and future of Digital Medieval Studies for the Asia & Oceania Panel, in the session Engaging in Chinese Literature.

The Chinese Text Project ( is a crowdsourced digital library of premodern Chinese writing, containing over 35 million pages of scanned primary source material and billions of words of transcribed text. In this talk I describe the implementation of a crowdsourced semantic annotation system for these texts, as well as the joint construction of a crowdsourced knowledge graph recording data covering close to 3000 years of Chinese history.

Files (1.0 GB)
Name Size
5-Donald Sturgeon.mp4
1.0 GB Download
All versions This version
Views 1111
Downloads 55
Data volume 5.2 GB5.2 GB
Unique views 1010
Unique downloads 55


Cite as