Published July 4, 2021 | Version v1
Dataset Open

humanities_keywords Dataset


The WE1S humanities_keywords dataset contains word-frequency and other non-consumptive-use data about 474,930 unique documents (no duplicate or close variants) mentioning the word "humanities" in English-language news sources. and other keywords related to the humanities in English-language news sources. Other keywords include "liberal arts," "the arts," "literature," "history," and "philosophy." The documents came from 850 U.S. and 437 international news sources with their associated blogs (including student newspapers) published mostly during 1989-2019. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")


WE1S makes available word frequency data only "non-consumptive use". This dataset cannot be used to access, read, or reconstruct the original texts.

The data has been archived in jsonl format (each json document is delimited by a line break).


Files (30.8 GB)

Name Size Download all
30.8 GB Download