comparison_sciences Dataset
Creators
Description
The comparison_sciences
dataset contains word-frequency and other non-consumptive-use data about 553,699 unique English-language news documents (no duplicate or close-variant documents) that contain the words "science" or "sciences." The documents came from U.S. mainstream and student news sources published during 1977-2019 (though mostly from 1985-2019). WE1S researchers use this data to understand how public discourse about the humanities compares to public discourse about science.
We gathered this data using keyword searches for "science," which found articles containing either (or both) the words "science" and "sciences." We took data from the top 10 circulating newspapers in the U.S. and from University Wire sources (student newspapers). Documents in this dataset may also contain the word "humanities," just as documents in the humanities_keyword
dataset may contain the words "science" or "sciences."
(See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")
Notes
Files
Files
(33.7 GB)
Name | Size | Download all |
---|---|---|
md5:287ee7a2576caabecc3a151aab4ff05e
|
33.7 GB | Download |