A Dataset of American Poetry by Poets from Historically Underrepresented Groups in the HathiTrust Digital Library
Authors/Creators
Description
This dataset provides American poetry data with poem-level page boundaries from selected poetry collections in the HathiTrust Digital Library. It encompasses 9,321 poems from 113 collections by American poets from historically underrepresented groups, including African Americans, Asian Americans, Pacific Islanders, Latin Americans, and Native Americans. Each CSV file represents each poetry collection, and each poem is identified by its start and end page numbers in HathiTrust. This dataset can be used for various computational analyses, including word frequency, topic modeling, word embeddings, and comparative analysis across poems from diverse communities.
Files
htrc_sections_updated.zip
Files
(128.0 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:d27d49a5c6c8c2c5255dc055cc2fdba6
|
128.0 kB | Preview Download |