Bibliographic metadata on Peru's Truth and Reconciliation Commission: A curated dataset (2003-2024)
Authors/Creators
- 1. Instituto de Analítica Social e Inteligencia Estratégica Pulso PUCP, Pontificia Universidad Católica del Perú
Description
This dataset contains bibliographic metadata for 300 scholarly works on Peru's Truth and Reconciliation Commission (Comisión de la Verdad y Reconciliacion, CVR), compiled from OpenAlex and citation searching, and manually curated.
Data source and retrieval
Records were originally retrieved from OpenAlex (https://openalex.org), an open catalogue of scholarly works.
The following query was run against the OpenAlex API in January 2025 using the default.search filter, which at the time searched across titles, abstracts, and full text of works:
- Search terms: ("truth commission" OR "truth and reconciliation commission" OR "comision de la verdad") AND "peru"
- Date filter: publication_year: 2000-2024
Note: the default.search filter used for this query has since been deprecated following a major OpenAlex upgrade. The documentation for this filter as it existed at the time of retrieval is archived at:
https://web.archive.org/web/20250729194158/https://docs.openalex.org/api-entities/works/filter-works#default.search
Although the query covered 2000-2024, no records from 2000-2002 were retained after curation. The final dataset spans 2003-2024.
Curation process
The initial OpenAlex results were manually curated as follows:
1. Records not relevant to the study scope were removed.
2. Additional records identified through snowballing (i.e. screening the references of included works) were added.
3. Only fields relevant to the study were retained from the original OpenAlex export.
4. Titles and abstracts of non-English works were translated into English during curation.
5. Subject terms were manually assigned from the UNBIS Thesaurus (https://metadata.un.org/thesaurus/).
Files
- peru_trc_bibliography_2003_2024.csv
Bibliographic metadata for 300 curated records. Encoded in UTF-8. Fields are comma-separated. 300 rows (excluding header). 13 columns.
- data_dictionary.csv
Description of all variables in peru_trc_bibliography_2003_2024.csv, including data type, missing value information, and allowed values.
Data format notes
- peru_trc_bibliography_2003_2024.csv is UTF-8 encoded and should be read with a standards-compliant CSV parser (e.g. read.csv() in R, pandas in Python).
- Abstract fields (abstract_eng, abstract_orig) may contain commas and other punctuation.
- Multiple authors are separated by a semicolon within the author field (e.g. "Last, First; Last, First").
- Subject terms (unbis_concept through unbis_concept_4) are stored in wide format, one column per term level. Records have between one and four terms assigned.
- DOI values do not include the resolver prefix. To resolve a DOI, prepend https://doi.org/ (e.g. https://doi.org/10.1353/tj.2004.0083).
- language_orig uses ISO 639-3 three-letter language codes (eng, spa, fra, por, deu).
Missing data
title_orig: Null for works originally published in English (235 records).
abstract_eng: Null for 2 records where the abstract was not available at source.
abstract_orig: Null for works originally published in English.
unbis_concept_2: Null for 1 record with only one term assigned.
unbis_concept_3: Null for 19 records with fewer than three terms.
unbis_concept_4: Null for 259 records with fewer than four terms.
doi: Null for 112 records where no DOI was available (common for books, book chapters, and theses).
Files
peru_trc_bibliography_2003_2024.csv
Files
(941.6 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:41788a00c8fb8f35817bca9eae75529f
|
3.1 kB | Preview Download |
|
md5:08f819f6a25c9416a0a53b832d56ecdd
|
938.5 kB | Preview Download |