10.5281/zenodo.6327207
https://zenodo.org/records/6327207
oai:zenodo.org:6327207
Cameron Neylon
Cameron Neylon
0000-0002-0068-716X
Centre for Culture and Technology, Curtin University
Bianca Kramer
Bianca Kramer
0000-0002-5965-6560
Utrehct University
Analysis of references in the IPCC AR6 WG2 Report of 2022
Zenodo
2022
IPCC, Crossref, references, DOIs
2022-03-04
https://github.com/Curtin-Open-Knowledge-Initiative/ipcc-ar6/releases/tag/v0.8
10.5281/zenodo.6327206
https://zenodo.org/communities/coki
0.8
Creative Commons Public Domain Dedication and Certification
This repository contains data on 17,420 DOIs cited in the IPCC Working Group 2 contribution to the Sixth Assessment Report, and the code to link them to the dataset built at the Curtin Open Knowledge Initiative (COKI).
References were extracted from the report's PDFs (downloaded 2022-03-01) via Scholarcy and exported as RIS and BibTeX files. DOI strings were identified from RIS files by pattern matching and saved as CSV file. The list of DOIs for each chapter and cross chapter paper was processed using a custom Python script to generate a pandas DataFrame which was saved as CSV file and uploaded to Google Big Query.
We used the main object table of the Academic Observatory, which combines information from Crossref, Unpaywall, Microsoft Academic, Open Citations, the Research Organization Registry and Geonames to enrich the DOIs with bibliographic information, affiliations, and open access status. A custom query was used to join and format the data and the resulting table was visualised in a Google DataStudio dashboard.
A brief descriptive analysis was provided as a blogpost on the COKI website.
The repository contains the following content:
Data:
data/scholarcy/RIS/ - extracted references as RIS files
data/scholarcy/BibTeX/ - extracted references as BibTeX files
IPCC_AR6_WGII_dois.csv - list of DOIs
Processing:
preprocessing.txt - preprocessing steps for identifying and cleaning DOIs
process.py - Python script for transforming data and linking to COKI data through Google Big Query
Outcomes:
Dataset on BigQuery - requires a google account for access and bigquery account for querying
Data Studio Dashboard - interactive analysis of the generated data
Zotero library of references extracted via Scholarcy
PDF version of blogpost
Note on licenses:
Data are made available under CC0
Code is made available under Apache License 2.0
Archived version of Release 2022-03-04 of GitHub repository:
https://github.com/Curtin-Open-Knowledge-Initiative/ipcc-ar6