Published October 27, 2023 | Version v1
Conference paper Open

Text Mining Scholarly Publications using APIs

  • 1. ROR icon Grinnell College
  • 2. ROR icon University of Illinois Urbana-Champaign

Description

Researchers often create custom datasets for their work instead of using whole corpora of scholarly publications. In this extended abstract, I describe my work constructing a pipeline that will make the creation of these custom datasets easy. My pipeline will be reusable such that given any Digital Object Identifier (DOI) of scholarly papers it can extract the full texts, if available, and researchers can create their own datasets to analyze the papers. My pipeline uses Crossref, Elsevier, and Wiley’s TDM APIs to help navigate the license problems and other access issues related to full-text extraction and allow researchers to focus on their analysis work.

Files

ASIST_METSTI2023_poster_Sarraf_et_al.pdf

Files (343.3 kB)

Name Size Download all
md5:f5dfbbbb2887f22d4ff54fe516bfea18
343.3 kB Preview Download

Additional details

Related works

Is version of
Presentation: 2142/120049 (Handle)

Funding

Sustainable Diversity in the Computing Research Pipeline 1246649
National Science Foundation
CAREER: Using network analysis to assess confidence in research synthesis 2046454
National Science Foundation

Software

Repository URL
https://github.com/infoqualitylab/text-mining-scholarly-API
Programming language
Python