Published January 8, 2026 | Version 1.0
Dataset Open

Open Social Science Citation Index (OpenSSCI)

  • 1. GESIS Leibniz-Institute for the Social Sciences in Cologne

Description

Description: 

The dataset OpenSSCI comprises reference metadata and citation links derived from 63,070 full-text academic documents archived in the SSOAR (Social Science Open Access Repository). The data has been curated within the OUTCITE project for downstream ingestion into OpenCitations. The main goal of OUTCITE is to research, develop, and deploy an open-source toolchain for linking literature references—including non-source items—to their sources. Demo system: <https://demo-outcite.gesis.org/>

Extracted and validated data in OpenSSCI (date: 26/11/2025):

  • Metadata: 2,306,779 (references after validation)
  • Citations: 3,126,779 (links citing–cited ID pairs)

Purpose:

Making bibliographic data available is important in all disciplines to ensure easy and fast access to the literature and other scientific resources such as research datasets. To this end, many publishers strive to index their publications in bibliographic databases enabling the linking of publications in a citation graph. Still, a significant part of citation data in disciplines such as social science is not accessible via bibliographic databases.

To provide validated reference metadata and citation links from SSOAR for ingestion into OpenCitations, enabling open, reproducible citation analysis and to support research in bibliometrics, IR, and open science by offering a reusable dataset.

 

Dataset Structure:

metadata.csv — one row per reference.
Bibliographic metadata for each referenced item extracted from SSOAR documents. Contains 2,306,779 validated references with fields including: id, title, author, pub_date, venue, volume, issue, page, type, publisher, and editor.

Example:

id title author pub_date venue volume issue page type publisher editor
doi:10.3386/w9305 Institutions Rule: The Primacy of Institutions over Geography and Integration in Economic Development Rodrik, D; Subramanian, A; Trebbi, F 2004 Journal of Economic Growth 9 2 131-165 journal article    
 

citations.csv — one row per citation link.
Directed links from each citing SSOAR document to its cited reference. Contains 3,126,779 citing–cited ID pairs, along with the corresponding publication dates (citing_publication_date, cited_publication_date).

Example:

citing_id citing_publication_date cited_id cited_publication_date

doi:10.37043/JURA.2011.3.1.3

2011

doi:10.3386/w9305

2004

 

For full details on the CSV schema and fields, see the OpenCitations GitHub documentation:https://github.com/opencitations/metadata/blob/master/documentation/csv_documentation-v1_1_2.pdf

 

 

Files

ssoar_oc_data.zip

Files (204.5 MB)

Name Size Download all
md5:0e2fd9d75e4246cc6dfecab0c540d671
204.5 MB Preview Download

Additional details

Additional titles

Subtitle (En)
A Dataset of Metadata and Citation Links from SSOAR produced by the OUTCITE Project

Related works

Is referenced by
Publication: 10.1007/s00799-024-00404-6 (DOI)

References