Open Social Science Citation Index (OpenSSCI)
Authors/Creators
- 1. GESIS Leibniz-Institute for the Social Sciences in Cologne
Description
Description:
The dataset OpenSSCI comprises reference metadata and citation links derived from 63,070 full-text academic documents archived in the SSOAR (Social Science Open Access Repository). The data has been curated within the OUTCITE project for downstream ingestion into OpenCitations. The main goal of OUTCITE is to research, develop, and deploy an open-source toolchain for linking literature references—including non-source items—to their sources. Demo system: <https://demo-outcite.gesis.org/>
Extracted and validated data in OpenSSCI (date: 26/11/2025):
- Metadata: 2,306,779 (references after validation)
- Citations: 3,126,779 (links citing–cited ID pairs)
Purpose:
Making bibliographic data available is important in all disciplines to ensure easy and fast access to the literature and other scientific resources such as research datasets. To this end, many publishers strive to index their publications in bibliographic databases enabling the linking of publications in a citation graph. Still, a significant part of citation data in disciplines such as social science is not accessible via bibliographic databases.
To provide validated reference metadata and citation links from SSOAR for ingestion into OpenCitations, enabling open, reproducible citation analysis and to support research in bibliometrics, IR, and open science by offering a reusable dataset.
metadata.csv — one row per reference.
Bibliographic metadata for each referenced item extracted from SSOAR documents. Contains 2,306,779 validated references with fields including: id, title, author, pub_date, venue, volume, issue, page, type, publisher, and editor.
Example:
| id | title | author | pub_date | venue | volume | issue | page | type | publisher | editor |
| doi:10.3386/w9305 | Institutions Rule: The Primacy of Institutions over Geography and Integration in Economic Development | Rodrik, D; Subramanian, A; Trebbi, F | 2004 | Journal of Economic Growth | 9 | 2 | 131-165 | journal article |
citations.csv — one row per citation link.
Directed links from each citing SSOAR document to its cited reference. Contains 3,126,779 citing–cited ID pairs, along with the corresponding publication dates (citing_publication_date, cited_publication_date).
Example:
| citing_id | citing_publication_date | cited_id | cited_publication_date |
|
doi:10.37043/JURA.2011.3.1.3 |
2011 |
doi:10.3386/w9305 |
2004 |
For full details on the CSV schema and fields, see the OpenCitations GitHub documentation:https://github.com/opencitations/metadata/blob/master/documentation/csv_documentation-v1_1_2.pdf
Files
ssoar_oc_data.zip
Files
(204.5 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:0e2fd9d75e4246cc6dfecab0c540d671
|
204.5 MB | Preview Download |
Additional details
Additional titles
- Subtitle (En)
- A Dataset of Metadata and Citation Links from SSOAR produced by the OUTCITE Project
Related works
- Is referenced by
- Publication: 10.1007/s00799-024-00404-6 (DOI)
References
- Backes, T., Iurshina, A., Shahid, M. A., & Mayr, P. (2024). Comparing Free Reference Extraction Pipelines. International Journal on Digital Libraries, 25(4), 841–853. DOI: https://doi.org/10.1007/s00799-024-00404-6
- SSOAR system: https://www.gesis.org/ssoar
- OUTCITE Demo system: https://demo-outcite.gesis.org/
- GESIS search: https://search.gesis.org/
- GESIS search utilizing OUTCITE references and citation links. Demo document: https://search.gesis.org/publication/gesis-ssoar-86950
- How to produce well-formed CSV files for OpenCitations: https://github.com/opencitations/metadata/blob/master/documentation/csv_documentation-v1_1_2.pdf