Published December 6, 2024
| Version 0.99.1
Dataset
Open
Pubmed citation dataset
Description
The scripts for generating the datasets are available at https://github.com/jokergoo/citation_analysis.
There are three files in this dataset:
1. citations.tab.gz: A table with two columns:
- citing: pmid (PubMed ID) of the citing paper
- cited: pmid of the cited paper
citing cited
10578099 10578100
10578100 10578099
10590126 10623754
10592169 10592170
10592169 10592175
10592169 10592200
2. pub_meta.tab.gz: A table with 8 columns:
- pmid: pmid of the paper.
- journal_uid: uid of the paper on PubMed.
- pub_year: year of the paper.
- n_authors: number of authors.
- country: identified country of the paper.
- country_type: type of the country identification. Values are `_domestic_`, `_domestic_80_`, `_international_` or `_empty_`. `_domestic_80_` means less than 20% of international authors in the middle of the author list.
- file_id: file id on PubMed FTP.
- n_references: number of references of the paper.
pmid journal_uid pub_year n_authors country country_type file_id n_references
38566917 101213162 2016 4 United States _domestic_ 1366 2
38567026 9886008 2022 3 United States _domestic_ 1366 12
38567115 101562981 2018 5 United States _domestic_ 1366 6
38567118 101668947 2018 4 United States _domestic_ 1365 0
38567245 101283276 2019 10 United States _domestic_ 1366 16
3. num_cite_country_country.tab.gz: A table with three columns:
- country_cited: country of the cited papers.
- country_citing: country of the citing papers.
- citations: total number of citations.
country_cited country_citing citations
Afghanistan Afghanistan 20
Afghanistan Australia 16
Afghanistan Austria 1
Afghanistan Bangladesh 4
Afghanistan Belgium 3
Afghanistan Brazil 8
Files
Files
(836.2 MB)
Name | Size | Download all |
---|---|---|
md5:cdeb6debfbaaee3285ec7d197fdcad75
|
737.3 MB | Download |
md5:323070ed73f50f8aedc213d8a39f0b31
|
94.8 kB | Download |
md5:851c895de2b7771cc7f70aacf2294628
|
98.7 MB | Download |