Dataset Open Access
Thanasis Vergoulis;
Ilias Kanellos;
Serafeim Chatzopoulos;
Danae Pla Karidi;
Theodore Dalamagas
This dataset contains impact metrics and indicators for a set of publications that are related to the COVID-19 infectious disease and the coronavirus that causes it. It is based on:
These data have been cleaned and integrated with data from COVID-19-TweetIDs and from other sources (e.g., PMC). The result was dataset of 56,731 unique articles along with relevant metadata (e.g., the underlying citation network). We utilized this dataset to produce, for each article, the values of the following impact measures:
We provide three CSV files, all containing the same information, however each having its entries ordered by a different impact measure. All CSV files are tab separated and have the same columns (PubMed_id, PMC_id, DOI, popularity_score, influence_score, tweets count).
The work is based on the following publications:
- COVID-19 Open Research Dataset (CORD-19). 2020. Version 2020-05-01 Retrieved from https://pages.semanticscholar.org/coronavirus-research. Accessed 2020-05-03. doi:10.5281/zenodo.3715506
- Chen Q, Allot A, & Lu Z. (2020) Keep up with the latest coronavirus research, Nature 579:193 (version 2020-05-03)
- R. Motwani L. Page, S. Brin and T. Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report. Stanford InfoLab.
- I. Kanellos, T. Vergoulis, D. Sacharidis, T. Dalamagas, Y. Vassiliou: Impact-Based Ranking of Scientific Publications: A Survey and Experimental Evaluation. TKDE 2019
- Rumi Ghosh, Tsung-Ting Kuo, Chun-Nan Hsu, Shou-De Lin, and Kristina Lerman. 2011. Time-Aware Ranking in Dynamic Citation Networks. In Data Mining Workshops (ICDMW). 373–380
A Web user interface that uses these data to facilitate the COVID-19 literature exploration, can be found here. More details in our preprint here.
Terms of use: These data are provided "as is", without any warranties of any kind. The data are provided under the Creative Commons Attribution 4.0 International license.
Name | Size | |
---|---|---|
articles_by_influence.csv
md5:32cda1ed57fe2c7ff677b0f147b4914c |
4.1 MB | Download |
articles_by_popularity.csv
md5:a7de1f1fa097ab8b89a1b3f53755b23f |
4.1 MB | Download |
articles_by_tweets.csv
md5:7af2aa142745e330d685904f2ba3d69c |
4.1 MB | Download |
Chen Q, Allot A, & Lu Z. (2020) Keep up with the latest coronavirus research, Nature 579:193 (version 2020-05-03)
COVID-19 Open Research Dataset (CORD-19). 2020. Version 2020-05-01. Retrieved from https://pages.semanticscholar.org/coronavirus-research. Accessed 2020-05-03.
I. Kanellos, T. Vergoulis, D. Sacharidis, T. Dalamagas, Y. Vassiliou: Impact-Based Ranking of Scientific Publications: A Survey and Experimental Evaluation. TKDE 2019
R. Motwani L. Page, S. Brin and T. Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report. Stanford InfoLab.
Rumi Ghosh, Tsung-Ting Kuo, Chun-Nan Hsu, Shou-De Lin, and Kristina Lerman. 2011. Time-Aware Ranking in Dynamic Citation Networks. In Data Mining Workshops (ICDMW). 373–380
All versions | This version | |
---|---|---|
Views | 61,136 | 301 |
Downloads | 8,714 | 82 |
Data volume | 72.4 GB | 333.3 MB |
Unique views | 57,403 | 256 |
Unique downloads | 6,646 | 69 |