Dataset Open Access

Datasets on contributorship and bibliometric variables for the study 'Task specialization and its effects on research careers'

Robinson-Garcia, Nicolas; Costas, Rodrigo; Sugimoto, Cassidy R.; Larivière, Vincent; Nane, Gabriela F.

Datasets used in the study 'Task specialization and its effects on research careers'.

Dataset 1 (plos_contribution_data_set.csv). Seed dataset containing contribution and bibliometric data on a set of publications assigned to the Medical and Life Sciences from PLOS journals.

Dataset 2 (pub_history.csv). Dataset of author-publication combinations for the complete publication history of 222,295 disambiguated authors and 6,236,239 distinct publications.

 

Summary of the paper

Research evaluation remains largely focused on individuals’ leadership and excellence, disregarding the collaborative nature of their work. We model a set of 70,694 publications and 347,136 distinct authors using Bayesian networks to predict scientists’ specific contributions on each of their publications. We predict the contributions of 222,925 authors in 6,236,239 publications, and apply an archetypal analysis to profile scientists by career stage. We divide scientific careers into four stages: junior, early-career, mid-career and late-career. Three scientific archetypes are found throughout the four career stages: 1) leader, 2) specialized, and 3) supporting. All three archetypes are encountered for the early- and mid-career stages, whereas for junior and late-career stages only two archetypes are found: specialized and supporting for junior scholars, and leader and supporting for late-career scholars. Scientists assigned to the leader and specialized archetypes tend to have longer careers than researchers who belong to the supporting archetype. There is consistent gender bias at all stages: the majority of male scientists belong to the leader archetype, while the larger proportion of women belong to the specialized archetype, especially for early and mid-career researchers. 

Files (894.2 MB)
Name Size
plos_contribution_data_set.csv
md5:7e3f3dd5e720f511662d0ded0e808e8e
39.0 MB Download
pub_history.csv
md5:6d75298a9ceaa2a0df1a675d2929ce1c
855.2 MB Download
142
146
views
downloads
All versions This version
Views 142142
Downloads 146146
Data volume 35.9 GB35.9 GB
Unique views 131131
Unique downloads 112112

Share

Cite as