Published July 5, 2023 | Version v1
Conference paper Open

Author Name Disambiguation for Fair Representation of Female Scholars in Science

Creators

Description

Using bibliographic data, studies have reported that female scholars produce fewer papers and attract fewer citations than male scholars. These findings may be based on flawed data. As none of the existing bibliographic data services consolidates author entities with different last names, entities of female authors who change names are inevitably split into separate entities – one with a maiden name and the other with a marital name. This means that publications and citations of female authors who have used different names are likely undercounted, possibly leading to an under-evaluation of their scholarly productivity and impact. This study develops a machine learning method customized to consolidate female author entities in bibliographic data to promote fair representation and evaluation of women in science. This study creates large-scale labeled data to train algorithmic models to merge the same female author entities split under different names. It implements the models on author entities recorded in PubMed, which indexes research papers in biomedicine and demonstrates how the correct identification of name-changed female authors can lead us to a different understanding of research productivity and citation-based impact of female scholars in the field where almost half of the scientists are estimated to be female.

Files

ISSI 2023 Proceedings, v2, pp 233–238.pdf

Files (898.1 kB)

Name Size Download all
md5:db1bbcc812f95f11348c6f09400b8d50
898.1 kB Preview Download