Datasets on DataCite - an Initial Bibliometric Investigation
Interest in measuring data citation and developing metrics for data is increasing. Despite this interest, basic bibliometric research investigating data sharing, data reuse and data citation practices remains relatively nascent. In this research in progress article, we use the DataCite GraphQL API to gather data for an initial investigation into dataset sharing and reuse as well as consider the current challenges. With over 8 million datasets in DataCite, we look at how datasets are dispersed by publication year, discipline, number of citations, license, institutional affiliation, and language. We find some patterns emerging, such as a recent increase in dataset publishing. However, there are still many limitations to doing this research that are discussed. As well, the future use of DataCite as a resource for doing this research and additional methods of analysis are considered.