Published September 7, 2022 | Version v1
Conference paper Open

The changing role of cited papers over time: An analysis of highly cited papers based on a large full text dataset

  • 1. Dalian University of Technology
  • 2. Leiden University


In this paper, we have analyzed the changing role of cited papers over time. This was done by taking 880 HCPs as a case dataset and by collecting the full text of more than 220 thousand papers that are citing those HCPs. Based on the collected full text data, we have analyzed how different aspects of the citations to the HCPs change over time. The aspects included in our analysis are the location of the citation in the full text, cited reference type, in-text citation type, and citation sentiment. The conclusion of our analysis can be summarized as follows.

On average, HCPs are cited earlier in the text as they get older. There is little change in the percentage of citations to HCPs that are located in middle part of the full text of the citing papers over the citing years, while the percentage located in the begin part is increasing and in the end part is decreasing. Second, many HCPs are mentioned multiple times in the full text of the citing papers in the first few citing years, while that percentage decreases as the HCPs get older. In addition, the average numbers of time the HCPs are mentioned in the full text of the citing papers is decreasing from 1.8 in 2000 to 1.4 in 2016. Third, HCPs are more likely to be cited along with other references in the same in-text citation in later citing years. This could indicate that as HCPs get older, they tend to serve more and more as general references and become less essential to the papers in which they are cited. Also, HCPs are cited together with more other references as they age, but there is a limit to this growth. Our last finding is that there is only a very weak increase in citation sentiment over the citing years. The largest proportion of the text of sentences in which HCPs are cited is associated with the neutral sentiment, followed by the positive and the negative sentiment.

There are several limitations to this study that should be noted. First, although the number of full texts of the citing papers that we have collected is large, it still accounts for only a relatively modest share of the total number of citing papers (14.3%). Additional full text data from other sources may yield different results. Second, fields of research were not considered, which may hide some evolutionary features that may be present at a more granular field or discipline level. Third, we limited ourselves in the number of citation characteristics studied. For example, we did not attempt to analyze the change in the relatedness between citing papers and cited papers. Despite these limitations, our results can be regarded as weak evidence that the reason why papers are cited may change over time. In the future, we look forward to additional studies examining the evolutionary characteristics of citations, at more granular levels, using full-text data from multiple sources, considering different research areas, and using semantic analyses. Such studies have the potential to influence our understanding of citation theory and behavior, and to have practical impact on applications such as information search and retrieval and the accurate modeling of the structure and dynamics of science.



Files (628.0 kB)

Name Size Download all
628.0 kB Preview Download