There is a newer version of this record available.

Dataset Open Access

Reliance on Science

Marx, Matt; Aaron Fuegi

Citation Style Language JSON Export

  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.7903131", 
  "language": "eng", 
  "title": "Reliance on Science", 
  "issued": {
    "date-parts": [
  "abstract": "<p>This dataset contains both front-page and in-text citations from patents to scientific articles, as well as <strong>patent-paper</strong>&nbsp;<strong>pairs</strong>, through 2021. &nbsp;<em>If you use the data, please cite </em>these two articles:</p>\n\n<p><strong>1. M. Marx &amp; A. Fuegi, &quot;Reliance on Science by Inventors: Hybrid Extraction of In-text Patent-to-Article Citations.&quot; </strong>&nbsp;<em>forthcoming in Journal of Economics and Management Strategy.&nbsp;</em>(<a href=\"\"></a>)</p>\n\n<p><strong>2. M. Marx, &amp; A.&nbsp;Fuegi, &quot;Reliance on Science: Worldwide Front-Page Patent Citations to Scientific Articles&quot; (2020),&nbsp;<em>Strategic Management Journal 41(9):1572-1594</em>. (</strong><a href=\"\"></a><strong>)&nbsp;</strong></p>\n\n<p>&nbsp;</p>\n\n<p>The datafile containing the citations is <strong>_pcs_mag_doi_pmid.tsv.&nbsp;</strong>DOIs and PMIDs provided where available. Each citation has the&nbsp;applicant/examiner flag, confidence score&nbsp;(1-10), and&nbsp;whether the reference was a) only on the front page, b) only in the body text, or c) in both. Each paper-patent citation also includes the temporal gap&nbsp;and three related measures of self-citation (i.e., was one or more of the inventors on the citing patent also an author on the cited paper).&nbsp;<strong>_reliance_on_science.pdf</strong>&nbsp;has full details.&nbsp;<strong>bodytextknowngood.tsv</strong>&nbsp;contains the known-good references for calculating recall.</p>\n\n<p>The datafile containing the patent-paper pairs (PPPs) is <strong>_patent_paper_pairs.tsv</strong>. These are USPTO only. Each PPP has a confidence score, the count of days between the publication of the paper and the filing of the patent. (If the patent is a continuation of another patent, the filing date of the original patent is used.) Also, when a paper is paired with multiple patents, an indicator variable reports whether those patents are continuations or otherwise identical.&nbsp;</p>\n\n<p>The remaining files redistribute much of the *final* edition of the&nbsp;<a href=\"\">Microsoft Academic Graph</a>&nbsp;(12/20/2021). Please also cite&nbsp;Sinha, A, et al. 2015. Overview of Microsoft Academic Service (MAS) and Applications. In Proceedings of the 24th International Conference on World Wide Web (WWW &rsquo;15 Companion). ACM, New York, NY, USA, 243-246. Note that,, and the OECD/wos-category crosswalks are derivatives of MAG and may not be updated through the end of 2021.</p>\n\n<p>These data are under an&nbsp;Open Data Commons Attribution license (ODC-By);&nbsp;use them for anything&nbsp;as long as you cite us! Source code for front-page matches is at&nbsp;;and for in-text is at Questions &amp; feedback to <a href=\"\"></a><em>.</em></p>\n\n<p><strong><em>This work is sponsored by the Alfred P. Sloan Foundation grant #G-2021-16822.</em></strong></p>", 
  "author": [
      "family": "Marx, Matt"
      "family": "Aaron Fuegi"
  "version": "v38", 
  "type": "dataset", 
  "id": "7903131"
All versions This version
Views 52,501875
Downloads 62,812927
Data volume 154.6 TB2.2 TB
Unique views 42,919788
Unique downloads 25,504412


Cite as