There is a newer version of this record available.

Dataset Open Access

Reliance on Science in Patenting

Marx, Matt; Aaron Fuegi

Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="" xmlns:oai_dc="" xmlns:xsi="" xsi:schemaLocation="">
  <dc:creator>Marx, Matt</dc:creator>
  <dc:creator>Aaron Fuegi</dc:creator>
  <dc:description>This dataset contains citations from worldwide patents to scientific articles.  If you use the data, please cite this paper: Marx, Matt and Aaron Fuegi, "Reliance on Science: Worldwide Front-Page Patent Citations to Scientific Articles" Forthcoming in Strategic Management Journal. ( 

There are two "flavors" of matches: linking to the Microsoft Academic Graph (_pcs_mag.tsv), and to PubMed (_pcs_pubmed.tsv). Each citation to science has the patent number, paper ID for MAG or PubMed, applicant/examiner indicator, and a confidence score (1-10); _data_description.pdf has full details.

We also have a beta release of matches from the body text of USPTO patents since patent #1 in 1836. The files _pcs_mag_bodytextbeta.tsv and _pcs_pubmed_bodytextbeta.tsv add a field indicating whether the citation appeared on the front page, in the body text, or in both.

The remaining files redistribute the Microsoft Academic Graph, carving up the original files into smaller, variable-specific files. There are also extensions including journal impact factor and high-level technical classifications. If you use them, please cite the following article: Sinha, A, et al. 2015. Overview of Microsoft Academic Service (MAS) and Applications. In Proceedings of the 24th International Conference on World Wide Web (WWW ’15 Companion). ACM, New York, NY, USA, 243-246.

	The PubMed linkages are publicly available without any licensing restrictions. The MAG linkages are subject to the Open Data Commons Attribution license (ODC-By), so you can use them for anything as long as you cite us.
	Questions &amp; feedback to 
	Join our listserv by sending a plain text email to with "subscribe relianceonscience-l" in the body. 
	Source code is available at

This computational work was performed on the Boston University Shared Computing Cluster.</dc:description>
  <dc:subject>innovation, patenting, science, citation</dc:subject>
  <dc:title>Reliance on Science in Patenting</dc:title>
All versions This version
Views 14,42734
Downloads 28,03428
Data volume 101.7 TB14.3 GB
Unique views 11,37626
Unique downloads 8,18623


Cite as