There is a newer version of the record available.

Published August 29, 2019 | Version v17
Dataset Open

Patent Citations to Science

  • 1. Boston University

Description

This dataset contains citations from USPTO patents granted 1947-2018 to articles captured by the Microsoft Academic Graph (ID) from 1800-2018.  

The main file, pcs.tsv, contains the resolved citations. Fields are tab-separated. Each match has the patent number, MAG ID, the original citation from the patent, an indicator for whether the citation was supplied by the applicant, examiner, or unknown, and a confidence score (1-10) indicating how likely this match is correct. Note that this distribution does not contain matches with confidence 2 or 1.

There is also a PubMed-specific match in pcs-pubmed.tsv.

The remaining files are a redistribution of the 1 January 2019 release of the Microsoft Academic Graph. All of these files are compressed using ZIP compression under CentOS5. Original files, documented at https://docs.microsoft.com/en-us/academic-services/graph/reference-data-schema, can be downloaded from https://aka.ms/msracad; this redistribution carves up the original files into smaller, variable-specific files that can be loaded individually (see _reliance_on_science.pdf for full details, the latest version of which is available at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3331686).

Source code for generating the patent citations to science in pcs.tsv is available at https://github.com/mattmarx/reliance_on_science. Source code for generating jif.zip and jcif.zip (Journal Impact Factor and Journal Commercial Impact Factor) is at https://github.com/mattmarx/jcif.

Although MAG contains authors and affiliations for each paper, it does not contain the location for affiliations. We have created a dataset of locations for affiliations appearing at least 100x using Bing Maps and Google Maps; however, it is unclear to us whether the API licensing terms allow us to repost their data. In any case, you can download our source code for doing so here: https://github.com/ksjiaxian/api-requester-locations.

MAG extracts field keywords for each paper (paperfieldid.zip and fieldidname.zip) --more than 200,000 fields in all! When looking to study industries or technical areas you might find this a bit overwhelming. We mapped the MAG subjects to six OECD fields and 39 subfields, defined here: http://www.oecd.org/science/inno/38235147.pdf. Clarivate provides a crosswalk between the OECD classifications and Web of Science fields, so we include WoS fields as well. This file is magfield_oecd_wos_crosswalk.zip.

Files

_reliance on science.pdf

Files (45.4 GB)

Name Size Download all
md5:75f5bf12a703348312d6d6083d3d314d
693.6 kB Preview Download
md5:0917e7304059b52619782aa4a5f1f24a
2.8 GB Preview Download
md5:9e35a6df4f3f6b0fe525eed10afae3d3
3.0 GB Preview Download
md5:f8501b603ac284a7c168d72a1511ad36
78.9 kB Preview Download
md5:a68b721d656a7be3ca6efb677d0a39b0
4.2 MB Preview Download
md5:c2f351238565d2216136aeaacdf55914
5.2 MB Preview Download
md5:7c66b0a4d51721179ce103ce9fdb35c9
8.1 MB Preview Download
md5:4fb35d70897e46a5b3f1ac9a723c095a
1.3 MB Preview Download
md5:bbe297e3f6a71b79d3b754ab00c3eba0
2.2 GB Preview Download
md5:3d7dbb590fa0f834a938e3897b71f4f5
4.3 GB Preview Download
md5:9705a0dc6d517b2336ecc148ba591982
3.5 GB Preview Download
md5:84c293aba31f57bbb85d2e6d5f65dfce
7.8 GB Preview Download
md5:cfde2972be81f7db051edc37e903ac91
448.7 MB Preview Download
md5:ae6a01a43054910834667f6763c4b13e
1.3 GB Preview Download
md5:78e5e3e144a42e8b22bc1f85c2b8ed3e
5.7 GB Preview Download
md5:d9a425c7c183d3a12762d0bf1ced17f2
807.1 MB Preview Download
md5:95c371e6e21169c13e1c5b3e6b7b8aab
6.9 GB Preview Download
md5:43535c579a791b6f07d11b1c3c381c4f
1.1 GB Preview Download
md5:d0067ff44ce5aee7db1be8e51398f950
620.2 MB Preview Download
md5:c55c296aa57a98e543383c4b0a8b06cc
1.5 GB Download
md5:486dead83a2c7f1a5f6e23e58960ed0e
3.4 GB Download