Reliance on Science in Patenting
Description
This dataset contains citations from the front pages of worldwide patents to articles captured by the Microsoft Academic Graph (MAG) from 1800-2018. If you use the data, please cite these two papers:
for the dataset of citations: Marx, Matt and Aaron Fuegi, "Reliance on Science: Worldwide Front-Page Patent Citations to Scientific Articles" (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3331686).
for the underlying dataset of papers Sinha, Arnab, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June (Paul) Hsu, and Kuansan Wang. 2015. An Overview of Microsoft Academic Service (MAS) and Applications. In Proceedings of the 24th International Conference on World Wide Web (WWW ’15 Companion). ACM, New York, NY, USA, 243-246.
The main file, pcs.tsv, contains the resolved citations. Fields are tab-separated. Each match has the patent number, MAG ID, an indicator for whether the citation was supplied by the applicant, examiner, or unknown, and a confidence score (1-10) indicating how likely this match is correct. Note that this distribution does not contain matches with confidence 2 or 1. Non-USPTO patents have the country abbreviation at the start of the patent number followed by a hyphen.
There is also a PubMed-specific match in pcs-pubmed.tsv. This file is currently limited to citations from USPTO patents.
The remaining files are a redistribution of the 1 January 2019 release of the Microsoft Academic Graph. All of these files are compressed using ZIP compression under CentOS5. Original files, documented at https://docs.microsoft.com/en-us/academic-services/graph/reference-data-schema, can be downloaded from https://aka.ms/msracad; this redistribution carves up the original files into smaller, variable-specific files that can be loaded individually (see _relianceonscience.pdf for full details).
Source code for generating the patent citations to science in pcs.tsv is available at https://github.com/mattmarx/reliance_on_science. Source code for generating jif.zip and jcif.zip (Journal Impact Factor and Journal Commercial Impact Factor) is at https://github.com/mattmarx/jcif.
MAG extracts field keywords for each paper (paperfieldid.zip and fieldidname.zip) --more than 200,000 fields in all! When looking to study industries or technical areas you might find this a bit overwhelming. We mapped the MAG subjects to six OECD fields and 39 subfields, defined here: http://www.oecd.org/science/inno/38235147.pdf. Clarivate provides a crosswalk between the OECD classifications and Web of Science fields, so we include WoS fields as well. This file is magfield_oecd_wos_crosswalk.zip.
Files
_relianceonscience.pdf
Files
(42.3 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:019c978ea8c12fcb78ef0d78a9520da2
|
1.1 MB | Preview Download |
|
md5:0917e7304059b52619782aa4a5f1f24a
|
2.8 GB | Preview Download |
|
md5:9e35a6df4f3f6b0fe525eed10afae3d3
|
3.0 GB | Preview Download |
|
md5:f8501b603ac284a7c168d72a1511ad36
|
78.9 kB | Preview Download |
|
md5:a68b721d656a7be3ca6efb677d0a39b0
|
4.2 MB | Preview Download |
|
md5:c2f351238565d2216136aeaacdf55914
|
5.2 MB | Preview Download |
|
md5:7c66b0a4d51721179ce103ce9fdb35c9
|
8.1 MB | Preview Download |
|
md5:4fb35d70897e46a5b3f1ac9a723c095a
|
1.3 MB | Preview Download |
|
md5:bbe297e3f6a71b79d3b754ab00c3eba0
|
2.2 GB | Preview Download |
|
md5:3d7dbb590fa0f834a938e3897b71f4f5
|
4.3 GB | Preview Download |
|
md5:9705a0dc6d517b2336ecc148ba591982
|
3.5 GB | Preview Download |
|
md5:84c293aba31f57bbb85d2e6d5f65dfce
|
7.8 GB | Preview Download |
|
md5:cfde2972be81f7db051edc37e903ac91
|
448.7 MB | Preview Download |
|
md5:ae6a01a43054910834667f6763c4b13e
|
1.3 GB | Preview Download |
|
md5:78e5e3e144a42e8b22bc1f85c2b8ed3e
|
5.7 GB | Preview Download |
|
md5:d9a425c7c183d3a12762d0bf1ced17f2
|
807.1 MB | Preview Download |
|
md5:95c371e6e21169c13e1c5b3e6b7b8aab
|
6.9 GB | Preview Download |
|
md5:43535c579a791b6f07d11b1c3c381c4f
|
1.1 GB | Preview Download |
|
md5:d0067ff44ce5aee7db1be8e51398f950
|
620.2 MB | Preview Download |
|
md5:c55c296aa57a98e543383c4b0a8b06cc
|
1.5 GB | Download |
|
md5:c951c01569247023737e9858aae76792
|
154.4 MB | Preview Download |
|
md5:f6fa79f9c3f9e37f7528a0aca2263d07
|
212.4 MB | Preview Download |
Additional details
References
- Marx, Matt and Aaron Fuegi, "Reliance on Science in Patenting: USPTO Front-Page Citations to Scientific Articles" (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3331686)
- Sinha, Arnab, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June (Paul) Hsu, and Kuansan Wang. 2015. An Overview of Microsoft Academic Service (MAS) and Applications. In Proceedings of the 24th International Conference on World Wide Web (WWW '15 Companion). ACM, New York, NY, USA, 243-246