Dataset Open Access
This Zenodo page describes data collection, processing, and different open access data files related to the text of USPTO patent documents. The document "Data Description Zenodo.pdf" provides more details. If you use the code or data, please cite the following paper:
Arts S, Hou J, Gomez JC. (2020). Natural language processing to identify the creation and impact of new technologies in patent text: code, data, and new measures. Forthcoming Research Policy. (https://doi.org/10.1016/j.respol.2020.104144)
Name | Size | |
---|---|---|
0_Data_Description_Zenodo.pdf
md5:ce0332320560f80efa6a86fcdbbae986 |
535.3 kB | Download |
1000_most_similar_patents.zip
md5:0660f13ff52576b824432ff8c6fbe628 |
46.0 GB | Download |
100_most_similar_patents.zip
md5:cbef0725269ac2185034a30b365066a9 |
5.1 GB | Download |
cosine_similarity.zip
md5:025c03d1b7f32acc75e93bc4f6d5aa38 |
80.9 MB | Download |
greek.txt
md5:aea2752c5e38c3ed96976e9264d88a1a |
685 Bytes | Download |
keywords.zip
md5:b1fe1e41a8da1c7ed8948487c7a1089f |
903.6 MB | Download |
new_bigrams.zip
md5:1a0268bc4a8ca3d83deb072e558990e1 |
68.5 MB | Download |
new_keyword_comb_1980_1989.zip
md5:ee65ce71ae3c02319db065685420f056 |
492.7 MB | Download |
new_keyword_comb_1990_1994.zip
md5:166d0b81fc60b8714ae77c230b648295 |
351.3 MB | Download |
new_keyword_comb_1995_1999.zip
md5:4621d1f6feaaca64f3adec600a6c624f |
866.1 MB | Download |
new_keyword_comb_2000_2004.zip
md5:fc14065819616aee644b01fa2971b9e7 |
774.7 MB | Download |
new_keyword_comb_2005_2009.zip
md5:e7167dc5b23816fbfc1ade0e3e047566 |
748.1 MB | Download |
new_keyword_comb_2010_2018.zip
md5:f34e22ad7a57fe8aff646c5ebb08fe12 |
557.7 MB | Download |
new_keyword_comb_all.zip
md5:764c38f0e64d0bcc20d8f7d709b4cfd1 |
3.1 GB | Download |
new_keywords.zip
md5:ad9ee88d67e61888fae961fa29894148 |
10.0 MB | Download |
new_trigrams.zip
md5:ae488e2e3460afdb5df0d7ae12b5a409 |
113.5 MB | Download |
patent txt raw.zip
md5:5ebdfe48395eec11e4f1a2de9490132e |
6.3 GB | Download |
patent_text_measures.zip
md5:675a63980d9deb10eb2062da80a045ce |
100.3 MB | Download |
stopwords.txt
md5:d42922204201e14c015aecd0f0762bd2 |
395.0 kB | Download |
symbols.txt
md5:5d4d932a407310fabea7e80531a9b467 |
167 Bytes | Download |
All versions | This version | |
---|---|---|
Views | 5,309 | 5,307 |
Downloads | 6,474 | 6,474 |
Data volume | 13.7 TB | 13.7 TB |
Unique views | 4,727 | 4,725 |
Unique downloads | 4,030 | 4,030 |