Dataset Open Access
Arora Ashish; Belenzon Sharon; Sheer Lia
This database links patent data to Compustat firms. When using the data, please cite "WHY DO FIRMS INVEST IN RESEARCH?" (Arora, Belenzon and Sheer), NBER WP 23187.
Please follow the Stata DO files to merge the data into Compustat (using the field "gvkey"). The program “main_do_file.do” is the main do file. It runs all the other do files. See the Readme file for more detail.
This project introduces major data extension and improvement to the historical NBER patent dataset, which should be valuable for all researchers working with patent data linked to firms. In updating the data to match between Compustat and patents to 2015, we address two major challenges: name changes and ownership changes. These challenges are central to how patents are assigned to firms over time. To be consistent over the sample period, we reconstruct the complete historical data covered in the NBER data files.
About 30% of the Compustat firms in our sample change their name at least once. Accounting for name changes improves the accuracy and scope of matches to patents (and other assets), ownership structure, and dynamic reassignments of GVKEY codes to companies. Dynamic reassignment means that, for instance, if a sample firm merges with another firm, the patents of the merged firm are included in the stock of patents linked to the Compustat record from that point onward, but not before.
For ownership and subsidiary data we rely on a wide range of M&A data, including SDC, historical snapshots of ORBIS files for 2002-2015, 10-K SEC filings, and NBER2006 as well as perform extensive manual checks that help us uncover firms’ structure and ownership changes before proceeding to the patent match. Thus, we have extended and improved the NBER patent data. In the enclosed "Data Appendix", we document our data construction work, present several examples (“case studies”), and outline the improvements we made to existing NBER historical patent data.
"Why do firms invest in scientific research?", Ashish Arora, Sharon Belenzon and Lia Sheer, NBER WP 23187.
|All versions||This version|
|Data volume||314.1 GB||89.0 GB|