DISCERN 2: Duke Innovation & SCientific Enterprises Research Network
Creators
Description
The DISCERN dataset was developed to support academic research on corporate innovation by linking data on U.S. publicly listed firms from Standard & Poor’s Compustat database to their patents and scientific publications. A key feature of DISCERN is its comprehensive coverage of firms’ subsidiaries and their ownership changes over time, which is crucial for accurately mapping corporate innovation. Patents and publications may be assigned to various legal entities within a firm’s organizational structure. Subsidiaries may change ownership in M&A events. By accounting for these ownership linkages over time, DISCERN enables researchers to construct more precise measures of firms’ knowledge production and examine the factors influencing their R&D investment decisions.
Version 2.0 incorporates several key improvements over the previous version of DISCERN. First, we shift to using the PatentsView database as the main source of patent data and OpenAlex as the main source of scientific publication data. PatentsView is publicly available and continuously maintained directly by the United States Patents & Trademarks Office (USPTO). OpenAlex is currently the only open data source of scientific publication metadata. Using freely available data sources allows us to share both the patent and the publication datasets openly. This enhances data access, which was previously limited due to the use of propriety data. Second, the updated dataset now covers the period from 1980 to 2021, providing an additional six years of data. Third, we transition to using Securities and Exchange Commission (SEC) filings as the primary source of subsidiary data, allowing us to trace ownership linkages further back to the mid-1990s and ensuring a higher degree of reliability compared to the Orbis data used in the original version, which was less reliable and had comprehensive coverage only from 2008. Finally, by transitioning to PatentsView and additional data sourced from the USPTO, we expand the scope of the dataset to include pre-grant patent applications and patent re-assignment information. This addition allows users to study patent applications regardless of grant status and to observe ownership transitions beyond those related to mergers and acquisitions.
A special thanks and appreciation go to Sanskriti Purohit and Ron Rabi for their diligent work and dedication to this effort.
The dataset is freely available under the O-UDA-1.0 License, permitting unrestricted use for research and commercial purposes. We request that users provide proper citations when utilizing the dataset. The license also allows for the creation of derivative datasets based on DISCERN, with the condition that creators ask their downstream users to cite the original authors appropriately.
If you use the data, please add these citations:
1. Arora, A., Belenzon, S., Cioaca, L., Sheer, L, Shin, H.M. & Shvadron, D. (2024). DISCERN 2.0: Duke Innovation & SCientific Enterprises Research Network [Dataset]. In Zenodo (CERN European Organization for Nuclear Research). https://doi.org/10.5281/zenodo.3594642
2. Arora, A., Belenzon, S., Cioaca, L., Sheer, L, & Shvadron, D. (2024). Back to the Future: Are Big Firms Regaining their Scientific and Technological Dominance? Evidence from DISCERN 2.0 (available soon)
Files
Data dictionary.pdf
Files
(404.0 MB)
Name | Size | Download all |
---|---|---|
md5:aca75dbab29c41d7bb723446cdff104b
|
179.0 kB | Preview Download |
md5:564cc14c112c2925c225d89495ff25b5
|
403.9 MB | Preview Download |