CSET scholarly literature metadata over OpenAlex works
Creators
- 1. Center for Security and Emerging Technology
Contributors
Contact person:
Data collector:
Project member:
Researchers:
- 1. Center for Security and Emerging Technology
Description
This dataset contains metadata developed at the Center for Security and Emerging Technology that augments OpenAlex works. Currently, this includes outputs of AI, Computer Vision, Robotics, Natural Language Processing, and Cybersecurity classifiers for English-language OpenAlex works published after 2009, and title and abstract-level language IDs. For works with positive predictions for AI, Computer Vision, Robotics, or Natural Language Processing, predictions by an AI Safety classifier are also available.
The attached zip file contains a set of JSONL files which comprise our dataset. Each row conforms to this schema, with null values omitted. This dataset is currently a work in progress and full documentation will be made available at a later date.
Files
cset_openalex.zip
Files
(1.9 GB)
Name | Size | Download all |
---|---|---|
md5:b4db5b0ac001db561bdaf204a21cb703
|
1.9 GB | Preview Download |
Additional details
Related works
- Documents
- 10.51593/20220030 (DOI)
- 10.48550/arXiv.2002.07143 (DOI)
- Is referenced by
- Publication: 10.48550/arXiv.2403.09097 (DOI)
Software
- Repository URL
- https://github.com/georgetown-cset/cset_openalex
- Programming language
- Python
- Development Status
- Active