Published August 22, 2023
| Version v1
Dataset
Open
PICKLE Dataset
Contributors
Description
The PICKLE dataset accompanies the paper In a PICKLE: A gold standard entity and relation corpus for the molecular plant sciences. It is a natural language processing (NLP) dataset of scientific abstracts labeled with gold standard entities and relations. The abstracts were drawn from PubMed searches for the terms "jasmonic acid" and "gibberellic acid". There are 6,245 entities and 2,149 relations across the 250 documents in the brat-formatted (.txt/.ann) documents, and 6,164 entity and 2,094 relation annotations in the jsonl-formatted dataset, as some annotations cannot be aligned to the tokenization used in the jsonl format and are dropped.
Files
brat_formatted_PICKLE_dataset_unsplit.zip
Files
(859.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:caaa214a3d6a5f6355127d2eec97c477
|
403.4 kB | Preview Download |
|
md5:9792e4a131226228aff846fcb50cb89a
|
455.8 kB | Preview Download |
Additional details
Funding
- U.S. National Science Foundation
- NRT-HDR: Intersecting computational and data science to address grand challenges in plant biology DGE-1828149
- U.S. National Science Foundation
- TRTech-PGR: Connecting sequences to functions within and between species through computational modeling and experimental studies IOS-2107215
- U.S. National Science Foundation
- RESEARCH-PGR: Combining machine learning and experimental analysis to define trichome and root-specific gene regulatory networks in cultivated tomato and related Solanaceae species. IOS-2218206
- U.S. National Science Foundation
- Assessing the connections between genetic interactions, environments, and phenotypes in Arabidopsis thaliana. MCB-2210431
- Great Lakes Bioenergy Research Center
- Great Lakes Bioenergy Research Center BER DE-SC0018409
Dates
- Accepted
-
2023-11-06Manuscript formally accepted to in silico Plants
- Available
-
2023-11-07First Zenodo upload of both dataset formats
- Available
-
2023-08-22jsonl data uploaded to Huggingface