Published May 13, 2024 | Version 1.0.0
Dataset Open

Pharmacokinetic Relation Extraction Database (PRED)

Creators

Description

Annotated data to perform Relation Extraction and extract pharmacokinetic (PK) parameter estimates from scientific text. 

Training, development and test files are released in JSONL format and store the annoated data for training and evaluating end-to-end relation extraction models. 

Each line in the JSONL files corresponds to an annotated sentence with the following information: 

  • text: Raw sentence
  • relations: List or relations each containing: 
    • head_span : Head entity of the relation as a dictionary containing start character, end character and entity type label
    • child_span: Child entity of the relation 
    • label: relation type label (i.e. either C_VAL, D_VAL or RELATED)
  • spans: List of entities mentioned in the sentence, specifying: (1) the character-level boundaries and (2) the entity type label of each annotation (i.e. either PK, VALUE, UNITS, RANGE or COMPARE). This field is not strictly required to train the model since all spans are defined within the relations.
  • sentence_hash: Unique sentence ID
  • metadata: Metadata with unique article, paragraph and sentence identifiers and the article section from which the sentence was extracted

 

Files

Files (18.1 MB)

Name Size Download all
md5:2f6e9a6c582780228fae3ebcfb36a655
2.4 MB Download
md5:d5f71aa1c0ba5d6c0643efd57036f7ed
5.1 MB Download
md5:2b4cc19c56c3bba380ca4360923b1b7b
10.6 MB Download