Schema for corpus annotated with named entitites relating to occupational substance exposures
No Additional ItemsArticle annotated with named entitites relating to occupational substance exposures
No Additional PropertiesUnique identifier for annotated document
Main type of substance exposure discussed in document (diesel exhaust or respirable crystalline silica (RCS) )
PubMed PMID for the document (if document is indexed by PubMed)
Digital Object Identifier (DOI) for the document, if available
URL where document can be accessed, if no DOI is available
Text corresponding to the subsections of the document that have been annotated, i.e., the abstract/summary, Methods section and Results section
An array of sentences within 'doc_text
No Additional ItemsProperties of individual sentences in 'doc_text'
No Additional PropertiesUnique identifier for sentence
Start character offset of sentence in 'doc_text'
End character offset of sentence in 'doc_text'
Text covered by the sentence
Array of named entity annotations in document
No Additional ItemsProperties of a named entity annotation
No Additional PropertiesUnique identifier for the named entity annotation
Array of text spans in 'doc_text' that constitute the named entity annotation. Annotations spans may be continuous or discontinuous; discontinuous annotations consist of one or more non-contiguous text spans
No Additional ItemsProperties of a span that constitutes or forms part of a named entity annotation
No Additional PropertiesStart character offset of span in 'doc_text''
End character offset of span in 'doc_text''
Text covered by the named entity annotation. In the case of annotations with discontinuous spans, the value of this attribute is created by concatenating the set of non-contiguous spans, separarated by spaces.
Semantic category of the named entity annotation