Published April 2023 | Version v1
Project deliverable Open

D4.3 Update to Annotation guidelines, tools & training

  • 1. ROR icon Medical University of Graz
  • 2. ROR icon Averbis (Germany)
  • 3. ROR icon Maastricht University
  • 4. ROR icon North Estonia Medical Centre
  • 5. B!Loba


Manual annotations of clinical narratives are crucial for the adoption and evaluation of Natural
Language Processing (NLP) tools, which support an overall AI-assisted data curation approach within
AIDAVA. For a symbolic representation of clinical entities of interest and the way how they are related,
normalisations that use international standards like SNOMED CT, FHIR or LOINC are crucial. For this
deliverable, we updated the first version of the manual annotation guideline (see AIDAVA Deliverable
D4.1), where requirements for annotation tooling were formulated with respect to the AIDAVA use
cases, together with some initial annotation instructions. Grounded on this requirement analysis,
INCEpTION was chosen as an annotation tool after a rigorous investigation of available annotation
software. A first manual annotation schema was developed and tested, with a focus on the use of
SNOMED CT and FHIR for the normalisation of the types of clinical entities (as annotating them with
terminology codes) referred to by clinical narratives. Within this preparation phase, INCEpTION was
deployed on all three clinical sites (MUG, NEMC, MUMC), with a first version of a consolidated
INCEpTION layer definition. A bi-weekly “train the trainers” session was started at the end of 2022,
supporting a continuous transition into the piloting phase of the developed guideline, analysing
example narratives and how they should be annotated according to the first version of the guideline.
Within the piloting phase lasting from January 2023 to May 2023, in communication with the
responsible clinicians, relevant attributes were identified, and a selection of them was used for
updating, testing and refinement of the annotation guideline. Annotators were recruited at all three
different sites and their feedback was taken into account for the customization and technical set up of
INCEpTION. Alignment with Deliverable D2.1 "Reference Ontology as a Global Data Sharing Standard"
defining the AIDAVA Reference Ontology was identified as crucial, therefore this deliverable was
postponed for one month from April to May 2023.
Building on the first version of the guideline delivered early January 2023, this updated descriptive
guideline provides a comprehensive framework and detailed instructions to ensure accurate
annotation of clinical narratives. It covers crucial aspects like data standardisation and best practices
in annotation (including annotation tool, general principles, specific instructions, concrete examples,
and quality control items), ensuring consistent, interoperable, and high-quality annotations. This is
invaluable for effective knowledge graph construction, data analysis, and knowledge extraction as
central requirements in AIDAVA.
The updated annotation instructions form the core of this deliverable, enabling to start the productive
phase of the manual annotation. Manual annotation of texts is iterative and dynamic. It is, therefore,
crucial to recognise potential updates and improvements that may arise during the productive phase.
Factors that can contribute to the modifications and enhancements of the set of annotation
instructions include active feedback from the annotation team, new insights into text phenomena that
lead to annotator disagreements, updates in data requirements from use cases, and evolution of
project objectives as a result of dissemination and communication activities during the project. To
ensure consistency and minimise inconsistencies in the annotation work, a structured feedback
mechanism is established, involving documenting any challenges or updates in a shared document,
and conducting meetings with the annotation team to address any emerging insights or challenges.


AIDAVA_101057062_D4.3_ Updated Annotation guidelines_Zenodo.pdf

Files (1.4 MB)

Additional details


AIDAVA - AI powered Data Curation & Publishing Virtual Assistant 101057062
European Commission