Published April 8, 2026 | Version v1
Dataset Open

Human-in-the-Loop Crowdsourced Annotation Dataset for Ukrainian Folk Art with Reproducible Jupyter Notebooks

  • 1. CrowdHeritage
  • 2. Datoptron
  • 3. ROR icon Europeana Foundation

Description

This record contains the code and datasets to reuse and reproduce the findings of the pilot "Human-in-the-Loop Crowdsourced Annotation of Ukrainian Folk Art", developed by Web2Learn.  The repository is designated as an open-source resource for digital humanities research, freely available for reproduction or reuse by scholars, students, and teachers. It is also intended for creative reuses. 

The pilot is implemented within the framework of the AISTER project. The pilot operationalises and analyses a human-in-the-loop (HITL) crowdsourcing framework for metadata enrichment in Europeana collections. The objective is to enhance the accessibility and discoverability of Ukrainian ethnographic heritage by improving the quality of descriptive metadata combining artificial intelligence tools (natural language processing, computer vision) and human participation, while contributing to a better understanding in HITL approaches to AI-assisted metadata generation in cultural heritage. 

The pilot includes a crowdsourcing campaign, set up on the CrowdHeritage platform, maintained by Datoptron. Participants are invited to browse images from the ethnographic collection of the Krovets Online Museum of Traditional Art of Ukraine on Europeana, which includes more than 300 folk art paintings depicting scenes from everyday rural life and religious themes. By reviewing keywords automatically generated with computational methods, participants corrected terms, rejected inaccurate ones, and added additional keywords by recognising scenes, objects and figures.

The repository contains the complete workflow with all activity captured, including more than 25000 annotation marks of AI-generated and crowdsourced content of 70 contributors:

  1. Automatically generated annotations (description tags) for artefacts on Europeana using AI tools (natural language processing, computer vision) and Europeana APIs, and
  2. Human-in-the-loop crowdsourced annotations on the CrowdHeritage platform to validate the AI-generated content, also enabling participants to contribute additional user-generated annotations.

Explore the full pilot also on GitHub, with the same layout designed to support reproducibility, with helpful notes, a detailed README file, machine-readable output files and executable code provided through Jupyter Notebooks: https://github.com/Web2LearnEU/AISTER-Crowdsourcing-Pilot

Disclaimer

Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Education and Culture Executive Agency (EACEA). Neither the European Union nor EACEA can be held responsible for them.

Files

README.md

Files (11.8 kB)

Name Size Download all
md5:7cc4da424fdcf6f342eae4e9eb240b3a
11.8 kB Preview Download

Additional details

Related works

Funding

Erasmus+
AI-enabled Citizen Participation in University-driven Ukrainian Cultural Heritage Safeguarding 000290738

Dates

Submitted
2026-04-09