JARVIS D 5.4 Whitepaper on Dataset Creation
Authors/Creators
Description
This document provides a structured overview of how a dataset lifecycle should be, from data acquisition to the dataset maintenance. It outlines the methodologies, challenges, and lessons learned during the JARVIS project associated with data acquisition, ethical compliance, storage, data preprocessing, data analysis, data augmentation, and long-term maintenance.
Drawing on practical experience from the project’s use cases, the white paper identifies key requirements for ensuring data
quality, representativeness, traceability, and robustness. The document also presents good practices and recommendations for future dataset creation activities, emphasizing governance, version control, bias monitoring, and operational validity. These insights contribute to the development of reliable, transparent, and interoperable datasets aligned with SESAR objectives for trustworthy and human-centric AI systems.
The JARVIS project, a SESAR Joint Undertaking initiative (Grant 101114692), ran from 2023 to 2026 and developd three digital assistants for airborne, air traffic control and airport operations. Within workpackage 5 (WP 5) lessons learned on foundational AI challenges dataset creation, AI design assurance and Human-AI teaming were collected.
Files
JARVIS_D5.4_Whitepaper_on_Dataset_Creation_v01.00.pdf
Files
(899.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:18849f04fd9e8db73f6e47966cbf8f95
|
899.3 kB | Preview Download |