Project deliverable Open Access
Project MARVEL will create and publicly share with the academics, industrial community, and smart cities, a data pool of experimental multimodal audio-visual data and will showcase the use of the data and its processing across various pilots. This report documents the process for the collection and analysis of the experimental data. The use cases and AI tasks required to process the audio-visual data and enable the implementation of the pilots are first described. This knowledge determines how and where the audio-visual data is collected and annotated. The various devices, namely microphones and cameras, and their deployment are described next, followed by software tools that will be used in the data annotation task, that will be carried out according to what is required for training the AI models. The data is analysed to determine which parts constitute personal data, followed by a discussion on the appropriate data anonymisation techniques, which should ensure the sharing of GDPR compliant data. In addition, the data value chains are defined, including the data owner and access rights at each processing stage. The proposed datasets and AI models are matched and compared to the use cases and any gaps are identified. The volume and velocity at which data is collected and moved from one network layer to another are estimated from the technical specifications of the devices as well as from the expected output of the processing stages. These estimates enable the initial planning of the MARVEL framework, which promises a solution to collect and process big data of high volume and high variety while optimising both flow and processing at any appropriate point of the edge-fog-cloud infrastructure, typical of a smart city.