Published December 18, 2024 | Version v1
Conference paper Open

Data Ingestion and Harmonisation for the Maritime Domain

Description

The Maritime Industry is a massive business, connecting the entire world, as the main means of trading of essential goods. Nevertheless, there are challenges with the ever-increasing maritime traffic complexity, safety, performance, energy efficiency and automation. These challenges are driving the industry to embrace a digital transformation of the sector, with the application of state-of-the-art Artificial Intelligence, Big Data and High-Performance Computing technologies. With the extremely large amount of data generated by shipping, it is possible to apply these technologies to model the ships and their behaviours, create digital twins of the ships, as well as to model the traffic patterns in the sea, make optimal route predictions, etc. However, due to the vast number of actors in the Maritime Industry, the large amounts of data generated by the different actors is wildly varied, heterogeneous and complex. To use this data to train Machine Learning models and Artificial Intelligence technologies, there is a need for all the data coming from the different actors in the industry to be homogenised into a single unified format. To accomplish this, the authors propose the creation of the VesselAI Data Ingestion and Harmonisation Services, a tool that enables ingestion and harmonisation of generic maritime datasets. This tool provides the ability to map a raw dataset of choice to a harmonised schema with the application of Natural Language Processing algorithms, with no need to use scripts or develop code.

Notes

The version of the paper here available is the author's Accepted Manuscript (AM).

The paper presents a generic data collection and harmonisation component that enables datasets to be collected from different sources and to be harmonized into common, interoperable schemas. This component was tested in a Maritime scenario for different types of data, such as AIS and weather data, but it is used in the context of AI-DAPT to harmonise data coming from other domains such as manufacturing, or health

Files

contribution_239.pdf

Files (691.5 kB)

Name Size Download all
md5:477c9f4a9c38c978cf8776d0bd0186e8
691.5 kB Preview Download

Additional details

Funding

European Commission
VesselAI – ENABLING MARITIME DIGITALIZATION BY EXTREME-SCALE ANALYTICS, AI AND DIGITAL TWINS 957237
European Commission
AI-DAPT – AI-Ops Framework for Automated, Intelligent and Reliable Data/AI Pipelines Lifecycle with Humans-in-the-Loop and Coupling of Hybrid Science-Guided and AI Models 101135826