Published May 7, 2024 | Version v1
Project deliverable Open

D3.4 – Domain-adapted synthetic dataset for 3D semantic segmentation

Description

In this report, we review the benefits of artificial data in semantic point cloud segmentation to improve segmentation on real-life data. This could also be a response to the lack of annotated data on some fields, and more especially for BIM models. Annotating data by hand is a very tedious task and has to be supervised in some ways. The main point of focus is 3D interiors scans obtained with a LiDAR sensor. Our work will be divided into two main parts. The first part targets artificial data generation using a simulated LiDAR sensor inside Unreal Engine 5. There is a need to have more annotated data and generating artificial datasets will be proven to be a viable alternative. Our reference will be the Stanford dataset which consists of real-world interior scans (alongside their annotations, meshes, semantics, surface normal, materials and textures). The second part consists in the evaluation of such artificially generated data. In order to measure the benefits of using artificial data, we will use domain adaptation and finetuning on two pretrained models for semantic segmentation. We will also discuss the need of “good” artificial data, especially when it comes to complex tasks – even regardless of the model, and methods to generate them.

Files

D3.4 – Domain-adapted synthetic dataset for 3D semantic segmentation.pdf

Additional details

Funding

European Commission
HumanTech – Human Centered Technologies for a Safer and Greener European Construction Industry 101058236