CarDA - Car door Assembly Activities Dataset
Authors/Creators
Contributors
Data collectors:
Researcher:
Description
The CarDA dataset [1] (Car Door Assembly dataset) has been designed and captured to provide a comprehensive, multi-modal resource for analyzing car door assembly activities performed by trained line workers in realistic assembly lines.
It comprises a set of time-synchronized multi-camera RGB-D videos and human motion capture data acquired during car door assembly activities performed by real-line workers in a real manufacturing environment.
Deployment environment:
The use-case scenario concerns a real-world assembly line workplace in an automotive manufacturing industry, as the deployment environment. In this context,
line workers simulate the real car door assembly workflow using the prompts, sequences, and tools under very similar ergonomic and environmental conditions
as in existing factory shop floors.
The assembly line involves a conveyor belt that is separated into three virtually separated work areas that correspond to three assembly workstations. It moves at a low, constant speed, supporting cart-mounted car doors and material storage. A line worker is assigned to each workstation. All workers assemble car doors as the belt moves, with each station (WS10, WS20, and WS30). A worker completes a workstation-specific set of assembly actions, noted as a task cycle, lasting about 4 minutes before the cart proceeds to the next workstation for further assembly. Upon the successful completion of the task cycle, the cart is left to travel to the virtually defined area of the subsequent workstation where another line worker will continue the assembly process during the new task cycle. Each task cycle lasts approximately 4 minutes and is continuously repeated during the worker’s shift.
Data acquisition:
Data acquisition involves low-cost, passive RGB-D camera sensors that are installed at stationary locations alongside the car door assembly line and a motion
capture system for capturing time-synchronized sequences of images and motion capture data during car door assembly activities performed by real line workers.
Two stationary StereoLabs ZED2 stereo cameras were installed in each of the three workstations of the car door assembly line. The two stationary, workstation-specific cameras are located at bilateral positions on the two sides of the conveyor belt at the center of the area concerning that specific workstation.
The pair of RGB-D sensors were utilized to acquire stereo color and depth image sequences during car door task cycle executions. Each recording comprises
time-synchronized RGB (color) and depth image sequences captured throughout a task cycle execution at 30 frames per second (fps).
At the same time, the line worker used a wearable XSens MVN Link suit during work activities to acquire time-synced 3D motion capture data at 60 fps.
Note: Time synchronization between pairs of RGB-D (.svo) recordings (pairs captured during an assembly task cycle simultaneously from the inXX and outXX cameras installed by the wsXX) is guaranteed and relies on the StereoLabs ZED SDK acquisition software. Time synchronization between samples of the RGB-D and mp4 videos (30 fps) and the acquired motion capture data (60 fps) was performed manually with the starting frame/time of the video as a reference time. We have observed some time discrepancies between data samples of the two modalities that might occur after the first 40-50 seconds in some recordings.
CarDA Dataset:
The dataset has been split into two subsets, A and B.
Each comprises data acquired at different periods using the same multicamera system in the same manufacturing environment.
Subset A contains recordings of RGB-D videos, mp4 videos, and 3d human motion capture data (using the XSens MVN Link suit) acquired during car door assembly activities in all three workstations.
Subset B contains recordings of RGB-D videos and mp4 videos acquired during car door assembly activities in all three workstations.
CarDA subset Α
It contains:
-
- RGB-D was acquired using StereoLabs ZED 2 sensors in .svo format
- mp4 videos (30fps) extracted from the .svo files (using the left camera of the stereo pair of each camera).
- 3D human pose data (ground truth) captured using the Movella Xsens MVN Link motion capture system (60 fps) in .bvh format
- Annotation data (xls file format):
-
-
- Ground truth related to temporal segmentation and classification of car door assembly actions (subgoals) during task cycle executions, performed by personnel working directly on the assembly line for the CarDA dataset.
- Ground truth data on the duration of basic ergonomic postures based on the EAWS ergonomic screening tool: Two experts in manufacturing and ergonomics performed manual annotations related to the EAWS screening tool.
-
CarDA subset Α files:
-
- ws10 - svo - mp4 - bvh.rar
Five assembly task cycle executions are recorded in WS10 containing pairs of RGB-D videos (.svo) acquired by two StereoLabs ZED 2 different stereo cameras, .bvh motion capture data acquired using the XSens Link system. Annotation data are also available. - ws20 - svo - mp4 - bvh.rar
Four assembly task cycle executions are recorded in WS20 containing pairs of RGB-D videos (.svo) acquired by two StereoLabs ZED 2 different stereo cameras, .bvh motion capture data acquired using the XSens Link system. Annotation data are also available. - ws30 - svo - mp4 - bvh.rar
Four assembly task cycle executions are recorded in WS30 containing pairs of RGB-D videos (.svo) acquired by two StereoLabs ZED 2 different stereo cameras, .bvh motion capture data acquired using the XSens Link system. Annotation data are also available.
- ws10 - svo - mp4 - bvh.rar
CarDA subset B
It contains:
-
- RGB-D was acquired using StereoLabs ZED 2 sensors in .svo format
- mp4 videos (30fps) extracted from the .svo files (using the left camera of the stereo pair of each camera).
- Annotation data (xls file format):
- Ground truth related to temporal segmentation and classification of car door assembly actions (subgoals) during task cycle executions, performed by personnel working directly on the assembly line for the CarDA dataset.
- Ground truth data on the duration of basic ergonomic postures based on the EAWS ergonomic screening tool: Two experts in manufacturing and ergonomics performed manual annotations related to the EAWS screening tool.
CarDA subset B files:
-
- ws10 - svo - mp4.rar
Three pairs of RGB-D videos (.svo) acquired by two StereoLabs ZED 2 different stereo cameras placed in the real workplace are provided. - ws20 - svo - mp4.rar
Six pairs of RGB-D videos (.svo) acquired by two StereoLabs ZED 2 different stereo cameras placed in the real workplace are provided. - ws30 - svo - mp4.rar
Three pairs of RGB-D videos (.svo) acquired by two StereoLabs ZED 2 different stereo cameras placed in the real workplace are provided.
- ws10 - svo - mp4.rar
Contact:
Konstantinos Papoutsakis, PhD: papoutsa@ics.forth.gr
Maria Pateraki: mpateraki@mail.ntua.gr
Assistant Professor | National Technical University of Athens
Affiliated Researcher | Institute of Computer Science | FORTH
References:
[1] Konstantinos Papoutsakis, Nikolaos Bakalos, Konstantinos Fragkoulis, Athena Zacharia, Georgia Kapetadimitri, and Maria Pateraki. A vision-based framework for human behavior understanding in industrial assembly lines. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops - T-CAP 2024 Towards a Complete Analysis of People: Fine-grained Understanding for Real-World Applications, 2024.
Files
ECCV_2024_TCAP_HBU.pdf
Files
(19.6 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:43cb9a6c7f95e2b4fedb936c5a54f577
|
658.6 kB | Download |
|
md5:f07c809495b325b42de2002c7eb9b2fb
|
3.4 MB | Preview Download |
|
md5:39b17872b17bb2b7e7585f96887dc470
|
2.1 GB | Download |
|
md5:0ca1ffc001232527e454862c89f747f9
|
3.1 GB | Download |
|
md5:b0925175c4584f5b43e454c4a1298d81
|
2.1 GB | Download |
|
md5:fde6afa0852e64d71bea606cd79968ed
|
7.4 GB | Download |
|
md5:0fd6d73b2b89eea5c27490fe06763352
|
921.8 MB | Download |
|
md5:d6b158d50df4614426b3c7359d98f4a9
|
4.0 GB | Download |
Additional details
Funding
References
- Konstantinos Papoutsakis, Nikolaos Bakalos, Konstantinos Fragkoulis, Athena Zacharia, Georgia Kapetadimitri, and Maria Pateraki. A vision-based framework for human behavior understanding in industrial assembly lines. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops - T-CAP 2024 Towards a Complete Analysis of People: Fine-grained Understanding for Real-World Applications, 2024.