MatSeg: Material State Segmentation Dataset and Benchmark
Creators
Description
MatSeg Dataset and benchmark for zero-shot material state segmentation.
MatSeg Benchmark containing 1220 real-world images and their annotations is available at MatSeg_Benchmark.zip the file contains documentation and Python readers.
MatSeg dataset containing synthetic images with infused natural images patterns is available at MatSeg3D_part_*.zip and MatSeg3D_part_*.zip (* stand for number).
MatSeg3D_part_*.zip: contain synthethc 3D scenes
MatSeg2D_part_*.zip: contain syntethc 2D scenes
Readers and documentation for the synthetic data are available at: Dataset_Documentation_And_Readers.zip
Readers and documentation for the real-images benchmark are available at: MatSeg_Benchmark.zip
The Code used to generate the MatSeg Dataset is available at: https://zenodo.org/records/11401072
Additional permanent sources for downloading the dataset and metadata: 1, 2
Evaluation scripts for the Benchmark are now available at:
https://zenodo.org/records/13402003 and https://e.pcloud.link/publink/show?code=XZsP8PZbT7AJzG98tV1gnVoEsxKRbBl8awX
Description
Materials and their states form a vast array of patterns and textures that define the physical and visual world. Minerals in rocks, sediment in soil, dust on surfaces, infection on leaves, stains on fruits, and foam in liquids are some of these almost infinite numbers of states and patterns.
Image segmentation of materials and their states is fundamental to the understanding of the world and is essential for a wide range of tasks, from cooking and cleaning to construction, agriculture, and chemistry laboratory work.
The MatSeg dataset focuses on zero-shot segmentation of materials and their states, meaning identifying the region of an image belonging to a specific material type of state, without previous knowledge or training of the material type, states, or environment.
The dataset contains a large set of (100k) synthetic images and benchmarks of 1220 real-world images for testing.
Benchmark
The benchmark contains 1220 real-world images with a wide range of material states and settings. For example: food states (cooked/burned..), plants (infected/dry.) to rocks/soil (minerals/sediment), construction/metals (rusted, worn), liquids (foam/sediment), and many other states in without being limited to a set of classes or environment. The goal is to evaluate the segmentation of material materials without knowledge or pretraining on the material or setting. The focus is on materials with complex scattered boundaries, and gradual transition (like the level of wetness of the surface).
Evaluation scripts for the Benchmark are now available at: 1 and 2.
Synthetic Dataset
The synthetic dataset is composed of synthetic scenes rendered in 2d and 3d using a blender. The synthetic data is infused with patterns, materials, and textures automatically extracted from real images allowing it to capture the complexity and diversity of the real world while maintaining the precision and scale of synthetic data. 100k images and their annotation are available to download.
License
This dataset, including all its components, is released under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication. To the extent possible under law, the authors have dedicated all copyright and related and neighboring rights to this dataset to the public domain worldwide. This dedication applies to the dataset and all derivative works.
The MatSeg 2D and 3D synthetic were generated using the open-images dataset which is licensed under the https://www.apache.org/licenses/LICENSE-2.0. For these components, you must comply with the terms of the Apache License. In addition, the MatSege3D dataset uses Shapenet 3D assets with GNU license.
Example Usage:
An Example of a training and evaluation code for a net trained on the dataset and evaluated on the benchmark is given at these urls: 1, 2
This include an evaluation script on the MatSeg benchmark.
Training script using the MatSeg dataset.
And weights of a trained model
License
This dataset, including all its components, is released under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication. To the extent possible under law, the authors have dedicated all copyright and related and neighboring rights to this dataset to the public domain worldwide. This dedication applies to the dataset and all derivative works.
The MatSeg 2D and 3D synthetic were generated using the open-images dataset which is licensed under the https://www.apache.org/licenses/LICENSE-2.0. For these components, you must comply with the terms of the Apache License. In addition, the MatSege3D dataset uses Shapenet 3D assets with GNU license.
Croissant metadata and additional sources for downloading the dataset are available at 1,2
Files
Dataset_Documentation_And_Readers.zip
Files
(44.5 GB)
Name | Size | Download all |
---|---|---|
md5:39382fd7b526c47aa8332ce19f248710
|
18.8 MB | Preview Download |
md5:526722ae52daa2a6650dc742c2b87b50
|
3.8 GB | Preview Download |
md5:0e8973fc0c979ebee2852ae808bf3aa3
|
3.8 GB | Preview Download |
md5:2e50084772fee4a3bb98acab8ffb1d66
|
3.8 GB | Preview Download |
md5:ea122e014d9119bef6063325fca6275c
|
3.7 GB | Preview Download |
md5:632d47a91550b4d37e7c956ddcbafb26
|
3.8 GB | Preview Download |
md5:bb68f84aaa18174c3990f562ff84bfbd
|
3.9 GB | Preview Download |
md5:3bc9440d935dbcab31efd6d185742993
|
3.8 GB | Preview Download |
md5:f8ea9578bd44addcf4ef4684a6ae5a09
|
3.9 GB | Preview Download |
md5:4900fb5c908b8f3f5e52c42a5f018061
|
3.8 GB | Preview Download |
md5:1605d010dea2c9f47e17813ccbcb0f24
|
4.0 GB | Preview Download |
md5:b6a4142eea46747f6ffc5b310b533efc
|
6.3 GB | Preview Download |