concrete_patch_classification
Authors/Creators
Contributors
Project leader:
Project member (2):
Description
concrete_patch_classification
This dataset is based on the original dataset I3DCP introduced in Rill-García, R., Dokladalova, E., Dokládal, P., Caron, J.-F., Mesnil, R., Margerit, P., & Charrier, M. (2022). Inline monitoring of 3D concrete printing using computer vision. Additive Manufacturing, 60, 103175. https://doi.org/10.1016/j.addma.2022.103175
The original dataset includes raw images of cement-based material deposition, segmentation masks of interstitial lines, and texture classification patches. In particular, our work focuses on the texture classification patches. This dataset thus provides three complementary resources:
- A reorganized version of the original 111 patches with 5-fold splits.
- An extended set of 426 expert-annotated patches with an additional geometric defect class(Crushed in English, Écrasé in French).
- A collection of synthetic patches generated with StyleGAN3, covering all five classes.
Sub-dataset 1: Original annotated texture windows
- Content: 111 labeled gray-leveled texture windows with fixed width 200 extracted from 24 raw images. 5-fold cross-validation
- Original classes:
- Fluid (24 images, proportion 21.62%)
- Good (27 images, proportion 24.32%)
- Dry (24 images, proportion 21.62%)
- Tearing (36 images, proportion 32.43%)
- Labels: texture_windows-labels.csv.
- Model weights fine-tuned in subdataset1 with synthetic images in subdataset3: Baseline model introduced by (Rill-García et al., 2022) , EfficientFormer model introduced by (Li et al., 2022) and proposed Multimodal Dual-Branch model. (pth: model weight, *.txt: normalization params for image, *.npy: normalization params for texture descripteur vector)
Sub-dataset 2: Extended expert-annotated texture windows
- Content: 426 extended labeled gray-leveled texture windows with fixed width 200 extracted from 24 raw images. 5-fold cross-validation
- Classes:
- Fluid(84 images,proportion 19.72%)
- Good(127 images,proportion 29.81%)
- Dry(68 images,proportion 15.96%)
- Tearing(61 images,proportion 14.32%)
- Geometric defect Écrasé (French) / Crushed (English) (86 images, proportion 20.19%)
- Labels: patch_labels(426extension).csv
- Model weights fine-tuned in subdataset2 with synthetic images in subdataset3: Baseline model introduced by (Rill-García et al., 2022) , EfficientFormer model introduced by (Li et al., 2022) and proposed Multimodal Dual-Branch model.(pth: model weight, *.txt: normalization params for image, *.npy: normalization params for texture descripteur vector)
Sub-dataset 3: Synthetic images (StyleGAN3 generated)
- Content: Synthetic gray-leveled texture windows generated by five separate pretrained generative models.
- Classes:
- Fluid(1200 images)
- Good(1200 images)
- Dry(1200 images)
- Tearing(1200 images)
- Geometric defect Écrasé (French) / Crushed (English)(1200 images)
- Labels: ./images_generees(d1)/patch_labels(426extension+stylegan3).csv for Sub-dataset2. ./images_generees(d2)/texture_windows-labels(stylegan3_d2).csv for Sub-dataset1.
- Model weights trained for generation: 4 category-specific model weights trained by StyleGAN3 (fluid, good, dry, tearing), each model can only generate one category. 1 category-jointly model weights trained by StyleGAN3, which generates 5 categories(fluid, good, dry ,tearing, ecrase/crushed)
For specific dataset usage, please refer to the GitHub repository
Updates (compared to Version 1.0.0)
The models were re-trained under an updated training configuration, resulting in reduced overfitting compared to Version 1.0.0. In addition, the inference procedure has been upgraded from a single-model setup to a 5-model ensemble strategy based on logits averaging.
Synthetic Data Extension (SubDataset3)
The synthetic image dataset has been expanded. In Version 1.0.0, synthetic images were generated exclusively using a generator trained on Original dataset. In the current version, additional synthetic images generated by a generator trained on Re-annotated dataset have been included. The corresponding label CSV files are also provided to facilitate data augmentation during training.
To avoid data leakage between datasets, a cross-dataset generation strategy is adopted:
-
Synthetic images generated by the generator trained on the Original Dataset (Dataset1) are used exclusively for augmentation of the Re-annotated Dataset (Dataset2).
-
Conversely, synthetic images generated by the generator trained on the Re-annotated Dataset (Dataset2) are used exclusively for augmentation of the Original Dataset (Dataset1).
License
This dataset is distributed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0). https://creativecommons.org/licenses/by-nc-sa/4.0/
It is derived from the I3DCP released under the same license (CC BY-NC-SA 4.0). Additional annotations and processing were created by us and are released under the same CC BY-NC-SA 4.0 license.
Files
LICENSE.md
Files
(37.0 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:4a206eed80a3482b8fbf26350ead4538
|
19.1 kB | Preview Download |
|
md5:db5fcc5870d5321baeabc037163411a0
|
17.3 GB | Preview Download |
|
md5:d07f25276483df5d87e80f4f72d33d9a
|
17.3 GB | Preview Download |
|
md5:e5607888c340cabb862564e27d0c223b
|
16.9 MB | Preview Download |
|
md5:299c939d77060c3a7bd1e344d5301b0e
|
65.3 MB | Preview Download |
|
md5:124257bd2d24344cb6be840efb632b5d
|
2.4 GB | Preview Download |
Additional details
Funding
- Agence Nationale de la Recherche
- ANR-JCJC SmartAMP ANR-24-CE10-6002-01
Dates
- Created
-
2025-10-17Dataset creation
- Updated
-
2026-06-10Dataset Version 1.0.1 update
Software
- Repository URL
- https://github.com/frankxm/concrete-patch-texture-classification.git
- Programming language
- Python
- Development Status
- Active
References
- Rill-García R, Dokladalova E, Dokládal P, et al. Inline monitoring of 3D concrete printing using computer vision[J]. Additive Manufacturing, 2022, 60: 103175. https://doi.org/10.1016/j.addma.2022.103175