Published October 6, 2022 | Version v1.0
Dataset Open

Datasets for a data-centric image classification benchmark for noisy and ambiguous label estimation

Description

This is the official data repository of the Data-Centric Image Classification (DCIC) Benchmark. The goal of this benchmark is to measure the impact of tuning the dataset instead of the model for a variety of image classification datasets. Full details about the collection process, the structure and automatic download at

Paper: https://arxiv.org/abs/2207.06214

Source Code: https://github.com/Emprime/dcic

The license information is given below as download.

Citation

Please cite as

@article{schmarje2022benchmark,
    author = {Schmarje, Lars and Grossmann, Vasco and Zelenka, Claudius and Dippel, Sabine and Kiko, Rainer and Oszust, Mariusz and Pastell, Matti and Stracke, Jenny and Valros, Anna and Volkmann, Nina and Koch, Reinahrd},
    journal = {36th Conference on Neural Information Processing Systems (NeurIPS 2022) Track on Datasets and Benchmarks},
    title = {{Is one annotation enough? A data-centric image classification benchmark for noisy and ambiguous label estimation}},
    year = {2022}
}

Please see the full details about the used datasets below, which should also be cited as part of the license.

@article{schoening2020Megafauna,
author = {Schoening, T and Purser, A and Langenk{\"{a}}mper, D and Suck, I and Taylor, J and Cuvelier, D and Lins, L and Simon-Lled{\'{o}}, E and Marcon, Y and Jones, D O B and Nattkemper, T and K{\"{o}}ser, K and Zurowietz, M and Greinert, J and Gomes-Pereira, J},
doi = {10.5194/bg-17-3115-2020},
journal = {Biogeosciences},
number = {12},
pages = {3115--3133},
title = {{Megafauna community assessment of polymetallic-nodule fields with cameras: platform and methodology comparison}},
volume = {17},
year = {2020}
}

@article{Langenkamper2020GearStudy,
author = {Langenk{\"{a}}mper, Daniel and van Kevelaer, Robin and Purser, Autun and Nattkemper, Tim W},
doi = {10.3389/fmars.2020.00506},
issn = {2296-7745},
journal = {Frontiers in Marine Science},
title = {{Gear-Induced Concept Drift in Marine Images and Its Effect on Deep Learning Classification}},
volume = {7},
year = {2020}
}


@article{peterson2019cifar10h,
author = {Peterson, Joshua and Battleday, Ruairidh and Griffiths, Thomas and Russakovsky, Olga},
doi = {10.1109/ICCV.2019.00971},
issn = {15505499},
journal = {Proceedings of the IEEE International Conference on Computer Vision},
pages = {9616--9625},
title = {{Human uncertainty makes classification more robust}},
volume = {2019-Octob},
year = {2019}
}

@article{schmarje2019,
author = {Schmarje, Lars and Zelenka, Claudius and Geisen, Ulf and Gl{\"{u}}er, Claus-C. and Koch, Reinhard},
doi = {10.1007/978-3-030-33676-9_26},
issn = {23318422},
journal = {DAGM German Conference of Pattern Regocnition},
number = {November},
pages = {374--386},
publisher = {Springer},
title = {{2D and 3D Segmentation of uncertain local collagen fiber orientations in SHG microscopy}},
volume = {11824 LNCS},
year = {2019}
}

@article{schmarje2021foc,
author = {Schmarje, Lars and Br{\"{u}}nger, Johannes and Santarossa, Monty and Schr{\"{o}}der, Simon-Martin and Kiko, Rainer and Koch, Reinhard},
doi = {10.3390/s21196661},
issn = {1424-8220},
journal = {Sensors},
number = {19},
pages = {6661},
title = {{Fuzzy Overclustering: Semi-Supervised Classification of Fuzzy Labels with Overclustering and Inverse Cross-Entropy}},
volume = {21},
year = {2021}
}

@article{schmarje2022dc3,
author = {Schmarje, Lars and Santarossa, Monty and Schr{\"{o}}der, Simon-Martin and Zelenka, Claudius and Kiko, Rainer and Stracke, Jenny and Volkmann, Nina and Koch, Reinhard},
journal = {Proceedings of the European Conference on Computer Vision (ECCV)},
title = {{A data-centric approach for improving ambiguous labels with combined semi-supervised classification and clustering}},
year = {2022}
}


@article{obuchowicz2020qualityMRI,
author = {Obuchowicz, Rafal and Oszust, Mariusz and Piorkowski, Adam},
doi = {10.1186/s12880-020-00505-z},
issn = {1471-2342},
journal = {BMC Medical Imaging},
number = {1},
pages = {109},
title = {{Interobserver variability in quality assessment of magnetic resonance images}},
volume = {20},
year = {2020}
}


@article{stepien2021cnnQuality,
author = {St{\c{e}}pie{\'{n}}, Igor and Obuchowicz, Rafa{\l} and Pi{\'{o}}rkowski, Adam and Oszust, Mariusz},
doi = {10.3390/s21041043},
issn = {1424-8220},
journal = {Sensors},
number = {4},
title = {{Fusion of Deep Convolutional Neural Networks for No-Reference Magnetic Resonance Image Quality Assessment}},
volume = {21},
year = {2021}
}

@article{volkmann2021turkeys,
author = {Volkmann, Nina and Br{\"{u}}nger, Johannes and Stracke, Jenny and Zelenka, Claudius and Koch, Reinhard and Kemper, Nicole and Spindler, Birgit},
doi = {10.3390/ani11092655},
journal = {Animals 2021},
pages = {1--13},
title = {{Learn to train: Improving training data for a neural network to detect pecking injuries in turkeys}},
volume = {11},
year = {2021}
}

@article{volkmann2022keypoint,
author = {Volkmann, Nina and Zelenka, Claudius and Devaraju, Archana Malavalli and Br{\"{u}}nger, Johannes and Stracke, Jenny and Spindler, Birgit and Kemper, Nicole and Koch, Reinhard},
doi = {10.3390/s22145188},
issn = {1424-8220},
journal = {Sensors},
number = {14},
pages = {5188},
title = {{Keypoint Detection for Injury Identification during Turkey Husbandry Using Neural Networks}},
volume = {22},
year = {2022}
}

Addition: This repository also contains the original data from the paper "Annotating Ambiguous Images" (https://arxiv.org/abs/2306.12189). The data is created based on the original datasets and license from https://osf.io/t98fz/ and https://osf.io/nqjyw/

Files

Benthic.zip

Files (4.4 GB)

Name Size Download all
md5:68f7dd22258bf2a2f135bb747671c40b
148.2 MB Preview Download
md5:7a08ea63fca3597a01bd34aae829ec15
26.9 MB Preview Download
md5:231bf98ef362ad2fca76654c55aa1dc5
16.2 kB Preview Download
md5:8a4026c22f07373f022d9ab4818089ec
680.5 MB Preview Download
md5:bc3faf200c0842a1ba7c946c81ebdc4b
144.6 MB Preview Download
md5:07418167fcda8fa0bc9143cdf6bd2d06
24.3 MB Preview Download
md5:0ba6cbed0228673da9677021e41229fc
21.7 MB Preview Download
md5:6091ae8b8166e1b2aec5068604d8bab3
34.0 MB Preview Download
md5:d786cf1c6e2d18d919ae383727877476
1.3 GB Preview Download
md5:e7b77e309a21fe50017b881a27c57db6
1.4 GB Preview Download
md5:699af66aa841bcf6607fd74b63426763
552.9 MB Preview Download
md5:5802eac308309824957278d151055bbb
75.7 MB Preview Download
md5:7a0a6d70171925d86e18dfa67058548f
12.0 MB Preview Download