SCLIC - Semantic Changes in Learning Based Image Compression
Description
Dataset: SCLIC
This dataset is released as part of the publication "Challenging Cases of Neural Image Compression: A Dataset of Visually Compelling Yet Semantically Incorrect Reconstructions," accepted at the 33rd ACM International Conference on Multimedia'25.
Paper: https://fileshare.uibk.ac.at/f/bddcbc8d359742bd9a05/
Appendix: https://fileshare.uibk.ac.at/f/37d4f9a33d57473ba647/
Overview
The dataset is a collection of human-annotated miscompressions, collected in images compressed with different neural compression codecs at different quality settings. It consists of two CSV files
sclic_annotations.csvsclic_images.csv
and a collection of images provided in the following format:
miscomp_uncompressed.zip(1 archive of 1563 uncompressed images)miscomp_<codec>_<quality>.zip(12 archives of 1563 compressed images)
There are six codecs, each with two qualities. The compressed archives contain both the compressed and the reconstructed images. (Exceptions are the codecs CDC and STF, which contain reconstructions only.)
This dataset is also described using the Croissant metadata standard. See croissant.json for machine-readable metadata.
Download instructions
The following command downloads both CSV files and all archive files.
# Create a directory for the dataset and move into it
mkdir sclic_dataset && cd sclic_dataset
# Download the annotation and images lists and all dataset archives from Zenodo
wget -i - <<EOF
https://zenodo.org/records/16780952/files/sclic_annotations.csv
https://zenodo.org/records/16780952/files/sclic_images.csv
https://zenodo.org/records/16780952/files/miscomp_uncompressed.zip
https://zenodo.org/records/16780952/files/miscomp_cdc-0512x09.zip
https://zenodo.org/records/16780952/files/miscomp_cdc-2048x09.zip
https://zenodo.org/records/16780952/files/miscomp_hific-hi.zip
https://zenodo.org/records/16780952/files/miscomp_hific-lo.zip
https://zenodo.org/records/16780952/files/miscomp_hyper-mse3.zip
https://zenodo.org/records/16780952/files/miscomp_hyper-mse7.zip
https://zenodo.org/records/16780952/files/miscomp_hyper-msssim025.zip
https://zenodo.org/records/16780952/files/miscomp_hyper-msssim075.zip
https://zenodo.org/records/16780952/files/miscomp_jpegai-025hoff.zip
https://zenodo.org/records/16780952/files/miscomp_jpegai-075hoff.zip
https://zenodo.org/records/16780952/files/miscomp_stf-0067.zip
https://zenodo.org/records/16780952/files/miscomp_stf-0250.zip
EOF
# Extract the archives
for file in *.zip; do
unzip -o "$file"
done
Usage
To facilitate reproducibility, we provide a notebook with different cropping features available on GitHub: https://github.com/NoraH2004/SCLIC
License
The dataset is distributed under the terms of the CC BY 4.0 licence.
Citation
If you use this dataset in your research, please cite:
@inproceedings{hofer2025chall,
title = {Challenging cases of neural image compression: A dataset of visually compelling yet semantically incorrect reconstructions},
author = {Nora Hofer and Rainer Böhme},
booktitle = {ACM Multimedia},
year = {2025}
}
Thank You
A big THANK YOU goes to our labelers Leny Berry, Valerie Huter, and Max Ninow for the many hours of concentrated work! ❤️
Contact
For questions regarding the dataset, do not hesitate to contact:
Nora Hofer, University of Innsbruck, Austria, nora.hofer@uibk.ac.at
Funding
We gratefully acknowledge funding by the state of Tyrol (F.50541/6-2024).
Computational results were achieved using the LEO HPC infrastructure at the University of Innsbruck.
Files
sclic_annotations.csv
Files
(79.7 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:c7997a33ffc6b484eecb261dc1db44bd
|
9.8 kB | Preview Download |
|
md5:26e18ed96bf9abc6ea03b732cb4bcfdb
|
7.2 GB | Preview Download |
|
md5:cb71a54899a2b8a4b76171a27b81fc18
|
7.0 GB | Preview Download |
|
md5:8ffa7220f3ddc90b4ccc014e0d067d28
|
7.0 GB | Preview Download |
|
md5:676a1f53de2ac20f8bc922c895c0da17
|
7.0 GB | Preview Download |
|
md5:29c603aec1104d8e0689e621da144e15
|
5.2 GB | Preview Download |
|
md5:090f2eb6c07db00a677b576808d73ae4
|
6.3 GB | Preview Download |
|
md5:cb25639a29b7870313eee5f0ff11b2f5
|
5.0 GB | Preview Download |
|
md5:6856640149532f6c4e600c81b601c53e
|
6.0 GB | Preview Download |
|
md5:c15925ce99a874757753dfd225736279
|
5.9 GB | Preview Download |
|
md5:7e25a5e6b89f15e00b59d7135c28e2e6
|
6.8 GB | Preview Download |
|
md5:db53f37c7dec94bfac18f25c1fd2829f
|
4.6 GB | Preview Download |
|
md5:731541589860d89790f7f4645840ab9b
|
5.1 GB | Preview Download |
|
md5:573ba986b1f025d1d11f1be8f85815b9
|
6.8 GB | Preview Download |
|
md5:201099320f42b5008055a369ce018295
|
2.1 MB | Preview Download |
|
md5:83f4db69a65fc898f46627801c622dc3
|
2.5 MB | Preview Download |
Additional details
Related works
- Cites
- Conference proceeding: 10.1109/WIFS61860.2024.10810704 (DOI)