Published August 17, 2025 | Version 1.0.0.
Dataset Open

SCLIC - Semantic Changes in Learning Based Image Compression

  • 1. ROR icon Universität Innsbruck

Description

Dataset: SCLIC

This dataset is released as part of the publication "Challenging Cases of Neural Image Compression: A Dataset of Visually Compelling Yet Semantically Incorrect Reconstructions," accepted at the 33rd ACM International Conference on Multimedia'25.

Paper: https://fileshare.uibk.ac.at/f/bddcbc8d359742bd9a05/
Appendix: https://fileshare.uibk.ac.at/f/37d4f9a33d57473ba647/

Overview

The dataset is a collection of human-annotated miscompressions, collected in images compressed with different neural compression codecs at different quality settings. It consists of two CSV files

  • sclic_annotations.csv
  • sclic_images.csv

and a collection of images provided in the following format:

  • miscomp_uncompressed.zip(1 archive of 1563 uncompressed images)
  • miscomp_<codec>_<quality>.zip (12 archives of 1563 compressed images)

There are six codecs, each with two qualities. The compressed archives contain both the compressed and the reconstructed images. (Exceptions are the codecs CDC and STF, which contain reconstructions only.)

This dataset is also described using the Croissant metadata standard. See croissant.json for machine-readable metadata.

Download instructions

The following command downloads both CSV files and all archive files.

# Create a directory for the dataset and move into it
mkdir sclic_dataset && cd sclic_dataset

# Download the annotation and images lists and all dataset archives from Zenodo
wget -i - <<EOF
https://zenodo.org/records/16780952/files/sclic_annotations.csv https://zenodo.org/records/16780952/files/sclic_images.csv https://zenodo.org/records/16780952/files/miscomp_uncompressed.zip https://zenodo.org/records/16780952/files/miscomp_cdc-0512x09.zip https://zenodo.org/records/16780952/files/miscomp_cdc-2048x09.zip https://zenodo.org/records/16780952/files/miscomp_hific-hi.zip https://zenodo.org/records/16780952/files/miscomp_hific-lo.zip https://zenodo.org/records/16780952/files/miscomp_hyper-mse3.zip https://zenodo.org/records/16780952/files/miscomp_hyper-mse7.zip https://zenodo.org/records/16780952/files/miscomp_hyper-msssim025.zip https://zenodo.org/records/16780952/files/miscomp_hyper-msssim075.zip https://zenodo.org/records/16780952/files/miscomp_jpegai-025hoff.zip https://zenodo.org/records/16780952/files/miscomp_jpegai-075hoff.zip https://zenodo.org/records/16780952/files/miscomp_stf-0067.zip https://zenodo.org/records/16780952/files/miscomp_stf-0250.zip EOF # Extract the archives for file in *.zip; do unzip -o "$file" done

Usage

To facilitate reproducibility, we provide a notebook with different cropping features available on GitHub: https://github.com/NoraH2004/SCLIC

License

The dataset is distributed under the terms of the CC BY 4.0 licence.

Citation

If you use this dataset in your research, please cite:

@inproceedings{hofer2025chall,
       title     = {Challenging cases of neural image compression: A dataset of visually compelling yet semantically incorrect reconstructions},
       author    = {Nora Hofer and Rainer Böhme},
       booktitle = {ACM Multimedia},
       year      = {2025}
}


Thank You

A big THANK YOU goes to our labelers Leny Berry, Valerie Huter, and Max Ninow for the many hours of concentrated work! ❤️

Contact

For questions regarding the dataset, do not hesitate to contact:
Nora Hofer, University of Innsbruck, Austria, nora.hofer@uibk.ac.at

Funding

We gratefully acknowledge funding by the state of Tyrol (F.50541/6-2024).
Computational results were achieved using the LEO HPC infrastructure at the University of Innsbruck.

 

Files

sclic_annotations.csv

Files (79.7 GB)

Name Size Download all
md5:c7997a33ffc6b484eecb261dc1db44bd
9.8 kB Preview Download
md5:26e18ed96bf9abc6ea03b732cb4bcfdb
7.2 GB Preview Download
md5:cb71a54899a2b8a4b76171a27b81fc18
7.0 GB Preview Download
md5:8ffa7220f3ddc90b4ccc014e0d067d28
7.0 GB Preview Download
md5:676a1f53de2ac20f8bc922c895c0da17
7.0 GB Preview Download
md5:29c603aec1104d8e0689e621da144e15
5.2 GB Preview Download
md5:090f2eb6c07db00a677b576808d73ae4
6.3 GB Preview Download
md5:cb25639a29b7870313eee5f0ff11b2f5
5.0 GB Preview Download
md5:6856640149532f6c4e600c81b601c53e
6.0 GB Preview Download
md5:c15925ce99a874757753dfd225736279
5.9 GB Preview Download
md5:7e25a5e6b89f15e00b59d7135c28e2e6
6.8 GB Preview Download
md5:db53f37c7dec94bfac18f25c1fd2829f
4.6 GB Preview Download
md5:731541589860d89790f7f4645840ab9b
5.1 GB Preview Download
md5:573ba986b1f025d1d11f1be8f85815b9
6.8 GB Preview Download
md5:201099320f42b5008055a369ce018295
2.1 MB Preview Download
md5:83f4db69a65fc898f46627801c622dc3
2.5 MB Preview Download

Additional details

Related works

Cites
Conference proceeding: 10.1109/WIFS61860.2024.10810704 (DOI)

Funding

Landes Tirols
Tiroler Nachwuchsforscher*innenförderung F.50541/6-2024