Published January 20, 2026 | Version v1
Dataset Open

When the Codec Hallucinates: User Perceptions of Miscompressed Images - Userstudy Dataset

  • 1. University of Innsbruck
  • 2. ROR icon Universität Innsbruck

Description

This dataset is released as part of the publication "When the Codec Hallucinates: User Perceptions of Miscompressed Images", accepted at the Conference on Human Factors in Computing Systems 2026 (ACM CHI) in Barcelona, Spain.

Paper:https://informationsecurity.uibk.ac.at/pdfs/HB2026_CHI.pdf

For better reproducibility, we provide both the dataset of images used for the study and the detailed responses of the survey.

image_dataset.csv

Provides detailed information for every image used in the study. Refer to Figure 4 in the paper for an overview; each box in the figure reflects one row in this table.

Columns

  • ID: Unique identifier for each image sample, consisting of [version|name|group]
    • version:
      • J - JPEG-compressed
      • N - Neural-compressed
      • M - Miscompressed
      • U - Uncompressed
    • name: Five first digits of image name
    • group: Subject group that saw the image. [1...4], or 0 if all groups saw the image ("anchor image")
  • ORIGNAME: Full name of the scene (original image)
  • CROP: Size of the crop in pixels (quadratic)
  • GROUP: Subject group that saw the image.  [1...4], or 0 if all groups saw the image ("anchor image")
  • ORDER: Position in the instrument [1...12]
  • UNCOMP: "1" if the image is uncompressed
  • JPEG: "1" if the image isas JPEG
  • JPEGQ: JPEG quality factor for JPEG compressed images
  • NC: "1" if the image is neurally compressed
  • NCQ: Compression codec for neurally compressed images
  • MISCOMP: "1" if the image contains a miscompression

 

survey_responses.csv

For the corresponding questions, please refer to Appendix B.2 of the paper. Columns represent questions in chronological order, followed by timings.

Columns

  • DEM- age, sex D1 - D3: demographic questions
  • [version|name|group|question]: Image comparison
    • version:
      • J - JPEG-compressed
      • N - Neural-compressed
      • M - Miscompressed
      • U - Uncompressed
    • name: Five first digits of image name
    • group: Subject group that saw the image [1...4], or 0 if all groups saw the image ("anchor image")
    • question:
      • dif - S1: difference
      • mis - S2: misunderstandings [1...6], 1 = "certainly"
      • mod - S3(1): intentional editing [1...6], 1 = "certainly"
      • com - S3(2): uncontrolled distortion [1...6], 1 = "certainly"
  • THE* C1(1 - 8): Theoretical knowledge
  • PRA* C2(1 - 8): Practical experience
  • VER* C3(1 - 6): Verification practices
  • EXP* C4(1 - 2): Experience with image distribution
  • interviewtime: Time to complete the whole survey [s]
  • age, group, vis -Time: Demographics Time [s]
  • [version|name|group|question] -Time: Image comparison time [s]
    • ...
    • question
      • dif
      • mis
      • modcom: mod and com together since on the same display
  • the, pra, ver, exp -Time: Control variable's time [s]

Files

image_dataset.csv

Files (91.8 kB)

Name Size Download all
md5:29013c5cfe9a470b94826e259374500d
1.7 kB Preview Download
md5:5a77883eb47463019f7bfebc51894ce8
90.1 kB Preview Download

Additional details

Related works

Is supplement to
Publication: 10.1145/3772318.3790293 (DOI)

Funding

Landes Tirols
Tiroler Nachwuchsforscher*innenförderung F.50541/6-2024