Published February 19, 2019 | Version 2.0
Dataset Open

3D IQ Test Task (3D-IQTT) - A Dataset for Quantitative Evaluation of 3D Reconstruction from 2D Images

Description

3D reconstruction is mostly evaluated qualitatively. With this dataset, we are introducing a new difficult quantitative task, the 3D IQ test task (3D-IQTT).

It is designed to be similar to mental rotation questions found in some IQ tests. Each element in the dataset consists of 4 images: reference object and answers 1-3. One of the answers is the reference object but randomly rotated. For every question, dataset users have to use their model to pick the rotated model out of the 3 possible answers.

The dataset encourages semi-supervised or unsupervised 3D reconstruction because it contains a large corpus of unlabeled data and only a small set of labeled data where the correct answer is known.

All the images are of blocky 3D shapes floating in space in front of a black background.

Demo scripts for loading/processing the dataset can be found at https://github.com/fgolemo/3D-IQTT

The dataset consists of:

  • 3diqtt-v2-train.h5 (XZ-compressed)
    (Training Dataset)
    • /labeled
      • /questions
        format: [10,000 x 4 x 128 x 128 x 3], corresponding to (10k items) x (reference + 3 answers) x (img width) x (img height) x (RGB), np.float32 in range [0,1]
      • /answers
        format: [10,000], corresponding to (10k answers), np.uint8, one of the following three items: [0,1,2]
    • /unlabeled
      • /questions
        format: [100,000 x 4 x 128 x 128 x 3], corresponding to (100k items) x (reference + 3 answers) x (img width) x (img height) x (RGB), np.float32 in range [0,1]
  • 3diqtt-v2-test.h5
    (Test Dataset)
    • /questions
      format: [10,000 x 4 x 128 x 128 x 3], corresponding to (10k items) x (reference + 3 answers) x (img width) x (img height) x (RGB), np.float32 in range [0,1].
      Important! This is what you have to evaluate yourself on. We have the correct answers but they are not public.
  • 3diqtt-v2-val.h5
    (Validation Dataset)
    • /questions
      format: [10,000 x 4 x 128 x 128 x 3], corresponding to (10k items) x (reference + 3 answers) x (img width) x (img height) x (RGB), np.float32 in range [0,1]
    • /answers
      format [10,000], corresponding to (10k answers), np.uint8, one of the following three items: [0,1,2]

 

Important: Before use, the main training dataset (3diqtt-v2-train.h5.xz) needs to be decompressed. This can take up to 24h depending on your hardware. We apologize for any inconvenience caused by this. The uncompressed file has a size of ~74GB. The reason for this compression was a restriction on the size of individual files. The command for decompression is "unxz 3diqtt-v2-train.h5.xz" on Unix machines.

If you use this dataset, please cite it.

Notes

This project was funded in part by the CHIST-ERA project "IGLU" through the Agence Nationale de la Recherche (ANR) and in part by the Canadian Institute for Advanced Research (CIFAR).

Files

Files (61.7 GB)

Name Size Download all
md5:c22b4dc9e7c3eba9cbd38051f4e995de
7.9 GB Download
md5:e1162bab0eae078524e3f0c32ff87101
46.0 GB Download
md5:ece1705bf8b7425058f630c20855615f
7.9 GB Download

Additional details

Funding

European Commission
CHIST- ERA - European Coordinated Research on Long-term Challenges in Information and Communication Sciences and Technologies 248663