Published 2012 | Version v1
Dataset Open

Challenging Data for Stereo and Optical Flow

Description

Abstract

Selected scenes for stereo disparity and optical flow estimation containing yet unsolved challenges. 
Dataset containing 11 challenging sequences for stereo and optical flow estimation. 

Introduction

Currently only few test sequences for optical flow and stereo are available. Most of these show highly controlled indoor scenes and do not contain the complexity that is commonly encountered in outdoor environments. Our aim is to provide new, challenging outdoor data to stimulate research in computer vision.
We acquired several million frames with a carefully devised stereo camera system. The recorded scenes provide a huge variety of different weather conditions, different motion and depth layers; they contain city and countryside situations and were acquired at night and at day. From this large quantity of data we selected eleven scenes, each containing a different challenge, highlighting problems that occur regularly.
We estimated optical flow and stereo on 10.000 manually selected frames and found that state-of-the-art algorithms frequently fail to estimate reliable correspondences in situations that are summarized in the selected scenes. We observed that these situations fundamentally violate common model assumptions such as brightness constancy and single motion per pixel. With access to this highly challenging data, we also like to encourage alternative approaches that open up new ways to deal with the occuring problems.
On this webpage, the challenges in these frames are described and links for the download of the sequences are provided. To keep the data managable, each scene contains about 30 frames and keyframes are named for which the described phenomenom is most explicit. Furthermore we provide a minute description of the recoding and calibration procedures and additionally we supply code to simplify the dealing with the data and the visualization of the results.
 

This dataset is part of the robust vision challenge: click here for more information!

Methods

Scene Description and Visualization of Correspondences on Reference Frame

Each scene is named with keywords that give a short hint on its principal content. In this section, a more detailed description of the challenge in the scenes is provided together with a reference frame where the problem is most evident.
Additionally, we provide exemplary results on the reference frame to visualize some of the problems. The reference stereo results have been computed by a self-implemented variant of the SGM method using the rank filter as proposed by Hirschmüller and Scharstein [PDF]. Optical flow results have been computed by the implementation of Black and Anandan's method as implemented by Deqing Sun and his own nonlocal method, both described in his paper with public Matlab code. Parameters used are available upon request.
The results we show are based on the respective algorithm output with standard parameters across all scenes. As the data is highly challenging and the parameters are not tuned, we cannot expect that the results are as good as they can get.
Please also note that these methods have not been developed for such difficult scenes - we merely used these algorithms as (to the best of our knowledge) better or specialized ones do not exist. Detailed information on the visualization method we used can be found here. You can also click on the images to see a fullsize view of the visualization.

Blinking Arrow (March)
Mainly interesting for: Optical Flow
Challenge: Driving on a road with guard railing, the railings are prone to temporal aliasing. A light signal system near the horizon flashes and violates the constant brightness assumption.
Reference Frame: 3
Car Truck (March)
Mainly interesting for: Optical Flow, Stereo
Challenge: Accelerating at the approach to an Autobahn, large rotational egomotion is present. A car-transport truck crossing provides complex two-layer occlusion and high motion differences to the background.
Reference Frame: 5
Crossing Cars (May)
Mainly interesting for: Optical Flow, Stereo, Feature Matching
Challenge: Harsh lighting conditions with many saturated highlights and strong scene reflections are quite common in outdoor scenes. An additional challenge in this scene are the complex occlusions between crossing cars and transparency in the car-windows.
Reference Frame: 9
Flying Snow (December)
Mainly interesting for: Optical Flow, Stereo, Feature Matching
Challenge: Thick snowflakes fall heavily from the sky and obscure part of a city driving scene. A preceeding car hurles up partly transparent snow/mud fountains.
Reference Frame: 4
Night and Snow (December)
Mainly interesting for: Optical Flow, Stereo, Feature Matching
Challenge: In this driving scene the windshield wiper blocks part of one stereo camera. Additionally, the windshield is wet, causing the headlights of an approaching car to flare. Due to darkness the aperture is wide open and the image has only a shallow depth of field.
Reference Frame: 17
Rain Blur (August)
Mainly interesting for: Stereo
Challenge: Driving during rain causes differently blurred left and right views.
Reference Frame: 11
Rain Flares (June)
Mainly interesting for: Optical Flow, Stereo, Feature Matching
Challenge: Driving during rain causes many kinds of lensflares of other vehicle's headlights. Additionally, water on the windshield blurs the view.
Reference Frame: 21
Reflecting Car (June)
Mainly interesting for: Optical Flow, Stereo, Feature Matching
Challenge: Strong reflections on glossy surfaces such as cars and cast shadows often induce spurious motions.
Reference Frame:28
Shadow on Truck (June) 
Mainly interesting for: Optical Flow, Feature Matching
Challenge: Follwing a truck on the Autobahn, this scene is geometrically quite simple. However, cast shadows dancing on the truck and the street violate assumptions that are made by many algorithms.
Reference Frame:13
Sunflare (March)
Mainly interesting for: Optical Flow, Stereo, Feature Matching
Challenge: In a common driving situation with considerable egomotion, the sun shines directly into the cameras causing severe lensflares.
Reference Frame: 24
Wet Autobahn (December)
Mainly interesting for: Optical Flow, Feature Matching
Challenge: This sequence summarized multiple effects of adverse weather conditions. A film of water renders flat surfaces like the street reflecting. Additionally, spray raised by passing cars is semitransparent and motion estimation on both spray and occluded objects is hard.
Reference Frame: 20
 

Data Generation

We recorded the data with a high-speed, high-resolution stereo camera system. A publication with motivation and a minute description of the recording procedure is to be published in Febuary 2012. Meanwhile, a full technical report can be found here.

Data Processing

We recorded the sequences with a resolution of 1312x1082@12bit and a framerate of 100Hz. The baseline distance was around 30cm. All on-camera-preprocessing was turned off.
The following operations have been performed on the images:

  • We computed a radiometric calibration using an Ulbricht-Sphere (integrating sphere) and estimated a quadratic response curve for each pixel (two parameters).
  • Before each recording session we estimated the dark current by averaging 100 frames with a closed lens cap.
  • Before each recording session both cameras were calibrated based on a method similar to https://doi.org/10.1117/12.279802. This means that we have removed spherical distortions and aligned the images with respect to their epipolar geometry (horizontal lines in the first image correspond to the same horizontal lines in the second image). The camera parameters are available as extra download below
  • The dark current image was subtracted from each camera frame and the inverse of each response curve was used to linearize the intensities. Hot and defect pixels were identified based on a heuristic (significant local intensity deviations throughout all sequences) and removed by applying a 3x3 median filter.
  • Each frame (both left and right) was rectified based on a lookup-table which was created with the camera calibration result.
  • The resulting (both radiometrically and geometrically rectified) image pair is downscaled to the size of 656x541@12bit by averaging each four pixels to a single pixel. We used this simple method explicitly to conserve the natural noise on the images as much as possible. Every fourth image (to reach an effective framerate of 25Hz) is stored as PGM files.
  • Saturated pixels are not processed.

Privacy Protection

The recorded data shows challenging traffic scenes as they occur in real life and therefore contains license plates and pedestrians. To protect the privacy of the traffic participants we removed all information that can be traced back to individuals. As we attach great importance to the protection of privacy and the compliance with national privacy laws we chose to accept this interference with the raw data.
To keep interference as minimal as possible, we manually labeled all pedestrians and license plates in the scenes and removed only the high frequency data that contains the private information by a Gaussian blur-filter. We chose a large variance to render inverse engineering of the original data impossible.
To prevent the introduction of spurious gradients at the transition between filtered and original regions we define a boundary region where filtered and unfilterted parts are blended together smoothly. With this procedure we ensure that neither new high frequency content nor new image gradients are introduced.
Practical comparison of optical flow and stereo correspondences estimated on both the original and the modified images show that the removal of private information does not interfere with the performance of the tested algorithms.
Along with the modified images we offer binary masks for download in which modified pixel are marked with a non-zero value.

Technical info

System Configuration

Code

We provide different packages of C++ and MATLAB code that we used to produce some of the results on this page. When using this code please note the disclaimer provided with the software.

  • In this project we work with 12-bit grayscale .pgm-images. Unfortunatly, MATLAB's imread() function does not support 12-bit images. You can use this function as a replacement, maintaining syntax and options of imread().
  • Results on this page were displayed with our visualization code. To compare your results easily to the ones given here, you can use the same visualization methods, provided as MATLAB and C++ sources.

Other

Related Work

A well-known set of short image sequences and stereo pairs with ground truth has been made public via the Middlebury Benchmark Website which is being maintained by Prof. Daniel Scharstein, Middlebury College, USA.
The first dataset with long image sequences, partly augmented with ground truth for stereo as well as optical flow estimation was the .enpeda.-Database created by the group of Prof. Reinhard Klette in Auckland, New Zealand.
More recently, Andreas Geiger and colleagues in the groups of Prof. Christoph Stiller (Karlsruhe Institute of Technology) and Prof. Raquel Urtasun (Toyota Technological Institute, USA) published the KITTI Vision Benchmark Suite.
If you have published a similar dataset, please let us know, so we can include it in this article.

 

Notes

Usage of the Data

This data must be used for research purposes only.

In case you would like to publish work based on this data, please cite the following article:
@article{meister2012outdoor,
title={Outdoor stereo camera system for the generation of real-world benchmark data sets},
author={Meister, S. and J{\"a}hne, B. and Kondermann, D.},
journal={Optical Engineering},
volume={51},
number={02},
pages={021107},
year={2012}
 }

Acknowledgements

The present data was acquired and processed by
Daniel Kondermann (HCI),
Stephan Meister (HCI),
Paul-Sebastian Lauer (Bosch Corporate Research) and
Anita Sellent (Bosch Corporate Research)
in collaboration with the Robert Bosch GmbH.

The work was backed up by
Bernd Jähne (HCI), 
Wolfgang Niehsen (Bosch Corporate Research) and
Jochen Wingbermühle (Bosch Corporate Research).

Furthermore, our HCI student research assistants
Annika Berger,
Julian Coordts,
Tobias Preatsch and
Christoph Koke
spent countless hours to assist with the preparation of the data.

Files

Animation.gif

Files (4.5 GB)

Name Size Download all
md5:f82729b870ba3c00d181e8961c59d23d
546.1 kB Preview Download
md5:217b5b077c0b702d7b7b0c48daa29531
731.0 kB Preview Download
md5:579248c4914420f1688f8b5f4c0d971d
7.5 kB Preview Download
md5:2ec78b0839694393e6493e475932413b
556.9 kB Preview Download
md5:c560b191d0add715f9e4606b33dbdc78
4.5 GB Preview Download
md5:bbc9d53bf4f82781fb5bdc7b53520ab4
795.4 kB Preview Download
md5:6074ddcda4a73d19dac94fff50e39942
818.6 kB Preview Download
md5:554531d13f66a743ebb875d09312d19e
44.5 kB Preview Download
md5:80a9a805120a4f53abf6da9a9867d78b
743.9 kB Preview Download
md5:0ded7ab1bd84a8d7013f949a09202db9
810.8 kB Preview Download
md5:93fa1b006d2db22b8d94bbc2e096daa6
736.7 kB Preview Download
md5:66a1f1adc548b5836197363fb1a1723c
702.4 kB Preview Download
md5:2ae1176dc8e2ff7b3100134207be1f0c
693.8 kB Preview Download
md5:7a2511fe8d4242f003ecdb3374744547
1.7 kB Preview Download
md5:b1932abdfc80eafc2f25720064736562
810.3 kB Preview Download
md5:cf12686f532d4d9257853b964c0850eb
719.4 kB Preview Download

Additional details

Related works

Software

Repository URL
https://zenodo.org/doi/10.5281/zenodo.8033172
Programming language
MATLAB, C++
Development Status
Inactive