Challenging Data for Stereo and Optical Flow
Creators
- Meister, Stephan (Data collector)1, 2
- Kondermann, Daniel (Data collector)1
- Lauer, Paul-Sebastian (Data collector)3
- Sellent, Anita (Data collector)3
- Robert Bosch (Germany) (Data collector)
- Jähne, Bernd (Other)1
- Niehsen, Wolfgang (Other)3
- Wingbermühle, Jochen (Other)3
- Berger, Annika (Other)1
- Coordts, Julian (Other)1
- Preatsch, Tobias1
- Koke, Christoph (Other)1
Description
Abstract
Selected scenes for stereo disparity and optical flow estimation containing yet unsolved challenges.
Dataset containing 11 challenging sequences for stereo and optical flow estimation.
Introduction
Currently only few test sequences for optical flow and stereo are available. Most of these show highly controlled indoor scenes and do not contain the complexity that is commonly encountered in outdoor environments. Our aim is to provide new, challenging outdoor data to stimulate research in computer vision.
We acquired several million frames with a carefully devised stereo camera system. The recorded scenes provide a huge variety of different weather conditions, different motion and depth layers; they contain city and countryside situations and were acquired at night and at day. From this large quantity of data we selected eleven scenes, each containing a different challenge, highlighting problems that occur regularly.
We estimated optical flow and stereo on 10.000 manually selected frames and found that state-of-the-art algorithms frequently fail to estimate reliable correspondences in situations that are summarized in the selected scenes. We observed that these situations fundamentally violate common model assumptions such as brightness constancy and single motion per pixel. With access to this highly challenging data, we also like to encourage alternative approaches that open up new ways to deal with the occuring problems.
On this webpage, the challenges in these frames are described and links for the download of the sequences are provided. To keep the data managable, each scene contains about 30 frames and keyframes are named for which the described phenomenom is most explicit. Furthermore we provide a minute description of the recoding and calibration procedures and additionally we supply code to simplify the dealing with the data and the visualization of the results.
This dataset is part of the robust vision challenge: click here for more information!
Methods
Scene Description and Visualization of Correspondences on Reference Frame
Each scene is named with keywords that give a short hint on its principal content. In this section, a more detailed description of the challenge in the scenes is provided together with a reference frame where the problem is most evident.
Additionally, we provide exemplary results on the reference frame to visualize some of the problems. The reference stereo results have been computed by a self-implemented variant of the SGM method using the rank filter as proposed by Hirschmüller and Scharstein [PDF]. Optical flow results have been computed by the implementation of Black and Anandan's method as implemented by Deqing Sun and his own nonlocal method, both described in his paper with public Matlab code. Parameters used are available upon request.
The results we show are based on the respective algorithm output with standard parameters across all scenes. As the data is highly challenging and the parameters are not tuned, we cannot expect that the results are as good as they can get.
Please also note that these methods have not been developed for such difficult scenes - we merely used these algorithms as (to the best of our knowledge) better or specialized ones do not exist. Detailed information on the visualization method we used can be found here. You can also click on the images to see a fullsize view of the visualization.
Blinking Arrow (March) Mainly interesting for: Optical Flow Challenge: Driving on a road with guard railing, the railings are prone to temporal aliasing. A light signal system near the horizon flashes and violates the constant brightness assumption. Reference Frame: 3 |
Car Truck (March) Mainly interesting for: Optical Flow, Stereo Challenge: Accelerating at the approach to an Autobahn, large rotational egomotion is present. A car-transport truck crossing provides complex two-layer occlusion and high motion differences to the background. Reference Frame: 5 |
Crossing Cars (May) Mainly interesting for: Optical Flow, Stereo, Feature Matching Challenge: Harsh lighting conditions with many saturated highlights and strong scene reflections are quite common in outdoor scenes. An additional challenge in this scene are the complex occlusions between crossing cars and transparency in the car-windows. Reference Frame: 9 |
Flying Snow (December) Mainly interesting for: Optical Flow, Stereo, Feature Matching Challenge: Thick snowflakes fall heavily from the sky and obscure part of a city driving scene. A preceeding car hurles up partly transparent snow/mud fountains. Reference Frame: 4 |
Night and Snow (December) Mainly interesting for: Optical Flow, Stereo, Feature Matching Challenge: In this driving scene the windshield wiper blocks part of one stereo camera. Additionally, the windshield is wet, causing the headlights of an approaching car to flare. Due to darkness the aperture is wide open and the image has only a shallow depth of field. Reference Frame: 17 |
Rain Blur (August) Mainly interesting for: Stereo Challenge: Driving during rain causes differently blurred left and right views. Reference Frame: 11 |
Rain Flares (June) Mainly interesting for: Optical Flow, Stereo, Feature Matching Challenge: Driving during rain causes many kinds of lensflares of other vehicle's headlights. Additionally, water on the windshield blurs the view. Reference Frame: 21 |
Reflecting Car (June) Mainly interesting for: Optical Flow, Stereo, Feature Matching Challenge: Strong reflections on glossy surfaces such as cars and cast shadows often induce spurious motions. Reference Frame:28 |
Shadow on Truck (June) Mainly interesting for: Optical Flow, Feature Matching Challenge: Follwing a truck on the Autobahn, this scene is geometrically quite simple. However, cast shadows dancing on the truck and the street violate assumptions that are made by many algorithms. Reference Frame:13 |
Sunflare (March) Mainly interesting for: Optical Flow, Stereo, Feature Matching Challenge: In a common driving situation with considerable egomotion, the sun shines directly into the cameras causing severe lensflares. Reference Frame: 24 |
Wet Autobahn (December) Mainly interesting for: Optical Flow, Feature Matching Challenge: This sequence summarized multiple effects of adverse weather conditions. A film of water renders flat surfaces like the street reflecting. Additionally, spray raised by passing cars is semitransparent and motion estimation on both spray and occluded objects is hard. Reference Frame: 20 |
Data Generation
We recorded the data with a high-speed, high-resolution stereo camera system. A publication with motivation and a minute description of the recording procedure is to be published in Febuary 2012. Meanwhile, a full technical report can be found here.
Data Processing
We recorded the sequences with a resolution of 1312x1082@12bit and a framerate of 100Hz. The baseline distance was around 30cm. All on-camera-preprocessing was turned off.
The following operations have been performed on the images:
- We computed a radiometric calibration using an Ulbricht-Sphere (integrating sphere) and estimated a quadratic response curve for each pixel (two parameters).
- Before each recording session we estimated the dark current by averaging 100 frames with a closed lens cap.
- Before each recording session both cameras were calibrated based on a method similar to https://doi.org/10.1117/12.279802. This means that we have removed spherical distortions and aligned the images with respect to their epipolar geometry (horizontal lines in the first image correspond to the same horizontal lines in the second image). The camera parameters are available as extra download below
- The dark current image was subtracted from each camera frame and the inverse of each response curve was used to linearize the intensities. Hot and defect pixels were identified based on a heuristic (significant local intensity deviations throughout all sequences) and removed by applying a 3x3 median filter.
- Each frame (both left and right) was rectified based on a lookup-table which was created with the camera calibration result.
- The resulting (both radiometrically and geometrically rectified) image pair is downscaled to the size of 656x541@12bit by averaging each four pixels to a single pixel. We used this simple method explicitly to conserve the natural noise on the images as much as possible. Every fourth image (to reach an effective framerate of 25Hz) is stored as PGM files.
- Saturated pixels are not processed.
Privacy Protection
The recorded data shows challenging traffic scenes as they occur in real life and therefore contains license plates and pedestrians. To protect the privacy of the traffic participants we removed all information that can be traced back to individuals. As we attach great importance to the protection of privacy and the compliance with national privacy laws we chose to accept this interference with the raw data.
To keep interference as minimal as possible, we manually labeled all pedestrians and license plates in the scenes and removed only the high frequency data that contains the private information by a Gaussian blur-filter. We chose a large variance to render inverse engineering of the original data impossible.
To prevent the introduction of spurious gradients at the transition between filtered and original regions we define a boundary region where filtered and unfilterted parts are blended together smoothly. With this procedure we ensure that neither new high frequency content nor new image gradients are introduced.
Practical comparison of optical flow and stereo correspondences estimated on both the original and the modified images show that the removal of private information does not interfere with the performance of the tested algorithms.
Along with the modified images we offer binary masks for download in which modified pixel are marked with a non-zero value.
Technical info
System Configuration
- Machine: Intel Desktop Board DQ45CB, Intel Core 2 Duo E7300 @ 2,66GHZ, 8 GB RAM, Win 7 64
- HD: Hitachi HTE723212L9A360(111GB); Intel ICH8R/ICH9R/ICH10R/DO/5 Series/3400 Series SATA RAID
- Framegrabber: Silicon Software microEnable IV VD4-CL
- Camera: 2 x Photon Focus MV1-D1312-160-CL-12
- Optics: Linos Mevis-C 25mm/1.6
Code
We provide different packages of C++ and MATLAB code that we used to produce some of the results on this page. When using this code please note the disclaimer provided with the software.
- In this project we work with 12-bit grayscale .pgm-images. Unfortunatly, MATLAB's imread() function does not support 12-bit images. You can use this function as a replacement, maintaining syntax and options of imread().
- Results on this page were displayed with our visualization code. To compare your results easily to the ones given here, you can use the same visualization methods, provided as MATLAB and C++ sources.
Other
Related Work
A well-known set of short image sequences and stereo pairs with ground truth has been made public via the Middlebury Benchmark Website which is being maintained by Prof. Daniel Scharstein, Middlebury College, USA.
The first dataset with long image sequences, partly augmented with ground truth for stereo as well as optical flow estimation was the .enpeda.-Database created by the group of Prof. Reinhard Klette in Auckland, New Zealand.
More recently, Andreas Geiger and colleagues in the groups of Prof. Christoph Stiller (Karlsruhe Institute of Technology) and Prof. Raquel Urtasun (Toyota Technological Institute, USA) published the KITTI Vision Benchmark Suite.
If you have published a similar dataset, please let us know, so we can include it in this article.
Notes
Files
Animation.gif
Files
(4.5 GB)
Name | Size | Download all |
---|---|---|
md5:f82729b870ba3c00d181e8961c59d23d
|
546.1 kB | Preview Download |
md5:217b5b077c0b702d7b7b0c48daa29531
|
731.0 kB | Preview Download |
md5:579248c4914420f1688f8b5f4c0d971d
|
7.5 kB | Preview Download |
md5:2ec78b0839694393e6493e475932413b
|
556.9 kB | Preview Download |
md5:c560b191d0add715f9e4606b33dbdc78
|
4.5 GB | Preview Download |
md5:bbc9d53bf4f82781fb5bdc7b53520ab4
|
795.4 kB | Preview Download |
md5:6074ddcda4a73d19dac94fff50e39942
|
818.6 kB | Preview Download |
md5:554531d13f66a743ebb875d09312d19e
|
44.5 kB | Preview Download |
md5:80a9a805120a4f53abf6da9a9867d78b
|
743.9 kB | Preview Download |
md5:0ded7ab1bd84a8d7013f949a09202db9
|
810.8 kB | Preview Download |
md5:93fa1b006d2db22b8d94bbc2e096daa6
|
736.7 kB | Preview Download |
md5:66a1f1adc548b5836197363fb1a1723c
|
702.4 kB | Preview Download |
md5:2ae1176dc8e2ff7b3100134207be1f0c
|
693.8 kB | Preview Download |
md5:7a2511fe8d4242f003ecdb3374744547
|
1.7 kB | Preview Download |
md5:b1932abdfc80eafc2f25720064736562
|
810.3 kB | Preview Download |
md5:cf12686f532d4d9257853b964c0850eb
|
719.4 kB | Preview Download |
Additional details
Related works
- Is described by
- Journal article: 10.1117/1.OE.51.2.021107 (DOI)
- Obsoletes
- Dataset: https://hci.iwr.uni-heidelberg.de/benchmarks/Challenging_Data_for_Stereo_and_Optical_Flow (URL)
- Requires
- Software: 10.5281/zenodo.12635127 (DOI)
Software
- Repository URL
- https://zenodo.org/doi/10.5281/zenodo.8033172
- Programming language
- MATLAB, C++
- Development Status
- Inactive