Published October 21, 2022 | Version 0.1.0
Dataset Open

Cross-Camera View-Overlap Recognition

  • 1. Queen Mary University of London

Description

Data accompanying the paper titled Cross-Camera View-Overlap Recognition, published in the proceedings of the European Conference on Computer Vision Workshop and presented used for the evaluation of the framework presented in the publication.

The dataset consists of image sequence pairs from four scenarios: two scenarios that were collected with both hand-held and chest-mounted cameras – gate and backyard of four sequences each – and two publicly available datasets – TUM-RGB-D SLAM and courtyard from CoSLAM – for a total of ∼28,000 frames (∼25 minutes).

The data consisting of images, annotations, and scripts to process existing public sequences.

Image sequences are provided for the collected scenarios gate and backyard. We sub-sampled backyard from 30 to 10 fps for annotation purposes.

Image sequences for the scenario office can be found at TUM RGB-D SLAM (fr1_desk, fr1_desk2, fr1_room). Scripts to process these sequences as used in the work are provided.

The courtyard scenario consists of four sequences. We sub-sampled courtyard from 50 to 25 fps for annotation purposes. Original sequences are available at CoSLAM project website.

For all scenarios, we provide i) the annotation of angular distances, Euclidean distances, and overlap ratio of each view pair across camera sequences; ii) the annotation of the calibration (intrinsic) parameters;  and iii) the annotation of the camera poses over time for each camera sequence, as automatically reconstructed with the structure-from-motion pipeline, COLMAP, or exploiting the depth data for the office scenario.

 

Camera poses are saved as .txt file for each sequence using the KITTI format. The pose of each frame is represented as a 3x4 matrix (12 parameters) that is converted into a vector by horizontally concatenating the rows of the matrix:
[r11 r12 r13 tx
r21 r22 r23 ty    =>  [r11 r12 r13 tx r21 r22 r23 ty r31 r32 r33 tz]
r31 r32 r33 tz]

Values of the parameters are saved in 6 digit floating point numbers as exponential notation.

 

Along with the dataset, we also provide the global features computed by using DeepBit [code] and NetVLAD [code] for each image of all camera sequences.

 

If you use the data, please cite:

A. Xompero and A. Cavallaro, Cross-camera view-overlap recognition, International Workshop on Distributed Smart Cameras (IWDSC), European Conference on Computer Vision Workshops, 24 October 2022.

ArXiv: https://arxiv.org/abs/2208.11661
Webpage: http://www.eecs.qmul.ac.uk/~ax300/xview/

Files

backyard.zip

Files (7.9 GB)

Name Size Download all
md5:d8df9a35c90b716f049a79c8b91c5c8e
5.5 GB Preview Download
md5:95a0bb0967723b150d5f8f86cc773f10
170.1 MB Preview Download
md5:c0c7b606c273c0469c2e0eda2986d1e3
288.5 MB Preview Download
md5:51d36a139e851defd0d7efc945904699
1.9 GB Preview Download
md5:cad75d85090e87ace83fbff6c6f0e4da
8.9 MB Preview Download

Additional details

Related works

Is derived from
Preprint: 10.48550/arXiv.2208.11661 (DOI)