TPT-Bench: A Large-Scale, Long-Term and Robot-Egocentric Dataset for Benchmarking Target Person Tracking
Contributors
Project leader:
Project manager:
Project member (6):
Supervisor:
Description
TPT-Bench is a large-scale, long-term, and robot-egocentric dataset designed for benchmarking Target Person Tracking (TPT) in real-world environments.
It comprises 5.3 hours of multimodal recordings, including robot odometry, 3D LiDAR, panoptic images, RGB-D data, and IMU streams. In addition to the raw sensor data, the dataset provides 571,982 frame-level 2D bounding box annotations of the target person.
The recordings cover 48 sequences captured in densely populated everyday environments—such as schools, food stores, markets, plazas, and metro stations—enabling extended tracking episodes with an average duration of 397.2 seconds per sequence. These long-horizon scenarios highlight realistic challenges such as frequent target disappearance and complex crowd interactions. TPT-Bench is designed to support research on long-term, robust, and socially-aware person-following and target-tracking tasks in real-world robot deployments.
To support evaluation and reproducibility, we provide an accompanying software toolkit (https://github.com/MedlarTea/TPT-BENCH-TOOLS) for benchmarking TPT algorithms on our dataset. To facilitate the use of multimodal data, we also release calibration parameters and a collection of utility scripts, including functions for loading calibration files, synchronizing and storing ROS messages, extracting and visualizing odometry, projecting point clouds onto pinhole and panoramic cameras, handling panoramic projection/unprojection and distortion, and tracking the target person in 2.5D space.
More detailed baseline results of state-of-the-art TPT algorithms, methodological discussions, and related works are available in our preprint (https://arxiv.org/pdf/2505.07446).
For information about the dataset’s origin, structure, content organization, labeling procedures, and processing pipeline, please refer to the preprint or the provided “data_report.pdf” included in this upload.
Our dataset organization is shown as follows:
TPT-Bench/
├── panoramic_images/
│ └── <seq_id>/ # e.g., 0000, 0002, ...
│ └── <timestamp>.jpg # Equirectangular projection images
├── GTs/
│ └── <seq_id>.json # Target 2D bounding box annotations
├── rosbags/
│ └── <seq_id>.bag # Raw sensor data (LiDAR, IMU, Images)
├── descriptions/
│ └── <seq_id>.txt # Natural language description of the sequence
├── quickview_videos/
│ └── <seq_id>.mp4 # Low-res visualization for quick inspection
└── evaluation_results/
└── <seq_id>/
└── <baseline>.json # Output from baseline tracker
Due to Zenodo’s upload size limitation (50 GB per upload), this record includes the complete dataset except for the full contents of the “rosbags” and “panoramic_images” directories. For these two directories, we provide the full 0015 sequence as a representative subset. The complete versions of “rosbags” and “panoramic_images” (≈1 TB total) are hosted on our project’s software webpage for open access: https://github.com/MedlarTea/TPT-BENCH-TOOLS. All other files in this record are included in their entirety.
Files
data_report.pdf
Files
(21.8 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:405ba81241c014faecfde214f297a827
|
339.6 kB | Preview Download |
|
md5:d66a113abb386aa4bfa8bfb1dcf56840
|
28.3 kB | Preview Download |
|
md5:4570a1bef24abe988ec7ca5108ebf871
|
1.1 GB | Preview Download |
|
md5:043d51ac1b4a5a3ce87ff7d9ec44f4dc
|
7.6 MB | Preview Download |
|
md5:3f716c90b719986d6279aca5e08dca58
|
2.3 GB | Preview Download |
|
md5:cd166aec05c7a81bcf6c1ffa885c82c7
|
7.4 GB | Preview Download |
|
md5:50c17ba7fdb96467123af32575d0eb5d
|
11.1 GB | Preview Download |
Additional details
Related works
- Is supplement to
- Preprint: arXiv:2505.07446 (arXiv)
Dates
- Available
-
2025-11-26publication date
Software
- Repository URL
- https://github.com/MedlarTea/TPT-BENCH-TOOLS
- Programming language
- Python
- Development Status
- Active