Towards Complete Input Representations for Vehicle Trajectory Prediction Models
Authors/Creators
Description
Predicting accurate trajectories for traffic participants is a crucial task in the autonomous driving field.
Especially in complex scenarios like urban intersections we need to anticipate intentions and positions of
surrounding agents in order to plan a safe and smooth maneuver for an ego-vehicle. Trajectory prediction
remains a challenging task due to a diverse set of influencing factors like the topology of the street, traffic
rules and physical dynamics but also stochastic intentions of the agents and their mutual influences. The
performance of prediction models depends on both the input representation, i.e. the encoding of the scene,
and the prediction, i.e. the decoding of trajectories. Recent works have mainly focused on the decoding
part. Conversely, in this work we investigate the input representations of trajectory prediction models.
They can roughly be divided into raster- and vector-based methods. Raster-based methods represent
the traffic scene as bird’s-eye-view images whereas vector-based methods represent all elements as sets
of points. We explain the pros and cons of both methods and carry out preliminary experiments to
enhance the representations by 1) enriching the over-simplistic vector representation and 2) combining
both representations. Our finding results on the popular NuScenes benchmark indicate that very simple
representations are actually sufficient for state-of-the-art results. This is in contrast to real-world driving
where more information influences driver decisions. Consequently, it might be the call for more diverse
and representative datasets and/or more expressive models.
Files
paper.pdf
Files
(2.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:9825b61a2febc913da3c399a335c226d
|
2.2 MB | Preview Download |
Additional details
Funding
Dates
- Accepted
-
2024-09-18