Single-Camera Basketball Tracker through Pose and Semantic Feature Fusion
Description
Tracking sports players is a widely challenging
scenario, specially in single-feed videos recorded in tight courts,
where cluttering and occlusions cannot be avoided. This paper
presents an analysis of several geometric and semantic visual features
to detect and track basketball players. An ablation study is carried
out and then used to remark that a robust tracker can be built with
Deep Learning features, without the need of extracting contextual
ones, such as proximity or color similarity, nor applying camera
stabilization techniques. The presented tracker consists of: (1) a
detection step, which uses a pretrained deep learning model to
estimate the players pose, followed by (2) a tracking step, which
leverages pose and semantic information from the output of a
convolutional layer in a VGG network. Its performance is analyzed
in terms of MOTA over a basketball dataset with more than 10k
instances.
Files
10010623.pdf
Files
(370.1 kB)
Name | Size | Download all |
---|---|---|
md5:331db69170a2ca121ed78a88dc13d4c9
|
370.1 kB | Preview Download |