Broadcast is all you need: Robust Multiplayer Tracking in Ice Hockey using Monocular Videos
Date
2025-01-22
Authors
Advisor
Clausi, David
Zelek, John
Zelek, John
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
MOT in ice hockey pursues the combined task of detecting and associating players across a given sequence to maintain their identities. Tracking players in sports using monocular broadcast videos is an important computer vision problem that enables several downstream analytics and enhances viewership experience. However, existing tracking approaches encounter significant challenges in dealing with occlusions, blurs, camera pan-tilt-zoom effects, and dynamic player movements prevalent in telecast feeds. These challenges are further exacerbated in fast-paced sports such as ice hockey, where existing trackers struggle to maintain identity consistency due to players' sudden, non-linear motion patterns. In this thesis, acknowledging the fundamental role of quality datasets, we first present two hockey tracking datasets: our previously developed HTD-1 and a newly curated, open-source dataset called HTD-2, annotated from broadcast NHL games. Based on this new dataset, we establish a reference benchmark by evaluating six SOTA tracking methods to enable performance comparisons in hockey MOT. A detailed study is conducted for each algorithm to understand their merits and drawbacks on tracking players. Next, to address the present limitations, we propose a novel tracking model formulating MOT as a bipartite graph matching problem cued with homography inputs. Specifically, we disambiguate the positional representation of occluded players as viewed through broadcast footage, by warping them onto a view-invariant overhead rink template and encode their transformations into the graph message passing network. This ensures reliable spatial context for identity-preserved track prediction. Experimental results demonstrate that our model achieves a 10 times reduction in IDsw and a 32.45% improvement in IDF1 score compared to the existing baseline on HTD-1, establishing a new SOTA. The proposed model also exhibits strong generalization capabilities, achieving 92.8% IDF1 and only 60 IDsw during cross-validation on HTD-2. Finally, ablation studies are presented to validate our performance and substantiate our approach.