Repository logo
About
Deposit
Communities & Collections
All of UWSpace
  • English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
Log In
Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "Balaji, Bavesh"

Filter results by typing the first few letters
Now showing 1 - 1 of 1
  • Results Per Page
  • Sort Options
  • Loading...
    Thumbnail Image
    Item
    Language Guided Out-of-Bounding Box Pose Estimation for Robust Ice Hockey Analysis
    (University of Waterloo, 2024-08-27) Balaji, Bavesh; Clausi, David; Rambhatla, Sirisha
    Accurate estimation of human pose and the pose of interacting objects, such as hockey sticks, is fundamental in vision-driven hockey analytics and crucial for tasks like action recognition and player assessment. Estimating 2D keypoints from monocular video is challenging, particularly in fast-paced sports such as ice hockey, where motion blur, occlusions, bulky equipment, color similarities, and constant camera panning complicate accurate pose prediction. This thesis addresses these challenges with contributions on three fronts. First, recognizing the lack of an existing benchmark, we present a comparative study of four state-of-the-art human pose estimation approaches using a real-world ice hockey dataset. This analysis aims to understand the impact of each model on ice hockey pose estimation and investigate their respective advantages and disadvantages. Building on insights from this comparative study, we develop an ensemble model for jointly predicting player and stick poses. The ensemble comprises two networks: one trained from scratch to predict all keypoints, and another utilizing a unique transfer learning paradigm to incorporate knowledge from large-scale human pose datasets. Despite achieving promising results, we observe that these top-down approaches yield suboptimal outcomes due to constraints such as requiring all keypoints to be within a bounding box and accommodating only one player per bounding box. To overcome these issues, we introduce an image and text based multi-modal solution called TokenCLIPose, which predicts stick keypoints without encapsulating them within a bounding box. By focusing on capturing only the player in a bounding box and treating their stick as missing, our model predicts out-of-bounding box keypoints. To incorporate the context of the missing keypoints, we use keypoint-specific text prompts to leverage the rich semantic representations provided by language. This dissertation’s findings advance the state-of-the-art in 2D pose estimation for ice hockey, outperforming existing methods by 2.6% on our dataset, and provide a robust foundation for further developments in vision-driven sports analytics.

DSpace software copyright © 2002-2025 LYRASIS

  • Privacy policy
  • End User Agreement
  • Send Feedback