Human Pose Estimation for Sports and Health: Action Recognition in Sports Analysis and Sitting Posture Correction
Abstract
Human pose estimation and action recognition are pivotal tasks in computer vision with extensive applications in sports analysis and health. This dissertation addresses several key research areas to advance these fields. First, it introduces a cost-efficient method to enhance and adapt generative models, enabling broader application and improved performance in diverse contexts. Additionally, the study proposes a comprehensive dataset, VideoBadminton, designed specifically for fine-grained action recognition in badminton. This dataset includes detailed annotations and a wide variety of action categories, setting a new standard for future research in sports analysis. Furthermore, the research adapts the CLIP model for action recognition on the VideoBad- minton dataset, demonstrating how multimodal models can be effectively utilized to improve the accuracy and robustness of action recognition in sports videos. The dissertation also applies human posture estimation techniques to sitting posture analysis, contributing to lower back pain research and offering potential benefits in clinical settings. Collectively, these studies leverage deep learning techniques, including convolutional neu- ral networks (CNNs), recurrent neural networks (RNNs), transformers, and multimodal mod- els, to advance the fields of human pose estimation and action recognition. The findings have significant implications for sports analysis, coaching, and health, providing valuable insights into athletic performance, training program effectiveness, and injury prevention. Additionally, the proposed methodologies have broader applications in areas such as healthcare, gaming, and robotics, where accurate human pose estimation and action recognition are critical.