Athlete pose estimation from single-view TV broadcast footage.
Fastovets, Mykyta (2017) Athlete pose estimation from single-view TV broadcast footage. Doctoral thesis, University of Surrey.
|
Text
thesis.pdf - Version of Record Available under License Creative Commons Attribution Non-commercial Share Alike. Download (147MB) | Preview |
Abstract
This thesis presents work on athlete pose estimation in single-iew broadcast videos. Human pose estimation is an important problem in computer vision and has received much interest in the research community due to the wide range of applications. This thesis presents a novel framework for the semi-automatic estimation of human pose in television quality sports footage. The focus is on achieving accurate pose estimation results on sports video sequences, with the assistance of a human operator in a broadcast studio setting, that can be used to drive post-action analysis and graphical overlays. A method for extracting and tracking off-the-shelf scale-invariant features on athletes is tested. Evaluation shows that such features are ill-suited for tracking articulated motion due to drift, data association, and a general lack of stable features to track. A keyframe-driven approach, inspired by the Pictorial Structures model, is developed for estimating 2D pose of athletes in sports sequences. This approach models the human body as a tree of loosely linked parts and introduces a temporal smoothness term aimed at ensuring temporal consistency of pose throughout the sequence. The evaluation demonstrates that such an approach is able to extract human pose in such videos, but requires a significant amount of manual interaction to do so with accuracy required for broadcast settings. A novel non-sequential method for maximising benefit from manually annotated keyframe poses using minimum spanning trees is developed. The developed algorithm serves two purposes: keyframe selection, and keyframe information propagation. Optimal keyframes are automatically selected and suggested to the operator for labelling. Once labelled, information from these keyframes is propagated throughout the sequence and automatically generated keyframes are created in visually similar frames. Qualitative and quantitative evaluation demonstrates an increase in accuracy and a decrease in the number of required keyframes. Finally, a geometric method for converting 2D poses into 3D is developed. The algorithm assumes a weak perspective projection for the video sequence and known relative limb lengths for the athlete, and is able to recover the relative scale given at least three labelled keyframes by solving a continuous optimisation problem. Evaluation against a baseline geometric method shows improved stability and lower residual error.
Item Type: | Thesis (Doctoral) | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Subjects : | computer vision, human pose estimation, optimisation, feature tracking | ||||||||||||
Divisions : | Theses | ||||||||||||
Authors : |
|
||||||||||||
Date : | 28 February 2017 | ||||||||||||
Funders : | BBC R&D, University of Surrey | ||||||||||||
Contributors : |
|
||||||||||||
Depositing User : | Mykyta Fastovets | ||||||||||||
Date Deposited : | 09 Mar 2017 11:25 | ||||||||||||
Last Modified : | 09 Nov 2018 16:40 | ||||||||||||
URI: | http://epubs.surrey.ac.uk/id/eprint/813522 |
Actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year