University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Hierarchical decision making for semantic analysis and summarisation of sports videos.

Jaser, Edward. (2005) Hierarchical decision making for semantic analysis and summarisation of sports videos. Doctoral thesis, University of Surrey (United Kingdom)..

Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (4MB) | Preview


Video comprises several modalities and as the richest of all type of media, it is a very important tool and powerful medium of communication. It is used extensively for presenting information and expressing and communicating ideas. Huge amount of video material is generated everyday covering a wide variety of subjects. Without efficient and flexible tools, the usability of this material is quite restricted. In recent years there has been much research addressing the problem of automatic video analysis and retrieval. In this thesis, the problem of automatic video annotation is considered. We develop a multistage decision making system tailored to the domain of sport videos. The first stage is concerned with reaching a compact yet efficient representation of raw video material. One popular approach to this problem is a representation in terms of low-level features. A major limitation is that the stored indexing features are too low-level; they relate directly to the properties of the data. In this stage we opted for a representation in terms of cues. Cues are the result of processing that associates the feature measurements with real-world objects or events. An additional advantage of this approach is that the cues from different types of features are presented in a homogeneous way. The second stage of the system is concerned with the classification of video shots. The set of classes considered relate to some characteristic views that occur frequently in sport videos. The decision making mechanism in this stage is a boosted decision tree which generates hypotheses concerning the semantics of the sports video content given the cues annotation. In contrast to many shot classifiers reported in the literature, the proposed one decomposes the global complex classification problem into a number of simpler tasks. It has the flexibility of choosing different subsets of features (cues in our case) to solve those tasks, thus eliminating unnecessary computations. The final stage of the system is designed to attack the misclassification committed in earlier stages by exploiting temporal context. Misclassification can be due to error in the cue extraction, in the shot classifier or the consequence of a genuine ambiguity as the same visual content may be attributed to different sport categories, depending on the context. The functionality of this stage is realised by a Hidden Markov Model system which bridges the gap between the semantic content categorisation defined by the user and the actual visual content categories. This stage also addresses the grouping of shots into scenes. Experimental results on a database comprising video material from six different events demonstrate that the proposed system is working well.

Item Type: Thesis (Doctoral)
Divisions : Theses
Authors :
Jaser, Edward.
Date : 2005
Contributors :
Depositing User : EPrints Services
Date Deposited : 09 Nov 2017 12:17
Last Modified : 20 Jun 2018 11:38

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800