University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Combining Corpus Co-Occurrence and Attention With Visual Features for Object Recognition.

Mountstephens, James. (2007) Combining Corpus Co-Occurrence and Attention With Visual Features for Object Recognition. Doctoral thesis, University of Surrey (United Kingdom)..

[img]
Preview
Text
27696203.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (6MB) | Preview

Abstract

The automatic recognition of objects in visual scenes, both dynamic and static, is an important and highly-challenging computational task. Typical approaches function purely on patterns found in visual features extracted directly from the scene image (Marr, 1982; Serre et al, 2007). Object recognition in people, however, may be influenced by prior contextual knowledge of co-occurrence patterns between real-world objects present in scenes (Bar, 2004) and the derivation and application of this type of knowledge is the primary motivation of this work. In a novel fashion, knowledge of object co-occurrence is derived from patterns of co-occurring words derived from linguistic corpora (Sinclair, 1991) and used to extend the hybrid video annotation architecture of Hoogs et al (2003b), itself based around the hierarchical linguistic knowledge base, WordNet (Fellbaum, 1998). In addition to extending Hoogs' architecture with corpus-derived contextual knowledge, recognition in multi-object scenes is guided by a model of selective visual attention due to Itti et al (1998) which is used to sequentially build an object-level context for a given scene. Two systems - LHACCOR (Linguistic Hierarchy Applying Corpus Context to Object Recognition) and VHACCOR (Visual Hierarchy Applying Corpus Context to Object Recognition) are presented to explore these ideas. LHACCOR retains Hoogs' use of the WordNet noun hierarchy and VHACCOR extends Hoogs further by adapting a class hierarchy to the capability of the available visual processing rather than using the preset structure like WordNet, that is not explicitly designed around visual distinctions between classes. For VHACCOR, whatever classes are confused at a visual level must be distinguished by context. These two systems have been tested with natural, uncontrived scenes and display some improvement over a purely visual approach.

Item Type: Thesis (Doctoral)
Divisions : Theses
Authors : Mountstephens, James.
Date : 2007
Additional Information : Thesis (Ph.D.)--University of Surrey (United Kingdom), 2007.
Depositing User : EPrints Services
Date Deposited : 06 May 2020 14:15
Last Modified : 06 May 2020 14:18
URI: http://epubs.surrey.ac.uk/id/eprint/856088

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800