University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Audio-Visual Localization with Hierarchical Topographic Maps: Modeling the Superior Colliculus

Casey, MC, Pavlou, A and Timotheou, A (2012) Audio-Visual Localization with Hierarchical Topographic Maps: Modeling the Superior Colliculus Neurocomputing, 97. pp. 344-356.

Available under License : See the attached licence file.

Download (453kB)
Text (licence)

Download (33kB)


A key attribute of the brain is its ability to seamlessly integrate sensory information to form a multisensory representation of the world. In early perceptual processing, the superior colliculus (SC) takes a leading role in integrating visual, auditory and somatosensory stimuli in order to direct eye movements. The SC forms a representation of multisensory space through a layering of retinotopic maps which are sensitive to different types of stimuli. These eye-centered topographic maps can adapt to crossmodal stimuli so that the SC can automatically shift our gaze, moderated by cortical feedback. In this paper we describe a neural network model of the SC consisting of a hierarchy of nine topographic maps that combine to form a multisensory retinotopic representation of audio-visual space. Our motivation is to evaluate whether a biologically plausible model of the SC can localize audio-visual inputs live from a camera and two microphones. We use spatial contrast and a novel form of temporal contrast for visual sensitivity, and interaural level difference for auditory sensitivity. Results are comparable with the performance observed in cats where coincident stimuli are accurately localized, while presentation of disparate stimuli causes a significant drop in performance. The benefit of crossmodal localization is shown by adding increasing amounts of noise to the visual stimuli to the point where audio-visual localization significantly out performs visual-only localization. This work demonstrates how a novel, biologically motivated model of low level multisensory processing can be applied to practical, real-world input in real-time, while maintaining its comparability with biology.

Item Type: Article
Divisions : Faculty of Engineering and Physical Sciences > Computing Science
Authors :
Casey, MC
Pavlou, A
Timotheou, A
Date : 15 November 2012
DOI : 10.1016/j.neucom.2012.05.015
Uncontrolled Keywords : Audio–visual localization, Multisensory integration, Topographic maps, Superior colliculus, Computational neuroscience
Related URLs :
Additional Information : NOTICE: this is the author’s version of a work that was accepted for publication in Neurocomputing. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Neurocomputing, 97, November 2012, DOI 10.1016/j.neucom.2012.05.015.
Depositing User : Symplectic Elements
Date Deposited : 03 Aug 2012 10:42
Last Modified : 31 Oct 2017 14:37

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800