University of Surrey

Test tubes in the lab Research in the ATI Dance Research

An Audio-Visual System for Object-Based Audio: From Recording to Listening

Coleman, Philip, Franck, A, Francombe, Jon, Liu, Qingju, de Campos, Teofilo, Hughes, R, Menzies, D, Simon Galvez,, M, Tang, Y, Woodcock, J , Jackson, Philip, Melchior, F, Pike, C, Fazi, F, Cox, T and Hilton, Adrian (2018) An Audio-Visual System for Object-Based Audio: From Recording to Listening IEEE Transactions on Multimedia, 20 (8). pp. 1919-1931.

MM-008211-R1-acceptedversion.pdf - Accepted version Manuscript

Download (3MB) | Preview


Object-based audio is an emerging representation for audio content, where content is represented in a reproductionformat- agnostic way and thus produced once for consumption on many different kinds of devices. This affords new opportunities for immersive, personalized, and interactive listening experiences. This article introduces an end-to-end object-based spatial audio pipeline, from sound recording to listening. A high-level system architecture is proposed, which includes novel audiovisual interfaces to support object-based capture and listenertracked rendering, and incorporates a proposed component for objectification, i.e., recording content directly into an object-based form. Text-based and extensible metadata enable communication between the system components. An open architecture for object rendering is also proposed. The system’s capabilities are evaluated in two parts. First, listener-tracked reproduction of metadata automatically estimated from two moving talkers is evaluated using an objective binaural localization model. Second, object-based scene capture with audio extracted using blind source separation (to remix between two talkers) and beamforming (to remix a recording of a jazz group), is evaluated with perceptually-motivated objective and subjective experiments. These experiments demonstrate that the novel components of the system add capabilities beyond the state of the art. Finally, we discuss challenges and future perspectives for object-based audio workflows.

Item Type: Article
Divisions : Faculty of Arts and Social Sciences > Department of Music and Media
Authors :
Franck, A
de Campos,
Hughes, R
Menzies, D
Simon Galvez,, M
Tang, Y
Woodcock, J
Melchior, F
Pike, C
Fazi, F
Cox, T
Date : 17 January 2018
Funders : EPSRC
DOI : 10.1109/TMM.2018.2794780
Copyright Disclaimer : Copyright 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Uncontrolled Keywords : Audio systems, Audio-visual systems
Additional Information : Dataset @ DOI: 10.15126/surreydata.00845514
Depositing User : Melanie Hughes
Date Deposited : 03 Jan 2018 10:19
Last Modified : 05 Mar 2019 15:51

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800