Visual Sentences for Pose Retrieval over Low-resolution Cross-media Dance Collections
Ren, R and Collomosse, J (2012) Visual Sentences for Pose Retrieval over Low-resolution Cross-media Dance Collections IEEE Transactions on Multimedia. ISSN 1520-9210
Ren-TMM-2012.pdf - Accepted Version
Available under License : See the attached licence file.
We describe a system for matching human posture (pose) across a large cross-media archive of dance footage spanning nearly 100 years, comprising digitized photographs and videos of rehearsals and performances. This footage presents unique challenges due to its age, quality and diversity. We propose a forest-like pose representation combining visual structure (self-similarity) descriptors over multiple scales, without explicitly detecting limb positions which would be infeasible for our data. We explore two complementary multi-scale representations, applying passage retrieval and latent Dirichlet allocation (LDA) techniques inspired by the the text retrieval domain, to the problem of pose matching. The result is a robust system capable of quickly searching large cross-media collections for similarity to a visually specified query pose. We evaluate over a crosssection of the UK National Research Centre for Dance’s (UK-NRCD), and the Siobhan Davies Replay’s (SDR) digital dance archives, using visual queries supplied by dance professionals. We demonstrate significant performance improvements over two base-lines; classical single and multi-scale Bag of Visual Words (BoVW) and spatial pyramid kernel (SPK) matching.
|Additional Information:||© 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.|
|Divisions:||Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing|
|Depositing User:||Symplectic Elements|
|Date Deposited:||31 May 2012 12:05|
|Last Modified:||23 Sep 2013 19:30|
Actions (login required)
Downloads per month over past year