University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Gated Variational AutoEncoders: Incorporating Weak Supervision to Encourage Disentanglement

Vowels, Matthew, Camgöz, Necati Cihan and Bowden, Richard (2020) Gated Variational AutoEncoders: Incorporating Weak Supervision to Encourage Disentanglement In: 15th IEEE International Conference on Automatic Face and Gesture Recognition, 18-22 May 2020, Buenos Aires, Argentina.

[img]
Preview
Text
PID6374309[1].pdf - Accepted version Manuscript

Download (1MB) | Preview

Abstract

Variational AutoEncoders (VAEs) provide a means to generate representational latent embeddings. Previous research has highlighted the benefits of achieving representations that are disentangled, particularly for downstream tasks. However, there is some debate about how to encourage disentanglement with VAEs, and evidence indicates that existing implementations do not achieve disentanglement consistently. The evaluation of how well a VAE’s latent space has been disentangled is often evaluated against our subjective expectations of which attributes should be disentangled for a given problem. Therefore, by definition, we already have domain knowledge of what should be achieved and yet we use unsupervised approaches to achieve it. We propose a weakly supervised approach that incorporates any available domain knowledge into the training process to form a Gated-VAE. The process involves partitioning the representational embedding and gating backpropagation. All partitions are utilised on the forward pass but gradients are backpropagated through different partitions according to selected image/target pairings. The approach can be used to modify existing VAE models such as beta-VAE, InfoVAE and DIP-VAE-II. Experiments demonstrate that using gated backpropagation, latent factors are represented in their intended partition. The approach is applied to images of faces for the purpose of disentangling head-pose from facial expression. Quantitative metrics show that using Gated-VAE improves average disentanglement, completeness and informativeness, as compared with un-gated implementations. Qualitative assessment of latent traversals demonstrate its disentanglement of head-pose from expression, even when only weak/noisy supervision is available.

Item Type: Conference or Workshop Item (Conference Paper)
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering
Authors :
NameEmailORCID
Vowels, Matthewm.vowels@surrey.ac.uk
Camgöz, Necati Cihann.camgoz@surrey.ac.uk
Bowden, RichardR.Bowden@surrey.ac.uk
Date : 18 February 2020
Funders : EPSRC
Grant Title : EPSRC Grant
Projects : ExTOL
Depositing User : James Marshall
Date Deposited : 28 Feb 2020 16:36
Last Modified : 28 Feb 2020 16:36
URI: http://epubs.surrey.ac.uk/id/eprint/853850

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800