University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Sign Language Recognitions: Generalising to More Complex Corpora.

Cooper, HM Sign Language Recognitions: Generalising to More Complex Corpora. UNSPECIFIED thesis, University of Surrey.

Full text not available from this repository.

Abstract

The aim of this thesis is to find new approaches to Sign Language Recognition (SLR) which are suited to working with the limited corpora currently available. Data available for SLR is of limited quality; low resolution and frame rates make the task of recognition even more complex. The content is rarely natural, concentrating on isolated signs and filmed under laboratory conditions. In addition, the amount of accurately labelled data is minimal. To this end, several contributions are made: Tracking the hands is eschewed in favour of detection based techniques more robust to noise; for both signs and for linguistically-motivated sign sub-units are investigated, to make best use of limited data sets. Finally, an algorithm is proposed to learn signs from the inset signers on TV, with the aid of the accompanying subtitles, thus increasing the corpus of data available. Tracking fast moving hands under laboratory conditions is a complex task, move this to real world data and the challenge is even greater. When using tracked data as a base for SLR, the errors in the tracking are compounded at the classification stage. Proposed instead, is a novel sign detection method, which views space-time as a 3D volume and the sign within it as an object to be located. Features are combined into strong classifiers using a novel boosting implementation designed to create optimal classifiers over sparse datasets. Using boosted volumetric features, on a robust frame differenced input, average classification rates reach 71% on seen signers and 66% on a mixture of seen and unseen signers, with individual sign classification rates gaining 95%. Using a classifier per sign approach to SLR, means that data sets need to contain numerous examples of the signs to be learnt. Instead, this thesis proposes learnt classifiers to detect the common sub-units of sign. The responses of these classifiers can then be combined for recognition at the sign level. This approach requires fewer examples per sign to be learnt, since the sub-unit detectors are trained on data from multiple signs. It is also faster at detection time since there are fewer classifiers to consult, the number of these being limited by the linguistics of sign and not the number of signs being detected. For this method, appearance based boosted classifiers are introduced to distinguish the sub-units of sign. Results show that when combined with temporal models, these novel sub-unit classifiers, can outperform similar classifiers learnt on tracked results. As an added side effect; since the sub-units are linguistically derived they can be used independently to help linguistic annotators. Since sign language data sets are costly to collect and annotate, there are not many publicly available. Those which are, tend to be constrained in content and often taken under laboratory conditions. However, in the UK, the British Broadcasting Corporation (BBC) regularly produces programs with an inset signer and corresponding subtitles. This provides a natural signer, covering a wide range of topics, in real world conditions. While it has no ground truth, it is proposed that the translated subtitles can provide weak labels for learning signs. The final contributions of this thesis, lead to an innovative approach to learn signs from these co-occurring streams of data. Using a unique, temporally constrained, version of the Apriori mining algorithm, similar sections of video are identified as possible sign locations. These estimates are improved upon by introducing the concept of contextual negatives, removing contextually similar noise. Combined with an iterative honing process, to enhance the localisation of the target sign, 23 word/sign combinations are learnt from a 30 minute news broadcast, providing a novel method for automatic data set creation.

Item Type: Thesis (UNSPECIFIED)
Authors :
NameEmailORCID
Cooper, HMhelen.cooper@surrey.ac.ukUNSPECIFIED
Contributors :
ContributionNameEmailORCID
thesis_supervisorBowden, Rr.bowden@surrey.ac.ukUNSPECIFIED
Uncontrolled Keywords : Contextual Negatives, Temporally Constrained Apriori Mining, Data Mining, Weakly Supervised Learning, Viseme Detection, Boosting, Integral Volume, Volumetric Features, Sign Language Recognition
Related URLs :
Depositing User : Symplectic Elements
Date Deposited : 17 May 2017 12:13
Last Modified : 17 May 2017 15:02
URI: http://epubs.surrey.ac.uk/id/eprint/834485

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800