University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Computational approaches for verbal deception detection.

Vartapetiance, Anna (2015) Computational approaches for verbal deception detection. Doctoral thesis, University of Surrey.

[img] Text (Thesis)
Vartapetiance-2014-PhD-thesis-submitted.pdf - Thesis (version of record)
Restricted to Repository staff only until 30 January 2017.
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (4MB) | Request a copy
Text (Embargo to restrict access to this thesis)
Anna Vartapetiance Thesis embargo 18Dec2014.pdf - Other
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (151kB) | Preview
[img] Text (Deposit Agreement)
Vartapetiance-2014_Author_Deposit_Agreement.docx - Other
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (42kB)


Deception exists in all aspects of life and is particularly evident on the Web. Deception includes child sexual predators grooming victims online, medical news headlines with little medical evidence or scientific rigour, individuals claiming others’ work as their own, and systematic deception of company shareholders and institutional investors leading to corporate collapses. This thesis explores the potential for automatic detection of deception. We investigate the nature of deception and the related cues, focusing in particular on Verbal Cues, and concluding that they cannot be readily generalised. We demonstrate how deception-specific features, based on sound hypotheses, can overcome related limitations by presenting approaches for three different examples of deception – namely Child Sexual Predator Detection (SPD), Authorship Identification (AI) and Intrinsic Plagiarism Detection (IPD). We further show how our approaches result in competitive levels of reliability. For SPD we develop our approach largely based on the commonality of requests for key personal information. To address AI, we introduce approaches based on a frequency-mean-variance and a frequency-only framework in order to detect strong associations between co-occurring patterns of a limited number of stopwords. Our IPD approaches are based on simple commonality of words at document level and usage of proper nouns; document sections lacking commonality can be identified as plagiarised. The frameworks of the International Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN) competitions provided an independent evaluation of the approaches. The SPD approach obtained an F1 score of 0.48. F1 scores of 0.47, 0.53 and 0.57 were achieved in AI tasks for PAN2012, 2013 and 2014 respectively. IPD yielded an overall accuracy of 91%. Through post-competition adaptations we also show how to improve the approaches and the scores and demonstrate the importance of suitable datasets and how most approaches are not easily transferable between various types of deception.

Item Type: Thesis (Doctoral)
Divisions : Theses
Authors :
Date : 30 January 2015
Funders : Department of Computing, University of Surrey
Contributors :
Thesis supervisorGillam,
Uncontrolled Keywords : Deception Detection, Natural Language Processing, Child Sexual Predator Detection, Authorship Identification, Intrinsic Plagiarism Detection
Depositing User : Anna Vartapetiance Salmasi
Date Deposited : 13 Feb 2015 09:57
Last Modified : 05 Jan 2016 02:08

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800