Quite Simple Approaches for Authorship Attribution, Intrinsic Plagiarism Detection and Sexual Predator Identification
Gillam, L and Vartapetiance, A (2012) Quite Simple Approaches for Authorship Attribution, Intrinsic Plagiarism Detection and Sexual Predator Identification In: 4th PAN workshop; CLEF 2012, 2012-09-17 - 2012-09-20, Rome, Italy.
Available under License : See the attached licence file.
Tasks such as Authorship Attribution, Intrinsic Plagiarism detection and Sexual Predator Identification are representative of attempts to deceive. In the first two, authors try to convince others that the presented work is theirs, and in the third there is an attempt to convince readers to take actions based on false beliefs or ill-perceived risks. In this paper, we discuss our approaches to these tasks in the Author Identification track at PAN2012, which represents our first proper attempt at any of them. Our initial intention was to determine whether cues of deception, documented in the literature, might be relevant to such tasks. However, it quickly became apparent that such cues would not be readily useful, and we discuss the results achieved using some simple but relatively novel approaches: for the Traditional Authorship Attribution task, we show how a mean-variance framework using just 10 stopwords detects 42.8% and could be obtain 52.12% using fewer; for Intrinsic Plagiarism Detection, frequent words achieved 91.1% overall; and for Sexual Predator Identification, we used just a few features covering requests for personal information, with mixed results.
|Item Type:||Conference or Workshop Item (Conference Paper)|
|Divisions :||Faculty of Engineering and Physical Sciences > Computing Science|
|Date :||20 September 2012|
|Additional Information :||The original publication is available at http://www.springerlink.com|
|Depositing User :||Symplectic Elements|
|Date Deposited :||12 Jun 2013 14:32|
|Last Modified :||23 Sep 2013 20:05|
Actions (login required)
Downloads per month over past year