A high-performance plagiarism detection system
Cooke, N, Gillam, L, Wrobel, P, Cooke, H and Al-Obaidli, F (2011) A high-performance plagiarism detection system In: PAN at CLEF 2011, 2011-09-19 - 2011-09-22, Amsterdam.
| PDF Available under License : See the attached licence file. 68Kb | |
| PDF (licence) 32Kb |
Official URL: http://clef2011.org/index.php?page=pages/proceedin...
Abstract
In this paper we report on our high-performance plagiarism detection system which is able to process the PAN plagiarism corpus for the external plagiarism detection task within relatively short timescales in contrast to previously reported state-of-the-art, and still produce a reasonable degree of performance (PAN 11, 4th place, PlagDet=0.2467329, Recall=0.1500480, Precision=0.7106536, Granularity=1.0058894). At the core of our system is a simple method which avoids the use of hash-type approaches, but about which we are unable to disclose too many details due to a patent application in progress. We optimised our performance using the PAN10 collection, and used the best parameters for the final submission. We anticipated a relatively similar performance at PAN11, modulo changes to the plagiarism cases, and 4th place this year put us between participants who had been 5th and 6th in PAN 10.
| Item Type: | Conference or Workshop Item (Paper) |
|---|---|
| Divisions: | Faculty of Engineering and Physical Sciences > Computing Science |
| ID Code: | 124486 |
| Deposited By: | Symplectic Elements |
| Deposited On: | 23 Mar 2012 14:23 |
| Last Modified: | 24 Jan 2013 09:25 |
Document Downloads
Repository Staff Only: item control page
Tools
Tools