A high-performance plagiarism detection system
Cooke, N, Gillam, L, Wrobel, P, Cooke, H and Al-Obaidli, F (2011) A high-performance plagiarism detection system In: PAN at CLEF 2011, 2011-09-19 - 2011-09-22, Amsterdam.
Available under License : See the attached licence file.
Official URL: http://clef2011.org/index.php?page=pages/proceedin...
In this paper we report on our high-performance plagiarism detection system which is able to process the PAN plagiarism corpus for the external plagiarism detection task within relatively short timescales in contrast to previously reported state-of-the-art, and still produce a reasonable degree of performance (PAN 11, 4th place, PlagDet=0.2467329, Recall=0.1500480, Precision=0.7106536, Granularity=1.0058894). At the core of our system is a simple method which avoids the use of hash-type approaches, but about which we are unable to disclose too many details due to a patent application in progress. We optimised our performance using the PAN10 collection, and used the best parameters for the final submission. We anticipated a relatively similar performance at PAN11, modulo changes to the plagiarism cases, and 4th place this year put us between participants who had been 5th and 6th in PAN 10.
|Item Type:||Conference or Workshop Item (Paper)|
|Divisions:||Faculty of Engineering and Physical Sciences > Computing Science|
|Deposited By:||Symplectic Elements|
|Deposited On:||23 Mar 2012 14:23|
|Last Modified:||24 Jan 2013 09:25|
Repository Staff Only: item control page