A high-performance plagiarism detection system
Cooke, N, Gillam, L, Wrobel, P, Cooke, H and Al-Obaidli, F (2011) A high-performance plagiarism detection system In: PAN at CLEF 2011, 2011-09-19 - 2011-09-22, Amsterdam.
Available under License : See the attached licence file.
In this paper we report on our high-performance plagiarism detection system which is able to process the PAN plagiarism corpus for the external plagiarism detection task within relatively short timescales in contrast to previously reported state-of-the-art, and still produce a reasonable degree of performance (PAN 11, 4th place, PlagDet=0.2467329, Recall=0.1500480, Precision=0.7106536, Granularity=1.0058894). At the core of our system is a simple method which avoids the use of hash-type approaches, but about which we are unable to disclose too many details due to a patent application in progress. We optimised our performance using the PAN10 collection, and used the best parameters for the final submission. We anticipated a relatively similar performance at PAN11, modulo changes to the plagiarism cases, and 4th place this year put us between participants who had been 5th and 6th in PAN 10.
|Item Type:||Conference or Workshop Item (Paper)|
|Divisions:||Faculty of Engineering and Physical Sciences > Computing Science|
|Depositing User:||Symplectic Elements|
|Date Deposited:||23 Mar 2012 14:23|
|Last Modified:||23 Sep 2013 19:02|
Actions (login required)
Downloads per month over past year