Συγγραφείς: | Stamatatos E. |
---|
Τίτλος: | Intrinsic Plagiarism Detection Using Character n-gram Profiles |
---|
Συνέδριο: | 3rd Int. Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN-09) |
---|
Editors: | |
---|
Ed: | Όχι |
---|
Eds: | Όχι |
---|
Σελίδες: | |
---|
Να εμφανιστεί: | Όχι |
---|
Μήνας: | |
---|
Έτος: | 2009 |
---|
Τόπος: | |
---|
Εκδότης: | |
---|
Δεσμός: | |
---|
Όνομα αρχείου: | |
---|
Περίληψη: | The task of intrinsic plagiarism detection deals with cases where no reference corpus
is available and it is exclusively based on stylistic changes or inconsistencies within a given
document. In this paper a new method is presented that attempts to quantify the style variation
within a document using character n-gram profiles and a style change function based on an
appropriate dissimilarity measure originally proposed for author identification. In addition, we
propose a set of heuristic rules that attempt to detect plagiarism–free documents and
plagiarized passages, as well as to reduce the effect of irrelevant style changes within a
document. The proposed approach is evaluated on the recently-available corpus of the 1st Int.
Competition on Plagiarism Detection with promising results. |