Testing of Support Tools for Plagiarism Detection
Total Page:16
File Type:pdf, Size:1020Kb
Testing of Support Tools for Plagiarism Detection Version 1.0; 11th of February 2020 Tomáš Foltýnek*, Mendel University in Brno, Czechia and University of Wuppertal, Germany, [email protected] Dita Dlabolová, Mendel University in Brno, Czechia, [email protected] Alla Anohina-Naumeca, Riga Technical University, Latvia, [email protected] Salim Razı, Canakkale Onsekiz Mart University, Turkey, [email protected] Július Kravjar, Slovak Centre for Scientific and Technical Information, [email protected] Laima Kamzola, Riga Technical University, Latvia, [email protected] Jean Guerrero-Dib, Universidad de Monterrey, Mexico, [email protected] Özgür Çelik, Balikesir University, Turkey, [email protected] Debora Weber-Wulff, HTW Berlin, Germany, [email protected] * Corresponding author Abstract There is a general belief that software must be able to easily do things that humans find difficult. Since finding sources for plagiarism in a text is not an easy task, there is a wide-spread expectation that it must be simple for software to determine if a text is plagiarized or not. Software cannot determine plagiarism, but it can work as a support tool for identifying some text similarity that may constitute plagiarism. But how well do the various systems work? This paper reports on a collaborative test of 15 web-based text-matching systems that can be used when plagiarism is suspected. It was conducted by researchers from seven countries using test material in eight different languages, evaluating the effectiveness of the systems on single- source and multi-source documents. A usability examination was also performed. The sobering results show that although some systems can indeed help identify some plagiarized content, they clearly do not find all plagiarism and at times also identify non-plagiarized material as problematic. Keywords text-matching software, software testing, plagiarism, plagiarism detection tools, usability testing Declarations Availability of data and materials Data and materials used in this project are publicly available from http://www.academicintegrity.eu/wp/wg-testing/ 1 Competing interests Several authors are involved in organization of the regular conferences Plagiarism across Europe and Beyond, which receive funding from Turnitin, Urkund, PlagScan and StrikePlagiarism.com. One team member received "Turnitin Global Innovation Awards" in 2015. These facts did not influence the research in any phase. Funding This research did not receive any external funding. HTW Berlin provided funding for openly publishing the data and materials. Authors' contributions TF managed the project and performed the overall coverage evaluation. DD communicated with the companies that are providing the systems. AAN and LK wrote the survey of related work. SR and ÖÇ wrote the discussion and conclusion. LK and DWW performed the usability evaluation. DWW designed the methodology, made all the phone calls, and improved the language of the final paper. All authors meticulously evaluated the similarity reports of the systems and contributed to the whole project. All authors read and approved the final manuscript. The contributions of others who are not authors are listed in the acknowledgements. Acknowledgments We are deeply indebted to the contributions made to this investigation by the following persons: ● Gökhan Koyuncu and Nil Duman from the Canakkale Onsekiz Mart University (Turkey) uploaded many of the test documents to the various systems; ● Jan Mudra from Mendel University in Brno (Czechia) contributed to the usability testing, and performed the testing of the Czech language set; ● Caitlin Lim from the University of Konstanz (Germany) contributed to the literature review; ● Pavel Turčínek from Mendel University in Brno (Czechia) prepared the Czech language set; ● Esra Şimşek from the Canakkale Onsekiz Mart University (Turkey), helped in preparing the English language set; ● Maira Chiera from University of Calabria (Italy) prepared the Italian language set; ● Styliani Kleanthous Loizou from the University of Nicosia (Cyprus) contributed to the methodology design; ● We wish to especially thank the software companies that provided us access to their systems free of charge and patiently extended our access as the testing took much more time than originally anticipated. ● We also wish to thank the companies that sent us feedback on an earlier version of this report. We are not able to respond to every issue raised, but are grateful for them pointing out areas that were not clear. 2 1. Introduction Teddi Fishman, former director of the International Centre for Academic Integrity, has proposed the following definition for plagiarism: “Plagiarism occurs when someone uses words, ideas, or work products, attributable to another identifiable person or source, without attributing the work to the source from which it was obtained, in a situation in which there is a legitimate expectation of original authorship, in order to obtain some benefit, credit, or gain which need not be monetary“ (Fishman, 2009, p. 5). Plagiarism constitutes a severe form of academic misconduct. In research, plagiarism is included in the three “cardinal sins”, FFP—Fabrication, falsification, and plagiarism. According to Bouter, Tijdink, Axelsen, Martinson, & ter Riet (2016), plagiarism is one of the most frequent forms of research misconduct. Plagiarism constitutes a threat to the educational process because students may receive credit for someone else’s work or complete courses without actually achieving the desired learning outcomes. Similar to the student situation, academics may be rewarded for work which is not their own. Plagiarism may also distort meta-studies, which make conclusions based on a number or percentage of papers that confirm or refute a certain phenomenon. If these papers are plagiarized, then the number of actual experiments is lower and conclusions of the meta- study may be incorrect. There can also be other serious consequences for the plagiarist. The cases of politicians who had to resign in the aftermath of a publicly documented plagiarism case are well known, not only in Germany (Weber-Wulff, 2014) and Romania (Abbott, 2012), but also in other countries. Scandals involving such high-profile persons undermine citizens’ confidence in democratic institutions and trust in academia (Tudoroiu, 2017). Thus, it is of great interest to academic institutions to invest the effort both in plagiarism prevention and in its detection. Foltýnek, Meuschke, & Gipp (2019) identify three important concerns in addressing plagiarism: 1. Similarity detection methods that for a given suspicious document, are expected to identify possible source document(s) in a (large) repository; 2. Text-matching systems that maintain a database of potential sources, employ various detection methods, and provide an interface to users; 3. Plagiarism policies that are used for defining institutional rules and processes to prevent plagiarism or to handle cases that have been identified. This paper focuses on the second concern. Users and policymakers expect what they call plagiarism detection software, but more exactly should be referred to as text-matching software, to use state-of-the-art similarity detection methods. The expected output is a report with all the passages that are identical or similar to other documents highlighted, together with links to and information about the potential sources. To determine how the source was changed and whether a particular case constitutes plagiarism or not, an evaluation by a human being is always needed, as there are many inconclusive or problematic results reported. The output of 3 such a system is often used as evidence in a disciplinary procedure. Therefore, both the clarity of the report and the trustworthiness of its content are important for the efficiency and effectiveness of institutional processes. There are dozens of such systems available on the market, both free and paid services. Some can be used online, while others need to be downloaded and used locally. Academics around the globe are naturally interested in the question: How far can these systems reach in detecting text similarities and to what extent are they successful? In this study, we will look at the state- of-the-art text-matching software with a focus on non-English languages and provide a comparison based on specific criteria by following a systematic methodology. The investigation was conducted by nine members of the European Network for Academic Integrity (ENAI) in the working group TeSToP, Testing of Support Tools for Plagiarism Detection. There was no external funding available, the access to the various systems was provided to the research group free of charge by the companies marketing the support tools. The paper is organized as follows. Section 2 provides a detailed survey of related work. Section 3 specifies the methodology used to carry out the research. Section 4 describes the systems used in the research. Section 5 reports the results acquired. Discussion and conclusion points are given at the end of the paper. 2. Survey of Related Work Since the beginning of this century, considerable attention has been paid, not only to the problem of plagiarism, but also to text-matching software that is widely used to help find potentially plagiarized fragments in a text. There are plenty of scientific papers that postulate in their titles that they offer a classification, a comparative study, an overview, a