A Quantitative Evaluation of Turnitin from an L2 Science and Engineering Perspective
Total Page:16
File Type:pdf, Size:1020Kb
CALL-EJ, 17(1), 1-18 A Quantitative Evaluation of Turnitin from an L2 Science and Engineering Perspective Kathryn Oghigian ([email protected]) Waseda University, Japan Michael Rayner ([email protected]) Waseda University, Japan Kiyomi Chujo ([email protected]) Nihon University, Japan Abstract The purpose of this paper is to investigate the functionality and accuracy of Turnitin results as applied to 68 science and engineering research papers, and the potential use of the software in a second language context. Results showed Turnitin found “similar matching” in 99% of papers; however, an analysis eliminating false positives and categorizing actual plagiarism events as outright, paraphrase and patchwork plagiarism, or stealing an apt term showed only 29% featured plagiarized material, and in most cases, evidence suggested no intent to deceive. Findings indicate that Turnitin can be useful, particularly as a pedagogical rather than policing tool, but “similarity” percentages can be misleading and careful user evaluation of the entire paper shown with flagged highlighting is necessary in order to fairly assess student intent. Keywords: plagiarism detection, originality, second language learning, Turnitin Introduction From a review of the huge number of studies in the literature, it is immediately apparent that plagiarism and the use of plagiarism detection software are challenging issues. A great deal of research has been done on plagiarism (for an excellent overview, see Flowerdew & Li, 2007:162); however, it is also clear that there is no agreed upon definition or nomenclature, particularly when applied to the reproduction of text, as opposed to ideas (see Pecorari, 2001). Suggested terms include textual plagiarism (Flowerdew & Li, 2007), textual borrowing (Shi, 2006), textual and prototypical plagiarism (Pecorari, 2001), and plagiphrasing (Whitaker, 1993). Other researchers have suggested that plagiarism is a Western standard that dates back to the appearance of the printing press in the Middle Ages (Flowerdew & Li, 2007) and is not applicable in other cultures, particularly those in Asia and the Middle East (Click, 2012; Sowden, 2005). The use of plagiarism software for student writing is also an issue. Turnitin (http://turnitin.com/), arguably the most commonly used plagiarism detection software, has been banned from several universities, most notably Yale, Harvard, Princeton (The Daily Princetonian, 2006; Hanrahan, 2008; Bretag & Mahmud, 2009), and others, and its 1 CALL-EJ, 17(1), 1-18 mandatory use has been legally challenged in other academic institutions (CBC, 2004; The McGill Daily, 2005). Second Language Learning and Plagiarism In a second language (L2) context, there seem to be several important issues concerning plagiarism. These include whether students fully grasp the concepts of similarity (i.e., how strings of words can be matched as “similar” to words in a database by software), originality and plagiarism; what role cultural expectations may play; and if the practice of plagiarism may simply be a developmental issue, which disappears as students increase their productive vocabulary and learn how to take better notes, summarize, paraphrase and quote sources. In their detailed qualitative examination of Turnitin through student questionnaire responses on its use, Bensal, Miraflores and Tan (2013) explored the first of these issues and concluded that the students in their study tended to miss the point of originality; in other words, that they focused their efforts on “rephrasing to evade high percentages of Similarity Index ratings more than trying to explain the real concepts in the paper guided by a real sense of academic integrity” (p. 17). Some international students may not intend to deceive, but rather engage in various forms of plagiarism because they believe memorization and recitation to be acceptable and valuable, or they may have difficulty understanding that someone can own an idea (Sutherland-Smith, 2005; Trudeau & Sevier, undated). Shi (2006) investigated plagiarism (“textual appropriation”) and found that students did not understand what needed citing and what did not; and nonwestern students viewed the idea of plagiarism as “foreign and unacceptable” (p. 264). Jones and Freeman (2003) describe a “benign learning process” in which students learn to copy format, segments, and phrases, resulting in the perception of copying as a valid form of learning. L2 students are often given “sentence templates,” or taught to search out clusters in corpora to incorporate into writing (Hunston, 2002), and the introduction of corpus pedagogy to find and use common expressions or formulaic writing may contribute to learner uncertainty, i.e., confusion between what is common usage and allowable, and what is unique, and not. Flowerdew and Li (2007) summarize the use of plagiarism as a student survival strategy based on the belief that some copying is acceptable to combat task overload and pressure to pass assessments, and to compensate for a lack of confidence in using the target language (p. 169). These studies offer a glimpse into why students might plagiarize; however, there are limited studies on evaluating the functionality and accuracy of Turnitin in an L2 context. Stapleton (2012) evaluated whether or not Turnitin was a deterrent for plagiarizing in a study on writing produced by L1 graduate students. He found that students who were aware their work would be checked had lower percentages, but also noted that the software was not necessarily accurate and should be used with caution. Walker (2010) used Turnitin to assess the frequency, nature and extent of plagiarism in university business student writing, but did not assess its accuracy or provide suggestions for more effective use. 2 CALL-EJ, 17(1), 1-18 Purpose of the Study Turnitin was recently adopted by a Japanese university and made available to English academic writing teachers (two of the authors). The purpose of this paper is to investigate how the software could be best employed in an L2 science and engineering context by evaluating the functionality and accuracy of the program; and suggesting modifications that could make it a more effective tool for L2 students and teachers. In this study, two important assumptions are made. First, it is recognized that Turnitin was not created as a teaching tool for L2 writing students. This investigation is not a criticism of the software but rather is an excursion into how it functions in an L2 environment and how it might be adapted to be more effective. Secondly, Turnitin was originally created and marketed as a plagiarism detection software (Barrie & Presti, 2000), and in the perceptions of many users remains so (Shi, 2006; Stapleton, 2012; Vie, undated). Turnitin provides an initial percentage of “similar” or “matching” text and flags questionable (“similar”) passages; this study is an evaluation of this percentage and specific flagged strings of words in order to more clearly understand whether or these results can, in fact, be viewed as plagiarism. Method Operational Definitions Turnitin is only able to identify language in the form of sequential words or strings that match against its database through its use of algorithms, however plagiarism in the sense of the use of another’s ideas without citation, is not measured or evaluated. Thus an evaluation of the software is only possible in terms of how effectively it identifies to what extent students use identical phrasing to source texts or other students’ work. Participants Research papers written by 68 third and fourth year undergraduate students as the final task of an academic writing course were screened for plagiarism using Turnitin. The students belonged to various departments in the faculty of science and engineering. The students had a wide range of English language abilities based on teacher observation, and self-reported Test of English for International Communication (TOEIC) scores ranging from 390 to 950. They were enrolled in a weekly 90 minute, one semester (14 week) elective writing course with a goal of producing a 2,000-word research paper on a topic of their choice in their technical fields, formatted in IEEE style (http://www.ieee.org/documents/stylemanual.pdf). Students were given instruction in research writing, including how to cite and reference sources and how to take notes in their own words and quote, paraphrase and summarize. University policy governing plagiarism was provided both in the L1 (Japanese) and L2 (English), and students were repeatedly reminded “not to copy” because “plagiarized papers will fail.” Participants signed a consent form to allow their work to be used for research purposes. 3 CALL-EJ, 17(1), 1-18 Procedure The university provides Turnitin for teachers’ use, and the interface is found on the university network. To access Turnitin, a teacher creates a folder into which each student submits a paper. A settings option for Turnitin is available to the teacher so that all papers will be automatically checked when uploaded by the students. A percentage denoting the “Plagiarism Detection Results” and a coloured box representing the scale of the problem appear next to each student’s name once the paper completes its run through the program. In Figure 1, the first result shows 28% with a yellow code; subsequent results are shown with various percentages and green codes. The figure is truncated to conceal identifying information about the students. Figure 1. A partial class list with colour-coded “Plagiarism Detection Results”. Functionality For the first analysis, the original colour-coded “plagiarism detection” percentage shown in the folder class list was noted for each student paper. Turnitin filters allow users to identify and exclude quotes and bibliography from the “Plagiarism Detection Results” percentage, and these filters were then activated so that quotes and bibliography [hereafter referred to as references] would be excluded. The new percentage was noted. (This had to be done manually for each paper because of this particular system set-up.) Accuracy By clicking on the underlined percentage next to “plagiarism detection results” in the class list, a document viewer is loaded.