Open-Source Tools and Benchmarks for Code-Clone Detection: Past, Present, and Future Trends

Total Page:16

File Type:pdf, Size:1020Kb

Open-Source Tools and Benchmarks for Code-Clone Detection: Past, Present, and Future Trends Open-Source Tools and Benchmarks for Code-Clone Detection: Past, Present, and Future Trends Andrew Walker Tomas Cerny Eungee Song Computer Science Computer Science Computer Science ECS, Baylor University ECS, Baylor University ECS, Baylor University One Bear Place #97141 One Bear Place #97141 One Bear Place #97141 Waco, TX 76798 Waco, TX 76798 Waco, TX 76798 [email protected] [email protected] [email protected] ABSTRACT of a software system lead to code clones. Despite some ini- A fragment of source code that is identical or similar to an- tial surface-level benefits of code-clones, they, in fact, make other is a code-clone. Code-clones make it difficult to main- the source files very hard to modify consistently. For in- tain applications as they create multiple points within the stance, consider a software system that has several cloned code that bugs must be fixed, new rules enforced, or design subsystems created by duplication with slight modification. decisions imposed. As applications grow larger and larger, When a fault is found in one subsystem, caution must be the pervasiveness of code-clones likewise grows. To face the used to modify all other subsystems [44] or risk the persis- code-clone related issues, many tools and algorithms have tence of the bug into deployment. If the existence of clones been proposed to find and document code-clones within an has been documented and maintained properly, the modifi- application. In this paper, we present the historical trends cation would be relatively easy; however, keeping all clone in code-clone detection tools to show how we arrived at the information is a generally expensive process, especially for a current implementations. We then present our results from large and complex system. a systematic mapping study on current (2009-2019) code- In this paper, we provide the roadmap to existing research clone detection tools with regards to technique, open-source and state-of-the-art code clone detection. We searched the nature, and language coverage. Lastly, we propose future di- IEEE Xplore, ACM Digital Library, and SpringerLink in- rections for code-clone detection tools. This paper provides dexers to identify existing work since 2009. From 3,056 the essentials to understanding the code-clone detection pro- found papers, we recognized and reviewed 67 papers that cess and the current state-of-art solutions. provide tools and benchmarks that could be used to detect code-clones. We identify and classify the techniques used by modern clone detection tools. We also observe and iden- CCS Concepts tify open-source tools and provide references. Finally, we •Software and its engineering ! Formal software ver- determine the coverage by modern tools across existing pro- ification; Software maintenance tools; Software ver- gramming languages. Out of our assessment, we compile ification and validation; Parsers; together future trends. The reader of this paper will un- derstand recent research in this area with respect to tools and benchmarks and thus be able to work on top of existing Keywords artifacts instead of reinventing the wheel. Code Clone, Clone Detection, Mapping Study, Survey The rest of the paper is organized as follows. Section 2 presents the background on code clones and clone types, 1. INTRODUCTION as well as an overview of the code-clone detection process. Section 3 presents our process and results from a mapping Code-clone detection is the process of finding exact or similar study on current trends in code-clone detection tools. Sec- pieces of code known as code clones within an application. tion 4 discusses future trends for code-clone detection tools, Code-clones are introduced in a multitude of ways, includ- as found through our comprehensive study. This is followed ing one of the most significant ways, which is through code by threats to validity and our conclusion. reuse by developers. This involves a developer copying ei- ther pre-existing fragments, coding style, or both. Another way is through repeated computation using duplicated func- tions with slight changes and variations in variables or data 2. BACKGROUND structures used. This is also done for the purposes of en- In this section, we present a background on code-clones and hancements or customization [10]. These changes that are their detection. A large issue within the field code-clone so often used for modification and performance enhancement detection is the definition of what constitutes a code-clone. Firstly, we cover what the generally accepted different types Copyright is held by the authors. This work is based on an ear- lier work: RACS’19 Proceedings of the 2019 ACM Research in Adap- of code-clones are. We then present a summary of historical tive and Convergent Systems, Copyright 2019 ACM 978-1-4503-6843- trends in code-clone detection. While this section does not 8. http://doi.acm.org/10.1145/3338840.3355654 fully cover every tool, we believe the tools we cover show a APPLIED COMPUTING REVIEW DEC. 2019, VOL. 19, NO. 4 28 good representation of understanding the historical trends. source code that have no bearing in the comparison process. Lastly, we cover the benchmarks that are used to test code- Second, it transforms source code into units by dividing it clone detection tools. into separate fragments such as classes, functions, begin- end blocks, or statements. This is done in a variety of ways, 2.1 Basic Definitions including lexical or Abstract Syntax Tree (AST) analysis. These units are used to check for the existence of direct code- Throughout this paper, we use the following well-accepted clone relations. Last, this process will define the comparison basic definitions from the previous surveys on code clones units. For instance, source units can be divided into tokens. [11, 66, 75]: Transformation: This process transforms the source code Code Fragment: A continuous segment of the source code, into a corresponding Intermediate Representation (IR) for specified by (l, s, e), including the source file l, the line the comparison. There are many types of representations that fragment starts on, s, and the line it ends on, e. can be constructed from the source code, such as token Clone Pair: A pair of code fragments that are similar, spec- streams, in which each line of source code is converted into a ified by (f1, f2, ? ), including the similar code fragments f1 sequence of tokens. Another common construct is the AST, and f2, and their clone type ?. in which all of the parsed source code is transformed into an abstract syntax tree or parse tree for sub-tree comparisons. Clone Class: A set of code fragments that are similar. Spec- Additionally, source code can be extracted into Program ified by the tuple (f1, f2,::: , fn, ? ). Each pair of distinct Dependency Graph (PDG), which is used to represent con- fragments is a clone pair: (fi, fj , ? ), i; j 2 1 : : : n; i 6= j: trol and data dependencies. A PDG is usually made using Code Block: A sequence of code statements within braces. semantics-aware techniques from the source code for sub- graph comparison. 2.2 Types of Clones Match Detection: This process compares every transformed The high-level recognized clone classification is broken into fragment of code to all other fragments to find similar source two categories - syntactic and semantic clones. Syntactic code fragments. The output is a set of similar code frag- clones refer to two code fragments, which are similarly based ments either in a clone pair list such as (c1,c2) or a set on their text [67, 3], while semantic clones are two code frag- of combined clone pairs in one class or one group such as ments similar based on their functions [21]. Furthermore, (c1,c2,c3). from the more detailed perspective, there are four types of code clones where the first three types fit the syntactic clone 2.4 Historical Trends category, and the fourth one fits the semantic clones. In the next section of this paper, we will present our findings Type-1: A type-1 code-clone is one in which the two frag- from a mapping study on modern code-clone detection tools. ments are exactly identical. However, the two code frag- This study is limited to papers within the past decade (2009- ments do not need to be precisely the same with regards to 2019) to focus exclusively on modern trends. However, there whitespace, blanks, and comments, as these are generally were many established, well-known tools before 2009, which removed for the code-clone detection process. are worth discussing. We discuss those tools briefly below although they fall outside of the scope of our mapping study. Type-2: A type-2 code-clone is one in which two code frag- ments are similar except for the renaming of some unique Some of the earliest and most seminal work on code-clone identifiers such as function/class names and variable identi- detection was done in the early 1990s. Baker proposed in fiers. In a seminal paper on type-2 clones, Baker identifies 1992 a code-clone detection tool [5] that was based on the the replacement of these unique identifiers as "parameteriz- line-by-line comparison. For the purposes of comparison, ing" the code fragment [8, 6]. whitespace and comments were removed from source-code files. In 1995, this algorithm was updated into a tool called Type-3: A type-3 code-clone is essentially a type-2 code- Dup [7], which used the idea of "parameterization" to allow clone; however, the fragments may be modified. This in- the discovery of type 1 and type 2 clones. To "parameterize" cludes adding and removing portions of the code from the the source code, all unique identifiers (e.g., variable names, two fragments or reordering statements within a code block.
Recommended publications
  • EZ Upgrade 3.0 USB 3.0 NOTEBOOK HARD DRIVE UPGRADE KIT User’S Guide
    EZ Upgrade 3.0 USB 3.0 NOTEBOOK HARD DRIVE UPGRADE KIT User’s Guide Attention: You can download the latest version of EZ Gig IV from Apricorn’s website at: http://www.apricorn.com/ezgig Review the most current EZ Upgrade FAQ’s at: http://www.apricorn.com/support.php TABLE OF CONTENTS Getting to know your EZ Upgrade 4 SafeRescue 21 Introduction 4 CachedMemory 21 The Package Contains 4 SharedMemory 21 System Requirements 5 Animation 21 HotCopy / LiveImage 22 Connecting your EZ Upgrade 6 Avoid exclusive read access 22 Hard Drive Installation 6 For SATA hard drives 6 Partitions 23 Once your hard drive is connected 7 Resizing your partitions manually 24 Before Running the Software 8 Start Clone 25 Before running EZ Gig we recommend 8 Interupting the Cloning Process 26 EZ Gig instructions for Windows 7 & Vista 9 Aborting the Cloning Process 26 Changing the default “Power Options” 9 Interupting the Verification process 26 Connecting Your Hard Drive 10 Congratulations your Clone is Complete 27 EZ Gig Start Up Options 10 Creating a Bootable EZ Gig CD 11 SuperDuper! Backup Software for Mac 28 Cloning your hard drive with EZ Gig 12 After the Clone 29 Selecting the Source Drive 13 Using the EZ Upgrade as a external backup drive 29 Selecting the Destination Drive 14 Device Removal (Windows) 30 Speed Test 14 EZ Upgrade FAQs 31 Drive Verification 15 EZ Gig FAQs 34 Data Select 16 Contacting Technical Support 39 Using the Data Select feature 17 RMA Policy 39 Analyzing files 18 Warranty Conditions 40 Selecting Folders to Omit 19 Advanced Options 20 Verify Copy 20 Copy Free Areas 20 SmartCopy 20 Media Direct (Dell) 20 More Advanced Options 21 FastCopy 21 Getting to know your EZ Upgrade System Requirements Introduction Hardware: Pentium CPU II or Apple G3 or later 128MBs RAM With its Super Speed USB 3.0 connection, Available USB Port EZ Upgrade 3.0 is a complete hardware and CD ROM or CD-RW drive software solution that makes upgrading notebook hard drives fast and easy.
    [Show full text]
  • Informatica Fast Clone Faqs
    Informatica Fast Clone FAQs © 2011 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation. Abstract This article describes frequently asked questions about using Informatica Fast Clone for bulk data movement. It includes information about Fast Clone features and some common errors. Supported Versions ¨ Informatica Fast Clone 6.6 Table of Contents General Questions............................................................... 2 Configuration Questions........................................................... 5 Questions about Oracle Sources...................................................... 7 Troubleshooting................................................................. 8 General Questions What is Fast Clone? Fast Clone is a high-performance cloning tool for moving bulk data from Oracle databases to heterogeneous destinations, including relational databases and flat files. The optional DataStreamer component can stream data to Greenplum and Teradata destinations, which avoids intermediate storage use and reduces I/O. You can use Fast Clone for any of the following purposes: ¨ Cloning Oracle databases. ¨ Moving Oracle data to another type of platform. ¨ Migrating production data into a test environment. ¨ Initially loading data into tables that will be the targets of Informatica Data Replication jobs, before starting transactional data replication. In this case, Fast Clone is a high-speed alternative to the Data Replication InitialSync component. What is the difference between the direct path unload and conventional path unload methods? Fast Clone has two methods of unloading data from an Oracle source: direct path unload and conventional path unload. The direct path unload method is much faster. It extracts source metadata to physical files and reads Oracle data files directly. Also, it can stream data to Greenplum and Teradata destinations.
    [Show full text]
  • Acronis® Disk Director® 12 User's Guide
    User Guide Copyright Statement Copyright © Acronis International GmbH, 2002-2015. All rights reserved. "Acronis", "Acronis Compute with Confidence", "Acronis Recovery Manager", "Acronis Secure Zone", Acronis True Image, Acronis Try&Decide, and the Acronis logo are trademarks of Acronis International GmbH. Linux is a registered trademark of Linus Torvalds. VMware and VMware Ready are trademarks and/or registered trademarks of VMware, Inc. in the United States and/or other jurisdictions. Windows and MS-DOS are registered trademarks of Microsoft Corporation. All other trademarks and copyrights referred to are the property of their respective owners. Distribution of substantively modified versions of this document is prohibited without the explicit permission of the copyright holder. Distribution of this work or derivative work in any standard (paper) book form for commercial purposes is prohibited unless prior permission is obtained from the copyright holder. DOCUMENTATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID. Third party code may be provided with the Software and/or Service. The license terms for such third-parties are detailed in the license.txt file located in the root installation directory. You can always find the latest up-to-date list of the third party code and the associated license terms used with the Software and/or Service at http://kb.acronis.com/content/7696 Acronis patented technologies Technologies, used in this product, are covered and protected by one or more U.S.
    [Show full text]
  • ALTAIR 8800 CLONE COMPUTER OPERATOR's MANUAL Version
    ALTAIR 8800 CLONE COMPUTER OPERATOR’S MANUAL Version 2.3, April 2021 TABLE OF CONTENTS INTRODUCTION ............................................................... 1 PART 1 – ALTAIR 8800 CLONE HARDWARE SPECIFICATIONS ......................... 2 PART 2 – CONFIGURATION MONITOR ............................................. 4 A. Floppy Disk Menu ....................................................... 4 B. PROM Menu .............................................................. 6 C. Serial Port Menu ....................................................... 7 D. Load .BIN or .HEX File ................................................. 9 E. Administration Menu .................................................... 9 PART 3 – TERMINAL EMULATOR ................................................ 14 A. Overview .............................................................. 14 B. TeraTerm Terminal Emulator ............................................ 14 C. Installing TeraTerm ................................................... 14 D. Configuring TeraTerm .................................................. 14 E. Using TeraTerm ........................................................ 15 PART 4 – ALTAIR 8800 DEMONSTRATIONS ....................................... 16 A. Kill-the-Bit Front Panel Game ......................................... 16 B. Loading and Using 4K BASIC from Paper Tape ............................ 18 C. Loading and Using 8K BASIC from Cassette .............................. 21 D. Loading and Using Disk BASIC from Floppy Disk ........................
    [Show full text]
  • Using Trade Dress to Protect the Look and Feel of Video Games
    THE JOHN MARSHALL REVIEW OF INTELLECTUAL PROPERTY LAW TRYING ON TRADE DRESS: USING TRADE DRESS TO PROTECT THE LOOK AND FEEL OF VIDEO GAMES BENJAMIN C.R. LOCKYER ABSTRACT With the creation of video games for smart phones, video games are some of the most accessible forms of entertainment on the market. What was once only an attraction inside the designated location of arcade halls, is now within the grip of nearly every smart phone user. With new game apps for smart phones going viral on a regular basis, the video game industry has become one of the most profitable in the entertainment realm. However, the industry's overall success has also led to increased competition amongst game developers. As a result, competing developers create near exact copies of highly successful video games called clones. By copying non-copyrightable elements, clone developers can create confusingly similar video games. This comment examines the creation of clone video games and how their developers avoid copyright infringement by exploiting scènes à faire and the merger doctrine. The exploitation of copyright law for video game developers could be combated by trademark law. By using the Lanham Act's protection for trade dress, non-copyrightable elements that identify popular games may be protected. By seeking trade dress protection against clones, game developers can sustain the value of their investment in gaming apps, while also minimizing the issue of consumer confusion. Copyright © 2017 The John Marshall Law School Cite as Benjamin C.R. Lockyer, Trying on Trade Dress: Using Trade Dress to Protect the Look and Feel of Video Games, 17 J.
    [Show full text]
  • A Literature Review of Code Clone Analysis to Improve Software Maintenance Process
    A Literature Review of Code Clone Analysis to Improve Software Maintenance Process Md. Monzur Morshed* 1, 3, Md. Arifur Rahman2, Salah Uddin Ahmed1 {[email protected], [email protected], [email protected]} [email protected], [email protected] Department of Computer Science American International University-Bangladesh1, Carleton University-Canada2, SCICON & TigerHATS-Bangladesh3 Abstract—Software systems are getting more complex as the Gemini delivers the source files to the code clone detector, and system grows where maintaining such system is a primary CCFinder [24] then represents the information of the detected concern for the industry. Code clone is one of the factors making code clones to the user through various GUIs. software maintenance more difficult. It is a process of replicating code blocks by copy-and-paste that is common in software Hotta et al. [32] showed a different approach on the impact of development. In the beginning stage of the project, developers clones in software maintenance activities to measure the find it easy and time consuming though it has crucial drawbacks in the long run. There are two types of researchers where some modification frequencies of the duplicated and non-duplicated researchers think clones lead to additional changes during code segments. According to their study, the presence of maintenance phase, in later stage increase the overall clones does not introduce extra difficulties in the maintenance maintenance effort. On the other hand, some researchers think phase. that cloned codes are more stable than non cloned codes. In this study, we discussed Code Clones and different ideas, methods, M. Kim et al.
    [Show full text]
  • Dell PS Series Snapshots and Clones: Best Practices and Sizing Guidelines
    Dell PS Series Snapshots and Clones: Best Practices and Sizing Guidelines Dell Storage Engineering November 2019 Dell EMC Best Practices Revisions Date Description May 2012 Initial release December 2016 Minor updates November 2019 vVols branding update The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any software described in this publication requires an applicable software license. Copyright © 2012-2019 Dell Inc. or its subsidiaries. All Rights Reserved. Dell, EMC, and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks may be the property of their respective owners. Published in the USA [11/14/2019] [Best Practices] [BP1027] Dell EMC believes the information in this document is accurate as of its publication date. The information is subject to change without notice. 2 Dell PS Series Snapshots and Clones: Best Practices and Sizing Guidelines | BP1027 Table of contents 1 Introduction ................................................................................................................................................................... 6 1.1 Audience ............................................................................................................................................................. 6 1.2 Terminology .......................................................................................................................................................
    [Show full text]
  • China's Digital Game Sector
    May 17, 2018 China’s Digital Game Sector Matt Snyder, Analyst, Economics and Trade Acknowledgments: The author thanks Lisa Hanson, Dean Takahashi, and Greg Pilarowski for their helpful insights. Their assistance does not imply any endorsement of this report’s contents, and any errors should be attributed solely to the author. Disclaimer: This paper is the product of professional research performed by staff of the U.S.-China Economic and Security Review Commission, and was prepared at the request of the Commission to support its deliberations. Posting of the report to the Commission’s website is intended to promote greater public understanding of the issues addressed by the Commission in its ongoing assessment of U.S.- China economic relations and their implications for U.S. security, as mandated by Public Law 106-398 and Public Law 113-291. However, the public release of this document does not necessarily imply an endorsement by the Commission, any individual Commissioner, or the Commission’s other professional staff, of the views or conclusions expressed in this staff research report. Table of Contents Executive Summary....................................................................................................................................................3 China’s Digital Game Market ....................................................................................................................................3 Importance of the Digital Game Sector to the U.S. Economy ....................................................................................8
    [Show full text]
  • Dell EMC Unity: Snapshots and Thin Clones a Detailed Review
    Technical White Paper Dell EMC Unity: Snapshots and Thin Clones A Detailed Review Abstract This paper describes Dell EMC™ Unity Snapshots and Thin Clones for Dell EMC Unity storage systems and includes information about their configuration management, and other functionality. June 2021 H15089.7 Revisions Revisions Date Description June 2021 Template and format updates. Dell EMC Unity OE version 5.1 updates Acknowledgments Author: Ryan Poulin The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any software described in this publication requires an applicable software license. This document may contain certain words that are not consistent with Dell's current language guidelines. Dell plans to update the document over subsequent future releases to revise these words accordingly. This document may contain language from third party content that is not under Dell's control and is not consistent with Dell's current guidelines for Dell's own content. When such third party content is updated by the relevant third parties, this document will be revised accordingly. Copyright © 2016-2021 Dell Inc. or its subsidiaries. All Rights Reserved. Dell Technologies, Dell, EMC, Dell EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks may be trademarks of their respective owners. [6/21/2021] [Technical White Paper] [H15089.7] 2 Dell EMC Unity: Snapshots and Thin Clones | H15089.7 Table of contents Table of contents Revisions............................................................................................................................................................................
    [Show full text]
  • Retromagazine 02 Eng.Pdf
    TABLE OF CONTENTS Holiday time, memory time... ◊ The Olivetti M20 and the history of a Pag. 3 website Summer, with its torrid heat and hot nights came to visit all of us again. Probably now more than ever warmth and ◊ The LM80C Colour Computer - Part 1 Pag. 7 temperatures above average have been expected with such ◊ Japan 12th episode: Game & Watch Vs Pag. 11 trepidation. After a horrible winter and spring, this summer MADrigal is not only synonymous with holidays, but also a slow ◊ Can we multiply the number of games Pag. 18 return to the normal life for some of us. for THEC64? Yes, we can! ◊ Back to the past... - Episode nr. 2: Pag. 19 Who’s writing have been living abroad for few years now Windows 2000 and summer is one of the most awaited moments to be able to return to Italy and embrace friends and family. This year ◊ Amstrad CPC - Redefining characters Pag. 21 you can easily imagine how ardently I was waiting for the ◊ A splash screen in SCR format for the Pag. 23 possibility to travel again and return to the places of my Amstrad CPC youth. For those who live far from their home country the ◊ Abbreviations & shortcuts on using a Pag. 27 chance to return once or twice a year is like browsing graphical interface through a memory album. Finding places and people you ◊ The 1st RMW 8-bit Home Computer Pag. 30 haven't seen in a long time makes you want to know what Chess Tournament happened in the meantime and likewise the possibility of ◊ Introduction to ARexx – Part 1 Pag.
    [Show full text]
  • Dell EMC Powerstore: Snapshots and Thin Clones
    Technical White Paper Dell EMC PowerStore: Snapshots and Thin Clones Abstract This white paper provides an overview of the snapshot and thin clone features of Dell EMC™ PowerStore™, including information about the underlying structures and management methods. June 2021 H18156.3 Revisions Revisions Date Description April 2020 Initial release: PowerStoreOS 1.0 May 2020 Minor updates April 2021 Minor updates: PowerStoreOS 2.0 June 2021 Minor updates Acknowledgments Author: Ryan Poulin Updated: Ethan Stokes The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any software described in this publication requires an applicable software license. This document may contain certain words that are not consistent with Dell's current language guidelines. Dell plans to update the document over subsequent future releases to revise these words accordingly. This document may contain language from third party content that is not under Dell's control and is not consistent with Dell's current guidelines for Dell's own content. When such third party content is updated by the relevant third parties, this document will be revised accordingly. Copyright © 2020-2021 Dell Inc. or its subsidiaries. All Rights Reserved. Dell Technologies, Dell, EMC, Dell EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks may be trademarks of their respective owners. [6/7/2021] [Technical White Paper] [H18156.3] 2 Dell EMC PowerStore: Snapshots and Thin Clones | H18156.3 Table of contents Table of contents Revisions............................................................................................................................................................................
    [Show full text]
  • Software Clone Detection: a Systematic Review
    Information and Software Technology 55 (2013) 1165–1199 Contents lists available at SciVerse ScienceDirect Information and Software Technology journal homepage: www.elsevier.com/locate/infsof Software clone detection: A systematic review ⇑ Dhavleesh Rattan a, , Rajesh Bhatia b,1, Maninder Singh c,2 a Department of Computer Science and Engineering and Information and Technology, Baba Banda Singh Bahadur Engineering College, Fatehgarh Sahib 140 407, Punjab, India b Department of Computer Science and Engineering, Deenbandhu Chhotu Ram University of Science and Technology, Murthal (Sonepat) 131 039, Haryana, India c Computer Science and Engineering Department, Thapar University, Patiala 147 004, Punjab, India article info abstract Article history: Context: Reusing software by means of copy and paste is a frequent activity in software development. The Received 21 June 2011 duplicated code is known as a software clone and the activity is known as code cloning. Software clones Received in revised form 29 December 2012 may lead to bug propagation and serious maintenance problems. Accepted 21 January 2013 Objective: This study reports an extensive systematic literature review of software clones in general and Available online 14 February 2013 software clone detection in particular. Method: We used the standard systematic literature review method based on a comprehensive set of 213 Keywords: articles from a total of 2039 articles published in 11 leading journals and 37 premier conferences and Software clone workshops. Clone detection Systematic literature review Results: Existing literature about software clones is classified broadly into different categories. The Semantic clones importance of semantic clone detection and model based clone detection led to different classifications. Model based clone Empirical evaluation of clone detection tools/techniques is presented.
    [Show full text]