Design and Implementation of Semantic Patch Support for the Spoon Java Transformation Engine

Total Page:16

File Type:pdf, Size:1020Kb

Design and Implementation of Semantic Patch Support for the Spoon Java Transformation Engine DEGREE PROJECT IN COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2021 Design and Implementation of Semantic Patch Support for the Spoon Java Transformation Engine MIKAEL FORSBERG KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE Design and Implementation of Semantic Patch Support for the Spoon Java Transformation Engine MIKAEL FORSBERG Master in Computer Science Date: January 26, 2021 Supervisor: Nicolas Yves Maurice Harrand Examiner: Martin Monperrus School of Electrical Engineering and Computer Science Swedish title: Design och implementering av stöd för semantiska patchar för Javatransformeringsmotorn Spoon iii Abstract Software development is more often than not a collaborative process, creating a need for tools and file formats that enable developers to create and share suc- cinct representations of changes to source code in order to facilitate efficient communication. Standard POSIX diffs and patches have long been important parts of the toolkit, but their lack of support for the syntax and semantics of spe- cific programming languages results in limited expressiveness. The Semantic Patch Language (SmPL), introduced in 2006 together with the tool Coccinelle, increases the expressiveness of POSIX-style patches for the C programming language by leveraging support for the syntax and semantics of C. For exam- ple, an SmPL patch can specify changes to source code using metavariables that bind arbitrary program variable names, allowing for the specification of transformations involving variable references regardless of what specific vari- able names appear in programs targeted by the patch. A recent development is Coccinelle4J, a prototype modification of Coccinelle targeting the Java pro- gramming language. Coccinelle4J remains based on a toolkit designed for the parsing and modeling of C, adapted to operate on Java source code. The language mismatch of the base toolkit gives rise to limitations. Despite this, Coccinelle4J remains the state of the art for an SmPL targeting Java. In this thesis we lay the foundations for an SmPL for Java based on Spoon, a robust Java metaprogramming toolkit. We qualitatively investigate to which extent the features of SmPL and Coccinelle are generalizable to a Java context, and we implement and evaluate SPOON-SMPL, a prototype SmPL tool for Java based on Spoon. We base the core design of SPOON-SMPL on temporal logic and model checking, heavily inspired by the design of Coccinelle. We find the majority of identified SmPL features to generalize for Java. We quantitatively evaluate SPOON-SMPL by comparing the running time performance to that of Coccinelle4J over a set of six semantic patches with associated real-world project code bases used in an API migration case study originally performed by the authors of Coccinelle4J. Additionally, we compare the running times of SPOON-SMPL to the average build time of each associated project. We find that SPOON-SMPL performs worse than Coccinelle4J, but that the performance remains in a range acceptable for a single developer using inexpensive hard- ware. Finally, we provide two proposed designs for extensions to SPOON-SMPL along with a set of suggestions for future work. The proposals show that our prototype offers a strong potential to leverage the capabilities of the Spoon li- brary, particularly in providing improved and robust support for certain aspects of Java for which Coccinelle4J provides only limited support. iv Sammanfattning Mjukvaruutveckling är ofta en kollaborativ process med behov av effektiv kommunikation. Ett centralt inslag i denna kommunikation är möjligheten för utvecklare att skapa och sinsemellan dela kortfattade sammanfattningar över källkodsändringar. De POSIX-standardiserade verktygen diff och patch har länge utgjort en viktig del av verktygslådan, men deras avsaknad av stöd för syntax och semantik hos specifika programspråk ger upphov till en begrän- sad uttrycksfullhet. Semantic Patch Language (SmPL), introducerat år 2006 tillsammans med verktyget Coccinelle, erbjuder ökad uttrycksfullhet i POSIX- liknande patchar för programspråket C. En SmPL-patch kan bland annat an- vända metavariabler, logiska variabelnamn som binder godtyckliga program- variabler, för att specificera transformationer som berör variabelreferenser oav- sett vilka variabelnamn som förekommer i målprogrammet. Coccinelle4J, en modifikation av Coccinelle, är en nyligen framtagen prototyp på ett SmPL- verktyg för programspråket Java. Coccinelle4J baseras på en teknisk grund designad för tolkning och bearbetning av C som anpassats till att bearbeta Ja- va. Språkskillnader gör det svårt att få en heltäckande anpassning, vilket leder till ett begränsat stöd för vissa av Javas egenskaper. Trots detta är Coccinelle4J i dagsläget den främsta lösningen för SmPL för Java. I denna avhandling tar vi de första stegen mot ett SmPL för Java baserat på Spoon, ett robust metapro- grammeringsbibliotek för Java. Vi undersöker kvalitativt vilka egenskaper hos SmPL och Coccinelle som kan generaliseras till Java, samt implementerar och utvärderar SPOON-SMPL, en prototyp på ett SmPL-verktyg för Java baserat på Spoon. Designen av SPOON-SMPL är kraftigt inspirerad av Coccinelle, och ba- seras på temporallogik och modellprövning. Vi finner att en klar majoritet av de egenskaper vi identifierat hos SmPL och Coccinelle låter sig generaliseras till Java. Vi utvärderar kvantitativt SPOON-SMPL genom att jämföra körtidspre- standan mot Coccinelle4J över sex semantiska patchar med tillhörande pro- jektkodbaser som ursprungligen användes i en fallstudie kring API-migrering utförd av teamet bakom Coccinelle4J. Vi jämför även körtidsprestandan mot byggnadstiden för vardera projekt. Vi finner att körtidsprestandan hos SPOON- SMPL är sämre än Coccinelle4J, men att den trots det befinner sig inom ett område som är acceptabelt för en enskild mjukvaruutvecklare med en enkel persondator. Slutligen presenterar vi två detaljerade förslag till utökningar av SPOON-SMPL tillsammans med en uppsättning förslag för framtida arbete. Vi visar genom detta att vår prototyp har en kraftfull potential för utökningar som drar nytta av de funktioner som finns i Spoon, i synnerhet kring ett förbättrat och robust stöd för vissa egenskaper hos Java där Coccinelle4J endast erbjuder ett begränsat stöd. v Acknowledgements I would like to thank: • Prof. Martin Monperrus, my examiner. Martin suggested the project and gave me the opportunity to pursue it. Martin also helped establish the research methodology and formulate the formal research questions, gave regular feedback on the structure of the thesis and my approaches to various aspects of the work, suggested many papers on related works, and provided tips on the use of Spoon. • Nicolas Yves Maurice Harrand, my supervisor. Like Martin, Nicolas provided feedback on the methodology and the structure of the thesis, and also helped me with a couple of difficult choices in the implementation. Nicolas also provided detailed feedback on the full text, introduced me to a set of useful tools and ideas for im- proving the text, provided papers on the subtleties involved in bench- marking the performance of Java programs, and helped eliminate a for- mal research question for which the results were overly speculative. • Ann Bengtsson, degree project coordinator at KTH EECS. Ann greatly helped me solve the complications surrounding my formal admittance to the degree project course. Finally, I would like to jointly thank Martin and Nicolas for their sympathy and patience throughout the project in general, and in particular surrounding the passing of my father. To my father. I’m sorry I took too long. Thank you for everything. Contents 1 Introduction 1 1.1 Problem statement . 1 1.2 Research questions . 2 1.3 Contributions . 2 1.4 Intended audience . 3 1.5 Outline of the thesis . 3 2 Background 4 2.1 Text file differencing . 4 2.1.1 diff .......................... 4 2.1.2 patch .......................... 6 2.2 Formal logics for the modeling of computer programs . 7 2.2.1 Computation Tree Logic . 7 2.2.2 CTL with free variables . 13 2.2.3 CTL with quantified variables . 16 2.2.4 CTL with variables and witnesses . 17 2.3 Program analysis and transformation . 19 2.3.1 Spoon . 20 2.3.2 Semantic Patch Language . 22 3 Related work 28 3.1 Semantic patching . 28 3.1.1 Coccinelle . 28 3.1.2 Coccinelle4J . 30 3.2 Program transformation using temporal logic . 31 3.3 Other approaches to Java source code transformation . 32 3.4 API migration . 32 vii viii CONTENTS 4 Design of spoon-smpl 34 4.1 Design goals . 34 4.2 Core engine . 36 4.3 Parsing SmPL . 37 4.4 Formula language . 41 4.5 Formula compilation . 43 4.6 Batch processing . 50 4.7 Use of Spoon . 50 5 Evaluation methodology 52 5.1 Analytical methodology . 52 5.1.1 RQ1: Generalizable features . 52 5.1.2 RQ2: Non-generalizable features . 53 5.2 Experimental methodology . 53 5.2.1 RQ3: Patch application performance . 54 5.2.2 RQ4: Project build times . 64 6 Evaluation results 66 6.1 Analytical results . 66 6.1.1 Coccinelle feature catalog . 66 6.1.2 RQ1: Generalizable features . 81 6.1.3 RQ2: Non-generalizable features . 81 6.2 Experimental results . 82 6.2.1 Hardware and software . 83 6.2.2 RQ3: Patch application performance . 83 6.2.3 RQ4: Project build times . 89 7 Discussion 91 7.1 Limitations . 91 7.2 Threats to validity . 92 7.3 Extension proposals . 93 7.3.1 Improving name resolution . 94 7.3.2 Improving sub-typing . 95 7.4 Future work . 96 7.4.1 Support for more simple Java constructs . 97 7.4.2 Support for looping constructs . 98 7.4.3 Support for isomorphisms . 98 7.4.4 Model checker optimizations . 99 7.4.5 Using Spoon sniper mode . 100 7.4.6 Using spoon.pattern . 100 CONTENTS ix 7.4.7 Target-embedded parsing of the semantic patch . 101 7.5 Ethical considerations . 102 8 Conclusions 104 Bibliography 105 A Full semantic patches 109 A.1 Semantic patch 4: should_vibrate .............. 110 A.1.1 Original version . 110 A.1.2 Modified version .
Recommended publications
  • Semantic Patches for Java Program Transformation
    Semantic Patches for Java Program Transformation Hong Jin Kang School of Information Systems, Singapore Management University, Singapore Ferdian Thung School of Information Systems, Singapore Management University, Singapore Julia Lawall Sorbonne Université/Inria/LIP6, France Gilles Muller Sorbonne Université/Inria/LIP6, France Lingxiao Jiang School of Information Systems, Singapore Management University, Singapore David Lo School of Information Systems, Singapore Management University, Singapore Abstract Developing software often requires code changes that are widespread and applied to multiple locations. There are tools for Java that allow developers to specify patterns for program matching and source- to-source transformation. However, to our knowledge, none allows for transforming code based on its control-flow context. We prototype Coccinelle4J, an extension to Coccinelle, which is a program transformation tool designed for widespread changes in C code, in order to work on Java source code. We adapt Coccinelle to be able to apply scripts written in the Semantic Patch Language (SmPL), a language provided by Coccinelle, to Java source files. As a case study, we demonstrate the utility of Coccinelle4J with the task of API migration. We show 6 semantic patches to migrate from deprecated Android API methods on several open source Android projects. We describe how SmPL can be used to express several API migrations and justify several of our design decisions. 2012 ACM Subject Classification Software and its engineering → Software notations and tools Keywords and phrases Program transformation, Java Digital Object Identifier 10.4230/LIPIcs.ECOOP.2019.22 Category Experience Report Supplement Material ECOOP 2019 Artifact Evaluation approved artifact available at https://dx.doi.org/10.4230/DARTS.5.2.10 Coccinelle4J can be found at https://github.com/kanghj/coccinelle/tree/java Funding This research was supported by the Singapore National Research Foundation (award number: NRF2016-NRF-ANR003) and the ANR ITrans project.
    [Show full text]
  • Coccinelle: Reducing the Barriers to Modularization in a Large C Code Base
    Coccinelle: Reducing the Barriers to Modularization in a Large C Code Base Julia Lawall Inria/LIP6/UPMC/Sorbonne University-Regal Modularity 2014 1 Modularity Wikipedia: Modularity is the degree to which a system's components may be separated and recombined. • A well-designed system (likely) starts with a high degree of modularity. • Modularity must be maintained as a system evolves. • Evolution decisions may be determined by the impact on modularity. Goal: Maintaining modularity should be easy as a system evolves. 2 Modularity and API functions Well designed API functions can improve modularity • Hide module-local variable names. • Hide module-local function protocols. Problem: • The perfect API may not be apparent in the original design. • The software may evolve, making new APIs needed. • Converting to new APIs is hard. 3 Modularity in the Linux kernel ipv4 ipv6 ext4 Net File system library library Kernel btrfs e1000e Driver library ... tun tef6862 4 An improvement in modularity could better isolate functionality in one of these modules. Case study: Memory management in Linux Since Linux 1.0, 1994: • kmalloc: allocate memory • memset: clear memory • kfree: free memory Since Linux 2.6.14, 2006: • kzalloc: allocate memory • kfree: free memory • No separate clearing, but need explicit free. Since Linux 2.6.21, 2007: • devm kzalloc: allocate memory • No explicit free. 5 6 kzalloc platform kzalloc platform devm_kzalloc 800 i2c kzalloc i2c devm_kzalloc 600 calls 400 200 0 2.6.20 2.6.22 2.6.24 2.6.26 2.6.28 2.6.30 2.6.32 2.6.34 2.6.36 2.6.38
    [Show full text]
  • Inferring Semantic Patches for the Linux Kernel
    SPINFER: Inferring Semantic Patches for the Linux Kernel Lucas Serrano and Van-Anh Nguyen, Sorbonne University/Inria/LIP6; Ferdian Thung, Lingxiao Jiang, and David Lo, School of Information Systems, Singapore Management University; Julia Lawall and Gilles Muller, Inria/Sorbonne University/LIP6 https://www.usenix.org/conference/atc20/presentation/serrano This paper is included in the Proceedings of the 2020 USENIX Annual Technical Conference. July 15–17, 2020 978-1-939133-14-4 Open access to the Proceedings of the 2020 USENIX Annual Technical Conference is sponsored by USENIX. SPINFER: Inferring Semantic Patches for the Linux Kernel Lucas Serrano Van-Anh Nguyen Sorbonne University/Inria/LIP6 Sorbonne University/Inria/LIP6 Ferdian Thung Lingxiao Jiang School of Information System School of Information System Singapore Management University Singapore Management University David Lo Julia Lawall, Gilles Muller School of Information System Inria/Sorbonne University/LIP6 Singapore Management University Abstract (i.e., diff-like) syntax, enhanced with metavariables to rep- In a large software system such as the Linux kernel, there is resent common but unspecified subterms and notation for a continual need for large-scale changes across many source reasoning about control-flow paths. Given a semantic patch, files, triggered by new needs or refined design decisions. In Coccinelle applies the rules automatically across the code this paper, we propose to ease such changes by suggesting base. Today, Coccinelle is widely adopted by the Linux com- transformation rules to developers, inferred automatically munity: semantic patches are part of the Linux kernel source from a collection of examples. Our approach can help auto- tree, are invokable from the kernel build infrastructure, and mate large-scale changes as well as help understand existing are regularly run by Intel’s Linux kernel 0-day build-testing large-scale changes, by highlighting the various cases that service [10].
    [Show full text]
  • Automating Patching of Vulnerable Open-Source Software Versions in Application Binaries
    Automating Patching of Vulnerable Open-Source Software Versions in Application Binaries Ruian Duan:, Ashish Bijlani:, Yang Ji:, Omar Alrawi:, Yiyuan Xiong˚, Moses Ike:, Brendan Saltaformaggio,: and Wenke Lee: fruian, ashish.bijlani, yang.ji, alrawi, [email protected], [email protected] [email protected], [email protected] : Georgia Institute of Technology, ˚ Peking University Abstract—Mobile application developers rely heavily on open- while ensuring backward compatibility, and test for unin- source software (OSS) to offload common functionalities such tended side-effects. For the Android platform, Google has as the implementation of protocols and media format playback. initiated the App Security Improvement Program (ASIP) [21] Over the past years, several vulnerabilities have been found in to notify developers of vulnerable third-party libraries in popular open-source libraries like OpenSSL and FFmpeg. Mobile use. Unfortunately, many developers, as OSSPolice [15] and applications that include such libraries inherit these flaws, which LibScout [4] show, do not update or patch their application, make them vulnerable. Fortunately, the open-source community is responsive and patches are made available within days. However, which leaves end-users exposed. Android developers mainly mobile application developers are often left unaware of these use Java and C/C++ [1] libraries. While Derr et al. [14] flaws. The App Security Improvement Program (ASIP) isa show that vulnerable Java libraries can be fixed by library- commendable effort by Google to notify application developers level update, their C/C++ counterparts, which contain many of these flaws, but recent work has shown that many developers more documented security bugs in the National Vulnerability do not act on this information.
    [Show full text]
  • SED 1214 Transcript EPISODE 1214
    SED 1214 Transcript EPISODE 1214 [INTRODUCTION] [00:00:00] JM: Static analysis is a type of debugging that identifies defects without running the ​ code. Static analysis tools can be especially useful for enforcing security policies by analyzing code for security vulnerabilities early in the development process allowing teams to rapidly address potential issues and conform to best practices. R2c has developed a fast open source static analysis tool called Semgrep. Semgrep provides syntax-aware code scanning and a database of thousands of community-defined rules to compare your code against. Semgrep also makes it easy for security engineers and developers to define custom rules to enforce their organization's policies. R2c's platform has been adopted by a lot of companies such as Dropbox and Snowflake and they are gaining a lot of traction. Isaac Evans is the founder and CEO of r2c. Before founding r2c he was an entrepreneur in residence at Red Point and a computer scientist at the U.S. Department of Defense. Isaac joins the show today to talk about how r2c is helping teams improve cloud security and static analysis. How static analysis fits into CI/CD workflows and what to expect from r2c and the Semgrep project in the future. [INTERVIEW] [00:01:15] JM: Isaac, welcome to the show. ​ [00:01:17] IE: Hey, Jeff. It's great to be here. ​ [00:01:19] JM: I have done probably 15 shows over the years on static analysis tools. Why are ​ there so many static analysis tools? [00:01:30] IE: Well, you know, Jeff, I think it's something really deep about developer ​ psychology actually.
    [Show full text]
  • Automated Secure Code Review for Webapplications
    DEGREE PROJECT IN TECHNOLOGY, FIRST CYCLE, 15 CREDITS STOCKHOLM, SWEDEN 2021 Automated secure code review for web­applications SADEQ GHOLAMI ZEINEB AMRI KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE Title English: Automated secure code review for web­applications Svenska: Automatiserad kodgranskning för webbapplikationer Authors Sadeq Gholami <[email protected]> Zeineb Amri <[email protected]> School of Electrical Engineering and Computer Science KTH Royal Institute of Technology Place for Project Stockholm, Sweden Examiner Johan Montelius KTH Royal Institute of Technology Supervisors Fadil Galjic KTH Royal Institute of Technology Christoffer Jerkeby F­Secure ii Abstract Carefully scanning and analysing web­applications is important, in order to avoid potential security vulnerabilities, or at least reduce them. Traditional code reviewing methods, such as manual code reviews, have various drawbacks when performed on large code­bases. Therefore it is appropriate to explore automated code reviewing tools and study their performance and reliability. The literature study helped identify various prerequisites, which facilitated the application of automated code reviewing tools. In a case study, two static analysis tools, CodeQL and Semgrep, were used to find security risks in three open source web­ applications with already known vulnerabilities. The result of the case study indicates that the automated code reviewing tools are much faster and more efficient than the manual reviewing, and they can detect security vulnerabilities to a certain acceptable degree. However there are vulnerabilities that do not follow a pattern and are difficult to be identified with these tools, and need human intelligence to be detected. Keywords: automated code reviewing tools, CodeQL, Semgrep, code review, security vulner­ abilities, web­applications iii Abstrakt Det är viktigt att skanna och analysera webbapplikationer noggrant för att undvika potentiella säkerhetsproblem eller åtminstone minska dem.
    [Show full text]
  • Detect Complex Code Patterns Using Semantic Grep
    Detect complex code patterns using semantic grep Bence Nagy | [email protected] @r2cdev 1 tl;dw - This Talk ● Secure code is hard ● Static analysis tools are too noisy / too slow ● grep isn’t expressive enough ● Need something, fast, code-aware, flexible, powerful… open source! Semgrep: Fast and syntax-aware semantic code pattern search for many languages: like grep but for code 2 Use Semgrep to: ● Search: Find security bugs ● Guard: Enforce best practices ● Monitor: Get notifications about new matches ● Migrate: Refactor code easily 3 $ whois @underyx (Bence Nagy) engineer @ r2c previously at: Astroscreen (information warfare) Kiwi.com (travel) $ getent group r2c We’re an SF based code analysis startup. Mission: profoundly improve code security & reliability 4 Outline 1. A 60 second history 2. Trees. (well… syntax trees) 3. Learning Semgrep! 4. Integration into CI/CD ⚙ 5. Semgrep Rules Registry 5 github.com/returntocorp/semgrep 6 Semgrep, Est. 2009 First version of Semgrep (sgrep/pfff) was written at Facebook circa 2009 and was used to enforce nearly 1000 rules! The original author, Yoann Padioleau (@aryx), joined r2c last year. Yoann was the first static analysis hire at Facebook and previously PhD @ Inria, contributor to coccinelle.lip6.fr 7 Language Support License 8 grep and Abstract Syntax Trees (ASTs) 9 xkcd 1171 FALSE POSITIVES 10 Code is not a string, it’s a tree 臨 string != tree @app.route("/index") @app.route(“/index”) def index(): rep = response.set_cookie(name(), def index(): secure=False, s=func()) return rep name(), func()
    [Show full text]
  • Introducing Semgrep
    introducing semgrep Drew Dennison | r2c @drewdennison Slides are posted at bit.ly/hella-secure-semgrep 1 Tl;dr - This Talk ● Writing secure code is hard ● SAST tools are too noisy / too slow, grep isn’t expressive enough ● Need something, fast, code-aware, flexible, powerful… open source! ● Enter: semgrep Use to: ● Enforce org / code specific patterns and best practices ● Find security bugs ● Scan every PR for vulnerabilities 2 whois? Presenting: Drew Dennison, co-founder @ r2c BS in CompSci from MIT, ex-Palantir r2c We’re a static analysis startup based in SF and we really care about software security. 3 Talk Outline 1. Writing secure code is hard 2. Getting started with semgrep 3. Demo 4. Using it yourself 4 It’s really hard to write secure code 5 Cookies @app.route("/index") def index(): resp = response.set_cookie("username","DrewDennison") return resp 6 How might we detect this? ⇒ REGEX grep -RE 'response\.set_cookie\(' path/to/code 7 Code Equivalence response.set_cookie("username","DrewDennison") Default from flask.response import set_cookie as sc Import and rename sc("username","DrewDennison") response.set_cookie("username","Drew", secure=True, path="/") Keyword arguments response.set_cookie( Multiline "username","DrewDennison", secure=True, samesite=Secure, httponly=True) # response.set_cookie("username","DrewDennison", secure=True) Commented out code 8 Regex Matches response.set_cookie("username","DrewDennison") from flask.response import set_cookie as sc sc("username","DrewDennison")1 True Positive response.set_cookie("username","Drew",3
    [Show full text]
  • Towards Generating Transformation Rules Without Examples for Android API Replacement
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Institutional Knowledge at Singapore Management University Singapore Management University Institutional Knowledge at Singapore Management University Research Collection School Of Information Systems School of Information Systems 4-2019 Towards generating transformation rules without examples for android API replacement Thung Ferdian Singapore Management University, [email protected] Hong Jin KANG Singapore Management University, [email protected] Lingxiao JIANG Singapore Management University, [email protected] David LO Singapore Management University, [email protected] Follow this and additional works at: https://ink.library.smu.edu.sg/sis_research Part of the Software Engineering Commons Citation Ferdian, Thung; KANG, Hong Jin; JIANG, Lingxiao; and LO, David. Towards generating transformation rules without examples for android API replacement. (2019). Proceedings of the 35th IEEE International Conference on Software Maintenance and Evolution (ICSME), Cleveland, OH, USA, 2019 September 30 - October 4. 1-5. Research Collection School Of Information Systems. Available at: https://ink.library.smu.edu.sg/sis_research/4824 This Conference Proceeding Article is brought to you for free and open access by the School of Information Systems at Institutional Knowledge at Singapore Management University. It has been accepted for inclusion in Research Collection School Of Information Systems by an authorized administrator of Institutional
    [Show full text]
  • Effective Source Code Analysis with Minimization
    Effective Source Code Analysis with Minimization Geet Tapan Telang Kotaro Hashimoto Krishnaji Desai Hitachi India Pvt. Ltd. Hitachi India Pvt. Ltd. Hitachi India Pvt. Ltd. Research Engineer Software Engineer Researcher Bangalore, India Bangalore, India Bangalore, India Email: [email protected] Email: [email protected] Email: [email protected] Abstract—During embedded software development using open source, there exists substantial amount of code that is ineffective which reduces the debugging efficiency, readability for human inspection and increase in search space for analysis. In domains like real-time embedded system and mission critical systems, this may result in inefficiency and inconsistencies affecting lower quality of service, enhanced readability, increased verification and validation efforts. To mitigate these shortcomings, we propose the method of minimization with an easy to use tool support that leverages preprocessor directives with GCC for cutting out ifdef blocks. Case studies of Linux kernel tree, Busybox and industry- strength OSS projects are evaluated indicating average reduction in lines of code roughly around 5%-22% in base kernel using minimization technique. This results in increased efficiency for analysis, testing and human inspections that may help in assuring dependability of systems. Fig. 1: Overview of minimization. I. INTRODUCTION The outgrowth of Linux operating system and other real- II. BACKGROUND time embedded systems in dependable computing has raised concern in possible areas such as mission critical system. The In OSS software domain, there are many developers con- size of code has grown to approximately 20 million lines of tributing towards the common goal such as real-time, safety- code (Linux) and with this scale and complexity, it becomes mitigation etc.
    [Show full text]
  • Clang and Coccinelle: Synergising Program Analysis Tools for CERT C Secure Coding Standard Certification
    Clang and Coccinelle synergising program analysis tools for CERT C Secure Coding Standard certification Olesen, Mads Christian; Hansen, René Rydhof; Lawall, Julia; Palix, Nicolas Jean-Michel Published in: Proceedings of the Fourth International Workshop on Foundations and Tecniques for Open Source Software Certification (OpenCert 2010) DOI: 10.14279/tuj.eceasst.33.455 Publication date: 2010 Document version Publisher's PDF, also known as Version of record Document license: Unspecified Citation for published version (APA): Olesen, M. C., Hansen, R. R., Lawall, J., & Palix, N. J-M. (2010). Clang and Coccinelle: synergising program analysis tools for CERT C Secure Coding Standard certification. In L. S. Barbosa, A. Cerone, & S. A. Shaikh (Eds.), Proceedings of the Fourth International Workshop on Foundations and Tecniques for Open Source Software Certification (OpenCert 2010) Electronic Communications of the EASST Vol. 33 https://doi.org/10.14279/tuj.eceasst.33.455 Download date: 01. Oct. 2021 Electronic Communications of the EASST Volume 33 (2010) Proceedings of the Fourth International Workshop on Foundations and Techniques for Open Source Software Certification (OpenCert 2010) Clang and Coccinelle: Synergising program analysis tools for CERT C Secure Coding Standard certification Mads Chr. Olesen, Rene´ Rydhof Hansen, Julia L. Lawall, Nicolas Palix 18 pages Guest Editors: Luis S. Barbosa, Antonio Cerone, Siraj A. Shaikh Managing Editors: Tiziana Margaria, Julia Padberg, Gabriele Taentzer ECEASST Home Page: http://www.easst.org/eceasst/ ISSN 1863-2122 ECEASST Clang and Coccinelle: Synergising program analysis tools for CERT C Secure Coding Standard certification Mads Chr. Olesen1, Rene´ Rydhof Hansen1, Julia L. Lawall2, Nicolas Palix2 1rrh,[email protected], http://www.cs.aau.dk 2julia,[email protected], http://www.diku.dk Abstract: Writing correct C programs is well-known to be hard, not least due to the many language features intrinsic to C.
    [Show full text]
  • Aalborg Universitet Coccinelle Tool Support for Automated
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by VBN Aalborg Universitet Coccinelle Tool support for automated CERT C Secure Coding Standard certification Olesen, Mads Chr.; Hansen, Rene Rydhof; Lawall, Julia L.; Palix, Nicolas Jean-Michel Published in: Science of Computer Programming DOI (link to publication from Publisher): 10.1016/j.scico.2012.10.011 Publication date: 2014 Document Version Early version, also known as pre-print Link to publication from Aalborg University Citation for published version (APA): Olesen, M. C., Hansen, R. R., Lawall, J. L., & Palix, N. J-M. (2014). Coccinelle: Tool support for automated CERT C Secure Coding Standard certification. Science of Computer Programming, 91(Part B), 141-160. https://doi.org/10.1016/j.scico.2012.10.011 General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. ? Users may download and print one copy of any publication from the public portal for the purpose of private study or research. ? You may not further distribute the material or use it for any profit-making activity or commercial gain ? You may freely distribute the URL identifying the publication in the public portal ? Take down policy If you believe that this document breaches copyright please contact us at [email protected] providing details, and we will remove access to the work immediately and investigate your claim.
    [Show full text]