Taking Code Analysis Into Use for Existing Codebase

Janne Uitto TAKING CODE ANALYSIS INTO USE FOR EXISTING CODEBASE Faculty of Information Technology and Communication Sciences Master’s Thesis April 2020 i ABSTRACT Janne Uitto: Taking code analysis into use for existing codebase Master’s Thesis Tampere University Software Engineering April 2020 The amount of the code increases in applications across industries. Also, the amount of vulnerabilities has increased, which increases the importance of quality assurance. Any kind of quality assurance increases the probability of finding the issues from programs. One method to detect security vulnerabilities is to run code analysis. Code analysis tools can automatically detect po- tential security vulnerabilities from source code including overruns, using uninitialised memory, null pointer dereferences, and memory leaks. This thesis shows how a code analysis tool can be included as part of a company's existing software development process so that new issues can be found as early as possible. Code analysis tools might report many findings from codebases that have not been analysed previously. Existing issues need to be fixed so that new issues can be identified easily. The effort on fixing existing issues was compared to the severity of found issues, which then gave the cost-effective- ness of the effort spent on fixing. A code analysis tool was taken into use at M-Files. The code analysis tool reported a total of 3,064 existing issues from the company's codebase. Found issues were categorised into three severity levels: 3,054 low, 3 high, and 7 critical issues were found. Found issues were fixed. It took 60.5 work hours to fix these issues. A CI pipeline is used to automate integration steps for the new code. After existing issues were fixed the code analysis tool was set to monitor all new changes in the company's CI pipeline. Currently, it is mandatory to have a clean analysis result before new changes can be merged into the main code branch. This reduces the number of issues, including vulnerabilities, in released software. It is not mandatory to have a code analysis tool to verify codebase from the beginning of the project. This thesis shows that it was a tolerable effort to take code analysis into use for the existing codebase. Setting up code analysis as part of the development process as soon as possible is a recommended action for all organisations. Keywords: code analysis, analyser, static, deployment, information security The originality of this thesis has been checked using the Turnitin OriginalityCheck service. TIIVISTELMÄ Janne Uitto: Koodianalysaattorin käyttöönotto jo käytössä olevaan koodikantaan Diplomityö Tampereen yliopisto Ohjelmistotuotanto Huhtikuu 2020 Koodin määrä kasvaa eri sovelluksissa toimialasta riippumatta. Tällöin myös haavoittuvuuksien määrä kasvaa, mikä korostaa laadunvarmistuksen tärkeyttä. Kaikentyyppinen laadunvarmis- tus parantaa virheiden löytymistä ohjelmista. Yksi tapa löytää tietoturvahaavoittuvuuksia on koodianalyysin ajaminen. Koodianalyysityökalut voivat automaattisesti tunnistaa mahdollisia tietoturvahaavoittuvuuksia aiheuttavia virheitä lähdekoodista. Tällaisia virheitä ovat esimerkiksi ylivuoto, alustamattoman muistin käyttö, nollaosoittimen käyttäminen ja muistivuoto. Tämä työ selvittää kuinka koodianalyysityökalu voidaan ottaa osaksi yrityksen olemassa ole- vaa ohjelmistokehitysprosessia niin, että uudet virheet havaitaan mahdollisimman ajoissa. Tie- detty ongelma on, että koodikannasta, jota ei aikaisemmin ole analysoitu, saattaa löytyä todella paljon virheitä. Nämä virheet on korjattava ennen kuin voidaan helposti havaita uuden koodin virheet. Olemassa olevien virheiden korjaamiseen kuluu aikaa, joten osana työtä oli tarkoitus sel- vittää korjauksien kustannustehokkuus, joka voidaan määrittää virheiden vakavuudesta ja niiden korjaamiseen kuluneesta ajasta. Koodianalyysi työkalu otettiin käyttöön M-Files:ssa. Koodianalyysi löysi yrityksen koodikannasta 3 064 virhettä. Virheet jaettiin kolmeen vakavuustasoon, jolloin matalan tason virheitä löytyi 3 054 kappaletta, korkeita 3 ja kriittisiä 7. Löydetyt virheet korjattiin. Näiden virheiden korjaamiseen kului yhteensä 60,5 tuntia. CI-putkea käytetään automaattisten tarkistusten tekemiseen uudelle koodille. Löydettyjen virheiden korjaamisen jälkeen, koodianalyysityökalu ajetaan osana yrityksen CI-putkea. Koodiana- lyysin käyttöönoton jälkeen uusi koodi hyväksytään osaksi pääkoodihaaraa vasta, kun analyysistä on saatu virheetön tulos. Tämä vähentää virheiden ja sitä kautta haavoittuvuuksien määrää asi- akkaalle menevään tuotteeseen. Koodianalyysin käyttöönottaminen ei ole pakollista heti projektin alusta alkaen. Tämä työ näytti, että koodianalyysi on mahdollista otaa käyttöön jo olemassa olevaan koodikantaan. Koo- dianalyysin käyttöön ottaminen mahdollisimman ajoissa on suositeltavaa jokaiselle organisaa- tiolle. Avainsanat: koodianalysaattori, koodianalyysi, staattinen, käyttöönotto, tietoturva Tämän julkaisun alkuperäisyys on tarkastettu Turnitin OriginalityCheck –ohjelmalla. PREFACE I want to thank my supervisor Mika Hirvonen, who enabled me to start master's studies alongside work by ensuring a good balance between work and studies. My instructors Ossi Nykänen, Chief Research Engineer at M-Files, and Marko Helenius, University Instructor at Tampere University, guided me through the thesis process with excellent expertise. They provided helpful comments and guid- ance throughout the process. Thanks to my family, friends, and colleagues who have advised me with my studies and this thesis. In Tampere, Finland, on 15 April 2020 Janne Uitto CONTENTS 1. INTRODUCTION................................................................................................... 1 2. SECURE DEVELOPMENT ................................................................................... 6 2.1 Secure design principles ...................................................................... 8 2.1.1 Threat modeling .......................................................................... 11 2.2 Secure programming .......................................................................... 13 2.2.1 Cryptography .............................................................................. 13 2.2.2 Code reviews .............................................................................. 13 2.2.3 Developer training ....................................................................... 14 2.3 Testing ............................................................................................... 14 2.4 Code analysis .................................................................................... 16 2.4.1 Code annotation.......................................................................... 19 2.4.2 Compiler warnings ...................................................................... 21 3. CASE M-FILES ................................................................................................... 22 3.1 M-Files company ................................................................................ 22 3.2 M-Files product .................................................................................. 22 3.3 Current status of code analysis .......................................................... 24 3.4 Goals for analysis............................................................................... 25 4. RESOLVING ANALYSIS WARNINGS ................................................................ 26 4.1 CI pipeline in GitLab ........................................................................... 26 4.2 Investigate analysis warnings ............................................................. 28 4.3 Fix warnings - Case M-Files ............................................................... 33 4.4 Mandatory code analysis job .............................................................. 36 4.5 Fixing details, time and severity ......................................................... 37 4.6 Training for developers ...................................................................... 41 5. RESULTS ........................................................................................................... 43 5.1 Future development ........................................................................... 44 5.1.1 Increased rule set level ............................................................... 44 5.1.2 Usage of annotations .................................................................. 45 5.1.3 More projects into the analysis .................................................... 45 5.1.4 Other technologies ...................................................................... 46 6. CONCLUSIONS .................................................................................................. 47 REFERENCES ....................................................................................................... 49 LIST OF FIGURES Figure 1. Microsoft Secure Development Lifecycle (SDL) process [11]. ....................... 6 Figure 2. Design, implementation and verification phases with code analysis. ............. 7 Figure 3. Waterfall process compared to agile [14]....................................................... 8 Figure 4. Test automation pyramid [18]. ..................................................................... 15 Figure 5. True positive, false positive, true negative, false negative. .......................... 17 Figure 6. Code analysis from project properties in Visual Studio 2019. .....................

Taking Code Analysis Into Use for Existing Codebase

Game Developers’ Census, the Big Three at E3, and More

Installing the Klocwork Server Package

Q1 Where Do You Use C++? (Select All That Apply)

Visual C++ Team Is at Cppcon 2016 Take the Survey, Attend a Talk, and Drop by Our Booth to Say Hello

Game-Tech-Whitepaper

Der Turbolader Für Die C++ Entwicklungspipeline 8 X Schnellere Builds

The Parallel

Enlighten Global Illumination Releases Version 3.10

View the Revised S2018 Advance Program

Coverity 2018.09 Command Reference Reference for Coverity Analysis, Coverity Platform, and Coverity Desktop

Disruption in Platform-Based Ecosystems

Latest and Greatest in Visual Studio for C++ Developers Steve.Carroll & Daniel.Moth @Microsoft.Com