Documentation of the BISMON Static Source Code Analyzer
Total Page:16
File Type:pdf, Size:1020Kb
documentation of the BISMON static source code analyzer. This report and the Bismon static analyzer has been funded by European H2020 projects www.chariotproject.eu (grant 780075) and www.decoder-project.eu (grant 824231). The opinions are those of the author(s) only. The content of the publication herein is the sole responsibility of the publishers and it does not necessarily represent the views expressed by the European Commission or its services. Basile STARYNKEVITCH (CEA, LIST) DILS/LSL from its SOFTWARE SECURITY LABORATORY author: [email protected] http://starynkevitch.net/Basile/ unverified, unapproved, unchecked draft Ful of Mistaks. Some version of this DRAFT report might be downloadable from http://starynkevitch.net/Basile/bismon- doc.pdf This partly generated document and the Bismon software itself are co-developed in an agile and incremental manner, and have exactly 3932 git commits on Mon 01 Feb 2021 03:26:24 PM MET. For more, please see https://github.com/bstarynk/bismon/commits/ for details. These commits are too many and too fine grained to be considered as “versions”. Given the agile and continuous workflow, it is unreasonable, and practically impossible, to identify any formalized versions. This document is co-developed with the Bismon software itself, it was typeset using LATEX on Linux and contains some generated documentation 1, mixed with hand-written text. During development of bismon, the amount of gen- erated documentation will grow. The entire history of Bismon (both the software -including its persistent store- and this document) is available on https://github.com/bstarynk/bismon/commits and has, for this document of commit id 61e424abbb482e7e++ (done on 2021-Feb-01) generated on Feb 01, 2021, exactly 3932 commits (or elementary changes). Since changes on any file in the git repository can affect this document, no “version” is identifiable. For convenience to the reader, here are the last 2 three git commit-s: commit 61e424abbb482e7eb81dcd2da7d12bb1e667f2b8 Author: Basile Starynkevitch <[email protected]> Date: Mon Feb 1 14:08:34 2021 +0100 mention both chariot and decoder M doc/bismon-doc.tex commit 9e198ce90e47a60a45c6cdc025c166fb159d3bf4 Author: Basile Starynkevitch <[email protected]> Date: Mon Feb 1 11:05:16 2021 +0100 the output of ./bismon -help goes into the documentation M build-bismon-doc.sh M doc/appendix-bm.tex M doc/bismon-doc.tex A doc/genscripts/005-bismon-help.sh commit 3eee9ed1ff85ef0f28ef74049c31014618bd72e7 Author: Basile Starynkevitch <[email protected]> Date: Mon Feb 1 10:24:32 2021 +0100 show gitid in -help message M main_BM.c 1The generated parts are clearly identified as such, and are extracted from the Bismon system. 2Obtained by the git log -name-status -3 command running in bismon top source directory. the BISMON static source code analyzer There is no notion of any identifiable “version” in bismon, so also in this report. The work is incremental and the development is agile. copyright message 'Copyright © 2018 - 2021 CEA (Commissariat à l’énergie atomique et aux énergies alternatives). $ This deliverable contains original unpublished work except where clearly indicated otherwise. Acknowl- edgement of previously published material and of the work of others has been made through appropriate citation, quotation or both. Reproduction is authorised provided the source is acknowledged (see last page for licensing details). & % notice This work is funded (from start of 2018 to end of 2020) thru the CHARIOT project (see its web site on http://chariotproject.eu/) which has received funding from the European Unions Horizon 2020 re- search and innovation programme under the Grant Agreement No 780075. This work is also partly funded -from 2019 to 2021- by the DECODER H2020 project, under its Grant Agreement 824231. DRAFT 61e424abbb482e7e++ on 2021-Feb-01 Page 2 of 77 the BISMON static source code analyzer Contents 1 Introduction 5 1.1 Mapping CHARIOT output .................................... 5 1.2 Deliverable Overview and Report Structure ........................... 5 1.3 Expected audience ........................................ 6 1.4 The CHARIOT vision on specialized static source code analysis for more secure and safer IoT software development ....................................... 8 1.4.1 About static source code analysis and IoT ........................ 8 1.4.2 The power of an existing compiler: GCC ........................ 11 1.4.3 Leveraging simple static source analysis on GCC .................... 17 1.5 Lessons learned from GCC MELT ................................ 19 1.6 Driving principles for the Bismon persistent monitor ....................... 20 1.6.1 About Bismon ...................................... 20 1.6.2 About Bismon as a domain-specific language ...................... 22 1.6.3 About Bismon as a evolving software system ...................... 24 1.6.4 About Bismon as a static source code analyzer framework ............... 26 1.7 Multi-threaded and distributed aspects of Bismon ........................ 26 2 Data and its persistence in Bismon 29 2.1 Data processed in Bismon ..................................... 29 2.1.1 Immutable values ..................................... 29 2.1.2 Mutable objects ...................................... 30 2.2 garbage collection of values and objects ............................. 32 2.3 persistence in Bismon ....................................... 33 2.3.1 file organization of the persistent state .......................... 34 2.3.2 persisting objects ..................................... 35 3 Static analysis of source code in Bismon 37 3.1 static analysis of GCC code .................................... 37 3.2 static analysis of IoT firmware or application code ........................ 37 3.3 static analysis related to pointers and addresses ......................... 38 4 Using Bismon 41 4.1 How JSON is used by Bismon .................................. 41 4.1.1 The canonical JSON encoding of Bismon values .................... 41 4.1.2 The nodal JSON decoding into Bismon values ...................... 42 4.1.3 JSON extraction with extract_json ......................... 43 4.2 Web interface internal design ................................... 45 4.3 Using bismon for CHARIOT ................................... 48 5 Miscellanous work 49 5.1 Contributions to other free software projects ........................... 49 5.1.1 Aborted contribution to libonion ........................... 49 5.1.2 Contribution to GCC ................................... 49 5.2 Design and implementation of the compiler and linker extension ................ 49 6 Conclusion 50 A Building bismon from its source code 51 A.1 Prerequisites for building bismon ................................ 51 A.2 File naming conventions in bismon ............................... 51 A.3 Naming conventions and source files organization for bismon ................. 52 A.4 Generators and meta-programs in bismon ........................... 54 DRAFT 61e424abbb482e7e++ on 2021-Feb-01 Page 1 of 77 the BISMON static source code analyzer B Configuring bismon from its source code 54 C Building bismon from its source code 54 C.1 Checking the version of bismon ................................. 55 D Dumping and restoring the bismon persistent heap 57 Index 58 References 65 DRAFT 61e424abbb482e7e++ on 2021-Feb-01 Page 2 of 77 the BISMON static source code analyzer List of Figures 1 the CHARIOT compilation of some IoT firmware or application (simplified) .......... 11 2 The Bismon monitor used by some IoT developer team following the CHARIOT approach. .. 19 3 crude (soon deprecated) GTK interface oct. 22, 2018, git commit cbdcf1ec351c3f2a ......... 25 4 generated dump example: first_test_module in file store2.bmon .......... 35 5 syntax of values in dumped data files. .............................. 35 List of Tables 1 adherence to CHARIOT’s GA deliverable and tasks descriptions ................ 5 2 recursive inlining with constant folding in GCC (C source) ................... 12 3 recursive inlining with constant folding in GCC (generated early Gimple) ........... 12 4 recursive inlining with constant folding in GCC (generated SSA form) ............. 13 5 recursive inlining with constant folding in GCC (generated optimized) ............. 14 6 recursive inlining with constant folding in GCC (generated MIPS assembler) ......... 14 7 optimization around heap allocation by GCC (C source) ..................... 15 8 optimization around heap allocation by GCC (generated Gimple) ................ 16 9 optimization around heap allocation by GCC (generated SSA/optimized) ............ 16 10 optimization around heap allocation by GCC (generated x86-64 assembly) ........... 17 11 canonical JSON encoding v json of a Bismon value v. ..................... 41 12 nodal JSON decoding xjsyJnodK of a JSON value js. ........................ 42 13 simple extraction from some JSON thing js; (see also table 14 below.) ............... 43 14 complex extraction from some JSON thing js; (see also table 13 above.) .............. 44 DRAFT 61e424abbb482e7e++ on 2021-Feb-01 Page 3 of 77 the BISMON static source code analyzer Glossary of terms and abbreviations used Abbreviation / Description Term binutils GNU free software package containing assembler as, linker ld and other utilities to operate on object files or binary executables, etc... https://www.gnu.org/software/binutils/ bismon the free software framework and persistent monitor described here ; source repository on http://github.com/bstarynk/bismon/