
jbse-manual Documentation Release latest Pietro Braione Aug 25, 2021 Contents 1 About this book 1 1.1 What is JBSE?..............................................1 1.2 Who is the author of this book?.....................................1 1.3 Where do I find this book?........................................1 2 Introduction 3 2.1 Software analysis.............................................3 2.2 What is symbolic execution?.......................................4 2.3 Symbolic execution with objects as inputs................................ 10 3 Getting started with JBSE 17 3.1 Obtaining and installing JBSE...................................... 17 3.2 A basic example............................................. 17 3.3 Assertions and assumptions....................................... 22 4 Using JBSE 25 4.1 The symbolic execution classes..................................... 25 4.2 Creating a symbolic executor...................................... 25 i ii CHAPTER 1 About this book This book teaches you how to install, use and modify JBSE, an open source framework for the analysis of Java programs. 1.1 What is JBSE? JBSE is the Java Bytecode Symbolic Executor. Basically, it is a special-purpose Java Virtual Machine written in Java that can be used for program analysis and automated test generation. The homepage of JBSE is at https://pietrobraione. github.io/jbse/ and its Github repository is at https://github.com/pietrobraione/jbse. 1.2 Who is the author of this book? The author of this book is the main maintainer of JBSE, Pietro Braione. You can contact me via email at [email protected]. 1.3 Where do I find this book? This book is available on Github. Its repository is https://github.com/pietrobraione/jbse-manual. The book is written in reStructuredText and is published on readthedocs at https://jbse-manual.readthedocs.io. 1 jbse-manual Documentation, Release latest 2 Chapter 1. About this book CHAPTER 2 Introduction Welcome. Let me introduce you to JBSE and explain what it is and what it can do. JBSE is a special-purpose Java Virtual Machine (JVM). As you may know, a JVM is what is necessary to execute a program written in the programming languages Java, Scala, Clojure, Groovy, and many others. To be more precise, a JVM is able to execute the special format emitted by the compilers of these languages, the so-called Java bytecode. The programming languages that compile to Java bytecode achieve their portability across different platforms because their compilers does not translate programs directly to machine language, but to Java bytecode that, differently from machine language, is CPU- and OS-independent. It is sufficient to port a JVM implementation across different plat- forms, and automatically all the programs compiled to Java bytecode can be executed unchanged on all of them. The Java bytecode format is precisely documented in the Java Virtual Machine Specification (JVMS) books, that describe how a compliant JVM must execute a program in Java bytecode. The reference JVM implementation is Oracle’s Hotspot JVM, but there are many other ones, e.g., IBM’s OpenJ9 or aicas’ JamaicaVM. So JBSE is a JVM, and therefore it can be used as a drop-in replacement to Hotspot to execute Java (or Scala, Clojure, Groovy. ) software. Right? Well, not really. JBSE’s main purpose is to analyze, rather than execute, Java software. 2.1 Software analysis Let us face the reality: Too often software systems do not work as expected. There are many reasons why this happens, but the most cogent one is possibly that software systems quickly turn complex, and when they turn complex, they usually turn extremely complex. The Windows 10 operating system’s source code, for example, is about 50 millions of lines over 3.5 millions of files, and when checked out it occupies about 300 GB of disk space. Complexity in structure implies complexity in behavior, and unforeseeable behaviors are the main consequence, and the root cause of bugs. A possible way to dominate this complexity is to empower the software engineer with tools that help him or her with understanding how the system behaves. These tools perform what is commonly called software analysis and can be roughly classified into two categories: static and dynamic analysis tools. Static analysis tools extract information about a software system without executing it. The well known Findbugs, Checkstyle and PMD tools perform a kind of static analysis based on the idea of scanning a system’s source code in search for the occurrence of a number of predefined 3 jbse-manual Documentation, Release latest code patterns, each indicating the possible occurrence of a different kind of bug. Static analysis techniques usually require the availability of the source code and produce approximate answers, where false alarms and missed bugs are the norm, but their answers, when correct, can provide very general information on the correctness of the program. On converse, dynamic analysis tools gather information on the software under analysis by observing the effects of its execution. Testing is the quintessential dynamic analysis activity: It observes the effects of the execution of the system software when fed by a finite set of inputs, in search for the manifestations of software defects. Dynamic analyses are usually very precise, but bound to the (few) executions they are able to observe, they usually produce less general results than static analyses. For example, as observed by Dijkstra, testing alone cannot be used to assess the absence of any category of software bugs, while static analyses, in principle, may. JBSE performs a kind of analysis that is called symbolic execution, that is amenable both to verify the correctness of a program with respect to some desired properties expressed as assertions, and to generate test vectors for the program. When used for verification JBSE expects that you specify the verification properties of interest for your project as a set of assumptions and assertions. Assumptions specify the conditions that must be satisfied for an execution to be relevant. Preconditions are a typical form of assumptions, allowing e.g. to specify the range of the possible values for the program inputs. Assertions specify the conditions that must be satisfied for an execution to be correct. JBSE attempts to determine whether some input exists that satisfies all the assumptions and falsifies at least one assertion. In this regard JBSE is more similar in spirit, implementation and mode of use to tools like Symbolic PathFinder, Sireum/Kiasan and JNuke. 2.2 What is symbolic execution? If you do not know what “symbolic execution” is, then you may have a look at the corresponding Wikipedia article or to some textbook. But if you are really impatient, here is a very short tutorial. To explain what symbolic execution is we can consider that symbolic execution is to testing what symbolic equation solving is to numeric equation solving. Let us consider, for instance, the equation x2 − 2 · x + 1 = 0, of which we are asked to find its real solutions. This second degree equation is numeric, meaning that all its coefficients are numbers. According to the value of the discriminant ∆ the equation can have two real solutions (this happens when ∆ > 0), one real solution (when ∆ = 0) or no real solution (when ∆ < 0). In this case the equation has one real solution being ∆ = (−2)2 − 4 · 1 · 1 = 4 − 4 = 0. Conversely, the equation x2 − b · x + 1 = 0 is symbolic, because one of the coefficients b is not a number but a symbol, standing for an unknown numeric value ranging in a (possibly infinite) set of admissible values. If we assume that this set is the set of all the possible real numbers, then the discriminant of the second equation is ∆ = b2 − 4, for any real value of b. As with the numeric equations, to determine the solution of the symbolic equation we need to split cases based on the sign of the discriminant. But differently from our first example, where exactly one case holds, symbolic equation solving may require to follow more than one of them. Depending on the possible values of b our example symbolic equation may fall in one of three cases: If jbj > 2 the discriminant is greater than zero and the equation has two real solutions. If b = 2 or b = −2 the discriminant is zero and the equation has one real solution. Finally, if −2 < b < 2, the discriminant is less than zero and the equation has no real solutions. Since all the three subsets for b are nonempty any of the three cases may hold. As a consequence, the solution of a symbolic equation is usually expressed as a set of summaries. A summary associates a condition on the symbolic parameters with a corresponding possible result of the equation, where the result can be a number or anp expression 2 in the symbols. For ourp running example the solution produces as summaries jbj > 2 ! x = (b + b − 4)=2, jbj > 2 ! x = −(b + b2 − 4)=2, b = 2 ! x = 1, and b = −2 ! x = −1. Note that summaries overlap where a combination of parameters values (jbj > 2 in the previous case) yield multiple results, and that the union of the summaries does not span the whole domain for b, because some values for b yield no result. Symbolic execution is a program analysis technique that is based on performing the execution of a program with input values that may be symbols standing for sets of possible numeric (concrete) values. Consider for example the following Java program: package smalldemos.ifx; (continues on next page) 4 Chapter 2. Introduction jbse-manual Documentation, Release latest (continued from previous page) public class IfExample { boolean a, b; public void m(int x) { if (x>0){ a= true; } else { a= false; } if (x>0){ b= true; } else { b= false; } assert a== b; } } This program is the customary “double-if” example that is often used to illustrate how symbolic execution works.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages34 Page
-
File Size-