Concolic Testing
Total Page:16
File Type:pdf, Size:1020Kb
DART and CUTE: Concolic Testing Koushik Sen University of California, Berkeley Joint work with Gul Agha, Patrice Godefroid, Nils Klarlund, Rupak Majumdar, Darko Marinov Used and adapted by Jonathan Aldrich, with permission, for 17-355/17-655 Program Analysis 3/28/2017 1 Today, QA is mostly testing “50% of my company employees are testers, and the rest spends 50% of their time testing!” Bill Gates 1995 2 Software Reliability Importance Software is pervading every aspects of life Software everywhere Software failures were estimated to cost the US economy about $60 billion annually [NIST 2002] Improvements in software testing infrastructure may save one- third of this cost Testing accounts for an estimated 50%-80% of the cost of software development [Beizer 1990] 3 Big Picture Safe Programming Languages and Type systems Model Testing Checking Runtime Static Program and Monitoring Analysis Theorem Proving Dynamic Program Analysis Model Based Software Development and Analysis 4 Big Picture Safe Programming Languages and Type systems Model Testing Checking Runtime Static Program and Monitoring Analysis Theorem Proving Dynamic Program Analysis Model Based Software Development and Analysis 5 A Familiar Program: QuickSort void quicksort (int[] a, int lo, int hi) { int i=lo, j=hi, h; int x=a[(lo+hi)/2]; // partition do { while (a[i]<x) i++; while (a[j]>x) j--; if (i<=j) { h=a[i]; a[i]=a[j]; a[j]=h; i++; j--; } } while (i<=j); // recursion if (lo<j) quicksort(a, lo, j); if (i<hi) quicksort(a, i, hi); } 6 A Familiar Program: QuickSort void quicksort (int[] a, int lo, int hi) { Test QuickSort int i=lo, j=hi, h; int x=a[(lo+hi)/2]; Create an array Initialize the elements of // partition the array do { Execute the program on while (a[i]<x) i++; this array while (a[j]>x) j--; if (i<=j) { How much confidence h=a[i]; a[i]=a[j]; do I have in this testing a[j]=h; method? i++; Is my test suite j--; *Complete*? } } while (i<=j); Can someone generate a small and *Complete* // recursion test suite for me? if (lo<j) quicksort(a, lo, j); if (i<hi) quicksort(a, i, hi); } 7 Automated Test Generation Studied since 70’s King 76, Myers 79 30 years have passed, and yet no effective solution What Happened??? 8 Automated Test Generation Studied since 70’s King 76, Myers 79 30 years have passed, and yet no effective solution What Happened??? Program-analysis techniques were expensive Automated theorem proving and constraint solving techniques were not efficient 9 Automated Test Generation Studied since 70’s King 76, Myers 79 30 years have passed, and yet no effective solution What Happened??? Program-analysis techniques were expensive Automated theorem proving and constraint solving techniques were not efficient In the recent years we have seen remarkable progress in static program-analysis and constraint solving SLAM, BLAST, ESP, Bandera, Saturn, MAGIC 10 Automated Test Generation Studied since 70’s King 76, Myers 79 Question: Can we use similar 30 years have passed, and yet no effective solutiontechniques in Automated Testing? What Happened??? Program-analysis techniques were expensive Automated theorem proving and constraint solving techniques were not efficient In the recent years we have seen remarkable progress in static program-analysis and constraint solving SLAM, BLAST, ESP, Bandera, Saturn, MAGIC 11 Systematic Automated Testing Model-Checking Static Analysis and Formal Methods Testing Automated Constraint Theorem Proving Solving 12 CUTE and DART Combine random testing (concrete execution) and symbolic testing (symbolic execution) [PLDI’05, FSE’05, FASE’06, CAV’06,ISSTA’07, ICSE’07] Concrete + Symbolic = Concolic 13 Goal Automated Unit Testing of real-world C and Java Programs Generate test inputs Execute unit under test on generated test inputs so that all reachable statements are executed Any assertion violation gets caught Sub-problem: Input Generation Given a statement s in program P, compute input i, such that P(i) executes s This formulation is due to Wolfram Schulte 14 Execution Paths of a Program Can be seen as a binary tree with possibly infinite depth 0 1 Computation tree Each node represents the 1 execution of a “if then else” 001 statement Each edge represents the 0 1 execution of a sequence of 1 non-conditional statements Each path in the tree 1 represents an equivalence class of inputs 0 1 15 Example of Computation Tree void test_me(int x, int y) { if(2*x==y){ if(x != y+10){ printf(“I am fine here”); } else { NY2*x==y printf(“I should not reach here”); ERROR; } } NYx!=y+10 } ERROR 16 Concolic Testing: Finding Security and Safety Bugs Divide by 0 Error Buffer Overflow x = 3 / i; a[i] = 4; 17 Concolic Testing: Finding Security and Safety Bugs Key: Add Checks Automatically and Perform Concolic Testing Divide by 0 Error Buffer Overflow if (i !=0) if (0<=i && i < a.length) x = 3 / i; a[i] = 4; else else ERROR; ERROR; 18 Existing Approach I Random testing test_me(int x){ generate random inputs if(x==94389){ execute the program on ERROR; generated inputs } Probability of reaching an error can be } astronomically less Probability of hitting ERROR = 1/232 19 Existing Approach II Symbolic Execution test_me(int x, int y){ use symbolic values for if((x%10)*4==y){ input variables ERROR; execute the program symbolically on symbolic } else { input values // OK collect symbolic path constraints } use theorem prover to } check if a branch can be taken Symbolic execution Does not scale for large cannot solve for x, y programs Missed error Cannot solve all constraints (or false positive) 20 Existing Approach II Symbolic Execution test_me(int x, int y){ use symbolic values for if(bbox(x)==y){ input variables ERROR; execute the program symbolically on symbolic } else { input values // OK collect symbolic path constraints } use theorem prover to } check if a branch can be taken Symbolic execution Does not scale for large cannot look inside the programs bbox function Cannot solve all Missed error constraints (or false positive) 21 Concolic Execution Concolic Execution test_me(int x, int y){ Randomly choose x and if(bbox(x)==y){ y the first time ERROR; Use concrete value of x and result of running } else { bbox(x) to solve for y // OK Finds the error! } } 22 Concolic Testing Approach int double (int v) { return 2*v; } Random Test Driver: random value for x void testme (int x, int y) { and y z = double (y); Probability of reaching if (z == x) { ERROR is extremely low if (x > y+10) { ERROR; } } } 23 Concolic Testing Approach Concrete Symbolic int double (int v) { Execution Execution return 2*v; concrete symbolic path } state state condition void testme (int x, int y) { x = 22, y = 7 x = x0, y = y0 z = double (y); if (z == x) { if (x > y+10) { ERROR; } } } 24 Concolic Testing Approach Concrete Symbolic int double (int v) { Execution Execution return 2*v; concrete symbolic path } state state condition void testme (int x, int y) { z = double (y); x = 22, y = 7, x = x0, y = y0, if (z == x) { z = 14 z = 2*y0 if (x > y+10) { ERROR; } } } 25 Concolic Testing Approach Concrete Symbolic int double (int v) { Execution Execution return 2*v; concrete symbolic path } state state condition void testme (int x, int y) { z = double (y); if (z == x) { 2*y0 != x0 if (x > y+10) { ERROR; } } x = 22, y = 7, x = x0, y = y0, } z = 14 z = 2*y0 26 Concolic Testing Approach Concrete Symbolic int double (int v) { Execution Execution return 2*v; concrete symbolic path } state state condition void testme (int x, int y) { Solve: 2*y0 == x0 z = double (y); Solution: x0 = 2, y0 = 1 if (z == x) { 2*y0 != x0 if (x > y+10) { ERROR; } } x = 22, y = 7, x = x0, y = y0, } z = 14 z = 2*y0 27 Concolic Testing Approach Concrete Symbolic int double (int v) { Execution Execution return 2*v; concrete symbolic path } state state condition void testme (int x, int y) { x = 2, y = 1 x = x0, y = y0 z = double (y); if (z == x) { if (x > y+10) { ERROR; } } } 28 Concolic Testing Approach Concrete Symbolic int double (int v) { Execution Execution return 2*v; concrete symbolic path } state state condition void testme (int x, int y) { z = double (y); x = 2, y = 1, x = x0, y = y0, if (z == x) { z = 2 z = 2*y0 if (x > y+10) { ERROR; } } } 29 Concolic Testing Approach Concrete Symbolic int double (int v) { Execution Execution return 2*v; concrete symbolic path } state state condition void testme (int x, int y) { z = double (y); if (z == x) { 2*y0 == x0 x = 2, y = 1, x = x0, y = y0, if (x > y+10) { z = 2 z = 2*y0 ERROR; } } } 30 Concolic Testing Approach Concrete Symbolic int double (int v) { Execution Execution return 2*v; concrete symbolic path } state state condition void testme (int x, int y) { z = double (y); if (z == x) { 2*y0 == x0 if (x > y+10) { x0 y0+10 ERROR; } } x = 2, y = 1, x = x0, y = y0, } z = 2 z = 2*y0 31 Concolic Testing Approach Concrete Symbolic int double (int v) { Execution Execution return 2*v; concrete symbolic path } state state condition void testme (int x, int y) { Solve: (2*y0 == x0) $ (x0 > y0 + 10) z = double (y); Solution: x0 = 30, y0 = 15 if (z == x) { 2*y0 == x0 if (x > y+10) { x0 y0+10 ERROR; } } x = 2, y = 1, x = x0, y = y0, } z = 2 z = 2*y0 32 Concolic Testing Approach Concrete Symbolic int double (int v) { Execution Execution return 2*v; concrete symbolic path } state state condition void testme (int x, int y) { x = 30, y = 15 x = x0, y = y0 z = double (y); if (z == x) { if (x > y+10) { ERROR; } } } 33 Concolic Testing Approach Concrete Symbolic int double (int v) { Execution Execution return 2*v; concrete