Symbolic Exec
Symbolic Exec
Symbolic Exec
Michael Hicks
University of Maryland
and MC2
Software has bugs
2
Static analysis
4
Symbolic execution: a middle ground
• Testing works
■ But, each test only explores one possible execution
- assert(f(3) == 5)
■ We hope test cases generalize, but no guarantees
• Symbolic execution generalizes testing
■ Allows unknown symbolic variables in evaluation
- y = α; assert(f(y) == 2*y-1);
■ If execution path depends on unknown, conceptually
fork symbolic executor
- int f(int x) { if (x > 0) then return 2*x - 1; else return 10; }
5
Symbolic Execution Example
path condition
6
Insight
7
Early work on symbolic execution
• Robert S. Boyer, Bernard Elspas, and Karl N. Levitt.
SELECT–a formal system for testing and debugging
programs by symbolic execution. In ICRS, pages 234–
245, 1975.
• James C. King. Symbolic execution and program testing.
CACM, 19(7):385–394, 1976. (most cited)
• Leon J. Osterweil and Lloyd D. Fosdick. Program testing
techniques using simulated execution. In ANSS, pages
171–177, 1976.
• William E. Howden. Symbolic testing and the DISSECT
symbolic evaluation system. IEEE Transactions on
Software Engineering, 3(4):266–278, 1977.
8
The problem
9
Today
10
1E+18
Dongarra and Luszczek, Anatomy of a Globally Recursive
1E+16 Embedded LINPACK Benchmark, HPEC 2012.
http://web.eecs.utk.edu/~luszczek/pubs/hpec2012_elb.pdf
1E+14
1E+12
1E+10
1E+8
1E+6
1E+4
1E+2
1E+0
1950 1960 1970 1980 1990 2000 2010 2020
HPEC 2012
Waltham, MA
September 10-12, 2012
Remainder of the tutorial
12
Symbolic Execution for IMP
a ::= n | X | a0+a1 | a0-a1 | a0×a1
b ::= bv | a0=a1 | a0≤a1 | ¬b | b0∧b1 | b0∨b1
c ::= skip | X:=a | goto pc | if b then pc | assert b
p ::= c; ...; c
• n ∈ N = integers, X ∈ Var = variables, bv ∈ Bool = {true, false}
• This is a typical way of presenting a language
■ Notice grammar is for ASTs
13
Interpretation for IMP
• See main.ml
14
Symbolic Variables
• Add a new kind of expression
type aexpr = ... | ASym of string
type bexpr = ... | BSym of string
■ The string is the variable name
■ Naming variables is useful for understanding the output of
the symbolic executor
15
Symbolic Expressions
• Now change aeval and beval to work with symbolic
expressions
let rec aeval sigma = function
| ASym s -> new_symbolic_variable 32 s (* 32-bit *)
| APlus (a1, a2) ->
symbolic_plus (aeval sigma a1) (aeval sigma a2)
| ...
16
Symbolic State
• Previous step function, roughly speaking
cstep : sigma -> pc -> (sigma’, pc’)
19
Path explosion
• Usually can’t run symbolic execution to exhaustion
■ Exponential in branching structure
1. int a = α, b = β, c = γ; // symbolic
2. if (a) ... else ...;
3. if (b) ... else ...;
4. if (c) ... else ...;
• Potential drawbacks
■ Neither is guided by any higher-level knowledge
- Probably a bad sign
■ DFS could easily get stuck in one part of the program
- E.g., it could keep going around a loop over and over again
■ Of these two, BFS is a better choice
21
Search strategies
• Need to prioritize search
■ Try to steer search towards paths more likely to contain
assertion failures
■ Only run for a certain length of time
- So if we don’t find a bug/vulnerability within time budget, too bad
22
Randomness
• We don’t know a priori which paths to take, so
adding some randomness seems like a good idea
■ Idea 1: pick next path to explore uniformly at random
(Random Path, RP)
■ Idea 2: randomly restart search if haven’t hit anything
interesting in a while
■ Idea 3: when have equal priority paths to explore, choose
next one at random
- All of these are good ideas, and randomness is very effective
• One drawback: reproducibility
■ Probably good to use psuedo-randomness based on seed,
and then record which seed is picked
■ (More important for symbolic execution implementers than
users)
23
Coverage-guided heuristics
• Idea: Try to visit statements we haven’t seen before
• Approach
■ Score of statement = # times it’s been seen and how often
■ Pick next statement to explore that has lowest score
• Why might this work?
■ Errors are often in hard-to-reach parts of the program
■ This strategy tries to reach everywhere.
• Why might this not work?
■ Maybe never be able to get to a statement if proper
precondition not set up
• KLEE = RP + coverage-guided
24
Generational search
• Hybrid of BFS and coverage-guided
• Generation 0: pick one program at random, run to
completion
• Generation 1: take paths from gen 0, negate one
branch condition on a path to yield a new path
prefix, find a solution for that path prefix, and then
take the resulting path
■ Note will semi-randomly assign to any variables not
constrained by the path prefix
• Generation n: similar, but branching off gen n-1
• Also uses a coverage heuristic to pick priority
25
Combined search
• Run multiple searches at the same time
• Alternate between them
■ E.g., Fitnext
26
SMT solver performance
• SAT solvers are at core of SMT solvers
■ In theory, could reduce all SMT queries to SAT queries
■ In practice, SMT and higher-level optimizations are critical
• Some examples
■ Simple identities (x + 0 = x, x * 0 = 0)
■ Theory of arrays (read(42, write(42, x, A)) = x)
- 42 = array index, A = array, x = element
■ Caching (memoize solver queries)
■ Remove useless variables
- E.g., if trying to show path feasible, only the part of the path condition
related to variables in guard are important
27
Libraries and native code
• At some point, symbolic execution will reach the
“edges” of the application
■ Library, system, or assembly code calls
• In some cases, could pull in that code also
■ E.g., pull in libc and symbolically execute it
■ But glibc is insanely complicated
- Symbolic execution can easily get stuck in it
■ pull in a simpler version of libc, e.g., newlib
- libc versions for embedded systems tend to be simpler
• In other cases, need to make models of code
■ E.g., implement ramdisk to model kernel fs code
■ This is a lot of work!
28
Concolic execution
29
Concretization
31
Recent successes, run on binaries
• SAGE
■ Microsoft (Godefroid) concolic executor
■ Symbolic execution to find bugs in file parsers
- E.g., JPEG, DOCX, PPT, etc
■ Cluster of n machines continually running SAGE
• Mayhem
■ Developed at CMU (Brumley et al), runs on binaries
■ Uses BFS-style search and native execution
■ Automatically generates exploits when bugs found
32
KLEE
33
KLEE: Coverage for Coreutils
100% paste -d\\ a
pr -e t2.txt
klee vs. Manual (ELOC %) 50%
tac -r t3.tx
mkdir -Z a b
mkfifo -Z a
0% mknod -Z a b
md5sum -c t1
ptx -F\\ abc
−50%
ptx x t4.txt
seq -f %0 1
−100%
1 10 25 50 75 t1.txt: "\t \tMD
t2.txt: "\b\b\b\
Figure 6: Relative coverage difference between KLEE and t3.txt: "\n"
the C OREUTILS manual test suite, computed by subtracting t4.txt: "a"
the executable lines of code covered by manual tests (Lman )
from KLEE tests (Lklee ) and dividing by the total possible: Figure 7: KLEE -gene
(Lklee − Lman )/Ltotal . Higher bars are better for KLEE , fied for readability) th
which beats manual testing on all but 9 applications, often version 6.10 when ru
significantly. Pentium machine.
Cadar, Dunbar, and Engler. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for
34
Complex Systems Programs, OSDI 2008
KLEE achieves 76.9
5.2.2 Comparison against developer test suites
KLEE: Coreutils crashes
paste -d\\ abcdefghijklmnopqrstuvwxyz
pr -e t2.txt
tac -r t3.txt t3.txt
mkdir -Z a b
mkfifo -Z a b
mknod -Z a b p
md5sum -c t1.txt
ptx -F\\ abcdefghijklmnopqrstuvwxyz
ptx x t4.txt
seq -f %0 1
75 t1.txt: "\t \tMD5("
t2.txt: "\b\b\b\b\b\b\b\t"
een KLEE and t3.txt: "\n"
by subtracting t4.txt: "a"
ual tests (Lman )
total possible: Figure 7: KLEE -generated command lines and inputs (modi-
tter for KLEE , fied for readability) that cause program crashes in C OREUTILS
cations, often version 6.10 when run on Fedora Core 7 with SELinux on a
Pentium machine.
Cadar, Dunbar, and Engler. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for
35
Complex Systems Programs, OSDI 2008
KLEE achieves 76.9% overall branch coverage, while the
test suites
Other symbolic executors
36
Research tools at UMD
37
Lab
38