Open-Source Model Checking
R. Grosu1 , X. Huang1 , S. Jain2 , and S.A. Smolka1
1
Dept. of Computer Science, Stony Brook Univ., Stony Brook, NY 11794, USA
2
Intel Corporation, Hillsboro, OR 97124, USA
E-mail: {grosu,xhuang,sumit,sas}@cs.sunysb.edu, sumit.jain@intel.com
Abstract. We present GMC2 , a software model checker for GCC, the opensource compiler from the Free Software Foundation (FSF). GMC2 , which
is part of the GMC static-analysis and model-checking tool suite for GCC
under development at SUNY Stony Brook, can be seen as an extension of
Monte Carlo model checking to the setting of concurrent, procedural programming languages. Monte Carlo model checking is a newly developed
technique that utilizes the theory of geometric random variables, statistical hypothesis testing, and random sampling of lassos in Büchi automata
to realize a one-sided error, randomized algorithm for LTL model checking. To handle the function call/return mechanisms inherent in procedural languages such as C/C++, the version of Monte Carlo model checking
implemented in GMC2 is optimized for pushdown-automaton models. Our
experimental results demonstrate that this approach yields an efficient
and scalable software model checker for GCC.
1
Introduction
During the past 15 years, GCC has evolved from a modest C compiler, to a fullblown, multi-language compiler that can generate code for more than 30 target
architectures. The set of programming languages handled by GCC now includes
C, C++, Objective-C, Fortran, Java, and Ada. This diversity of languages and
architectures has made GCC one of the most popular compilers in current use.
Traditionally, GCC has translated source code directly to RTL (register transfer
level), a very low-level intermediate language, before applying any optimizations.
This inevitably rendered the optimizations performed as low level, since higherlevel semantic information such as data types, structures and fields were lost
during translation. To remedy this situation, the Tree-SSA branch [19] of GCC
has resulted in the addition of two new intermediate languages to GCC: GENERIC,
which provides a common infrastructure for abstract syntax tree analysis and
optimization; and GIMPLE three-address code, which provides a common infrastructure for CFG (control flow graph) analysis and optimization.
Together with their associated APIs, GENERIC and GIMPLE make Tree-SSA suitable as a platform not only for the development of high-level code-optimization
techniques, but also for new static-analysis tools, applicable to all of GCC’s input
languages. The acceptance of the Tree-SSA branch by the open-source community has led, during the past year, to it being merged with the main line in
Release Version 3.5.
In this paper, we describe a software model checker for GCC that we have designed and implemented at the Tree-SSA level. Our model checker, which we call
GMC2 for GCC-based Model Checking, is an extension of the technique of Monte
Carlo model checking [8] to the setting of concurrent, procedural programming
R. Grosu, X. Huang and S. Jain were partially supported by the NSF Faculty Early
Career Development Award CCR01-33583.
languages. Monte Carlo model checking is a newly developed technique that
utilizes the theory of geometric random variables, statistical hypothesis testing, and random sampling of lassos in Büchi automata to realize a one-sided
error, randomized algorithm for LTL model checking. To handle the function
call/return mechanisms inherent in procedural languages such as C/C++, the
version of Monte Carlo model checking implemented in GMC2 is optimized for
pushdown-automaton models.
At the heart of GMC2 is a GIMPLE CFG interpreter interpret that traverses
CFGs using Tree-SSA statement iterators succ, tsucc and fsucc, interpreting each
statement encountered according to its semantics. Of particular interest is the
manner in which process creation and synchronization statements are processed,
which force a return whenever a context switch is required, as well as function
invocation and return statements, which induce a hierarchic structure on the
hash table GMC2 utilizes for lasso detection.
GMC2 and interpret are part of the GMC suite of analysis and verification
tools we are developing for the Tree-SSA level of GCC, which additionally includes
an intra-procedural slicer and a BDD implementation of a symbolic-execution
engine for GIMPLE CFGs. The tool suite is intended to provide an open-source
framework for GCC-based static analysis and model-checking.
The main contributions of this paper can be summarized as follows.
– By virtue of being implemented at the Tree-SSA level, GMC2 is at once a
software model checker for each of GCC’s 6 input languages and more than 30
target architectures.
– GMC2 is an open-source model checker : its integration into GCC renders it readily and widely accessible for usage, critique, and extension by the open-source
community.
– GMC2 implements the technique of Monte Carlo model checking [8] within
the setting of concurrent procedural programming languages. The version of
Monte Carlo model checking implemented in GMC2 is therefore optimized for
pushdown-automaton models.
– Our experimental results demonstrate that the Monte Carlo approach yields
an efficient and scalable software model checker for GCC.
The rest of the paper develops along the following lines. Section 2 considers
the technique of Monte Carlo model checking. Section 3 provides an overview of
the GCC compilation process. Section 4 describes GMC2 , our software model checker
for GCC, while Section 5 summarizes our experimental results. Section 6 discusses
related work. Section 7 contains our conclusions and directions for future work.
The GMC2 model checker is available from [21].
2
Monte Carlo Model Checking
Monte Carlo model checking [8] performs random sampling of lassos in a Büchi
automaton (BA) to realize a one-sided error, randomized algorithm for LTL
model checking. In this section, we provide an overview of this technique. In
Section 4, we show how to extend this technique to hierarchic Büchi automata
(HBA) in the context of software model checking.
2
Büchi automata. A Büchi automaton A = (Σ, Q, Q0 , δ, F ) is a five-tuple
where: Σ is a finite input alphabet; Q is a finite set of states; Q0 ⊆ Q is the set
of initial states; δ ⊆ Q × Σ × Q is the transition relation; F ⊆ Q is the set of
accepting states. We assume, without loss of generality, that every state of a BA
has at least one outgoing transition, even if this transition is a self-loop.
a
a1
a
..., where s0 ∈ Q0 and for all i ≥ 0, si →i si+1 ∈ δ
A sequence σ = s0 →0 s1 →,
is called an infinite run of A if the sequence is infinite and a finite run otherwise.
An infinite run is called accepting if there exists an infinite set of indices J ⊆ N,
such that for all i ∈ J, si ∈ F .
We say that σ is ultimately periodic if there exist i ≥ 0, l ≥ 1 such that for
a
all j ≥ 0, si+j = si+j mod l . This means that σ consists of a finite prefix s0 →0
a
ai−1
a
i+l−1
· · · si−1 → , followed by the “infinite unfolding” of a cycle si →i · · · → si .
The cycle is called simple if for all 0 ≤ j = k < l, si+j = si+k ; i.e., the cycle
does not visit the same node twice. In the following, we shall refer to such a
reachable simple cycle as a lasso, and say that a lasso is accepting if its simple
cycle contains an accepting state.
Let S be a concurrent system, AS the BA encoding S’s state transition graph,
and ϕ an LTL property. Using the tableau method, one can construct a Büchi
automaton A¬ϕ accepting the same language as ¬ϕ [5]. The LTL model-checking
problem AS |= ϕ is then naturally defined in terms of the emptiness problem for
B = AS × A¬ϕ , which reduces to finding accepting lassos in B [22].
Random lassos and hypothesis testing. Instead of searching the entire state
space of B for accepting lassos, we successively generate up to M lassos of B on
the fly, by performing random walks in B. The walks are uniform in the sense
that they are generated by imposing a uniform distribution on the outgoing
transitions of the current state along the walk. If the currently generated lasso
is accepting, we have found a counter-example to emptiness, and stop.
To determine the number M of lassos we need to generate, we aim to answer,
with confidence 1−δ and within error margin , the following question: how
many independent lassos do we need to generate until one of them is accepting?
The answer is based on the theory of geometric random variables and statistical
hypothesis testing. Let X be geometric random variable parameterized by the
Bernoulli random variable Z (defined below) that takes value 1 with probability
pZ and value 0 with probability qZ = 1 − pZ . Intuitively, pZ is the probability
that an arbitrary lasso of B is accepting.
The cumulative distribution function of X for N independent trials of Z is:
F (N ) = Pr[X ≤ N ] = 1 − (1 − pZ )N . Requiring that F (N ) = 1 − δ yields:
N = ln(δ)/ ln(1 − pZ ). Given that pZ is what we wish to determine, we assume
for the moment that pZ ≥ . Replacing pZ with yields M = ln(δ)/ ln(1 − )
which is greater than N and therefore Pr[X ≤ M ] ≥ Pr[X ≤ N ] = 1 − δ.
Summarizing:
pZ ≥
⇒
Pr[X ≤ M ] ≥ 1 − δ
where
M = ln(δ)/ ln(1 − )
(1)
Inequation 1 gives us the minimal number of attempts M needed to achieve
success with confidence ratio δ, under the assumption that pZ ≥ . The standard
way of discharging such an assumption is to use statistical hypothesis testing (see
3
e.g. [18]). Define the null hypothesis H0 as the assumption that pZ ≥ . Rewriting
inequality 1 with respect to H0 we obtain:
Pr[X ≤ M | H0 ] ≥ 1 − δ
(2)
We now perform M trials. If no counterexample is found, i.e., if X > M , we
reject H0 . This may introduce a type-I error: H0 may be true even though we
did not find a counter-example. However, the probability of making this error
is bounded by δ; this is shown in inequality 3 which is obtained by taking the
complement of X ≤ M in inequality 2:
Pr[X > M | H0 ] < δ
(3)
Because we seek to attain a one-sided error decision procedure, we do not consider type-II errors in our application of hypothesis testing: as soon as we find a
counter-example, we stop sampling and decide (with probability 1) that A |= ϕ.
The Monte Carlo model-checking algorithm. For a BA B, define the
probability space (P(L), Pr), where L = La ∪ Ln is the set of all lassos of B
and La and Ln are the sets of all accepting and non-accepting lassos of B,
an−1
a0
respectively. The probability Pr[σ] of a lasso σ = s0 →.
. . → sn is defined
an−1
a0
inductively as follows: Pr[s0 ] = k −1 if |Q0 | = k and Pr[s0 →.
. . → sn ] =
an−1
an−2
a0
a
a
Pr[s0 →. . . → sn−1 ] · π[sn−1 → sn ] where π[s→s ] = m−1 if s→s ∈ δ and
|δ(s)| = m. That (P(L), Pr) is actually a probability space is established in [8].
Example 1 (Probability of lassos). Consider BA B of Figure 1. It contains four
lassos, 11, 1244, 1231 and 12344, having probabilities 1/2, 1/4, 1/8 and 1/8,
respectively. Lasso 1231 is accepting.
1
2
3
4
Fig. 1. Example lasso probability space.
Definition 1 (Lasso Bernoulli variable). The random variable Z associated
with the probability space(P(L), Pr) of a Büchi automaton B is
defined as follows: pZ = Pr[Z = 1] = λa ∈La Pr[λa ] and qZ = Pr[Z = 0] = λn ∈Ln Pr[λn ].
Example 2 (Lassos Bernoulli variable). For the Büchi automaton B of Figure 1,
the lassos Bernoulli variable has associated probabilities pZ = 1/8 and qZ = 7/8.
Having defined Z, X and H0 , we are now ready to present our Monte Carlo
decision procedure for emptiness checking of Büchi automata, called MC2 in [8].
MC2 consists of three statements. The first uses inequation 1 to determine the
value for M , given parameters and δ. The second statement is a for-loop that
successively samples up to M lassos by calling the random lasso (rLasso) routine,
described in Section 4. If an accepting lasso l is found, MC2 decides false and
returns l as a counter-example. If no accepting lasso is found within M trials,
MC2 decides true, and reports that with probability less than δ, pZ > .
4
bool × lasso MC2 (BA B = (Σ, Q, Q0 , δ, F), float 0 < , δ < 1)
{
M = ln δ / ln(1 − );
for (i = 1; i ≤ M; i++) if (rLasso(B)==(true,l)) return (false,l);
return (true,nil); /* Pr[X > M | H0 ] < δ */;
}
Theorem 1 ([8]). Given a Büchi automaton B and parameters and δ, if MC2
returns false, then L(B) = ∅. Otherwise, Pr[X > M | H0 ] < δ where M =
ln(δ)/ ln(1 − ) and H0 ≡ pZ ≥ .
MC2 is very efficient in both time and space. The recurrence diameter of a Büchi
automaton B is the longest loop-free path in B starting from an initial state.
Theorem 2 ([8]). Let B be a Büchi automaton, D its recurrence diameter and
M = ln(δ)/ ln(1 − ). Then MC2 runs in time O(M D) and uses O(D) space.
In the worst case, D is exponential in |S| + |ϕ| and thus MC2 ’s does not improve
on the space complexity of a typical model checker. In practice, however, one
can expect MC2 to perform much better than this.
3
Overview of GCC
The block diagram of Figure 2 provides an overview of the GCC compilation
process from source input file to object code.
C
File
C++
File
Java
File
C Parser
C++ Parser
...
Java Parser
Front End
Parse
Tree
Genericize
GEN
AST
Gimplify
GPL
AST
BldCFG
SSA/
GPL
CFG
RstCMP
RTL
Code
CodeGen
Obj
Code
Tree SSA Framework
Back End
Fig. 2. Block diagram of the GCC compilation process.
The language-specific front-end of GCC translates a source input file to a (languagespecific) parse tree. The back end is largely language independent, and handles
code optimization and final code generation. Traditionally, GCC translated source
code directly to RTL (register transfer level), a very low-level intermediate language, before applying any optimizations. This inevitably rendered the optimizations performed as low level, since higher-level semantic information such
as data types, structures and fields were lost during translation.
Tree-SSA. To remedy this situation, the Tree-SSA branch [19] of GCC has resulted in the addition of two new intermediate languages (ILs): GENERIC and
GIMPLE [16]. Together with their APIs, these ILs make Tree-SSA suitable as a platform not only for the development of high-level code optimization techniques,
but also for new static analysis tools, applicable to all of GCC’s input languages.
The acceptance of Tree-SSA by the open-source community has led, during the
past year, to it being merged with the mainline in Release Version 3.5.
5
int main() {
int a, b, c;
a = 5;
b = a + 10;
c = b + foo(a, b);
if (a > b + c)
c = b++ / a + (b * a);
bar(a, b, c); }
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
=>
int main {
int a, b, c;
int T1, T2, T3, T4;
a = 5;
b = a + 10;
T1 = foo(a, b);
T2 = b + T1;
if (a > T2) goto fi
T3 = b / a;
T4 = b * a;
c = T2 + T3
b = b + 1;
fi: bar (a, b, c); }
Fig. 3. Sample C program and corresponding GIMPLE representation.
GENERIC realizes a common infrastructure for AST-level (abstract syntax tree)
analysis and optimization by providing a language-independent IL for all parsetree constructs produced by the language-specific front ends. GIMPLE is a C-like
three-address (3A) code which provides a common infrastructure for CFG-level
(control flow graph) analysis and optimization. As usual, complex expressions
(possibly with side effects) are broken into simple 3A statements by introducing
new, temporary variables. Similarly, complex control statements are broken into
simple 3A (conditional) gotos by introducing new labels.
Syntactically, GIMPLE is a subset of GENERIC. Each GIMPLE tree is used to construct a GIMPLE-CFG, which itself can subsequently be converted to static singleassignment (SSA) form. Figure 3 shows a C program and its corresponding GIMPLE
representation, which preserves source-level information such as data types and
procedure calls. While not shown in the example, GIMPLE types also include
pointers and structures.
Once a function is translated to GIMPLE form, the Tree-SSA framework builds
its associated control flow graph. Each node in the CFG is linked to a basic
block (sequence of non-branching instructions), represented as a GIMPLE AST. CFG
transitions correspond to (conditional) goto instructions. For example, the CFG
for the GIMPLE program of Figure 3 is shown in Figure 4.
FUNCTION_DECL
a
int
Entry
A
a = 5;
b = a + 10;
T1 = foo(a, b);
T2 = b + T1;
if (a>T2) goto B;
B
T3 = b / a;
T4 = b * a;
c = T3 + T4;
b = b + 1;
true
talse
b
c
int
T2
int
CE
a
T3
int
T4
int
CE: Compound Exp
=
CE
=
5
CE
+
b
C
bar (a, b, c);
return;
Exit
T1
int
int
CE
=
a
=
if
10
T1 Call Exp
foo
GIMPLE TREE
a
T2
b
B
+
b
T1
Fig. 4. Control flow graph for example C program.
6
>
a
T2
The Tree-SSA API provides functions to manipulate and traverse CFGs, their
associated ASTs, and their list of variables. For example, succ returns the address
of the immediate successor of a non-branching statement, and tsucc and fsucc
return the address of the immediate true and respectively false successor of a
branching statement. Similarly, if a is a variable in a CFG, its type attribute
a.type is also a GIMPLE AST containing various information such as type name,
size, and alignment. Basic data-flow, control-flow, alias and reachability-analysis
routines are also provided by the Tree-SSA API.
4
Monte Carlo Software Model Checking
We have implemented a software model checker for GCC based on the generic
Monte-Carlo model-checking algorithm of Section 2. Our model checker, GMC2 , is
applicable to any program written in one of the procedural languages supported
by GCC, e.g. C. Call this program the target program to be verified. GMC2 also requires as input a procedure or function, call it the property function, representing
the LTL property of interest. The target program can contain concurrency primitives similar to those supported by the Verisoft model checker [6]. In the case
of safety properties, the property function is called to check for property violations in the target program. In the case of liveness properties, the property
function is called to check if an accepting state of the target program is visited
infinitely often, viewing the target program as a succinct representation of a
Büchi automaton.
GMC2 operates at the Tree-SSA level and assumes that the target program
and property function have been compiled into CFGs. Let P be the array of CFGs
corresponding to the target program, one for each of its functions, and let ϕ be
the CFG for the property function. At the heart of GMC2 is a CFG interpreter that
traverses the CFGs in P using Tree-SSA’s statement iterators and interprets the
statements contained in the CFGs according to their semantics. This allows GMC2
to generate the random lassos of the target program on the fly.
4.1
The Main Routine
Due to space considerations, we limit our discussion to the treatment of safety
properties. Given an array of P of CFGs for the target C program, a CFG ϕ for
the C function encoding a safety property, and parameters and δ, GMC2 successively generates at most ln(δ)/ ln(1−) random lassos of P; see Section 2. While
generating a lasso, ϕ is called to check whether or not ϕ is violated in the newly
reached program state. If so, GMC2 stops and returns the counter-example path
leading to the violating state. If all states of all sampled executions satisfy ϕ, GMC2
.
stops and reports with confidence greater than 1−δ that it rejects H0 = pZ ≥ .
2
At the heart of GMC is the rLasso routine for generating random lassos; rLasso
conducts a random execution of the CFGs in P by interpreting their (possibly
concurrent) C statements and checking for property violations.
7
4.2
The rLasso Random-Lasso Routine
In order to detect (global) lassos, the (concurrent) program state is stored in a
hash table ht each time a context switch occurs. This is for efficiency purposes:
the alternative, less efficient approach would be to store the program state after
each statement execution. To ensure the soundness of this approach, we assume
that the time between context switches is finite.
bool × lasso rLasso() /* global cfg array P, cfg ϕ */
{
hashTbl ht = ∅; ready list ready = ∅; bool × state (f,s) = rInit();
while (s ∈ ht) {
insert(ht,s); if (¬f) return (true,lasso(ht));
(f,s) = rNext(s); }
return (false,lasso(ht));
}
Hash table. The ht hash table is optimized so that common information among
global states is shared. It is also hierarchic in the sense that all states belonging
to a callee are linked to each other so that they can easily be removed from ht
when the callee returns.
The pseudo-code for the rLasso routine is given above. The first line sets
ht and ready to empty, and initializes the violation flag f and the current-state
variable s by calling routine rInit. The while-loop searches for (violating) lassos.
If the current state s is not in ht, then it is a new state and is inserted in
ht. If it is also violating, signaled by ¬f being true, then a violating lasso was
found, which is returned together with the corresponding flag to GMC2 . Otherwise,
another random next state is generated by calling rNext.
4.3
Routines rInit and rNext
Given a set V of typed variables, a valuation (or environment) of V is a mapping
of variables in V to their type-correct values. If Γ and Γ are lists, and σ is
a list element, we write concat(Γ, Γ), append(Γ, σ) and rest(Γ) for the lists
obtained by concatenating Γ and Γ , appending σ to Γ, and taking the rest of Γ,
respectively, and we write Γ(i) for the i-th element of Γ. If ∆ is a stack and φ a
stack element, then we write push(∆, φ), pop(∆) and ∆.φ for pushing φ onto the
stack, popping the stack, and for the topmost element on the stack, respectively.
If s is a statement, i.e., AST of a CFG, then s.a is a child of s.
Program state. The state Σ = (χ, Γ) of a concurrent C program consists of
a valuation χ of the shared variables (channels and semaphores) and a list Γ
of process states, one for each active process. The list is ordered by the order
of process creation. The state σ = (κ, δ) of a process has two components: the
control state κ and the data state δ. The control state κ = (γ, ν) consists of a
function name γ and a statement number ν within γ. The data state δ = (π, β, ∆)
consists of a heap π, a valuation of global variables β and a frame stack ∆. Each
frame φ = (κ, ρ) of ∆ contains a return control state κ to the caller CFG and a
valuation ρ for the local variables of the callee CFG.
8
Routine rInit. Execution of P starts in a random state Σ0 defined as follows.
All channels in χ0 are empty and all semaphores are 0. The process-state list Γ0
contains only the state σ0 of the root process. The control state κ0 of the root
process has function main of P in γ0 and 0 in ν0 .
bool × prgState rInit() /* global cfg array P, cfg ϕ */
{
sharedState χ = χ0 ; procStates Γ = ∅; frameStack ∆ = ∅;
cfgNm γ = main; stmNo ν = 0; controlState κ = (γ,ν);
lclEnv ρ = ρ0 ; forall (x ∈ dom(P[γ].param)) ρ[x] = random(P[γ].param.type);
frame φ = (trap,ρ); push(∆,φ); dataState δ = (π0 ,β0 ,∆);
procState σ = (κ,δ); append(Γ,σ); prgState Σ = (χ,Γ);
if eval(ϕ) return (true,Σ) else return (false,Σ);
}
The data state δ0 of the root process consists of the empty heap π0 , valuation β0
of the global variables, and stack frame ∆0 with only frame φ0 of main pushed.
This frame has a predefined return control state trap (e.g. the stop point) and a
valuation ρ0 for the local variables. The valuation of the formal parameters in ρ0
is chosen randomly within their corresponding range. Function eval evaluates a
CFG in the current state and returns its value.
bool × prgState rNext(prgState s)
{
/* global cfg array P, hashTbl h, CFG ϕ, ready list ready */
int i = random(|ready|); int nxt = ready[i];
return interpret(s,nxt);
}
Routine rNext. Routine rNext randomly selects one of the ready processes and
interprets it by calling routine interpret, described next. It regains control when
interpret reaches a concurrency statement, which requires a context switch.
4.4
Routine interpret
Routine interpret traverses the CFGs in P, using statement iterators succ, tsucc
and fsucc, and interprets each statement according to its semantics. Of particular
interest are the process creation and synchronization statements, which force a
return whenever a context switch is required, as well as function invocation and
return statements, which induce a hierarchic structure on the hash table.
Since interpret may generate several states before it returns, it has to check
whether property ϕ is true in all of them. Properties to be checked may also
be inserted in a program, as assert (p) statements. The interpreter then checks
whether predicate p is true in the current state and returns with a violation if
this is not the case.
The pseudo-code for interpret is given below. Its body is an infinite loop,
which according to the type of the current statement ν within the current CFG
P[γ], undertakes the actions defining the semantics of the statement. Due to space
limitations, we consider a representative susbset of statement types, which does
not include heap and pointer manipulation statements.
9
bool × prgState interpret(prgState Σ, int i)
{
/* global cfg array P, hashTbl ht, cfg ϕ, ready list ready */
channels χ = Σ.χ; procStates Γ = Σ.Γ; procState σ = Γ[i];
while (true) {
cfgNm γ = σ.κ.γ; stmtNo ν = σ.κ.ν;
frameStack ∆ = σ.δ.∆; globalEnv β = σ.δ.β;
switch (P[γ][ν].type) of
if:
/* if e goto t */ {
ν = (eval(P[γ][ν].exp)) ? tsucc(P[γ][ν]) : fsucc(P[γ][ν]); }
assert: /* assert(e) */ {
if (!eval(P[γ][ν].exp)) return (false,Σ); ν = succ(P[γ][ν]); }
assign: /* x = rhs */ {
if (P[γ][ν].rhs.type == expr) { /* rhs == e */
ν = succ(P[γ][ν]); (∆.ρ:β)[P[γ][ν].var] = eval(P[γ][ν].rhs); }
else if (P[γ][ν].rhs.fnc == toss) { /* rhs == toss(e) */
ν = succ(P[γ][ν]);
(∆.ρ:β)[P[γ][ν].var] = random(eval(P[γ][ν].rhs.exp)); }
else if (P[γ][ν].rhs.fnc == fork) { /* rhs == fork() */
ν = succ(P[γ][ν]); (∆.ρ:β)[P[γ][ν].var] = 0;
append(Γ,((γ,ν),(π,β,∆))); (∆.ρ:β)[P[γ][ν].var] = |Γ|-1; }
else if (P[γ][ν].rhs.fnc == recv) { /* rhs == recv(c) */
c = P[γ][ν].rhs.chnl;
if (empty(χ.c)) {append(χ.c.swait,i); return (true,Σ); }
ν = succ(P[γ][ν]); (∆.ρ:β)[P[γ][ν].var] = fst(χ.c.queue));
rest(χ.c.queue); concat(ready,χ.c.swait); χ.c.swait = ∅; }
else { /* rhs == f(a) */
a = eval(P[γ][ν].rhs.act); κ = (γ,ν);
γ = P[γ][ν].rhs.fnc); ν = 0;
push(∆,(κ,ργ,0 )); (∆.ρ)[P[γ].fpar] = a; }
return: /* return e */ {
e = eval(P[γ][ν].exp); (γ,ν) = ∆.κ; popLocal(ht); pop(∆);
(∆.ρ:β)[P[γ][ν].var] = e; }
send:
/* send(c,e) */ {
c = P[γ][ν].rhs.chnl;
if (full(χ.c)) {append(χ.c.rwait,i); return (true,Σ);}
ν = succ(P[γ][ν]); append(χ.c.queue,eval(P[γ][ν].rhs.exp));
concat(ready,χ.c.rwait); χ.c.rwait = ∅;
}
σ = ((γ,ν),(π,β,∆)); Γ[i] = σ; Σ = (χ,Γ);
if (!eval(ϕ)) return (false,Σ);
}
}
For the sequential intra-procedural group of statements, we discuss the interpretation of if and (simple) assignment. The former evaluates the predicate in
10
the current state and branches to the appropriate location by modifying ν. The
latter evaluates the right-hand side expression and updates the corresponding
local environment ∆.ρ (within the frame on the top of the frame stack) or global
environment β, on the location given by the left-hand side variable, accordingly.
By writing ∆.ρ : β we mean that both valuations are considered and that ∆.ρ has
precedence over β; i.e., we first search the variable in the local valuation.
The modeling and verification statements presented are toss and assert. For
toss, the interpreter first evaluates the argument expression to obtain a value v,
and then it randomly generates a number within the range [0, v]. The obtained
number is assigned to the location given by the left-hand side variable, in either
∆.ρ or β. For assert, the interpreter checks whether the predicate is true in
the current state. If this is not the case, it returns false and Σ. Otherwise, it
continues with the next statement by updating ν.
The inter-procedural statements presented are call and return. For call, a
new frame φ = (κ, ρ) is allocated on top of the frame stack ∆; κ is the current control state; ρ has the local variables of the target function γ initialized
accordingly by ργ,0 and the formal parameters evaluated in the current state.
Control is then moved to the callee by updating γ and ν accordingly. For return,
the interpreter does the following. First, it evaluates the return expression in a
temporary variable. It then restores the control state from the frame stack, pops
the frame stack and erases all the states corresponding to the callee from the
hash table. Finally it assigns the temporary variable, to the location given by
the variable of the statement pointed to by the control state, in either ∆.ρ or β.
The concurrency primitives considered so far include process creation, channels and semaphores. For simplicity, we only discuss fork, send and recv. The
other are treated in a similar manner. The fork statement is handled by creating
a new process (state) in Γ which is identical to the current except for the value
assigned to the variable on the left-hand side of the fork assignment statement.
This is zero for the child process and the index in Γ of the new process for the
parent process. The send statement is treated as expected. If the channel is full,
the process is put in the send wait queue of the corresponding channel, and
control is returned to rNext. Otherwise, the message expression is evaluated and
appended to the channel. Moreover, the process waiting in the receive queue
of the channel is awaken, by moving it to the ready list. The recv primitive is
treated in a similar way.
5
Experimental Results
To assess the performance and scalability of GMC2 , we compared it to VeriSoft, a
popular software model checker from Lucent Technologies [6], on two C benchmarks: dining philosophers and the Needham-Schroeder. VeriSoft and GMC2 were
given the same C source files as input, each of which can be downloaded from [21].
We also ran GMC2 on the TCAS benchmark. All GMC2 experiments were performed
on an Athlon 2600+ MHz processor with 1GB RAM running Linux 2.6.5.
Dining philosophers. For this classical synchronization problem, we used
a faulty symmetric but fair variant, where the number of philosophers varied
11
from 4 to 16. The safety property we checked was deadlock freedom. Our experimental results are given in Table 1. The meaning of the column headings is
the following: phi. is the number of philosophers; time is the execution time in
mins:secs; ce.len is the length of the counter-example found; states is the number of states VeriSoft visited until finding an error; transitions is the number of
transitions that VeriSoft traversed. The VeriSoft experiments were performed
on Sun Sparc Ultra-5.10 server running SunOS 5.6. Our experience shows that
the Athlon/Linux environment performs approximately 3.4 times faster than the
Sparc/SunOS environment.
phi.
4
6
8
10
12
14
16
time
0:00.07
0:00.11
0:00.78
0:02.17
0:04.82
0:06.22
0:11.56
GMC2
VeriSoft
samples ce.len. time states transitions
2
12 0:00.61 16
37
4
12 0:16.60 773
1171
11
20 2:57.29 5431
8449
31
24 10:41 17908 31433
24
27 > 2hr N/A
N/A
22
44 > 2hr N/A
N/A
14
32 > 2hr N/A
N/A
Table 1. Deadlock freedom for the symmetric and fair C implementation.
Needham-Schroeder protocol. This classic public-key protocol provides mutual authentication for two parties, before they engage in a transaction. In 1995,
Lowe first reported a flaw in the protocol [14], by exhibiting an attack involving
six message exchanges. Suppose A is the initiator, B is the responder and I is
the intruder. Then the attack is as follows: (i) A sends a nonce to I. (ii) I sends
same nonce to B. (iii) B sends the above received nonce and its new nonce to I.
(iv) I sends the above received message to A. (v) A validates the authenticity of
I and sends the second nonce from the message back to I. (vi) I sends this nonce
back to B which now also validates I. We checked for the existence of the above
attack in a C implementation of the protocol we obtained from Patrice Godefroid, who we greatfully acknowledge. GMC2 found it in 6 hours and 37 minutes
after having checked 10,682,639 lassos.
The same example and implementation was used in [7] to evaluate a novel
genetic algorithm. The time usage reported there is 2 hours and 33 minutes to
find 3 errors, which is superior to GMC2 on this benchmark. They also attempted
exhaustive and randomized search algorithms on this C program, neither of
which could find an error in 8 hours. Their experiments were performed on a
Pentium III 700 MHz processor with 256 MB RAM. Unfortunately, the genetic
version of VeriSoft is not publicly available, and we could not reproduce this
result on our own machine. Its superior performance might be explained by
the sequential nature of the protocol implementation, which essentially executes
only one round of a reactive system. In this round, the system either deadlocks,
produces a counterexample or it behaves correctly. Hence, lasso search seems to
be less useful in this case than applying genetic heuristics.
TCAS. The traffic alert and collision avoidance system (TCAS) is used on board
all US commercial aircrafts. It continuously monitors radar information to sense
12
whether a neighboring aircraft could become a threat by getting too close. Such
an aircraft is said to be an “intruder”, which is entering the protected zone. In
this situation TCAS issues a traffic advisory (TA) and estimates the time remaining
until the two aircrafts reach the closest point of approach. Such estimates are
used to compute the vertical separation between the two aircraft assuming that
the controlled aircraft maintains its current trajectory. Depending on the results
obtained, TCAS issues a resolution advisory (RA) suggesting the pilot to climb or
to descend.
property
rule
1
2
1
Best Advisory Selection
2
1
Avoid unnecessary crossing
2
1
No Crossing Advisory Selection
2
1
Optimal Advisory Selection
2
Safe Advisory Selection
GMC2
bugs found time samples
No
0.23 1278
Yes
0.03 147
No
0.25 1278
Yes
0.04 206
Yes
0.01
36
Yes
0.03 180
Yes
0.01
27
Yes
0.01
8
No
0.23 1278
Yes
0.06 217
Table 2. Running time of GMC2 for TCAS.
We have verified the RA component from Georgia Techs Siemens suite [20],
with respect to the specifications in [3]. Each property is verified by checking the
satisfiability of two rules, with specific initial values for variables. The details
of these rules, initial conditions on values and the properties, can be found in
[3]. Our experimental results are presented in Table 2, where the meaning of
the column headings is as follows: property name; corresponding rule number;
indication of whether or not GMC2 found a counter-example; time usage in seconds
within which either a counter-example was found or a predefined number of
samples was reached; if a counter-example was found, the last colum gives the
number of samples taken to that point; otherwise it is the predefined number of
samples to be taken: 1,278 corresponding to δ = 0.1 and = 1.8 × 10−3 .
6
Related Work
At the confluence of model checking, static analysis and theorem proving, software model checking has become an area of intense research. Given a software
system S, software model checkers either work directly on S, or extract a model
M from S and apply more traditional model-checking techniques to M . The software model checkers most closely related to GMC2 are those for concurrent procedural languages, such as C/C++, and include VeriSoft [6], Spin [11], Blast [10],
Magic [2] and C Wolf [4]. In the case of VeriSoft, a randomized search strategy based on genetic algorithms has been developed to guide state-space search
towards error states [7]. A comparison of the relative performance of GMC2 and
VeriSoft is given in Section 5.
13
The Cooperative Bug Isolation (CBI) project at Berkeley performs compiletime instrumentation on a a number of large open-source projects and distributes
the resulting binaries [13]. Information is then be gathered about how many
times a program terminated successfully or not. Subsequent statistical analysis
is used to isolate erroneous code segments. In contrast, GMC2 is a model checker
embedded at the Tree-SSA level of the open-source GCC compiler.
Other researchers have developed randomized methods for software verification and analysis. The key idea behind the program-analysis technique of random interpretation [9] is to execute a code fragment on a few random inputs in
a contrived manner, which includes giving random linear interpretations to the
operators in the program. Both branches of a conditional are executed on each
run and at joint points, a random affine combination of the joining states is performed. In the branches of an equality conditional, the data values are adjusted
on the fly to reflect the truth value of the guarding boolean expression.
In [17], Monte Carlo and abstract-interpretation techniques are used to analyze programs whose inputs are divided into two classes: those that behave
according to some fixed probability distribution and those considered nondeterministic.
7
Conclusions
We have presented GMC2 , a software model checker for GCC based on the technique
of Monte Carlo model checking. By virtue of being implemented at the Tree-SSA
level, GMC2 is at once a model checker for each of GCC’s 6 input languages, including
C and C++, and more than 30 target architectures. GMC2 is also an open-source
model checker, and therefore readily and widely accessible for usage, critique,
and extension by the open-source community.
An important advantage of our approach concerns the treatment of pointers
in GMC2 . Basically, it is much easier to interpret pointer operations than it is to
statically analyze them.
As ongoing and future work, we are in the process of creating a software model
checking branch of GCC for the public distribution of the GMC tool suite. Also,
we are developing automated abstraction [1] and interpolation techniques [15]
to handle programs with infinite-domain variables. Currently, we are manually
applying a form of bounded-range abstraction [12], for example, on the TCAS
benchmark.
References
1. T. Ball, R. Majumdar, T. Millstein, and S. K. Rajamani. Automatic predicate
abstraction of C programs. In Proc. PLDI 2001, SIGPLAN Notices 36(5), pages
203–213, 2001.
2. S. Chaki, E. Clarke, A. Groce, S. Jha, and H. Veith. Modular verification of
software components in C. Transactions on Software Engineering, 30(6):388–402,
June 2004.
14
3. A. Coen-Porisini, G. Denaro, C. Ghezzi, and M. Pezzè. Using symbolic execution
for verifying safety-critical systems. In ESEC/FSE-9: Proc. 8th European Software
Engineering Conference, pages 142–151. ACM Press, 2001.
4. D. C. DuVarney and S. P. Iyer. C Wolf – a toolset for extracting models from C
programs. In FORTE 2002, pages 260–275, 2002.
5. R. Gerth, D. Peled, M. Y. Vardi, and P. Wolper. Simple on-the-fly automatic verification of linear temporal logic. In Protocol Specification Testing and Verification,
pages 3–18, Warsaw, Poland, 1995. Chapman & Hall.
6. P. Godefroid. Model checking for programming languages using VeriSoft. In Proceedings of 24th ACM SIGPLAN-SIGACT Symp. Principles of Programming Languages (POPL ’97), pages 174–186. ACM Press, 1997.
7. P. Godefroid and S. Khurshid. Exploring very large state spaces using genetic
algorithms. STTT, 6(2):117–127, 2004.
8. R. Grosu and S. A. Smolka. Monte carlo model checking. In Proc. 11th Intl. Conf.
on Tools and Algorithms for the Construction and Analysis of Systems (TACAS
2005), pages 271–286. Springer-Verlag, 2005.
9. S. Gulwani and G. C. Necula. Precise interprocedural analysis using random interpretation. In Proc. 32nd Annual ACM Symposium on Principles of Programming
Languages. ACM, January 2005.
10. T. A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Software verification with
Blast. In Proceedings of the Tenth International Workshop on Model Checking of
Software (SPIN 2003), pages 235–239. Lecture Notes in Computer Science 2648,
Springer-Verlag, 2003.
11. G. J. Holzmann. The model checker SPIN. IEEE Trans. Softw. Eng., 23(5):279–
295, 1997.
12. D. Kroening, J. Ouaknine, S. A. Seshia, and O. Strichman. Abstraction-based
satisfiability solving of Presburger arithmetic. In CAV 2004, 2004.
13. B. Liblit, M. Naik, A. X. Zheng, A. Aiken, and M. I. Jordan. Public deployment
of cooperative bug isolation. In Workshop on Remote Analysis and Measurement
of Software Systems (RAMSS), 2004.
14. G. Lowe. An attack on the Needham-Schroeder public-key authentication protocol.
Information Processing Letters, pages 131–133, 1995.
15. K. L. McMillan. Applications of Craig interpolants in model checking. In TACAS
2005, pages 1–12, 2005.
16. J. Merrill. GENERIC and GIMPLE: A New Tree Representation for Entire Functions. In Proceedings of the GCC Developers Summit3, pages 171–180, May 25-27,
2003.
17. D. Monniaux. An abstract Monte-Carlo method for the analysis of probabilistic
programs. In Proc. 28th ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages, pages 93–101. ACM Press, 2001.
18. A. M. Mood, F.A. Graybill, and D.C. Boes. Introduction to the Theory of Statistics.
McGraw-Hill Series in Probability and Statistics, 1974.
19. D. Novillo. Tree SSA: A New Optimization Infrastructure for GCC. In Proceedings
of the GCC Developers Summit3, pages 181–193, May 25-27, 2003.
20. G. Rothermel and M. J. Harrold. Empirical studies of a safe regression test selection
technique. Software Engineering, 24(6):401–419, 1998.
21. Stony Brook University. GCC Open-Source Software Model-Checking Tool Kit.
http://www.cs.sunysb.edu/~gmc.
22. M. Vardi and P. Wolper. An automata-theoretic approach to automatic program
verification. In Proc. IEEE Symposium on Logic in Computer Science, pages 332–
344, 1986.
15
View publication stats