Spring20 Lecture8 StructuralTesting
Spring20 Lecture8 StructuralTesting
Spring20 Lecture8 StructuralTesting
Gregory Gay
DIT635 - February 14, 2020
Every developer must answer:
Are our tests are any good?
2
Have We Done a Good Job?
What we want:
• We’ve found all the faults.
• Impossible.
What we (usually) get:
• We compiled and it worked.
• We run out of time or budget.
• Inadequate.
3
Test Adequacy Metrics
Instead - can we compromise between the
impossible and the inadequate?
4
(In)Adequacy Metrics
• We do not know what faults exist before testing, so
we rely on an approximation of “we found all of the
faults”.
• Criteria identify inadequacies in the tests.
• If the test does not reach a statement, it is inadequate for
finding faults in that statement.
• If the requirements discuss two outcomes of a function,
but the tests only cover one, then the tests are
inadequate for verifying that requirement.
5
Adequacy Metrics
• Adequacy Metrics based on coverage of factors
correlated to finding faults (hopefully).
• Widely used in industry - easy to understand, cheap to
calculate, offer a checklist.
• Some metrics based on coverage of requirement
statements, used for verification.
• Majority based on exercising elements of the source code
in ways that might trigger faults.
• This is the basis of structural testing.
6
We Will Cover
• Structural Testing:
• Derive tests from the program structure, directed by a
chosen adequacy metric.
• Common structural coverage metrics:
• Statement coverage
• Branch coverage
• Condition coverage
• Path coverage
7
Structural Testing
• The structure of the software itself is a valuable
source of information.
• Structural testing is the practice of using that
structure to derive test cases.
• Sometime called white-box testing
• Functional = black-box.
8
Structural Testing
• Uses a family of metrics while (*eptr){
that define how and what char c;
code is to be executed. c = *eptr;
• Goal is to exercise a if(c == ‘+’){
*dptr = ‘ ‘;
certain percentage of the
} else{
code. *dptr = *eptr;
• Why?? }
}
9
The basic idea:
You can’t find all of the
faults without exercising all
of the code.
10
Structural Testing - Motivation
• Requirements-based tests should execute most
code, but will rarely execute all of it.
• Helper functions
• Error-handling code
• Requirements missing outcomes
• Structural testing compliments functional testing by
requiring that code elements are exercised in
prescribed ways.
11
Structural Does Not Replace Functional
• Structural testing should not be the basis for “How
do I choose tests?”
• Structure-based tests do not directly make an argument
for verification or expose missing functionality.
• Structural testing is useful for supplementing functional
tests to help reveal faults.
• Functional tests are good at exposing conceptual faults.
Structural tests are good at exposing coding mistakes.
12
Structural Testing Usage
Take code, derive information about
Test Inputs
structure, use obligation information to:
• Create Tests Tests Derives
13
Control and Data Flow
• We need context on how system executes.
• Code is rarely sequential - conditional statements
result in branches in execution, jumping between
blocks of code.
• Control flow is information on how control passes
between blocks of code.
• Data flow is information on how variables are used
in other expressions.
14
Control-Flow Graphs
• A directed graph representing
the flow of control through the i=0
program.
• Nodes represent sequential i<N
blocks of program commands. True
False A[i]<0
• Edges connect nodes in the True
sequence they are executed. False A[i] = - A[i];
Multiple edges indicate return(1)
i++
conditional statements (loops,
if statements, switches).
15
Structural Coverage Criteria
• Criteria based on exercising of:
• Statements (nodes of CFG)
• Branches (edges of CFG)
• Conditions
• Paths
• … and many more
• Measurements used as (in)adequacy criteria
• If significant parts of the program are not tested, testing is
surely inadequate.
16
Statement Coverage
• The most intuitive criteria. Did we execute every
statement at least once?
• Cover each node of the CFG.
• The idea: a fault in a statement cannot be revealed
unless we execute the statement.
• Coverage = Number of Statements Covered
Number of Total Statements
17
Statement Coverage
int flipSome(int A[], int N, int X) i=0
{
int i=0;
while (i<N and A[i] <X) i<N and A[i] <X
{ True
if (A[i]<0) False A[i]<0
A[i] = - A[i]; True
i++; False A[i] = - A[i];
}
return(1); return(1)
} i++
21
Subsumption
• Coverage metric (A) subsumes another metric
(B) if, for every program P, every test suite
satisfying A also satisfies B with respect to P.
• If we satisfy A, there is no point in measuring B.
• Branch coverage subsumes statement coverage.
• Covering all edges requires covering all nodes in a graph.
22
Subsumption
• Shouldn’t we always choose the stronger metric?
• Not always…
• Typically require more obligations (so, you have to come up with
more tests)
• Or, at least, tougher obligations - making it harder to come up with
the test cases.
• May end up with a large number of unsatisfiable obligations
23
Branch Coverage
int flipSome(int A[], int N, int X) i=0
{
int i=0;
while (i<N and A[i] <X) i<N and A[i] <X
{ True
if (A[i]<0) False A[i]<0
A[i] = - A[i]; True
i++; False A[i] = - A[i];
}
return(1); return(1)
} i++
25
Decisions and Conditions
• A decision is a complex Boolean expression.
• Made up of conditions connected with Boolean operators
(and, or, xor, not):
• Simple Boolean connectives.
• Boolean variables: Boolean b = false;
• Subexpressions that evaluate to true/false involving (<, >, <=, >=, ==,
and !=): Boolean x = (y < 12);
26
Decision Coverage
• Branch Coverage deals with a subset of decisions.
• Branching decisions that decide how control is routed
through the program.
• Decision coverage requires that all boolean
decisions evaluate to true and false.
• Coverage = Number of Decisions Covered
Number of Total Decisions
27
Basic Condition Coverage
• Several coverage metrics examine the individual
conditions that make up a decision.
• Identify faults in decision statements.
(a == 1 || b == -1) instead of (a == -1 || b == -1)
29
Basic Condition Coverage
int flipSome(int A[], int N, int X) i=0
{
int i=0;
while (i<N and A[i] <X) i<N and A[i] <X
{ True
if (A[i]<0) False A[i]<0
A[i] = - A[i]; True
i++; False A[i] = - A[i];
}
return(1); return(1)
} i++
1 True True
(A and 4
5
True
True
True
False
False
True
False
True
6 True False True False
(B and 7
8
True
True
False
False
False
False
True
False
9 False True True True
(C and 10
11
False
False
True
True
True
False
False
True
12 False True False False
D)))) 13
14
False
False
False
False
True
True
True
False
15 False False False True
16 False False False False
32
Short-Circuit Evaluation
• In many languages, if the first condition determines
the result of the entire decision, then fewer tests are
required.
• If A is false, B is never evaluated.
Test Case A B
1 True True
(A and B) 2 True False
3 False -
33
Modified Condition/Decision Coverage(MC/DC)
• Requires:
• Each condition evaluates to true/false
• Each decision evaluates to true/false
• Each condition shown to independently affect outcome
of each decision it appears in.
Test Case A B (A and B)
1 True True True
2 True False False
3 False True False
4 False False False
34
Let’s take a break.
35
Activity
Draw the CFG and write tests that provide statement, branch,
and basic condition coverage over the following code:
int search(string A[], int N, string what){
int index = 0;
if ((N == 1) && (A[0] == what)){
return 0;
} else if (N == 0){
return -1;
} else if (N > 1){
while(index < N){
if (A[index] == what)
return index;
else
index++;
}
}
return -1;
}
36
Activity
index=0
False
False False
(N==1) && N==0 N>1 return -1;
(A[0] = what)
True False
True index
True
<N
return 0; return -1; True
A[index] False
== what
index++;
37
Activity - Possible Solution
index=0
38
Path Coverage
• Other criteria focus on single elements.
• However, all tests execute a sequence of elements - a
path through the program.
• Combination of elements matters - interaction sequences
are the root of many faults.
• Path coverage requires that all paths through the
CFG are covered.
• Coverage = Number of Paths Covered
Number of Total Paths
39
Path Coverage
int flipSome(int A[], int N, int X) i=0
{
int i=0;
while (i<N and A[i] <X) i<N and A[i] <X
{ True
if (A[i]<0) False A[i]<0
A[i] = - A[i]; True
i++; False A[i] = - A[i];
}
return(1); return(1)
} i++
loop <= 20
41
Number of Tests
Path coverage for that loop bound requires:
3,656,158,440,062,976 test cases
43
Boundary Interior Coverage
• Need to partition the infinite set of paths into a finite
number of classes.
• Boundary Interior Coverage groups paths that
differ only in the subpath they follow when
repeating the body of a loop.
• Executing a loop 20 times is a different path than
executing it twice, but the same subsequences of
statements repeat over and over.
44
Boundary Interior Coverage
A B -> M A
B B
F L
F G G
B -> C -> D -> G -> H -> L -> BB
H I L H I
L B L L
B -> C -> D -> G -> I -> L -> B
B B
45
Number of Paths
• Boundary Interior Coverage
removes the problem of infinite if (a) S1;
loop-based paths. if (b) S2;
• However, the number of paths if (c) S3;
through this code can still be …
exponential. if (x) SN;
• N non-loop branches results in 2N
paths.
• Additional limitations may need to
be imposed on the paths tested.
46
Loop Boundary Coverage
• Focus on problems related to loops.
• Cover scenarios representative of how loops might be executed.
• For simple loops, write tests that:
• Skip the loop entirely.
• Take exactly one pass through the loop.
• Take two or more passes through the loop.
• (optional) Choose an upper bound N, and:
• M passes, where 2 < M < N
• (N-1), N, and (N+1) passes
47
Nested Loops
• Often, loops are nested within other loops.
• For each level, you should execute similar strategies to
simple loops.
• In addition:
• Test innermost loop first with outer loops executed
minimum number of times.
• Move one loops out, keep the inner loop at “typical”
iteration numbers, and test this layer as you did the
previous layer.
• Continue until the outermost loop tested.
48
Concatenated Loops
• One loop executes. The next line of code starts a
new loop.
• These are generally independent.
• Most of the time...
• If not, follow a similar strategy to nested loops.
• Start with bottom loop, hold higher loops at minimal iteration
numbers.
• Work up towards the top, holding lower loops at “typical”
iteration numbers.
49
Why These Loop Strategies?
• In proving formal correctness of a loop, we would establish
preconditions, postconditions, and invariants that are true on
each execution of the loop, then prove that these hold.
• The loop executes zero times when the postconditions
are true in advance.
• The loop invariant is true on loop entry (one), then each
loop iteration maintains the invariant (many).
• (invariant and !(loop condition) implies postconditions)
• Loop testing strategies echo these cases.
50
The Infeasibility Problem
Sometimes, no test can satisfy an obligation.
• Impossible combinations of conditions.
• Unreachable statements as part of defensive
programming.
• Error-handling code for conditions that can’t actually
occur in practice.
• Dead code in legacy applications.
• Inaccessible portions of off-the-shelf systems.
51
The Infeasibility Problem
Stronger criteria call for potentially infeasible
combinations of elements.
(a > 0 && a < 10)
It is not possible for both conditions to be false.
52
The Infeasibility Problem
How this is usually addressed:
• Adequacy “scores” based on coverage.
• 95% branch coverage, 80% MC/DC coverage, etc.
• Decide to stop once a threshold is reached.
• Unsatisfactory solution - elements are not equally
important for fault-finding.
• Manual justification for omitting each
impossible test obligation.
• Helps refine code and testing efforts.
• … but very time-consuming.
53
In Practice.. Budget Coverage
• Industry’s answer to “when is testing done”
• When the money is used up
• When the deadline is reached
• This is sometimes a rational approach!
• Implication 1:
• Adequacy criteria answer the wrong question. Selection is more
important.
• Implication 2:
• Practical comparison of approaches must consider the cost of test
case selection
54
Which Coverage Metric Should I Use?
Path Coverage
Basic Condition
Branch Coverage
Coverage
Power,
Cost Statement Coverage
55
Activity: Loop-Covering Tests
For the binary-search code:
1. Draw the control-flow graph for the method.
2. Identify the subpaths through the loop and draw the
unfolded CFG for boundary interior testing.
3. Develop a test suite that achieves loop boundary
coverage.
56
CFG
int bott, top, mid;
bott=0; top=size-1;
L = 0;
bott<=top F
&& !found
EXIT
T
T[L] mid=round(top +
== key bott/2);
T F T[mid]
== key F
found=true; found=false; T T bott=mid+1;
T[mid]
found=true; < key
top=mid-1;
L= mid; F
57
CFG
A
E -> EXIT
C D
F
EXIT E
F T J
T
I
F G F
K
H
T
58
CFG
A
E -> EXIT
H E
T
59
CFG
A Tests that execute the loop:
● 0 times key = 1, T = [1], size = 1
● 1 time key = 2, T = [1, 2], size = 2
B ● 2+ times key = 3, T = [1, 2, 3], size = 3
T F
C D
F
EXIT E
F T J
T
I
F G F
K
H
T
60
We Have Learned
• Test adequacy metrics let us “measure” how good
our testing efforts are.
• They prescribe test obligations that can be used to
remove inadequacies from test suites.
• Code structure is used in many adequacy metrics.
Many different criteria, based on:
• Statements, branches, conditions, paths, etc.
61
We Have Learned
• Coverage metrics tuned towards particular types of
faults. Some are theoretically stronger than others,
but are also more expensive and difficult to satisfy.
• Full path coverage is impractical
• However, there are strategies to get the benefits of path
coverage without the cost.
• These strategies are based on covering “important” paths
or subpaths.
62
Next Time
• Exercise Today: Functional Testing
• Next class: Data-Flow Testing
• Optional Reading - Pezze and Young, Chapters 6 and 13
63