Module 4
Module 4
Module 4
Path Testing, Data flow testing, Levels of Testing, Integration Testing DD Paths, Test coverage
metrics, Basis path testing, guidelines and observations, Definition Use testing, Slice based testing,
Guidelines and observations. Traditional view of testing levels, Alternative life cycle models, the
SATM systems, separating integration and system testing, Guidelines and observations.
➢ Definition
Given a program written in an imperative programming language, the program graph is a directed
graph in which nodes are statement fragments and edges represent flow of control.
• If i and j are nodes in the program graph, an edge exists from node i to node j iff the
statement fragment corresponding to node j can be executed immediately after the
statement fragment corresponding to node i.
▪ The groups of statements that make up a node in the Program Graph are called a
basic block.
▪ There is a straightforward algorithm to segment a code fragment into basic blocks
and create the corresponding Program Graph.
o Construction of a program graph from a given program is illustrated here with the pseudo
code implementation of the triangle program and maximum of 3 numbers.
o Line numbers refer to statements and statement fragments. The importance of the program
graph is that program executions correspond to paths from the source to the sink nodes.
o We also need to decide whether to associate nodes with non executable statements such as
variable and type declarations; here we do not.
Example 1
• Nodes 4 through 8 are a sequence, nodes 9 through 12 are an if-then-else construct, and nodes
13 through 22 are nested if-then-else constructs. Nodes 4 and 23 are the program source and
sink nodes, corresponding to the single-entry, single-exit criteria. No loops exist, so this is
directed acyclic graph.
• Test cases force the execution of such program paths. we now have a very explicit description
of the relationship between a test case and the part of the program it exercises.
• We also have an elegant, theoretically respectable way to deal with the potentially large
number of execution paths in a program.
Figure below is a graph of a simple (but unstructured) program; it is typical of the kind of example
used to show the impossibility of completely testing even simple programs. In this program, five
paths lead from node B to node F in the interior of the loop. If the loop may have to 18 repetitions,
some 4.77 trillion distinct program execution paths exist.
Example 2
{
1. int max;
2. if( a > b && a > c)
3. max=a;
4. else if(b > c)
5. max=b;
6. else max=c;
Module 4
7. return max;
}
Example 3
✓ DD Paths
Module 4
Definition
A DD-Path is a sequence of nodes in a program graph such that:
Case 1: It consists of a single node with indeg = 0.
Case 2: It consists of a single node with outdeg = 0.
Case 3: It consists of a single node with indeg ≥ 2 or outdeg ≥ 2.
Case 4: It consists of a single node with indeg =1 and outdeg=1.
Case 5: It is a maximal chain of length ≥ 1.
Definition: Given a program written in an imperative language, its DD-Path graph is a labeled
directed graph, in which nodes are DD-Paths of its program graph, and edges represent control
flow between successor DD-Paths.
• DD-Path is a condensation graph. For example, 2-connected program graph nodes are
collapsed to a single DD-Path graph node as shown in the table *1 and below figure.
Module 4
• The motivation of using DD-paths is that they enable very precise descriptions of test
coverage.
• In our quest to identify gaps and redundancy in the test cases as these are used to exercise
(test) different aspects of a program, we use formal models of the program structure to
reason about testing effectiveness.
• Test coverage metrics are a device to measure the extent to which a set of test cases
covers a program.
• Several widely accepted test coverage metrics are used; most of those are in Table below
(Miller, 1977).
• Having an organized view of the extent to which a program is tested makes it possible to
sensibly manage the testing process.
• Most quality organizations now expect the C1 metric (DD-Path coverage) as the minimum
acceptable level of test coverage. Less adequate, the statement coverage metric (C0) is still
widely accepted
Module 4
• Statement coverage based testing aims to devise test cases that collectively exercise
all statements in a program - Co
• Predicate coverage (or branch coverage, or decision coverage) based testing aims
to devise test cases that evaluate each simple predicate of the program to True and
False - C1
• Decision coverage is good for exercising faults in the way a computation has been
decomposed into cases.
Module 4
❖ Loop Coverage - C2
• Test cases that exercise the two possible outcomes of the decision of a loop
condition, that is one to traverse the loop and the other to exit (or not enter) the
loop.
• An extension would be to consider a modified boundary value analysis approach
where the loop index is given a minimum, minimum +, a nominal, a maximum
-, and a maximum value or even robustness testing.
• Once a loop is tested, then the tester can collapse it into a single node to simplify
the graph for the next loop tests. In the case of nested loops we start with the inner
most loop and we proceed outwards.
• If loops are knotted / unstructured then we must apply data flow analysis testing
techniques.
Loop Testing
• In simple C1 coverage criterion we are interested simply to traverse all edges in the
DD-Path graph.
Module 4
• CMCC , A more complete extension that includes both the basic condition and
branch adequacy criteria
• CMCC requires a test case T for each possible evaluation of compound conditions.
• For N basic conditions, we need 2N combinations of test cases are required..
• Short-circuit evaluation is effective in reducing the above number to a more
manageable number.
• For a compound predicate P1 (A or B), CMCC requires that each possible
combination of inputs be tested for each decision. Ex: if (A or B) requires,
A = True/False , B = True/False ( 4 Combinations of test cases)
• Exhaustive testing of software is not practical because variable input values and
variable sequencing of inputs result in too many possible combinations to test.
• NIST developed techniques for applying statistical methods to derive sample test
cases would address how to select the best sample of test cases and would provide
a statistical level of confidence or probability that a program implements its
functional specification correctly.
• The goal of statistically significant coverage is to develop methods for software
testing based on statistical methods, such as Multivariable Testing, Design of
Experiments, and Markov Chain usage models, and to develop methods for
software testing based on statistical measures and confidence levels.
A test suite T for a program P satisfies the path adequacy criterion iff ,
• for each path pi of P, there exists at least one test case in T that causes the
execution of pi .
• This is same as stating that every path in the flow graph model of program P is
exercised by at least one test case in T.
• If we consider the paths in a program graph (or DD-Graph) to form a vector space V, we
are interested to devise a subset of V say B that captures the essence of V; that is every
element of V can be represented as a linear combination of elements of B.
• Addition of paths means that one path is followed by another and multiplication of a
number by a path denotes the repetition of a path.
• If such a vector space B contains linearly independent paths and forms a “basis” for V,
then it certainly captures the essence of V.
• Input:
▪ Source code and a path selection criterion
• Process:
▪ Generation/Construction of a CFG using the design or code as a foundation
▪ Determine the Cyclomatic Complexity of the resultant flow graph that compute
measure of the unit's logical complexity
▪ Determine a basis set of linearly independent paths using the measure
▪ Selection of Paths
▪ Prepare test cases that will force execution of each path
▪ Feasibility Test of a Path
Module 4
• Introduced in 1976 by McCabe, is one of the most commonly used metrics in software
development.
• Provides a quantitative measure of the logical complexity of a program in terms of
Cyclomatic number
• Defines the number of independent paths in the basis set
• Provides an upper bound for the number of tests that must be conducted to ensure all
statements have been executed at least once
• The Cyclomatic Complexity of the program V(G), can be computed from its Control
Flow Graph (CFG) G in two ways,
– V(G)=e-n+2p
– V(G)=e-n+p
where e=no. of edges, n=no. of nodes, p=no. of connected regions.
The number of linearly independent paths for the above graph is V(G)=e-n+2p=10-7+2(1)=5 ,
The number of linearly independent circuits for the graph in figure below is V(G)=e-n+p=11-7+1=5
Module 4
1 2
D
B
5 6
E
3 4 7
8 F
C
9 10
G
These paths can be made to look like a vector space by defining notions of addition and scalar
multiplication ie., path addition is simply one path followed by another path, and multiplication
corresponds to repititions of a path.With this formulation, the path A,B,C,B,E,F,G is the basis sum
of p2+p3-p1, and the path A,B,C,B,C,B,C,G is the linear combination of 2p2-p1.
The entries in this table below are obtained by following a path and noting which edges are
traversed. Eg. Path p1 traverses edges 1, 4, and 9, while path p2 traverses the following edge
sequence: 1, 4 ,3, 4, 9. Because edge 4 is traversed twice by path p2, the entry for the edge column
is made 2. The table is called as incidence matrix.
Module 4
Path/Edge Traversal
Path/Edge Traversed 1 2 3 4 5 6 7 8 9 10
P1: A, B, C, G 1 0 0 1 0 0 0 0 1 0
P2: A, B, C, B, C, G 1 0 1 2 0 0 0 0 1 0
P3: A, B, E, F, G 1 0 0 0 1 0 0 1 0 1
P4: A, D, E, F, G 0 1 0 0 0 1 0 1 0 1
P5: A, D, F, G 0 1 0 0 0 0 1 0 0 1
ex1: A,B,C,B,E,F,G 1 0 1 1 1 0 0 1 0 1
ex2: A,B,C,B,C,B,C,G 1 0 2 3 0 0 0 0 1 0
The independence of the paths p1 to p5 can be checked by examining the first five rows of the
incidence matrix. The bold entries show edges that appear in exactly one path, so paths p2 to p5
are independent. Path p1 is independent of all of these, because any attempt to express p1 in terms
of the others introduce unwanted edges, so all five paths are independent. At this point, we might
check the linear combinations of the two example paths.
McCabe next develops an algorithmic procedure called the baseline method to determine a
set of basis paths. The method begins with the selection of a baseline path, which should
correspond to some “normal case” program execution. This can be somewhat arbitrary; McCabe
advises choosing a path with as many decision nodes as possible. Next, the baseline path is
retraced, and in turn each decision is “flipped”; that is, when a node of outdegree ≥ 2 is reached, a
different edge must be taken.
Example _1
Module 4
Example _2
Path p2 is infeasible, because paasing through node D means the sides are not triangle; so the
outcome of the decision at node F must be node G. Simmilarly, in p3, passing through node C
means the sides do form a triangle, so node G cannot be traversed. Other paths are feasible and
produce corresponding results.
We can identify two rules:
• If node C is traversed, then we must traverse node H.
• If node D is traversed, then we must traverse node G.
The logical dependencies reduce the size of a basis set when basis paths must be feasible as
shown below.
Example 1
Example 2
Module 4
✓ Essential Complexity
1
3
2
5
If … Then While … do
… …
X = 1; … …
Do While loop
(Repeat until):
do
statements
While …
Finally we have graph with cyclomatic complexity V(G)=1, ie., when program is well structured it can
always be reduced to a graph with one path.
Module 4
Branching
Out of a Loop
Figure shows the relationship among S, P, and T. Region 1 is most desirable— it contains
specified behaviors that are implemented by feasible paths. Region 2 and 6 must be empty. Region
3 contains feasible paths that correspond to unspecified behaviors. Region 4 and 7 contains
infeasible paths. Region 5 corresponds to specified behaviors that have not been implemented.
Region 7 is unspecified, infeasible, yet topologically possible paths.
Specified Programmed
Behaviors Behaviors
(Feasible Paths)
S 2 P
5 6
1
4 3
To Topologically posssible paths
T
Module 4
Module 4
Data flow testing is a term which is having no connection with dataflow diagrams. Data flow
testing refers to forms of structural testing that focus on the points at which variables receive values
and the points at which these values are used (or referenced). We will see that data flow testing
serves as a reality check on path testing.
Most programs deliver functionality in terms of data. Variables that represent data somehow
receive values, and these values are used to compute values for other variables. Since early 1960s,
programmers have analyzed source code in terms of the points (statements) at which variables
receive values and points at which these values are used. Early data flow analysis often centered
on a set of faults that are now known as define/reference anomalies:
Each of these anomalies can be recognized from the concordance of a program. Because the
concordance information is compiler generated, these anomalies can be discovered by what is
known as static analysis: finding faults in source code without executing it.
1. Define/Use testing
– paths to the locations and properties of references to variables within the program
code.
– a program can be analyzed in terms of
• how the variables are affected,
• assigned and
Module 4
➢ Define/Use Testing
Much of the formalization of define/use testing was done in the early 1980s. It presumes a program
graph in which nodes are statement fragments (a fragment may be an entire statement), and
programs that follow the structured programming precepts.
The following definitions refer to a program P that has a program graph G(P), and a set of program
variables V.
Definition
Node n ∈ G (P) is a defining node of the variable v ∈ V, written as DEF (v, n), iff the value of the
variable v is defined at the statement fragment corresponding to node n.
Ex : Input (x)
Ex : x = 20
Input statements, assignment statements, loop control statements, and procedure calls are all
examples of statements that are defining nodes.
When the code corresponding to such statements executes, the contents of the memory location(s)
associated with the variables are changed.
Definition
Node n ∈ G (P) is a usage node of the variable v ∈ V, written as USE (v, n), iff the value of the
variable v is used at the statement fragment corresponding to node n.
Module 4
Output statements, assignment statements, conditional statements, loop control statements, and
procedure calls are all examples of statements that are usage nodes.
When the code corresponding to such statements executes, the contents of the memory location(s)
associated with the variables remain unchanged.
Definition
A usage node USE(v, n) is a predicate use (denoted as P-use) iff the statement n is a predicate
statement; otherwise USE(v, n) is a computation use , (denoted C-use).
The nodes corresponding to predicate uses always have an out degree ≥ 2, and nodes corresponding
to computation uses always have out degree ≤ 1.
Definition
Definition-use (du path) : A definition-use path with respect to a variable v (denoted du-path) is
a path in PATHS(P) such that, for some v ∈ V, there are define and usage nodes DEF(v, m) and
USE(v, n), where m and n are the initial and final nodes of the path.
Definition
Testers have to notice how these definitions capture the essence of computing with stored data
values. Du-paths and dc-paths describe the flow of data across source statements from points at
which the values are defined to points at which the values are used. Du-paths that are not
definition-clear are potential trouble spots.
✓ Example
This program computes the commission on the sales of the total numbers of locks, stocks, and
barrels sold. The While-loop is a classical sentinel controlled loop in which a value of -1 for locks
signifies the end of the sales data. The totals are accumulated as the data values are read in the
while loop. After printing this preliminary information, the sales value is computed, using the
constant item prices defined at the beginning of the program. The sales value is then used to
compute the commission in the conditional portion of the program.
Figure below shows the decision-to-decision path (DD-Path) graph of the program graph given
above. More compression exists in this DD-Path graph because of the increased computation in
the commission problem. Table below details the statement fragments associated with DD-Paths.
Some DD-Paths are combined to simplify the graph.
• Table below lists define and usage nodes for the variables in the commission problem.
We use this information in conjunction with the program graph to identify various
definition-use and definition-clear paths.
• Whether or not non-executable statements such as constant and variable declaration
statements should be considered as defining nodes.
• We will refer to the various paths as sequences of node numbers
Module 4
H I
F
P1 = <13, 14>
P2 = <13, 14, 15, 16>
P3 = <19, 20, 14>
P4 = <19, 20, 14, 15, 16>
• Du- paths p1 and p2 refer to the priming value of locks, which is read at node 13: locks has a
predicate use in the While Statement (node 14), and if the condition is true (as in path p2), a
computation use at statement 16.
• The other two du-paths start near the end of the While loop and occur when the loops repeats.
• These four paths provide the loop coverage – bypass the loop, begin the loop, repeat the loop, and
exit the loop. All these du-paths are definition-clear.
P7 = <10,11,12,13,14,15,16,17,18,19,20,14,21,22,23,24>
P7 = <p6, 22, 23, 24>
• Only one defining node is used for sales; therefore, all the du-paths with respect to sales must be
definition-clear. They are interesting because they illustrate predicate and computation uses. The
first three du-paths are easy:
• Path p12 is a definition-clear path with three usage nodes; it also contains paths p10 and p11.
• The IF, ELSE IF logic in statements 29 through 40 highlights an ambiguity in the original research.
Two choices for du-paths begin with path p11: one choice is the path<27,28,29,30,31,32,33>, and
the other is the path<27,28,29,34>. The remaining du-paths for sales are:
In the following definitions, T is a set of paths in the program graph G(P) of a program P, with the set V of
variables. In the next definitions, we assume that the define/use paths are all feasible.
Definition
The set T satisfies the All-Defs criterion for the program P iff for every variable v ∈ V, T contains
definition-clear paths from every defining node of v to a use of v.
Definition
The set T satisfies the All-Uses criterion for the program P iff for every variable v ∈ V, T contains
definition-clear paths from every defining node of v to every use of v, and to the successor node of each
USE(v,n).
Definition
The set T satisfies the All-P-Uses /Some C-Uses criterion for the program P iff for every variable v ∈ V, T
contains definition-clear paths from every defining node of v to every predicate use of v, and if a definition
of v has no P-uses, there is a definition-clear path to at least one computation use.
Definition
The set T satisfies the All-C-Uses /Some P-Uses criterion for the program P iff for every variable v ∈ V, T
contains definition-clear paths from every defining node of v to every computation use of v, and if a
definition of v has no C-uses, there is a definition-clear path to at least one predicate use.
Module 4
Definition
The set T satisfies the All-DU-paths criterion for the program P iff for every variable v ∈ V, T contains
definition-clear paths from every defining node of v to every use of v, and to the successor node of each
USE(v, n), and that these paths are either single loop traversals or they are cycle free.
These test coverage metrics have several set-theory based relationships, which are referred to as
subsumption. These relationships are shown in Figure below.
➢ Slice-Based Testing
• Program slices have surfaced and submerged in software engineering literature since the
early 1980s. Informally, a program slice is a set of program statements that contribute to,
or affect a value for a variable at some point in the program.
• We continue with the notation we used for define-use paths: a program P that has a program
graph G(P), and a set of program variables V.
Definition
Given a program P, and a set V of variables in P, a slice on the variable set V at statement n,
written S(V,n), is the set of all statements in P that contribute to the values of variables in V. at
node n.
Module 4
Listing elements of a slice S(V,n) will be cumbersome, because the elements are program
statement fragments. Since it is much simpler to list fragment numbers in P(G), we make the
following trivial change:
Definition
Given a program P, and a program graph G(P) in which statements and statement fragments are
numbered, and a set V of variables in P, the slice on the variable set V at statement fragment n,
written S(V,n), is the set of node numbers of all statement fragments in P prior to n that contribute
to the values of variables in V at statement fragment n.
• The idea of slices is to separate a program into components that have some useful meaning.
• Slice captures the execution time behavior of a program with respect to the variable(s) in
the slice.
• Eventually, we will develop a lattice (a directed, acyclic graph) of slices, in which nodes
are slices, and edges correspond to the subset relationship.
• Declarative statements have an effect on the value of a variable. For now, we simply
exclude all non-executable statements. The notion of contribution is partially clarified by
the predicate (P-use) and computation (C-use) usage distinction, but we need to refine these
forms of variable usage. Specifically, the USE relationship pertains to five forms of usage:
P-use used in a predicate (decision)
C-use used in computation
O-use used for output
L-use used for location (pointers, subscripts)
I-use iteration (internal counters, loop indices)
For now, assume that the slice S(V, n) is a slice on one variable, that is, the set V consists of a
single variable, v.
✓ Example
• The commission problem is used because it contains interesting data flow properties.
• Follow these examples while looking at the source code for the commission problem that
we used to analyze in terms of define-use paths.
• Slices on the locks variable show why it is potentially fault-prone. It has a P-use at node
14 and a C-use at node 16, and has two definitions, the I-defs at nodes 13 and 19.
The slices for stocks and barrels are dull. They are short, definition-clear paths contained entirely
within a loop, so they are not affected by iteration of the loop.
The next four slices illustrate how repetition appears in slices. Node 10 is an A-def for total-Locks,
and node 16 contains both an A-def and a C-use. The remaining nodes in S10 (13, 14, 19, and 20)
pertain to the While-loop controlled by locks. Slices S10 and S11 are equal because nodes 21 and
24 are an O-use and a C-use of totalLocks respectively.
The slices on total-stocks and total-barrels are quite similar. They are initialized by A-defs at nodes
11 and12, and then are redefined by A-defs at nodes 17 and 18. Again, the remaining nodes (13,
14, 19 and 20) pertain to the While-loop controlled by locks.
The next six slices demonstrate our convention regarding values defined by assignment statements
(A-defs).
The slices on sales and commission are the interesting ones. There is only one defining node for
sales, the A-def at node 27. The remaining slices on sales show the P-uses, C-uses, and the O-use
in definition-clear paths.
S24: S(sales, 27) = {7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 24, 25, 26, 27}
S25: S(sales, 28) = {7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 24, 25, 26, 27}
S26: S(sales, 29) = {7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 24, 25, 26, 27}
S27: S(sales, 33) = {7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 24, 25, 26, 27}
S28: S(sales, 34) = {7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 24, 25, 26, 27}
S29: S(sales, 37) = {7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 24, 25, 26, 27}
S30: S(sales, 39) = {7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 24, 25, 26, 27}
Think about slice S24 in terms of its components, which are the slices on the C-use variables. We
can write S24 = S10 U S13 U S16 U S21 U S22 U S23 U {27}. Notice how the formalism
corresponds to our intuition: if the value of sales is wrong, we first look at how it is computed,
and if this is OK, we check how the components are computed.
Everything comes together with the slices on commission. There are six A-def nodes for
commission (corresponding to the six du-paths we identified earlier). Three computations of
commission are controlled by P-uses of sales in the IF, ELSE IF logic. This yields three “paths”
of slices that compute commission.
S37: S(commission, 42) = {7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 24, 25, 26,
27, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39}
The slice information improves our insight. Look at the lattice in Figure below, it is a directed
acyclic graph in which slices are nodes, and an edge represents the proper subset relationship.
This lattice is drawn so that the position of the slice nodes roughly corresponds with their position
in the source code. The definition-clear paths <34, 41>, <37, 41>, and <39, 41> correspond to the
edges that show slices S33, S35, and S36 are subsets of slice S37.
Figure below shows a lattice of slices for the entire program. Some slices (those that are identical
to others) have been deleted for clarity.
1. Never make a slice S(V, n) for which variables v of V do not appear in statement
fragment n. As an example, suppose we defined a slice on the locks variable at node
27. Defining such slices necessitates tracking the values of all variables at all points in
the program.
2. Make slices on one variable. The set V in slice S(V, n) can contain several variables,
and sometimes such slices are useful. The slice S(V, 26) where
contains all the elements of the slice S({sales}, 27) expect statement 27
3. Make slices for all A-def nodes. When a variable is computed by an assignment
statement, a slice on the variable at that statement will include all du-paths of the
variables used in the computation. Slice S({sales}, 36) is a good example of an A-def
slice.
4. Make slices for P-use nodes. When a variable is used in a predicate, the slice on that
variable at the decision statement shows how the predicate variable got its value. This
is very useful in decision-intensive programs like the Triangle program and NextDate.
5. Slices on non-P-use usage nodes are not very interesting. We discussed C-use slices in
point 2, where we saw they were very redundant with the A-def slice. Slices on O-use
variables can always be expressed as unions of slices on all the A-defs (and I-defs) of
the O-use variable. Slices on I-use and O-use variables are useful during debugging,
but if they are mandated for all testing, the test effort is dramatically increased.
6. Consider making slices compilable. Nothing in the definition of a slice requires that the
set of statements is compilable, but if we make this choice, it means that a set of
compiler directive and declarative statements is a subset of every slice.
Unitwise Questions
Module 4
VTU Questions
Module 4