Software Testing
Software Testing
: Contributing Author :
Dr. B.N. Subraya
Infosys Technologies Ltd.,
Mysore
Contents
Chapter 1
INTRODUCTION TO SOFTWARE TESTING 1
1.1 Learning Objectives.......................................................................... 1
1.2 Introduction...................................................................................... 1
1.3 What is Testing?............................................................................... 3
1.4 Approaches to Testing....................................................................... 5
1.5 Importance of Testing....................................................................... 6
1.6 Hurdles in Testing............................................................................. 6
1.7 Testing Fundamentals........................................................................ 7
Chapter 2
SOFTWARE QUALITY ASSURANCE 10
2.1 Learning Objectives.......................................................................... 10
2.2 Introduction...................................................................................... 10
2.3 Quality Concepts............................................................................... 11
2.4 Quality of design............................................................................... 12
2.5 Quality of Conformance.................................................................... 12
2.6 Quality Control (QC)......................................................................... 13
2.7 Quality Assurance (QA).................................................................... 13
2.8 Software Quality ASSURANCE (SQA)............................................. 14
2.9 Formal Technical Reviews (FTR)....................................................... 21
2.10 Statistical Quality Assurance.............................................................. 27
2.11 Software Reliability........................................................................... 30
2.12 The SQA Plan.................................................................................. 31
Chapter 3
PROGRAM INSPECTIONS, WALKTHROUGHS AND REVIEWS
QUALITY ASSURANCE 36
3.1 Learning Objectives.......................................................................... 36
3.2 Introduction...................................................................................... 36
3.3 Inspections and Walkthroughs............................................................ 37
3.4 Code Inspections.............................................................................. 38
3.5 An Error Check list for Inspections.................................................... 39
3.6 Walkthroughs.................................................................................... 42
Chapter 4
TEST CASE DESIGN 43
4.1 Learning Objectives.......................................................................... 43
4.2 Introduction...................................................................................... 43
4.3 White Box Testing............................................................................ 44
4.4 Basis Path Testing............................................................................ 45
4.5 Control Structure testing.................................................................... 49
4.6 Black Box Testing............................................................................ . 53
4.7 Static Program Analysis.................................................................... 57
4.8 Automated Testing Tools................................................................... 58
Chapter 5
TESTING FOR SPECIALIZED ENVIRONMENTS 60
5.1 Learning Objectives.......................................................................... 60
5.2 Introduction...................................................................................... 60
5.3 Testing GUIs.................................................................................... 60
5.4 Testing of Client/Server Architectures................................................ 63
5.5 Testing documentation and Help facilities............................................ 63
Chapter 6
SOFTWARE TESTING STRATEGIES 65
6.1. Learning Objectives.......................................................................... 65
6.2. Introduction...................................................................................... 65
6.3 A Strategic Approach To Software Testing......................................... 69
6.4 Verification and Validation.................................................................. 70
6.5 Organizing for software testing.......................................................... 71
6.6 A Software Testing Strategy.............................................................. 72
6.7 Strategic issues................................................................................. 75
6.8 Unit Testing...................................................................................... 75
Chapter 1
Y
ou will learn about:
1.2 INTRODUCTION
Software testing is a critical element of software quality assurance and represents the ultimate process
to ensure the correctness of the product. The quality product always enhances the customer confidence
in using the product thereby increases the business economics. In other words, a good quality product
means zero defects, which is derived from a better quality process in testing.
The definition of testing is not well understood. People use a totally incorrect definition of the word
testing, and that this is the primary cause for poor program testing. Examples of these definitions are such
statements as “Testing is the process of demonstrating that errors are not present”, “The purpose of
testing is to show that a program performs its intended functions correctly”, and “Testing is the process of
establishing confidence that a program does what it is supposed to do”.
Testing the product means adding value to it, which means raising the quality or reliability of the
program. Raising the reliability of the product means finding and removing errors. Hence one should not
test a product to show that it works; rather, one should start with the assumption that the program contains
errors and then test the program to find as many errors as possible. Thus a more appropriate definition is:
Testing is the process of executing a program with the intent of finding errors.
Purpose of Testing
To show the software works: It is known as demonstration-oriented
To show the software doesn’t work: It is known as destruction-oriented
To minimize the risk of not working up to an acceptable level: it is known as evaluation-oriented
l In April of 1999, a software bug caused the failure of a $1.2 billion military satellite launch, the
costliest unmanned accident in the history of Cape Canaveral launches. The failure was the
latest in a string of launch failures, triggering a complete military and industry review of U.S.
space launch programs, including software integration and testing processes. Congressional
oversight hearings were requested.
l On June 4, 1996, the first flight of the European Space Agency’s new Ariane 5 rocket failed
shortly after launching, resulting in an estimated uninsured loss of a half billion dollars. It was
reportedly due to the lack of exception handling of a floating-point error in a conversion from a
64-bit integer to a 16-bit signed integer.
l In January of 2001 newspapers reported that a major European railroad was hit by the aftereffects
of the Y2K bug. The company found that many of their newer trains would not run due to their
inability to recognize the date ’31/12/2000'; the trains were started by altering the control system’s
date settings.
l In April of 1998 a major U.S. data communications network failed for 24 hours, crippling a large
part of some U.S. credit card transaction authorization systems as well as other large U.S.
bank, retail, and government data systems. The cause was eventually traced to a software bug.
l The computer system of a major online U.S. stock trading service failed during trading hours
several times over a period of days in February of 1999 according to nationwide news reports.
The problem was reportedly due to bugs in a software upgrade intended to speed online trade
confirmations.
l In November of 1997 the stock of a major health industry company dropped 60% due to reports
of failures in computer billing systems, problems with a large database conversion, and inadequate
software testing. It was reported that more than $100,000,000 in receivables had to be written
off and that multi-million dollar fines were levied on the company by government agencies.
l Software bugs caused the bank accounts of 823 customers of a major U.S. bank to be credited
with $924,844,208.32 each in May of 1996, according to newspaper
reports. The American Bankers Association claimed it was the largest such error in banking
history. A bank spokesman said the programming errors were corrected and all funds were
recovered.
All the above incidents only reiterate the importance of thorough testing of software applications and
products before they are put on production. It clearly demonstrates that cost of rectifying defect during
development is much less than rectifying a defect in production.
l Review, inspection and verification of documents (Requirements, design documents Test Plans
etc.), code and other work products of software is known as static testing.
l Static testing is found to be the most effective and efficient way of testing.
l Measurements show that a defect discovered during design that costs $1 to rectify at that stage
will cost $1,000 to repair in production. This clearly points out the advantage of early testing.
l Testing should start with small measurable units of code, gradually progress towards testing
integrated components of the applications and finally be completed with testing at the application
level.
l Testing verifies the system against its stated and implied requirements, i.e., is it doing what it is
supposed to do? It should also check if the system is not doing what it is not supposed to do, if
it takes care of boundary conditions, how the system performs in production-like environment
and how fast and consistently the system responds when the data volumes are high.
l Changing requirements - the customer may not understand the effects of changes, or may
understand and request them anyway - redesign, rescheduling of engineers, effects on other
projects, work already completed that may have to be redone or thrown out, hardware
requirements that may be affected, etc. If there are many minor changes or any major changes,
known and unknown dependencies among parts of the project are likely to interact and cause
problems, and the complexity of keeping track of changes may result in errors. Enthusiasm of
engineering staff may be affected. In some fast-changing business environments, continuously
modified requirements may be a fact of life. In this case, management must understand the
resulting risks, and QA and test engineers must adapt and plan for continuous extensive testing
to keep the inevitable bugs from running out of control.
l time pressures - scheduling of software projects is difficult at best, often requiring a lot of
guesswork. When deadlines loom and the crunch comes, mistakes will be made.
l Poorly documented code - it’s tough to maintain and modify code that is badly written or poorly
documented; the result is bugs. In many organizations management provides no incentive for
programmers to document their code or write clear, understandable code. In fact, it’s usually the
opposite: they get points mostly for quickly turning out code, and there’s job security if nobody
else can understand it (‘if it was hard to write, it should be hard to read’).
l Software development tools - visual tools, class libraries, compilers, scripting tools, etc. often
introduce their own bugs or are poorly documented, resulting in added bugs.
Debugging-oriented:
This approach identifies the errors during debugging the program. There is no difference between
testing and debugging.
Demonstration-oriented:
The purpose of testing is to show that the software works. Here most of the time, the software is
demonstrated in a normal sequence/flow. All the branches may not be tested. This approach is mainly to
satisfy the customer and no value added to the program.
Destruction-oriented:
The purpose of testing is to show the software doesn’t work.
It is a sadistic process, which explains why most people find it difficult. It is difficult to design test
cases to test the program.
Evaluation-oriented:
The purpose of testing is to reduce the perceived risk of not working up to an acceptable value.
Prevention-oriented:
It can be viewed as testing is a mental discipline that results in low risk software. It is always better to
forecast the possible errors and rectify it earlier.
In general, program testing is more properly viewed as the destructive process of trying to find the
errors (whose presence is assumed) in a program. A successful test case is one that furthers progress in
this direction by causing the program to fail. However, one wants to use program testing to establish some
degree of confidence that a program does what it is supposed to do and does not do what it is not
supposed to do, but this purpose is best achieved by a diligent exploration for errors.
In a typical service oriented project, about 20-40% of project effort is spent on testing. It is much more
in the case of “human-rated” software.
For example, at Microsoft, tester to developer ratio is 1:1 whereas at NASA shuttle development
center (SEI Level 5), the ratio is 7:1. This shows that how testing is an integral part of Quality assurance.
Defect Distribution
In a typical project life cycle, testing is the late activity. When the product is tested, the defects may be
due to many reasons. It may be either programming error or may be defects in design or defects at any
stages in the life cycle. The overall defect distribution is shown in fig 1.1 .
Design
27%
Rqmts.
Design
Code
Code Other
Rqmts. 7%
56%
Other
10%
Fig 1.1: Software Defect Distribution
l A good test is one that has a high probability of finding an as yet undiscovered error.
The objective is to design tests that systematically uncover different classes of errors and do so with
a minimum amount of time and effort.
Secondary benefits include:
Testing cannot show the absence of defects, it can only show that software defects are present.
Fig 1.2: Test information flow in a typical software test life cycle
l A test configuration includes a test plan and procedures, test cases, and testing tools.
l It is difficult to predict the time to debug the code, hence it is difficult to schedule.
l Testing cannot prove correctness as not all execution paths can be tested.
Consider the following example shown in fig 1.3,
Fig: 1.3
A program with a structure as illustrated above (with less than 100 lines of Pascal code) has about
100,000,000,000,000 possible paths. If attempted to test these at rate of 1000 tests per second, would take
3170 years to test all paths. This shows that exhaustive testing of software is not possible.
QUESTIONS
1. What is software testing? Explain the purpose of testing?
2. Explain the origin of the defect distribution in a typical software development life cycle?
_________
Chapter 2
Y
ou will learn about:
l Software Reliability
2.2 INTRODUCTION
The quality is defined as “a characteristic or attribute of something”. As an attribute of an item,
quality refers to measurable characteristics-things we are able to compare to known standards such as
length, color, electrical properties, malleability, and so on. However, software, largely an intellectual
entity, is more challenging to characterize than physical objects.
Quality design refers to the characteristic s that designers specify for an item. The grade of materials,
tolerance, and performance specifications all contribute to the quality of design.
Quality of conformance is the degree to which the design specification s are followed during
manufacturing. Again, the greater the degree of conformance, the higher the level of quality of
conformance.
Formal
Technical Measurement
Software Review
Engineering
Methods
Quality
Standards SCM
And & & SQA Testing
And
Figure 2.1: Achieving Software Quality
l Quality
l Quality control
l Quality assurance
l Cost of quality
1. Cyclomatic complexity
2. Cohesion
4. Lines of code
When we examine an item based on its measurable characteristics, two kinds of quality may be
encountered:
l Quality of design
l Quality of conformance
l Prevention
l Appraisal
l Failure
Prevention costs include
q Quality Planning
q Test Equipment
q Training
Appraisal costs include activity to gain insight into product condition the “First time through” each
process.
Examples for appraisal costs include:
l Testing
Failure Costs are costs that would disappear if no defects appeared before shipping a product to
customer. Failure costs may be subdivided into internal and external failure costs.
Internal failure costs are costs incurred when we detect an error in our product prior to shipment.
Internal failure costs includes
l Rework
l Repair
External failure costs are the cost associated with defects found after the product has been shipped to
the customer.
Examples of external failure costs are
1. Complaint Resolution
2. Product return and replacement
3. Helpline support
4. Warranty work
The SQA group serves as the customer in-house representative. That is the people who perform SQA
must look at the software from customer’s point of views.
The SQA group attempts to answer the questions asked below and hence ensure the quality of software.
The questions are
2. Have technical disciplines properly performed their role as part of the SQA activity?
SQA Activities
SQA Plan is interpreted as shown in Fig 2.2
SQA is comprised of a variety of tasks associated with two different constituencies
1. The software engineers who do technical work like
QA activities performed by SE team and SQA are governed by the following plan.
l Evaluation to be performed.
SQA Plan
l Audits designated software work products to verify compliance with those defined as part of
the software process.
l Ensures that deviations in software work and work products are documented and handled
according to a documented procedure.
2. Confirm that parts of a product in which improvement is either not desired, or not needed.
3. Achieve technical work of more uniform, or at least more predictable quality that can be achieved
without reviews, in order to make technical work more manageable.
There are many different types of reviews that can be conducted as part of software- engineering like
3. Formal technical review is the most effective filter from a quality assurance standpoint. Conducted
by software engineers for software engineers, the FTR is an effective means of improving
software quality.
Assume that an error uncovered during design will cost 1.0 monetary unit to correct. Relative to this
cost, the same error uncovered just before testing commences will cost 6.5 units; during testing 15 units;
and after release, between 60 and 100 units.
preliminary design, detail design, and coding steps of the software engineering process. The model is
illustrated schematically in Figure 2.3.
A box represents a software development step. During the step, errors may be inadvertently generated.
Review may fail to uncover newly generated errors from previous steps, resulting in some number of
errors that are passed through. In some cases, errors passed through from previous steps, resulting in
some number of errors that are passed through. In some cases errors passed through from previous steps
are amplified (amplification factor, x) by current work. The box subdivisions represent each of these
characteristics and the percent efficiency for detecting errors, a function of the thoroughness of review.
DEVELOPMENT STEP
DEVELOPMENT STEP
Errors from
Errors from previous
previousStep
StepDEFECTS
DEFECTS DETECTION
DETECTION
Errors passed through Percent efficiency for error
Amplified errors 1:x detection
Errors
Newly generated errors passed to
next step
FigureFigure
2.3: 2.3: DefectAmplification
Defect Amplification Model.
Model.
Figure 2.4 illustrates hypothetical example of defect amplification for a software development process
in which no reviews are conducted. As shown in the figure each test step is assumed to uncover and
correct fifty percent of all incoming errors without introducing new errors (an optimistic assumption). Ten
preliminary design errors are amplified to 94 errors before testing commences. Twelve latent defects are
released to the field. Figure 2.5 considers the same conditions except that design and code reviews are
conducted as part of each development step. In this case, ten initial preliminary design errors are amplified
to 24 errors before testing commences.
Only three latent defects exist. By recalling the relative cost associated with the discovery and
correction of errors, overall costs (with and without review for our hypothetical example) can be established.
To conduct reviews a developer must expend time and effort and the development organization must
spend money. However, the results of the preceding or previous, example leave little doubt that we have
encountered a “Pay now or pay much more lately” syndrome.
Formal technical reviews (for design and other technical activities) provide a demonstrable cost benefit
and they should be conducted.
Preliminary design
0
070%
Detail Design
10 3, 2
2
50%
1-1.5
1
25 15 Code/Unit
5 Test
5
24
10 60%
-3
10
25
To
Integration Test integration
24
12
070%
Validation test
10
2
6
50%
1-1.5
25 System Test
3
060%
0
Latent errors
Figure2.4: Defect Amplification -No Reviews
Preliminary design
0
00%
Detail Design
10 10,
6
6
4 x 0%
1.5
4
x = 1.5 37 Code/Unit
Test
25
10
94
20%
27x3
x=3
25
To
integration
Integration Test
94
47
050%
Validation test
10
2
24
50%
1-1.5
25 System Test
12
060%
0
Latent errors
Figure 2.5: Defect Amplification - Reviews Conducted
l To uncover errors in function, logic, are implementations for any representation of the software.
In addition, the FTR serves as a training ground, enabling junior engineers to observe different approaches
to software analysis, design, and implementation. The FTR also serves to promote backup and continuity
because numbers of people become familiar with parts of the software that they may not have other wise
seen.
The FTR is actually a class of reviews that include walkthrough inspection and round robin reviews,
and other small group technical assessments of software. Each FTR is conducted as meeting and will be
successful only if it is properly planned, controlled and attended.
It should be noted that Humphrey [1995] has developed a review method, called Personal
Review (PR), which is similar to desk checking. In PR, each programmer examines his own
products to find as many defects as possible utilizing a disciplined process in conjunction with
Humphrey’s Personal Software Process (PSP) to improve his own work. The review strategy
includes the use of checklists to guide the review process, review metrics to improve the process,
and defect causal analysis to prevent the same defects from recurring in the future. The approach
taken in developing the Personal Review process is an engineering one; no reference is made in
Humphrey [1995] to cognitive theory.
2. Peer Rating is a technique in which anonymous programs are evaluated in terms of their
overall quality, maintainability, extensibility, usability and clarity by selected programmers who
have similar backgrounds [Myers 1979]. Shneiderman [1980] suggests that peer ratings of
programs are productive, enjoyable, and non-threatening experiences. The technique is often
referred to as Peer Reviews [Shneiderman 1980], but some authors use the term peer reviews
for generic review methods involving peers [Paulk et al 1993; Humphrey 1989].
3. Walkthroughs are presentation reviews in which a review participant, usually the software
author, narrates a description of the software and the other members of the review group
provide feedback throughout the presentation [Freedman and Weinberg 1990; Gilb and Graham
1993]. It should be noted that the term “walkthrough” has been used in the literature variously.
Some authors unite it with “structured” and treat it as a disciplined, formal review process
[Myers 1979; Yourdon 1989; Adrion et al. 1982]. However, the literature generally describes
walkthrough as an undisciplined process without advance preparation on the part of reviewers
and with the meeting focus on education of participants [Fagan 1976].
4. Round-robin Review is a evaluation process in which a copy of the review materials is made
available and routed to each participant; the reviewers then write their comments/questions
concerning the materials and pass the materials with comments to another reviewer and to the
moderator or author eventually [Hart 1982].
5. Inspection was developed by Fagan [1976, 1986] as a well-planned and well-defined group
review process to detect software defects – defect repair occurs outside the scope of the
process. The original Fagan Inspection (FI) is the most cited review method in the literature and
is the source for a variety of similar inspection techniques [Tjahjono 1996]. Among the FI-
derived techniques are Active Design Review [Parnas and Weiss 1987], Phased Inspection
[Knight and Myers 1993], N-Fold Inspection [Schneider et al. 1992], and FTArm [Tjahjono
1996]. Unlike the review techniques previously discussed, inspection is often used to control the
quality and productivity of the development process.
A Fagan Inspection consists of six well-defined phases:
i. Planning. Participants are selected and the materials to be reviewed are prepared and checked
for review suitability.
ii. Overview. The author educates the participants about the review materials through a presentation.
iii. Preparation. The participants learn the materials individually.
iv. Meeting. The reader (a participant other than the author) narrates or paraphrases the review
materials statement by statement, and the other participants raise issues and questions. Questions
continue on a point only until an error is recognized or the item is deemed correct.
B. No vs. Single vs. Multiple Session Reviews. The traditional Fagan Inspection provided for one
session to inspect the software artifact, with the possibility of a follow-up session to inspect
corrections. However, variants have been suggested.
Humphrey [1989] comments that three-quarters of the errors found in well-run inspections are
found during preparation. Based on an economic analysis of a series of inspections at AT&T,
Votta [1993] argues that inspection meetings are generally not economic and should be replaced
with depositions, where the author and (optionally) the moderator meet separately with inspectors
to collect their results.
On the other hand, some authors [Knight and Myers 1993; Schneider et al. 1992] have argued
for multiple sessions, conducted either in series or parallel. Gilb and Graham [1993] do not use
multiple inspection sessions but add a root cause analysis session immediately after the inspection
meeting.
C. Nonsystematic vs. Systematic Defect-Detection Technique Reviews. The most frequently
used detection methods (ad hoc and checklist) rely on nonsystematic techniques, and reviewer
responsibilities are general and not differentiated for single session reviews [Siy 1996]. However,
some methods employ more prescriptive techniques, such as questionnaires [Parnas and Weiss
1987] and correctness proofs [Britcher 1988].
D. Single Site vs. Multiple Site Reviews. The traditional FTR techniques have assumed that the
group-meeting component would occur face-to-face at a single site. However, with improved
telecommunications, and especially with computer support (see item F below), it has become
increasingly feasible to conduct even the group meeting from multiple sites.
E. Synchronous vs. Asynchronous Reviews. The traditional FTR techniques have also assumed
that the group meeting component would occur in real-time; i.e., synchronously. However, some
newer techniques that eliminate the group meeting or are based on computer support utilize
asynchronous reviews.
F. Manual vs. Computer-supported Reviews. In recent years, several computer supported review
systems have been developed [Brothers et al. 1990; Johnson and Tjahjono 1993; Gintell et al.
1993; Mashayekhi et al 1994]. The type of support varies from simple augmentation of the
manual practices [Brothers et al. 1990; Gintell et al. 1993] to totally new review methods [Johnson
and Tjahjono 1993].
The Wheeler et al. [1996] analysis does not specify the relative value of Practitioner Evaluation to
FTR, but two recent economic analyses provide indications.
l Votta [1993]. After analyzing data collected from 13 traditional inspections conducted at AT&T,
Votta reports that the approximately 4% increase in faults found at collection meetings (synergy)
does not economically justify the development delays caused by the need to schedule meetings
and the additional developer time associated with the actual meetings. He also argues that it is
not cost-effective to use the collection meeting to reduce the number of items incorrectly identified
as defective prior to the meeting (“false positives”). Based on these findings, he concludes that
almost all inspection meetings requiring all reviewers to be present should be replaced with
Depositions, which are three person meetings with only the author, moderator, and one reviewer
present.
l Siy [1996]. In his analysis of the factors driving inspection costs and benefits, Siy reports that
changes in FTR structural elements, such as group size, number of sessions, and coordination of
multiple sessions, were largely ineffective in improving the effectiveness of inspections. Instead,
inputs into the process (reviewers and code units) accounted for more outcome variation than
1. Egoless Programming. Gerald Weinberg [1971] began the examination of psychological issues
associated with software review in his work on egoless programming. According to Weinberg,
programmers are often reluctant to allow their programs to be read by other programmers
because the programs are often considered to be an extension of the self and errors discovered
in the programs to be a challenge to one’s self-image. Two implications of this theory are as
follows:
i. The ability of a programmer to find errors in his own work tends to be impaired since he
tends to justify his own actions, and it is therefore more effective to have other people check
his work.
ii. Each programmer should detach himself from his own work. The work should be considered
a public property where other people can freely criticize, and thus, improve its quality; otherwise,
one tends to become defensive, and reluctant to expose one’s own failures.
These two concepts have led to the justification of FTR groups, as well as the establishment
of independent quality assurance groups that specialize in finding software defects in many
software organizations [Humphrey 1989].
2. Role of Management. Another psychological aspect of FTR that has been examined is the
recording of data and its dissemination to management. According to Dobbins [1987], this must
be done in such a way that individual programmers will not feel intimidated or threatened.
3. Positive Psychological Impacts. Hart [1982] observes that reviews can make one more careful
in writing programs (e.g., double checking code) in anticipation of having to present or share the
programs with other participants. Thus, errors are often eliminated even before the actual review
sessions.
4. Group Process. Most FTR methods are implemented using small groups. Therefore, several
key issues from small group theory apply to FTR, such as group think (tendency to suppress
dissent in the interests of group harmony), group deviants (influence by minority), and domination
of the group by a single member. Other key issues include social facilitation (presence of others
boosts one’s performance) and social loafing (one member free rides on the group’s effort)
[Myers 1990]. The issue of moderator domination in inspections is also documented in the
literature [Tjahjono 1996].
Perhaps the most interesting research from the perspective of the current study is that of Sauer
et al. [2000]. This research is unusual in that it has an explicit theoretical basis and outlines a
behaviorally motivated program of research into the effectiveness of software development
technical reviews. The finding that most of the variation in effectiveness of software development
technical reviews is the result of variations in expertise among the participants provides additional
motivation for developing a solid understanding of Formal Technical Review at the individual
level.
It should be noted that all of this work, while based on psychological theory, does not address the issue
of how practitioners actually evaluate software artifacts.
3. Accept the work product provisionally (minor errors have been encountered and must be corrected
but no additional review will be required).
Once the decision made, all FTR attendees complete a sign-off indicating their participation in the
review and their concurrence with the review team findings.
2. To serve as an action item. Checklist that guides the producer as corrections are made. An
issues list is normally attached to the summary report.
It is important to establish a follow up procedure to ensure that item on the issues list have been
properly corrected. Unless this is done, it is possible that issues raised can “fall between the cracks”. One
approach is to assign responsibility for follow up for the review leader. A more formal approach as signs
responsibility independent to SQA group.
l Enunciate problem areas but don’t attempt to solve every problem noted
l Using Pareto principle (80% of the defects can be traced to 20% of all possible causes), isolate
the 20% (the “vital few”)
l Once the vital few causes have been identified, move to correct the problems that have caused
the defects.
This relatively simple concept represents an important step toward the creation of an adaptive software
engineering process in which changes are made to improve those elements of the process that introduce
errors. To illustrate the process, assume that a software development organization collects information on
defects for a period of one year. Some errors are uncovered as software is being developed. Other
defects are encountered after the software has been released to its end user.
Although hundreds of errors are uncovered all can be tracked to one of the following causes.
q Incomplete or Erroneous Specification (IES)
q Misinterpretation of Customer Communication (MCC)
q Intentional Deviation from Specification (IDS)
q Violation of Programming Standards ( VPS )
q Error in Data Representation (EDR)
q Inconsistent Module Interface (IMI)
q Error in Design Logic (EDL)
q Incomplete or Erroneous Testing (IET)
q Inaccurate or Incomplete Documentation (IID)
q Error in Programming Language Translation of design (PLT)
q Ambiguous or inconsistent Human-Computer Interface (HCI)
q Miscellaneous (MIS)
To apply statistical SQA table 2.1 is built. Once the vital few causes are determined, the software
development organization can begin corrective action.
After analysis, design, coding, testing, and release, the following data are gathered.
Ei = The total number of errors uncovered during the ith step in the software
Engineering process
Si = The number of serious errors
Mi = The number of moderate errors
Ti = The number of minor errors
PS = Size of the product (LOC, design statements, pages of documentation at the ith
step
Ws, Wm, Wt = weighting factors for serious, moderate and trivial errors where recommended values
are Ws = 10, Wm = 3, Wt = 1.
The weighting factors for each phase should become larger as development progresses. This rewards
an organization that finds errors early.
At each step in the software engineering process, a phase index, PIi, is computed
PIi = Ws (Si/Ei)+Wm (Mi/Ei)+Wt (Ti/Ei)
The error index EI ids computed by calculating the cumulative effect or each PIi, weighting errors
encountered later in the software engineering process more heavily than those encountered earlier.
EI =S (i x PIi)/PS
= (PIi+2PI2 +3PI3 +iPIi)/PS
The error index can be used in conjunction with information collected in table to develop an overall
indication of improvement in software quality.
MCC 156 17 12 9 68 18 76 17
IDS 48 5 1 1 24 6 23 5
VPS 25 3 0 0 15 4 10 2
EDR 130 14 26 20 68 18 36 8
IMI 58 6 9 7 18 5 31 7
EDL 45 5 14 11 12 3 19 4
IET 95 10 12 9 35 9 48 11
IID 36 4 2 2 20 5 14 3
PLT 60 6 15 12 19 5 26 6
HCI 28 3 3 2 17 4 8 2
MIS 56 6 0 0 15 4 41 9
MTBF = MTTF+MTTR
The acronym MTTF and MTTR are Mean Time To Failure and Mean Time To Repair, respectively.
In addition to reliability measure, we must develop a measure of availability. Software availability is the
probability that a program is operating according to requirements at a given point in time and is defined as:
The MTBF reliability measure is equally sensitive to MTTF and MTTR. The availability measure is
somewhat more sensitive to MTTR an indirect measure of the maintainability of the software.
understand the subtle difference between them. Software reliability uses statistical analysis to determine
the likelihood that a software failure will occur however, the occurrence of a failure does not necessarily
result in a hazard or mishap. Software safety examines the ways in which failure result in condition that
can be lead to mishap. That is, failures are not considered in a vacuum. But are evaluated in the context
of an entire computer based system.
ANSI/IEEE Standards 730-1984 and 983-1986 SQA plans is defined as shown below.
I. Purpose of Plan
II. References
III Management
1. Organization
2. Tasks
3. Responsibilities
IV. Documentation
1. Purpose
2. Required software engineering documents
3. Other Documents
a. Software requirements
b. Designed reviews
c. Software V & V reviews
d. Functional Audits
e. Physical Audit
f. In-process Audits
g. Management reviews
VII. Test
X. Code Control
XIV. Training
20 requirements that must be present for an effective quality assurance system. Because the ISO 9001
standard is applicable in all engineering disciplines, a special set of ISO guidelines have been developed to
help interpret the standard for use in the software process.
The 20 requirements delineated by ISO9001 address the following topic.
1. Management responsibility
2. Quality system
3. Contract review
4. Design control
5. Document and data control
6. Purchasing
9. Process control
20.Statistical techniques
In order for a software organization to become registered to ISO 9001, it must establish policies and
procedure to address each of the requirements noted above and then be able to demonstrate that these
policies and procedures are being followed.
Fig 2.6: The software capability maturity model is used to assess a software company’s maturity at software
development
Level 1: Initial. The software development processes at this level are ad hoc and often chaotic.
There are no general practices for planning, monitoring or controlling the process. The test process is just
as ad hoc as the rest of the process.
Level 2: Repeatable. This maturity level is best described as project level thinking. Basic project
management processes are in place to track the cost, schedule, functionality and quality of the project.
Basic disciplines like software testing practices like test plans and test cases are used.
Level3: Defined: Organizational, not just project specific, thinking comes into play at this level.
Common management and engineering activities are standardized and documented. These standards are
adapted and approved for use in different projects. Test documents and plans are reviewed and approved
before testing begins.
Level4: Managed. At this maturity level, the organization’s process is under statistical control. Product
quality is specified quantitatively beforehand and the software isn’t release until that goal is met.
27% were rated at Level 1, 39% at 2, 23% at 3, 6% at 4, and 5% at 5. (For ratings during the period
1992-96, 62% were at Level 1, 23% at 2, 13% at 3, 2% at 4, and (0.4% at 5.) The median size of
organizations was 100 software engineering/maintenance personnel; 32% of organizations were U.S.
federal contractors or agencies. For those rated at Level 1, the most problematical key process area
was in Software Quality Assurance.
QUESTIONS
1. Quality and reliability are related concepts, but are fundamentally different in a number of ways. Discuss them.
2. Can a program be correct and still not be reliable? Explain.
3. Can a program be correct and still not exhibit good quality? Explain.
4. Explain in more detail, the review technique adopted in Quality Assurance.
Chapter 3
Y
ou will learn about
l Review techniques
3.2 INTRODUCTION
Majority of the programming community worked under the assumptions that programs are written
solely for machine execution and are not intended to be read by people. The only way to test a program
is by executing it on a machine. Weinberg built a convincing strategy that why programs should be read by
people, and indicated this could be an effective error detection process.
Experience has shown that “human testing” techniques are quite effective in finding errors, so much
so that one or more of these should be employed in every programming project. The method discussed in
this Chapter are intended to be applied between the time that the program is coded and the time that
computer based testing begins. We discuss this based on two ways:
l It is generally recognized that the earlier errors are found, the lower are the costs or correcting
the errors and the higher is the probability of correcting the errors correctly.
l Programmers seem to experience a psychological change when computer-based testing
commences.
The general procedure is that the moderator distributes the program’s listing and design specification
to the other participants well in advance of the inspection session. The participants are expected to
familiarize themselves with the material prior to the session. During inspection session, two main activities
occur:
1. The programmer is requested to narrate, statement by statement, the logic of the program.
During the discourse, questions are raised and pursued to determine if errors exist. Experience
has shown that many of the errors discovered are actually found by the programmer, rather than
the other team members, during the narration. In other words, the simple act of reading aloud
one’s program to an audience seems to be a remarkably effective error-detection technique.
2. The program is analyzed with respect to a checklist of historically common programming errors
(such a checklist is discussed in the next section).
It is moderator’s responsibility to ensure the smooth conduction of the proceedings and that the
participants focus their attention on finding errors, not correcting them.
After the session, the programmer is given a list of the errors found. The list of errors is also analyzed,
categorized, ad used to refine the error checklist to improve the effectiveness of future inspections.
The main benefits of this method are;
l The programmers usually receive feedback concerning his or her programming style and choice
of algorithms and programming techniques.
l Other participants are also gain in similar way by being exposed to another programmer’s errors
and programming style.
l The inspection process is a way of identifying early the most error-prone sections of the program,
thus allowing one to focus more attention on these sections during the computer based testing
processes.
Data-Reference Errors
1. Is a variable referenced whose value is unset or uninitialized? This is probably the most frequent
programming error; it occurs in a wide variety of circumstances.
2. For all array references, is each subscript value within the defined bounds of the corresponding
dimension?
3. For all array references, does each subscript have an integer value? This is not necessarily an
error in all languages, but it is a dangerous practice.
4. For all references through pointer or reference variables, is the referenced storage currently
allocated? This is known as the “dangling reference” problem. It occurs in situations where the
lifetime of a pointer is greater than the lifetime of the referenced storage.
5. Are there any explicit or implicit addressing problems if, on the machine being used, the units of
storage allocation are smaller than the units of storage addressability?
6. If a data structure is referenced in multiple procedures or subroutines, is the structure defined
identically in each procedure?
7. When indexing into a string, are the limits of the string exceeded?
Data-Declaration Error
1. Have all variables been explicitly declared? A failure to do so is not necessarily an error, but it
is a common source of trouble.
2. If all attributes of a variable are not explicitly stated in the declaration, are the defaults well
understood?
3. Where a variable is initialized in a declarative statement, is it properly initialized?
4. Is each variable assigned the correct length, type, and storage class?
Computation Errors
1. Are there any computations using variables having inconsistent (e.g. Nonarithmetic) data types?
3. Are there any computations using variables having the same data type but different lengths?
7. Where applicable, can the value of a variable go outside its meaningful range?
8. Are there any invalid uses of integer arithmetic, particularly division? For example, if I is an
integer variable, whether the expression 2*I/2 is equal to I depends on whether I has an odd or
an even value and whether the multiplication or division is performed first.
Comparison Errors
1. Are there any comparisons between variables having inconsistent data types (e.g. comparing a
character string to an address)?
2. Are there any mixed-mode comparisons or comparisons between variables of different lengths?
If so, ensure that the conversion rules are well understood.
3. Does each Boolean expression state what it is supposed to state? Programmers often make
mistakes when writing logical expressions involving “and”, “or”, and “not”.
4. Are the operands of a Boolean operator Boolean? Have comparison and Boolean operators
been erroneously mixed together?
Control-Flow Errors
1. If the program contains a multi way branch (e.g. a computed GO TO in Fortran), can the index
variable ever exceed the number of branch possibilities? For example, in the Fortran statement,
GOTO(200,300,400), I
2. Will every loop eventually terminate? Devise an informal proof or argument showing that each
loop will terminate
4. Is it possible that, because of the conditions upon entry, a loop will never execute? If so, does
this represent an oversight? For instance, for loops headed by the following statements:
DO WHILE (NOTFOUND)
DO I=X TO Z
5. Are there any non-exhaustive decisions? For instance, if an input parameter’s expected values
are 1, 2, or 3; does the logic assume that it must be 3 if it is not 1 or 2? If so, is the assumption
valid?
Interface Errors
1. Does the number of parameters received by this module equal the number of arguments sent by
each of the calling modules? Also, is the order correct?
2. Do the attributes (e.g. type and size) of each parameter match the attributes of each corresponding
argument?
3. Does the number of arguments transmitted by this module to another module equal the number
of parameters expected by that module?
4. Do the attributes of each argument transmitted to another module match the attributes of the
corresponding parameter in that module?
5. If built-in functions are invoked, are the number, attributes, and order of the arguments correct?
6. Does the subroutine alter a parameter that is intended to be only an input value?
Input/Output Errors
1. If files are explicitly declared, are their attributes correct?
3. Is the size of the I/O area in storage equal to the record size?
6. Are there spelling or grammatical errors in any text that is printed or displayed by the program?
3.6 WALKTHROUGHS
The code walkthrough, like the inspection, is a set of procedures and error-detection techniques for
group code reading. It shares much in common with the inspection process, but the procedures are slightly
different, and a different error-detection technique is employed.
The walkthrough is an uninterrupted meeting of one to two hours in duration. The walkthrough team
consists of three to five people to play the role of moderator, secretary (a person who records all errors
found), tester and programmer. It is suggested to have other participants like:
l A programming-language expert,
The initial procedure is identical to that of the inspection process: the participants are given the materials
several days in advance to allow them to study the program. However, the procedure in the meeting is
different. Rather than simply reading the program or using error checklists, the participants “play computer”.
The person designated as the tester comes to the meeting armed with a small set of paper test cases-
representative sets of inputs (and expected outputs) for the program or module. During the meeting, each
test case is mentally executed. That is, the test data are walked through the logic of the program. The
state of the program (i.e. the values of the variables) is monitored on paper or a blackboard.
The test case must be simple and few in number, because people execute programs at a rate that is
very slow compared to machines. In most walkthroughs, more errors are found during the process of
questioning the programmer than are found directly by the test cases themselves.
QUESTIONS
1. Is code reviews are relevant to the software testing? Explain the process involved in a typical code review.
2. Explain the need for inspection and list the different types of code reviews.
3. Consider a program and perform a detailed review and list the review findings in detail.
CHAPTER 4
Y
ou will learn about:
4.2 INTRODUCTION
Software can be tested either by running the programs and verifying each step of its execution against
expected results or by statically examining the code or the document against its stated requirement or
objective. In general, software testing can be divided into two categories, viz. Static and dynamic testing.
Static testing is a non-execution-based testing and carried through by mostly human effort. In static
testing, we test, design, code or any document through inspection, walkthroughs and reviews as discussed
in Chapter 2. Many studies show that the single most cost-effective defect reduction process is the
classic structural test; the code inspection or walk-through. Code inspection is like proof reading and
developers will be benefited in identifying the typographical errors, logic errors and deviations in styles
and standards normally followed.
Dynamic testing is an execution based testing technique. Program must be executed to find the possible
errors. Here, the program, module or the entire system is executed(run) and the output is verified against
the expected result. Dynamic execution of tests is based on specifications of the program, code and
methodology.
2. All logical decisions are exercised for both true and false paths.
3. All loops are executed at their boundaries and within operational bounds.
l May find assumptions about execution paths incorrect, and so make design errors. White box
testing can find these errors.
l Typographical errors are random. Just as likely to be on an obscure logical path as on a mainstream
path.
“Bugs lurk in corners and congregate at boundaries”
On a flow graph:
Fig 4.2: Control flow of a program and the corresponding flow diagram
In Fig 3.3, the statements are numbered and the corresponding nodes also numbered with the same
number. The sample program contains one DO and three nested IF statements.
From the example we can observe that:
l Independent Paths:
1. 1, 8
2. 1, 2, 3, 7b, 1, 8
3. 1, 2, 4, 5, 7a, 7b, 1, 8
4. 1, 2, 4, 6, 7a, 7b, 1, 8
Cyclomatic complexity provides upper bound for number of tests required to guarantee the coverage
of all program statements.
Note: some paths may only be able to be executed as part of another test.
l Rows and columns of the matrix correspond to the number of nodes in the flow graph.
Which is 4.
Some other interesting link weight can be measured by the graph as:
l Relational expression: (E1 op E2), where E1 and E2 are arithmetic expressions. For example,
(x+y) – (s/t), where x, y, s and t are variables.
l Simple condition: Boolean variable or relational expression, possibly proceeded by a NOT operator.
l Compound condition: composed of two or more simple conditions, Boolean operators and
parentheses along with relational operators.
l Mismatch of types
Condition testing methods focus on testing each condition in the program of any type of conditions.
There are many strategies to identify errors.
Some of the strategies proposed include:
l Domain Testing: Uses three or four tests for every relational operator depending on the complexity
of the statement.
l Branch and relational operator testing: Uses condition constraints. Based on the complexity of
the relational operators, many branches will be executed.
Example 1: C1 = B1 & B2
l Condition constraint of form (D1,D2) where D1 and D2 can be true (t) or false(f).
l The branch and relational operator test requires the constraint set {(t,t),(f,t),(t,f)} to be covered
by the execution of C1.
Coverage of the constraint set guarantees detection of relational operator errors.
K: kill the variable, which is another state of the variable at any time of the execution of the program.
Any variable that is part of the program will undergo any of the above states. However, the sequence
of states is important. We can avoid following anomalies during the program execution:
l DU: Normal,
l UK, UU: Normal,
l DD: Suspicious
l DK: Probable bug
l KD: Normal
q (n-1), n, and (n+1) passes through the loop. This helps in testing the boundary of the loops.
l Nested Loops
q Start with inner loop. Set all other loops to minimum values.
l Concatenated Loops
l Unstructured loops
l Black box tests normally determine the quality of the software. It is an advantage to create the
quality criteria from this point of view from the beginning.
l In black box testing, software is subjected to a full range of inputs and the outputs are verified
for their correctness. Here, the structure of the program is immaterial.
l Black box testing technique can be applied once unit and integration testing is completed.
Some of the techniques used for black box testing are discussed below:
l If an input condition specifies a range or a specific value, one valid and two invalid equivalence
classes defined.
l If an input condition specifies a boolean or a member of a set, one valid and one invalid equivalence
classes defined.
Test cases for each input domain data item developed and executed.
This method uses less number of input data compared to exhaustive testing. However, the data for
boundary values are not considered.
This method though reduces significantly the number of input data to be tested, it does not test the
combinations of the input data.
BVA complements equivalence partitioning i.e. select any element in an equivalence class, select
those at the ‘’edge’ of the class.
Examples:
1. For a range of values bounded by a and b, test (a-1), a, (a+1), (b-1), b, (b+1).
2. If input conditions specify a number of values n, test with (n-1), n and (n+1) input values.
3. Apply 1 and 2 to output conditions (e.g., generate table of minimum and maximum size).
4. If internal program data structures have boundaries (e.g., buffer size, table limits), use input data
to exercise structures on boundaries.
BVA and Equivalence partitioning both helps in testing the programs and covers most of the conditions.
This method does not test the combinations of input conditions.
Executive Order 10358 provides in the case of an employee whose work week varies from the normal
Monday through Friday work week, that Labor Day and Thanksgiving Day each were to be observed on
the next succeeding workday when the holiday fell on a day outside the employee’s regular basic work
week. Now, when Labor Day, Thanksgiving Day or any of the new Monday holidays are outside an
employee’s basic workbook, the immediately preceding workday will be his holiday when the non-workday
on which the holiday falls is the second non-workday or the non-workday designated as the employee’s
day off in lieu of Saturday. When the non-workday on which the holiday falls is the first non-workday or
the non-workday designated as the employee’s day off in lieu of Sunday, the holiday observance is moved
to the next succeeding workday.
Simplified symbology:
There are a number of approaches to proving program correctness. We will only consider axiomatic
approach.
Suppose that at points P(1), .. , P(n) assertions concerning the program variables and their relationships
can be made.
The assertion a(1) is about inputs to the program, and a(n) about outputs.
We can now attempt, for k between 1 and (n-1), to prove that the statements between
Given that a(1) and a(n) are true, this sequence of proofs shows partial program correctness. If it can
be shown that the program will terminate, the proof is complete.
4. Undeclared variables
5. Uninitialised variables.
2. Code Auditors
3. Assertion processors
4. Test file generators
5. Test Data Generators
6. Test Verifiers
7. Output comparators.
Programmer can select any tool depending on the complexity of the program.
QUESTION
1. What is black box testing? Explain
2. What are the different techniques are available to conduct black box testing?
3. Explain different methods available in white box testing with examples.
Chapter 5
Testing f or Specialized
Environments
Y
ou will learn about:
5.2 INTRODUCTION
The need for specialized testing approaches is becoming mandatory as computer software has become
more complex. The White-box and black box testing methods are applicable across all environments,
architectures and applications, but unique guidelines and approaches to testing are sometime important.
We address the testing guidelines for specialized environments, architectures, and applications that are
commonly encountered by software engineers.
test engineers. Because of reusable components provided as part of GUI development environments, the
creation of the user interface has become less time consuming and more precise. GUI is becoming
mandatory for any application as users are used to it. Sometime, the user interface may be treated as a
different layer and easily separated from the traditional functional or business layer. The design and
development of user interface layer requires separate design and development methodology. Here the
main problem is to understand the user psychology during the development time. Due to complexity of
GUIs, testing and generating test cases has become more complex and tedious.
Because of modern GUIs standards (same look and feel), common tests can be derived.
What are the guidelines to be followed which helps for creating a series of generic
tests for GUIs?
Guidelines can be categorized into many operations. Some of them are discussed below:
For windows:
l Will the window open properly based on related typed or menu-based commands?
l Does the window properly regenerate when it is overwritten and then recalled?
l Are all functions that relate to the window available when needed?
l Are all functions that relate to the window available when needed?
l Are all relevant pull-down menus, tool bars, scroll bars, dialog boxes, and buttons, icons, and
other controls available and properly represented?
l If multiple or incorrect mouse picks within the window cause unexpected side effects?
l Are audio and/or color prompts within the window or as a consequence of window operations
presented according to specification?
l Does the application menu bar display system related features (e.g. a clock display)
l Are all menu functions and pull-down sub functions properly listed?
l Are all menu functions and pull-down sub functions properly listed?
l Is it possible to invoke each menu function using its alternative text-based command?
l Are menu functions highlighted (or grayed-out) based on the context of current operations
within a window?
l If the mouse has multiple buttons, are they properly recognized in context?
l Do the cursor, processing indicator (e.g. an hour glass or clock), and pointer properly change as
different operations are invoked?
Data entry:
l Are basic standard validation on each data is considered during the data entry itself?
l Once the data is entered completely and if a correction is to be done for a specific data, does the
system requires entering the entire data again?
Documentation testing can be approached in two phases. The first phase, formal technical review,
examines the document for editorial clarity. The second phase, live test, users the documentation in
conjunction with the use of the actual program.
Some of the guidelines are discussed here:
l Does the documentation accurately describe how to accomplish each mode of use?
l Are terminology, menu descriptions, and system responses consistent with the actual program?
l Are the document table of contents and index accurate and complete?
l Is the design of the document (layout, typefaces, indentation, graphics) conducive to understanding
and quick assimilation of information?
l Are all error messages displayed for the user described in more detail in the document?
3. Select your own GUI based software system and test the GUI related functions by using the listed guidelines in
this Chapter.
Chapter 6
Y
ou will learn about:
l Unit Testing
l Integration Testing
l Validation Testing
l System Testing
l Debugging Process
6.2. INTRODUCTION
A strategy for software testing integrates software test case design methods into a well-planned
series of steps that result in the successful construction of software. As important, a software testing
strategy provides a road map for the software developer, the quality assurance organization, and the
customer- a road map that describes the steps to be conducted as part of testing, when these steps are
planned and then undertaken, and how much effort, time, and resources will be required. Therefore any
testing strategy must incorporate test planning, test case design, test execution, and resultant data collection
and evaluation.
A software testing strategy should be flexible enough to promote the creativity and customization that
are necessary to adequately test all large software-based systems. At the same time, the strategy must be
rigid enough to promote reasonable planning and management tracking as the project progresses. Shooman
suggests these issues:
In many ways, testing is an individualistic process, and the number of different types of tests varies as
much as the different development approaches. For many years, our only defense against programming
errors was careful design and the native intelligence of the programmer. We are now in an era in which
modern design techniques are helping us to reduce the number of initial errors that are inherent in the
code. Similarly, different test methods are beginning to cluster themselves into several distinct approaches
and philosophies.
Different test methods begin to cluster into several distinct approaches and philosophies, which is
called strategy.
l Test planning
l A software testing strategy should be flexible enough to promote the creativity and customization
that are necessary to adequately test all large software based systems.
l At the same time, the strategy must be rigid enough to promote reasonable planning and
Management tracking as the project progresses.
Types of Testing
The level of test is the primary focus of a system and derives from the way a software system is
designed and built up. Conventionally this is known as the “V” model, which maps the types of test to
each stage of development.
Component Testing
Starting from the bottom the first test level is “Component Testing”, sometimes called Unit Testing. It
involves checking that each feature specified in the “Component Design” has been implemented in the
component.
In theory an independent tester should do this, but in practise the developer usually does it, as they are
the only people who understand how a component works. The problem with a component is that it
performs only a small part of the functionality of a system, and it relies on co-operating with other parts of
the system, which may not have been built yet. To overcome this, the developer either builds, or uses
special software to trick the component into believing it is working in a fully functional system.
Interface Testing
As the components are constructed and tested they are then linked together to check if they work with
each other. It is a fact that two components that have passed all their tests, when connected to each other
produce one new component full of faults. These tests can be done by specialists, or by the developers.
Interface Testing is not focussed on what the components are doing but on how they communicate
with each other, as specified in the “System Design”. The “System Design” defines relationships between
components, and this involves stating:
System Testing
Once the entire system has been built then it has to be tested against the “System Specification” to
check if it delivers the features required. It is still developer focussed, although specialist developers
known as systems testers are normally employed to do it.
In essence System Testing is not about checking the individual parts of the design, but about checking
the system as a whole. In effect it is one giant component.
System testing can involve a number of specialist types of test to see if all the functional and non-
functional requirements have been met. In addition to functional requirements these may include the
following types of testing for the non-functional requirements:
There are many others, the needs for which are dictated by how the system is supposed to perform.
Acceptance Testing
Acceptance Testing checks the system against the “Requirements”. It is similar to systems testing in
that the whole system is checked but the important difference is the change in focus:
l Systems Testing checks that the system that was specified has been delivered.
l Acceptance Testing checks that the system delivers what was requested.
The customer, and not the developer should always do acceptance testing. The customer knows what
is required from the system to achieve value in the business and is the only person qualified to make that
judgement.
The forms of the tests may follow those in system testing, but at all times they are informed by the
business needs.
Release Testing
Even if a system meets all its requirements, there is still a case to be answered that it will benefit the
business. The linking of “Business Case” to Release Testing is looser than the others, but is still important.
Release Testing is about seeing if the new or changed system will work in the existing business
environment. Mainly this means the technical environment, and checks concerns such as:
These tests are usually run the by the computer operations team in a business. The answers to their
questions could have significant a financial impact if new computer hardware should be required, and
adversely affect the “Business Case”.
It would appear obvious that the operations team should be involved right from the start of a project to
give their opinion of the impact a new system may have. They could then make sure the “Business Case”
is relatively sound, at least from the capital expenditure, and ongoing running costs aspects. However in
practise many operations teams only find out about a project just weeks before it is supposed to go live,
which can result in major problems.
l Testing begins at module level or class or object level in object-oriented systems and works
Outward toward the integration of the entire computer based system.
l Different techniques are appropriate at different points in time
l Testing is conducted by the developer of the software and, an independent test group for large
projects
l Testing and debugging are different activities, but debugging must be accommodated in any
testing strategy
Software testing is one element of a broader topic that is often referred to as verification and validation
(V&V).
l Verification refers to the set of activities that ensure that correctly implements a specific function.
l Validation refers to a different set of activities that ensure that the software that has been built
is traceable to customer requirements.
Boehm states like this.
Verification: “Are we building the product right”
Validation: “Are we building the right product?”
Software Formal
Engineering Technical Measurement
Methods Review
Quality
SCM
Standards Testing
&
And
SQA
Procedures
Fig 5.1 shows by application of methods and tools, effective formal technical reviews, and solid
management and measurement all lead to quality that is confirmed during testing.
Testing provides the last bastion from which quality can be assessed and, more pragmatically, errors
can be uncovered.
However, testing should not be viewed as a safety net. Quality cannot be tested it won’t be when you
begin testing and when finished testing Quality is incorporated throughout software process.
4 Note:
It is important to note that V&V encompass a wide array of SQA activities that include formal
technical reviews, quality and configuration audits, performance monitoring, Simulation, feasibility study,
documentation review, database review, algorithm analysis, development testing, qualification testing and
installation testing.
Although testing plays an extremely important role in V&V, many other activities are also necessary.
The role of an ITG is to remove the inherent problems associated with letting the builder test the thing
that has been built. Independent testing removes the conflict of interest that may otherwise present. After
all, personnel in the ITG team are paid to find errors.
How ever, the software developer does not turn the program over to ITG and walk away. The developer
and the ITG work closely throughout a software project to ensure that thorough tests will be conducted.
While testing is conducted, the developer must be available to correct errors that are uncovered.
The ITG is part of the software development project team in the sense that it becomes involved during
the specification process and stays involved (planning and specifying test procedures) throughout a large
project.
However, in many cases the ITG reports to the SQA organization, there by achieving a degree of
independence that might not be possible if it were a part of the software development organization.
System
S
Engineering
Requirements
R
D Design
C
Code
Unit Test
U
I
Integration test
V
Validation Test
System test S
T
Figure 5.2: Testing Strategy
The strategy for software testing may also be viewed in the context of the spiral.
Unit testing begins at the vortex of the spiral and concentrates on each unit of the software as
implemented in source code. Testing progresses by moving outward along the spiral to integration testing,
where the focus is on design and the construction of the software architecture. Talking another turn
outward on the spiral, we encounter
Validation testing where requirements established as part of software requirements analysis are
validated against the software that has been constructed. Finally, We arrive at system testing where the
software and other system elements are tested as a whole.
To test computer software, we spiral out along streamlines that broaden the scope of testing with each
turn.
Considering the process from a procedural point of view testing within the context of software engineering
is a series of four steps that are implemented sequentially.
The steps are shown In Figure 5.3 initially tests focus on each module individually, assuring that it
functions as a unit hence the name unit testing. Unit testing makes heavy use of white-box testing
techniques, exercising specific paths in a module’s control structure to ensure complete coverage and
maximum error detection. Next, modules must be assembled or integrated to form the complete software
package. Integration testing addresses the issues associated with the dual problems of verification and
program construction. Black-box test case design techniques are most prevalent during integration, although
a limited amount of white -box testing may be used to ensure coverage of major control paths. After the
software has been integrated (constructed), sets of high-order test are conducted. Validation criteria
(established during requirements analysis) must be tested. Validation testing provides final assurance
that software needs all functional, behavioral and performance requirements. Black-box testing techniques
are used exclusively during validation.
The last high-order testing step falls outside the boundary of software engineering and into the broader
context of computer system engineering. Software once validated must be combined with other system
elements (e.g., hardware, people, and databases). System testing verifies the tall elements mesh properly
and that overall system function/performance is achieved.
Requirements
Integration test
Design
Coding
Unit test
Code
Testing Direction
Figure 5.3: Software Testing Steps
Execution time,
Figure5. 4: Failure intensity as a function of execution time
l State testing objectives explicitly. The specific objectives of testing should be stated in
measurable terms for example, test effectiveness, test coverage, meantime to failure, the cost
to find and fix defects, remaining defect density or frequency of occurrence, and test work -
hours per regression test should all be stated within the test plan.
l Understand the users of the software and develop a profile for each user category.use
cases ,which describe interaction scenario for each class of user can reduce overall testing
effort by focussing testing on actual use of the product.
l Develop a testing plan that emphasizes “rapid cycle testing”. The feedback generated
from the rapid cycle tests can be used to control quality levels and corresponding test strategies.
l Build “robust” software that is designed to test itself. Software should be designed in a
manner that uses antibugging techniques. that is software should be capable of diagnosing certain
classes of errors. In addition, the design should accommodate automated testing regression
testing.
l Use effective formal technical reviews as a filter prior to testing. formal technical reviews
can be as effective as testing in uncovering errors. For this reason, reviews can reduce the
amount of testing effort that is required to produce high-quality software.
l Conduct formal technical reviews to assess the test strategy and test cases themselves. Formal
technical reviews can uncover inconsistencies, omissions, and outright errors in the testing
approach. This saves time and improves product quality.
l Develop a continuous improvement approach for the testing process. The test strategy should
be measured. The metrics collected during testing should be used as part of a statistical process
control approach for software testing.
boundary of the module . the relative complexity of tests and uncovered errors are limited by the constraint
scope established for unit testing. The unit test is normally white-box oriented, and the step can be conducted
in parallel for multiple modules.
Module
-----------
------------ Interface
Local data structures
------------- Boundary conditions
Independent paths
Error handling paths
Test Cases
When a module performs external I/O, following additional interface test must be
conducted.
1. File attributes correct?
3. incorrect initialization
4. precision Inaccuracy
Among the potential errors that should be tested when error handling is evaluated are:
1. Error description is unintelligible
5. Error description does not provide enough information to assist in the location of the cause of the
error.
Boundary testing is the last task of the unit tests step. software often files at its boundaries. That is,
errors often occur when the nth element of an n-dimensional array is processed; when the ith repetition of
a loop with i passes is invoke; or when the maximum or minimum allowable value is encountered. Test
cases that exercise data structure, control flow and data values just below, at just above maxima and
minima are Very likely to uncover errors.
Because a module is not a standalone program, driver and or stub software must be developed for
each unit test. The unit test environment is illustrated in figure 5.6.In most applications a driver is nothing
more than a “Main program” that accepts test case data, passes such data to the test module and prints
relevant results. Stubs serve to replace modules that are subordinate to the module that is to be tested. A
stub or “dummy sub program” uses the subordinate module’s interface may do minimal data manipulation
prints verification of entry, and returns.
Drivers and stubs represent overhead. That is, both are software that must be developed but that is
not delivered with the final software product. If drivers and stubs are kept simple, actual overhead is
relatively low. Unfortunately, many modules cannot be adequately unit tested with “simple” overhead
software. In such cases, Complete testing can be postponed until the integration test step (Where drivers
or stubs are also used).
Unit test is simplified when a module with high cohesion is designed. When a module addresses only
one function, the number of test cases is reduced and errors can be more easily predicted and uncovered.
Driver
Interface
Local data structures
Module to be tested Boundary conditions
Independent paths
Error handling paths
Stub Stub
Test Cases
RESULTS
whole. And chaos usually results! A set of errors is encountered. Correction is difficult because isolation
of causes is complicated by the vast expanse of the entire program. Once these errors are corrected, new
ones appear and the process continues in a seemingly endless loop.
Incremental integration is the antithesis of the big bang approach. The program is constructed and
tested in small segments, where errors are easier to isolate and correct; interfaces are more likely to be
tested completely; and a systematic test approach may be applied. We discuss some of incremental
methods here:
1. The main control module is used as a test driver, and stubs are substituted for all modules
directly subordinate to the main control module.
2. Depending on the integration approach selected (i.e., depth-or breadth first), subordinate stubs
are replaced one at a time with actual modules.
4. On completion of each set of tests, another stub is replaced with real module
5. Regression testing may be conducted to ensure that new errors have not been introduced
The process continues from step2 until the entire program structure is built.
Top-down strategy sounds relatively uncomplicated, but in practice, logistical problems arise. The
most common of these problems occurs when processing at low levels in the hierarchy is required to
adequately test upper levels. Stubs replace low-level modules at the beginning of top-down testing; therefore,
no significant data can flow upward in the program structure.
The tester is left with three choices
1. Delay many tests until stubs are replaced with actual modules.
2. Develop stubs that perform limited functions that simulate the actual module
3. Integrate the software from the bottom of the hierarchy upward
The first approach causes us to lose some control over correspondence between specific tests and
incorporation of specific modules. this can lead to difficulty in determining the cause of errors tends to
violate the highly constrained nature of the top down approach. The second approach is workable but can
lead to significant overhead, as stubs become increasingly complex. The third approach is discussed in
next section.
4. Drivers are removed and clusters are combined moving upward in the program structure.
As integration moves upward, the need for separate test drivers lessens. In fact, if the top two levels
of program structure are integrated top-down, the number of drivers can be reduced substantially and
integration of clusters is greatly simplified.
New data flow paths are established, new I/O may occur, and new control logic is invoked. These
changes may cause problems with functions that previously worked flawlessly. In the context of an
integration test, strategy regression testing is the re-execution of subset of tests that have already been
conducted to ensure that changes have not propagated unintended side effects.
Regression testing is the activity that helps to ensure that changes do not introduce unintended behavior
or additional errors.
The regression test suite contains three different classes of test cases.
1. A representative sample of tests that will exercise all software functions.
2. Additional tests that focus on software functions that are likely to be affected by the change.
3. Tests that focus on software components that have been changed.
4 Note:
It is impractical and inefficient to re-execute every test for every program function once a change has
occurred.
Selection of an integration strategy depends upon software characteristics and some time project
schedule. In general, a combined approach that uses a top-down strategy for upper levels of the program
structure, coupled with bottom-up strategy for subordinate levels may be best compromise.
Regression tests should follow on critical module function.
l Is a complex or error-prone
3. Overhead software
1. Order of integration
q Purpose
q Modules to be tested
q Expected results
3. Test environment
V. References
VI. Appendices
The Following criteria and corresponding tests are applied for all test phases.
Interfaces integrity. Internal and external interfaces are tested as each module is incorporated into
the structure.
Information content. Tests designed to uncover errors associated with local or global data structures
are conducted.
Performance Test designed to verify performance bounds established during software design are
conducted.
A schedule for integration, overhead software, and related topics are also discussed as part of the
“test Plan” section. Start and end dates for each phase are established and availability windows for unit
tested modules are defined. A brief description of overhead software(stubs and drivers) concentrates on
characteristics that might require special effort. Finally, test environments and resources are described.
Information contained in that section forms the basis for a validation testing approach.
The alpha test conducted at the developer’s site by a customer software is used in a natural setting
with the developer “Looking over the shoulder” of the user and recording errors and usage problems.
Alpha tests are conducted in a controlled environment.
The beta test is conducted at one or more customer sites by the end user(S) of the software . Unlike
alpha testing the developer is generally not present; therefore the beta test is “live”.
The customer records all problems (real/imagined) that are encountered during beta testing and reports
these to the developer at regular intervals. Because of problems reported during beta test, the software
developer makes modification and then prepares for release of the software product to the entire customer
base.
A classic system testing problem is “finger pointing”. This occurs when an error is uncovered, and
each system element developer blames the other for the problem. Rather than indulging in such nonsense,
the software engineer should anticipate potential interfacing problems and 1) design error-handling paths
that test all information coming from other elements of the system; 2) conduct a series of tests that
simulate bad data or other potential errors at the software interface;3) record the results of tests to use as
“evidence” if finger pointing does occur; and 4) participate in planning and design of system tests to
ensure that software is adequately tested.
In the section that follows, we discuss the types of system tests that are worthwhile for software -
based system.
time. In some cases, a system must be fault tolerant; that is, processing faults must not cause overall
system function to cease. In other cases, a system failure must be corrected within a specified period of
time or severe economic damage will occur.
Recovery testing is a system test that forces the software to fail in a variety of ways and verifies that
recovery is properly performed. If recovery is automatic (performed by the system itself), re-initialization,
check pointing, mechanism, data recovery, and restart are each evaluated for correctness. If recovery
requires human intervention, the mean time to repair is evaluated to determine whether it is within acceptable
limits.
6.11.2 Debugging
Software testing is a process that can be systematically planned and specified. Test case design can
be conducted, a strategy can be defined, and results can be evaluated against prescribed expectations.
Debugging occurs as a consequence of successful testing. That is, when a test case uncovers an error,
debugging is the process that results in the removal of the error. Debugging is not testing, but it always
occurs as consequence of testing as shown in figure 6.1
The debugging process attempts to match symptom with cause, there by leading to error correction.
In the latter case, the person performing debugging may suspect a cause, design a test case to help
validate his/her suspicion, and work toward error correction in iterative fashion.
Execution of Cases
Results
Test Cases
Additional
Tests
Suspected
causes
Corrections
Identified
Causes
5. The symptom may be a result of timing problems, rather than processing problems.
6. It may be difficult to accurately reproduce input conditions(e.g. a real-time application in which
input ordering is indeterminate).
7. The symptom may be due to causes that are distributed across a number of tasks running on
different processors.
l Brute force
l Back tracking
l Cause elimination
The brute force category of debugging is probably the most common and efficient method for isolating
the cause of a software error. Brute force debugging methods are applied when all methods of debugging
fail. Using a philosophy, memory dumps are taken, run time traces are invoked and the program is loaded
with WRITE statement. When this is done, one finds a clue by the information produced which leads to
cause of an error.
Backtracking is a common debugging approach that can be used successfully in small programs.
Beginning at the site where a symptom has been uncovered, the source code is traced backward (manually)
until the site of the cause is found. This process has a limitation when the source lines are more.
Cause Elimination is manifested by induction or deduction and introduces the concept of binary
partitioning. Data related to the error occurrence are organized to isolate potential causes.
Alternatively, a list of all possible causes is developed and tests are conducted to eliminate each.
If initial tests indicate that a particular cause hypothesis shows promise the data are refined in an
attempt to isolate the bug.
6.12 SUMMARY
q Software testing accounts for the largest percentage of technical effort in the software process.
Yet, we are only beginning to understand the subtleties of systematic test planning, execution
and control.
q To fulfill this objective, a series of test step-unit, integration, validation, and system tests-are
planned and executed.
q Unit and integration tests concentrate on functional verification of a module and incorporation of
modules into a program structure.
q System testing validates software once it has been incorporated into a larger system.
q Each test step is accomplished through a series of systematic test techniques that assist in the
design of test cases. With each testing step, the level of abstraction with which software is
considered is broadened.
q Unlike testing, debugging must be viewed as an art. Beginning with a symptomatic indication of
a problem, the debugging activity tracks down the cause of an error. Of the many resources
available during debugging, the most valuable is the counsel of other software engineers.
q The requirement for higher-quality software demands a more systematic approach to testing.
QUESTIONS
1. What is the difference between Verification and Validation? Explain in your own words.
2. Explain unit test method with the help of your own example.
3. Develop an integration testing strategy for any the system that you have implemented already. List the problems
encountered during such process.
4. What is validation test? Explain.
REFERENCES
1. Software Engineering, A practitioner’s Approach, Fourth Edition, by Roger S. Pressman, McGraw Hill.
3. The Art of Software Testing, by Glenford J. Myers, John Wiley & Sons.
Chapter 7
7.1 INTRODUCTION
C
urrently the web is the most popular and fastest growing information system deployed on the
internet, more than 80% of its traffic.
As of date, we can say that web based application deserve a high level of all software quality characters
tics defined in the ISO standards namely:
Functionality: Verified content of web must be ensured as well as fitness for intended purpose.
Reliability: Security and availability are of utmost importance especially for applications that required
trusted transactions or that must exclude the possibility that information is tampered.
Efficiency: Response times are one of the success criteria for on-line services
Usability: High user satisfaction is the basis for success
Portability: Platform independence must be ensured at client level.
Maintainability: High evolution speed of services requires that applications can be evolved very
quickly.
l Web based applications consists of a large degree of components written by somebody else and
“integrated “together” with application software;
l User interface is often more complex than many GUI based client-server application
l Performance behavior is largely unpredictable and depends on many factors which are not
under the control of the developers
l We do not have only HTML but also Perl, Java, VRML etc.
l Browser compatibility is mandatory but is made difficult by layers and multi platforms
l Reference platforms are brand new and are being changed constantly
l Interoperability issues are magnified and thorough testing requires substantial investments in
software and hardware.
q Syntax Problems
q Stylistic problems
q Lexical problems.
User interaction
User interaction covers
l Links,
l Fast loading,
l Compatibility
l Usability testing
l Fast loading are concerning with aspects like the web pages the presence of a fast loading
abstract/index, the presence of width and height attributes for IMG tag.
l Fast loading testing is very important if we consider that 85% of web users indicate slow loading
times as the reason for avoiding further visits to web sites.
l The following rules related to the page weight should be established as support to fast loading
testing
l Every page weight should be less than a specified size (e.g. 50k) “Graphical sugar” pictures
should be less than 3K (e.g. bullet header)
l To reach greater sizes, use multiple tables separated one from the other
l Minimize pictures within the tables and always specify width and height
l Every page should contain text or other information before the first <TABLE> tag.
Compatibility testing
Compatibility testing concerns cross-Browser compatibility that checks for site behaviour across industry
standard browsers and their recent versions. It checks that pages conform to W3C standards for HTML
and other languages. It checks that site behaviour for java applets and Active X control. Cross platform
java compatibility checks for the site’s behaviour across industry standard desktop hardware and OS.
Usability Testing
Usability testing refers to coherence of look and feel, navigational aids, and user interactions and
printing. These aspects must be tests with respect to normal behaviour, destructive behaviour and
inexperienced users.
Structural Aspects
This includes both portability and integrity topics. All filenames in must be in lowercase which are in
server side.
Links to URLs outside in the web site be in canonical form and links to URLs into the web site must
be in relative form.
Moreover it must be checked that every directory must have an index page, every anchor must point
to an existing page, and that are no limbo pages.
Stress testing
Stress test is designed to confront programs with abnormal situations. In essence, the tester who
performs stress testing asks:” How high can we crank this up before it fails”?
Stress testing executes a system in a manner that demands resources in abnormal quantity, frequency,
or volume.
For example :
1. Special test may be designed that generate 10 interrupts per second when 1 or 2 is the average
rate.
2. Input data rates may be increased by an order of magnitude to determine how input function will
respond.
3. Test cases that require maximum memory or other resources may be executed
4. Test cases that may cause trashing in a virtual operating system may be designed.
5. Test cases that may cause excessive hunting for disk resident data may be created.
A variation of stress testing is a technique called sensitivity testing. In some situation, a very small
range of data contained within the bounds of valid data for a program may cause extreme and even
erroneous processing or profound performance degradation. This situation is analogous to a singularity in
a mathematical function.
Sensitivity testing attempts to uncover data combinations within valid input classes that may cause
instability or improper processing.
Performance Testing
Performance testing is designed to test run-time performance testing occurs throughout all steps in the
testing process. Even at the unit level, the performance of an individual module may be assessed as white-
box tests are conducted. However, it is not until all system elements are fully integrated that the true
performance of a system can be ascertained.
Performance tests are often coupled with stress testing and often require both hardware and software
instrumentation. That is, it is often necessary to measure resource utilization in an exacting fashion.
External instrumentation can monitor execution intervals, log events as they occur, and sample machine
states on a regular basis. By incrementing a system, the tester can uncover situation that lead to degradation
and possible system failure.
l Emergence of XML
l Privacy issues
l Digital signatures
l Micro payments
These and other emerging technologies and services will require Internet testing approaches to be
continually fine-tuned, to guarantee the reliability and quantity of service.
QUESTIONS
Chapter 8
T
he standard model for process assessment and improvement includes the main processes for the
test processes. As a starting Point for all process improvement activities a more detailed process
model is needed.
When looking at the software engineering Process a lot of models are available .They are documented
and distributed and their use and tailoring needs are widely known and discussed .The Test Process is
often characterized by trial and error implementation from people who are mainly experienced in SE and
not in the testing area. In addition the models are not well documented and available to the QA responsible
ones in the companies
Based on the experiences of the problems known from the BOOTSTRAP assessments in many
companies and the experiences more than 15 years in organizing and performing test in a wide area of
companies a standard test process model was defined by SQS and integrated in the standard assessment
method of the BOOTSTRAP.
Organizational Process
Support Process
Operative Test
Processes
Infrastructure management
The purpose of the infrastructure management process is to provide a stable and up-to-date environment
with apt methods and tools for the software test and provide the testing staff with an environment for
work.
The processes of this cluster are dealing with framework which allows performing an efficient test.
Test management
The purpose of the test management process is to define the necessary processes for co-coordinating
and managing a test project and the appropriate resources for testing a software product.
Risk Management
The purpose of the risk management process is to continuously identify and mitigate the project risks
throughout the life cycle of the project. The process involves establishing a focus on management of risks
at both the project and the organizational level.
The processes of these clusters are dealing with the management of the test project with a focus on
the appropriate organization of the process and a permanent watch on potential risks.
Configuration Management
The purpose of the configuration management process is to provide a mechanism for identifying,
controlling and tracking the versions of all work products of test project or process.
Joint Reviews
The purpose of the joint reviews process is to maintain a common understanding with the customer of
the progress against reaching the customers goals and what should be done to help to ensure the development
of a product that satisfies the customer.
The processes in this cluster are dealing with the supporting process that provides techniques and
infrastructure for a successful test project.
Test Documentation
The purpose of test of documentation process is to ensure that the documented work products of the
project activities (e.g. requirement documents) comply with their fined requirements regarding form content;
Module Test
The purpose of the module test process is to ensure that modules of the software comply with their
defined formal coding requirements and with the requirement of the software design.
OO class test
The purpose of the OO class test is to ensure that the OO classes comply with their defined formal
coding requirements and with the requirements of the software design.
Functional Test
The purpose of the functional test process is to ensure that the functions of the application fulfill their
functional requirements.
Performance Test
The purpose of the performance test process is to ensure that the performance of the application
complies with its requirements.
Installation Test
The purpose of the installation test process is to ensure that the deliverable application can be installed
in the defined target environments.
Compatibility test
The compatibility test process is to ensure that the application is compatible with other specified
application in the target environment.
As a starting point to derive an improvement strategy the evaluated goals and business strategies are
analyzed .To support the defining the goals and their priorities as well as to measure the actual quality of
the processes metrics are selected and adopted to the needs of the company.
Based on these metrics and the analysis of the process capabilities, the strength and weaknesses of
the test processes the improvement steps are defined. The selection of improvements from the improvement
suggestion will be done by the assessed team, a management representative and will be supported by the
experienced assessor in an improvement workshop. The workshop ensures that the improvement areas
are defined based on the needs and the experiences of the assessed organization and the knowledge of
the experts.
The next step is the concrete definitions of activities .The activities for the test improvement are
planned and controlled like a “normal” software engineering project .The predefined metrics are used to
define measures that can be used to measure the success of the improvement.
Typical improvements for projects and organization that are starting with the test process improvement
are
l Implement a structure of testing phases with specified goals and criteria’s for the next phase
Fig 8.2 shows the basic steps on the way to test process improvement
SQA
Test
/Guidance
SQA SQA
Test / Test
Analyze /Advice
SQA
Test
Metrics
QUESTIONS
1. Explain different test process and its benefits in software application development.
Chapter 10
Test Metrics
10.0 INTRODUCTION
T
he idea of understanding test metrics is to investigate the benefits of adopting some specific
approach to testing.
If no formal, quantitative measurements are made, it is possible only to make qualitative statements
about the effectiveness of the testing process, which may in the short term assure senior management,
but which in the long term will not help to improve the testing process. Typical examples of where metrics
can be used in the testing process include:
l For estimating the testing effort required to complete a given testing project/task.
l For highlighting complex elements of the system under test that may be error-prone and require
additional testing effort.
This chapter discusses the role and use of metrics in process improvement, reviews the metrics employed
in such programs , discusses the issues involved in setting up and adopting a metrics program, and makes
a number of proposals for a simple and effective metrics set that can be used to improve the testing
process.
Primitive metrics typically form the raw data for a metrics program and will represent the observed
data collected during the project. Often plotting the progress of primitive matrix over a time is a powerful
means of observing the trends within the testing project. For example, plotting the numbers of defects
detected during the testing projects against time can provide the test team leader with one means of
determining when to stop testing by observing when the rate of detection of defects declines.
A computed metric is one that must be calculated from other data or metrics Examples include
1. The number of non comment lines of code written per day
2. The defect density
3. The number of defects detected and or reported per unit time or per development phase
Computed metrics typically form the basis of forming conclusion regarding the progress of a process
–improvement program. For example, observing the defect detection effectiveness percentage achieved
by a testing team across a number of testing projects provides a valuable indication of the change in
efficiency of the testing process over time.
Cost effort: This metric is typically measured as a payroll month and includes time taken by staff
during testing, as well as time spent by managers engaged on testing tasks.
It may also be of benefit to measure of the cost /effort involved in administrating and collecting the
metrics since comparison of this value with the total effort expended in the testing process will provide
information about the efficiency of the metrics program
Total Defects Found in Testing :In this context, a defect can be defined as any aspect of the
behavior of the software that would not exists if the software were fit for purpose .It may be of benefit to
consider recording the severity of the observed defects to help in comparing their relative impact. A
simple three-point scale of critical, serious, and minor is often sufficient for this purpose.
Communications: This Metrics is typically measured as the number of interfaces that a given
project team has come up with the purpose of characterizing the constraints on the project team due to
dependencies with entities organizationally and physically distant, such as users, managers, and or the
supplier.
This metric is most useful for large and or complex organization where development and testing take
place across geographically distinct sites. In particular metric can be used to identify possible improvements
to the organization and administration of complex testing projects and provides a means for assessing the
effectiveness of such improvements. For small testing projects involving few stuff located in the same
office or site, there is unlikely to be a great deal of benefit from the use of this metric.