Black Box
Black Box
Software testing is an important technique for assessing the quality of a software product.
In this chapter, we will explain the following:
• the basics of software testing, a verification and validation practice, throughout
the entire software development lifecycle
• the two basic techniques of software testing, black-box testing and white-box
testing
• six types of testing that involve both black- and white-box techniques.
• strategies for writing fewer test cases and still finding as many faults as possible
• using a template for writing repeatable, defined test cases
1 Introduction to Testing
Software testing is the process of analyzing a software item to detect the differences
between existing and required conditions (that is, bugs) and to evaluate the features of
the software item (IEEE, 1986; IEEE, 1990). Software testing is an activity that should
be done throughout the whole development process (Bertolino, May 2001).
Software testing is one of the “verification and validation,” or V&V, software practices.
Some other V&V practices, such as inspections and pair programming, will be discussed
throughout this book. Verification (the first V) is the process of evaluating a system or
component to determine whether the products of a given development phase satisfy the
conditions imposed at the start of that phase (IEEE, 1990). Verification activities include
testing and reviews. For example, in the software for the Monopoly game, we can verify
that two players cannot own the same house. Validation is the process of evaluating a
system or component during or at the end of the development process to determine
whether it satisfies specified requirements (IEEE, 1990). At the end of development
validation (the second V) activities are used to evaluate whether the features that have
been built into the software satisfy the customer requirements and are traceable to
customer requirements. For example, we validate that when a player lands on “Free
Parking,” they get all the money that was collected. Boehm (Boehm, 1981) has
informally defined verification and validation as follows:
“knew better” than the customer that the game of Life was more fun than Monopoly and
wanted to “delight” the customer with something more fun than the specifications stated.
This example may seem exaggerated – but as programmers we can miss the mark by that
much if we don’t listen well enough or don’t pay attention to details – or if we second
guess what the customer says and think we know better how to solve the customer’s
problems.
Verification Validation
Are we building the product right? Are we building the right product?
“I landed on “Go” but didn’t get my “I know this game has money and
$200!” players and “Go” – but this is not the
game I wanted.”
In both of Boehm’s informal definitions use the term “right.” But what is “right”? In
software we need to have some kind of standard or specification to measure against so
that we can identify correct results from incorrect results. Let’s think about how the
incorrect results might originate. The following terms with their associated definitions
(IEEE, 1990) are helpful for understanding these concepts:
A mistake committed by a person becomes a fault (or defect) in a software artifact, such
as the specification, design, or code. This fault, unless caught, propagates as a defect in
the executable code. When a defective piece of code is executed, the fault may become a
visible anomaly (a variance from the specification or desired behavior) and a failure is
observed. Otherwise, the fault remains latent. Testing can reveal failures, but it is the
faults that must be found and removed (Bertolino, May 2001); finding a fault (the cause
of a failure) can be time consuming and unpredictable. Error is a measure of just how
incorrect the results are.
Compared with
specification or desired
use/functionality
1
The IEEE does not define defect however, the term defect is considered to be synonymous with fault.
because of the need to redesign, recode, or otherwise remove the fault. These faults can
remain latent in the product through a follow-on release or perhaps forever.
For faults that are not discovered and removed before the software has been shipped,
there are costs. Some of these costs are monetary, and some could be significant in less
tangible ways. Customers can lose faith in our business and can get very angry. They can
also lose a great deal of money if their system goes down because of our defects. (Think
of the effect on a grocery store that can’t check out the shoppers because of its “down”
point-of-sale system.) And, software development organizations have to spend a great
deal of money to obtain specific information about customer problems and to find and fix
the cause of their failures. Sometimes, programmers have to travel to customer locations
to work directly on the problem. These trips are costly to the development organization,
and the customers might not be overly cheerful to work with when the programmer
arrives. When we think about how expensive it is to test, we must also consider how
expensive it is to not test – including these intangible costs as well as the more obvious
direct costs.
We also need to consider the relative risk associated with a failure depending upon the
type of project we work on. Quality is much more important for safety- or mission-
critical software, like aviation software, than it is for video games. Therefore, when we
balance the cost of testing versus the cost of software failures, we will test aviation
software more than we will test video games. As a matter of fact, safety-critical software
can spend as much as three to five times as much on testing as all other software
engineering steps combined (Pressman, 2001)!
To minimize the costs associated with testing and with software failures, a goal of testing
must be to uncover as many defects as possible with as little testing as possible. In other
words, we want to write test cases that have a high likelihood of uncovering the faults
that are the most likely to be observed as a failure in normal use. It is simply impossible
to test every possible input-output combination of the system; there are simply too many
permutations and combinations. As testers, we need to consider the economics of testing
and strive to write test cases that will uncover as many faults in as few test cases as
possible. In this chapter, we provide you with disciplined strategies for creating efficient
sets of test cases – those that will find more faults with less effort and time.
o Black box testing (also called functional testing) is testing that ignores the
internal mechanism of a system or component and focuses solely on the outputs
generated in response to selected inputs and execution conditions.
o White box testing (also called structural testing) is testing that takes into account
the internal mechanism of a system or component.
With black box testing, the software tester does not (or should not) have access to the
source code itself. The code is considered to be a “big black box” to the tester who can’t
see inside the box. The tester knows only that information can be input into to the black
box, and the black box will send something back out. Based on the requirements
knowledge, the tester knows what to expect the black box to send out and tests to make
sure the black box sends out what it’s supposed to send out. Alternatively, white box
testing focuses on the internal structure of the software code. The white box tester (most
often the developer of the code) knows what the code looks like and writes test cases by
executing methods with certain parameters. In the language of V&V, black box testing is
often used for validation (are we building the right software?) and white box testing is
often used for verification (are we building the software right?). This chapter focuses on
black box testing.
All software testing is done with executable code. To do so, it might be necessary to
create scaffolding code. Scaffolding is defined as computer programs and data files built
to support software development and testing but not intended to be included in the final
product (IEEE, 1990). Scaffolding code is code that simulates the functions of
components that don’t exist yet and allow the program to execute (Myers, 979).
Scaffolding code involves the creation of stubs and test drivers. Stubs are modules that
simulate components that aren’t written yet, formally defined as a computer program
statement substituting for the body of a software module that is or will be defined
elsewhere (IEEE, 1990). For example, you might write a skeleton of a method with just
the method signature and a hard-coded but valid return value. Test drivers are defined as
a software module used to involve a module under test and often, provide test inputs,
controls, and monitor execution and report test results (IEEE, 1990). Test drivers
simulate the calling components (e.g. hard-coded method calls) and perhaps the entire
environment under which the component is to be tested (Beizer, 1990). Another concept
is mock objects. Mock objects are temporary substitutes for domain code that emulates
the real code. For example, if the program is to interface with a database, you might not
want to wait for the database to be fully designed and created before you write and test a
partial program. You can create a mock object of the database that the program can use
temporarily. The interface of the mock object and the real object would be the same.
The implementation of the object would mature from a dummy implementation to an
actual database.
1. Unit Testing
Type: White box testing
Specification: Low-level design and/or code structure
Unit testing is the testing of individual hardware or software units or groups of related
units (IEEE, 1990). Using white box testing techniques, testers (usually the developers
creating the code implementation) verify that the code does what it is intended to do at a
very low structural level. For example, the tester will write some test code that will call a
method with certain parameters and will ensure that the return value of this method is as
expected. Looking at the code itself, the tester might notice that there is a branch (an if-
then) and might write a second test case to go down the path not executed by the first test
case. When available, the tester will examine the low-level design of the code; otherwise,
the tester will examine the structure of the code by looking at the code itself. Unit testing
is generally done within a class or a component.
2. Integration testing
Type: Black- and white-box testing
Specification: Low- and high-level design
Integration test is testing in which software components, hardware components, or both
are combined and tested to evaluate the interaction between them (IEEE, 1990). Using
both black and white box testing techniques, the tester (still usually the software
developer) verifies that units work together when they are integrated into a larger code
base. Just because the components work individually, that doesn’t mean that they all
work together when assembled or integrated. For example, data might get lost across an
interface, messages might not get passed properly, or interfaces might not be
implemented as specified. To plan these integration test cases, testers look at high- and
low-level design documents.
4. Acceptance testing
Type: Black-box testing
Specification: requirements specification
After functional and system testing, the product is delivered to a customer and the
customer runs black box acceptance tests based on their expectations of the functionality.
Acceptance testing is formal testing conducted to determine whether or not a system
satisfies its acceptance criteria (the criteria the system must satisfy to be accepted by a
customer) and to enable the customer to determine whether or not to accept the system
(IEEE, 1990). These tests are often pre-specified by the customer and given to the test
team to run before attempting to deliver the product. The customer reserves the right to
refuse delivery of the software if the acceptance test cases do not pass. However,
customers are not trained software testers. Customers generally do not specify a
“complete” set of acceptance test cases. Their test cases are no substitute for creating
your own set of functional/system test cases. The customer is probably very good at
specifying at most one good test case for each requirement. As you will learn below,
many more tests are needed. Whenever possible, we should run customer acceptance test
cases ourselves so that we can increase our confidence that they will work at the
customer location.
5. Regression testing
Type: Black- and white-box testing
Specification: Any changed documentation, high-level design
Throughout all testing cycles, regression test cases are run. Regression testing is
selective retesting of a system or component to verify that modifications have not caused
unintended effects and that the system or component still complies with its specified
requirements (IEEE, 1990). Regression tests are a subset of the original set of test cases.
These test cases are re-run often, after any significant changes (bug fixes or
enhancements) are made to the code. The purpose of running the regression test case is
to make a “spot check” to examine whether the new code works properly and has not
damaged any previously-working functionality by propagating unintended side effects.
Most often, it is impractical to re-run all the test cases when changes are made. Since
regression tests are run throughout the development cycle, there can be white box
regression tests at the unit and integration levels and black box tests at the integration,
function, system, and acceptance test levels.
The following guidelines should be used when choosing a set of regression tests (also
referred to as the regression test suite):
• Choose a representative sample of tests that exercise all the existing software
functions;
• Choose tests that focus on the software components/functions that have been
changed; and
• Choose additional test cases that focus on the software functions that are most
likely to be affected by the change.
A subset of the regression test cases can be set aside as smoke tests. A smoke test is a
group of test cases that establish that the system is stable and all major functionality is
present and works under “normal” conditions (Craig and Jaskiel, 2002). Smoke tests are
often automated, and the selection of the test cases are broad in scope. The smoke tests
might be run before deciding to proceed with further testing (why dedicate resources to
testing if the system is very unstable). The purpose of smoke tests is to demonstrate
stability, not to find bugs with the system.
6. Beta testing
Type: Black-box testing
Specification: None.
When an advanced partial or full version of a software package is available, the
development organization can offer it free to one or more (and sometimes thousands)
potential users or beta testers. These users install the software and use it as they wish,
with the understanding that they will report any errors revealed during usage back to the
development organization. These users are usually chosen because they are experienced
users of prior versions or competitive products. The advantages of running beta tests are
as follows (Galin, 2004):
• Identification of unexpected errors because the beta testers use the software in
unexpected ways.
• A wider population search for errors in a variety of environments (different
operating systems with a variety of service releases and with a multitude of other
applications running).
• Low costs because the beta testers generally get free software but are not
compensated.
The disadvantages of beta testing are as follows (Galin, 2004):
• Lack of systematic testing because each users uses the product in any manner they
chose.
• Low quality error reports because the users may not actually report errors or may
report errors without enough detail.
• Much effort is necessary to examine error reports particularly when there are
many beta testers.
Throughout all testing cycles, regression test cases are run. Regression testing is
selective retesting of a system or component to verify that modifications have not caused
unintended effects and that the system or component still complies with its specified
Write the test plan early in the development cycle when things are generally still going
pretty smoothly and calmly. This allows you to think through a thorough set of test cases.
If you wait until the end of the cycle to write and execute test cases, you might be in a
very chaotic, hurried time period. Often good test cases are not written in this hurried
environment, and ad hoc testing takes place. With ad hoc testing, people just start trying
anything they can think of without any rational roadmap through the customer
requirements. The tests done in this manner are not repeatable.
start writing black box test cases against that requirements document. By doing so this
early, the testers might realize the requirements are not complete. The team may ask
questions of the customer to clarify the requirements so a specific test case can be written.
The answer to the question is helpful to the code developer as well. Additionally, the
tester may request (of the programmer) that the code is designed and developed to allow
some automated test execution to be done. To summarize, the earlier testing is planned at
all levels, the better.
It is also very important to consider test planning and test execution as iterative processes.
As soon as requirements documentation is available, it is best to begin to write functional
and system test cases. When requirements change, revise the test cases. As soon as some
code is available, execute test cases. When code changes, run the test cases again. By
knowing how many and which test cases actually run you can accurately track the
progress of the project. All in all, testing should be considered an iterative and essential
part of the entire development process.
It is best if the person who plans and executes black box tests is not the programmer of
the code and does not know anything about the structure of the code. The programmers
of the code are innately biased and are likely to test that the program does what they
programmed it to do. What are needed are tests to make sure that the program does what
the customer wants it to do. As a result, most organizations have independent testing
groups to perform black box testing. These testers are not the developers and are often
referred to as third-party testers. Testers should just be able to understand and specify
what the desired output should be for a given input into the program, as shown in Figure
3.
Input Output
Executable Program
Black-box test
Figure 3: Black Box Testing. A black-box test takes into account only the input and output of the
software without regard to the internal code of the program.
The format of your test case design is very important. We will use a particular format for
our test cases, as shown in Table 2. We recommend you use this template in your test
planning.
The problem is that the description does not give exact values of how many spaces the
players moved. This is an overly simplistic problem – but maybe the program crashes
for some reason when Player 1 and Player 2 land on the same spot. If you don’t
remember what was actually rolled (you let the rolls be determined randomly and don’t
record them), you might never be able to cause the problem to happen again because you
don’t remember the circumstances leading up to the problem. Recreating the problem is
essentially important in testing so that problems that are identified can be repeated and
corrected. Instead write specific descriptions, such as shown in Table 4.
There’s also something else important to notice in the Preconditions for test case 3 in
Table 4. How can the test case ensure the player rolled a 3 when the value the dice rolls
needs to be random in the real game? Sometimes we have to add a bit of extra
functionality to put a program in “test mode” so we can run our test cases in a repeatable
manner and so we can easily force a condition happen. For example, we may want to test
what happens when a player lands on “Go” or on “Go to Jail” and want to force this
situation to occur. The Monopoly programmers needed to create a test mode in which (1)
the dice rolls could be input manually and (2) the amount of money each player starts
with is input manually. It is also important to run some non-repeatable test cases in the
regular game mode to test whether random dice input does not appear to change expected
behavior.
The expected results must also be written in a very specific way, as in Table 4. You need
to record what the output of the program should be, given a particular input/set of steps.
Otherwise, how will you know if the answer is correct (every time you run it) if you don’t
know what the answer is supposed to be? Perhaps your program performs mathematical
calculations. You need to take out your calculator, perform some calculations by hand,
and put the answer in the expected result field. You need to pre-determine what your
program is supposed to do ahead of time, so you’ll know right away if your program
responds properly or not.
Ideally, we’d like to test every possible thing that can be done with our program. But, as
we said, writing and executing test cases is expensive. We want to make sure that we
definitely write test cases for the kinds of things that the customer will do most often or
even fairly often. We also want to avoid writing redundant test cases that won’t tell us
anything new (because they have similar conditions to other test cases we already wrote).
Each test case should probe a different mode of failure. We also want to design the
simplest test cases that could possibly reveal this mode of failure – test cases themselves
can be error-prone if we don’t keep this in mind.
Black box test cases are based on customer requirements. We begin by looking at each
customer requirement. To start, we want to make sure that every single customer
requirement has been tested at least once. As a result, we can trace every requirement to
its test case(s) and every test case back to its stated customer requirement. The first test
case we’d write for any given requirement is the most-used success path for that
requirement. By success path, we mean that we want to execute some desirable
functionality (something the customer wants to work) without any error conditions. We
proceed by planning more success path test cases, based on other ways the customer
wants to use the functionality and some test cases that execute failure paths. Intuitively,
failure paths intentionally have some kind of errors in them, such as errors that users can
accidentally input. We must make sure that the program behaves predictably and
gracefully in the face of these errors. Finally, we should plan the execution of our tests
out such that the most troublesome, risky requirements are tested first. This would allow
more time for fixing problems before delivering the product to the customer. It would be
devastating to find a critical flaw right before the product is due to be delivered.
We’ll start with one basic subway train requirement. We can write many test cases based
on this one requirement, which follows below. As we’ve said before, it is impossible to
test every single possible combination of input. We’ll outline an incomplete sampling of
test cases and reason about them in this section.
Requirement: When a user lands on the “Go to Jail” cell, the player goes directly to
jail, does not pass go, does not collect $200. On the next turn, the player must pay $50
to get out of jail and does not roll the dice or advance. If the player does not have
enough money, he or she is out of the game.
There are many things to test in this short requirement above, including:
1. Does the player get sent to jail after landing on “Go to Jail”?
2. Does the player receive $200 if “Go” is between the current space and jail?
3. Is $50 correctly decremented if the player has more than $50?
4. Is the player out of the game if he or she has less than $50?
At first it is good to start out by testing some input that you know should definitely pass
or definitely fail. If these kinds of tests don’t work properly, you know you should just
quit testing and put the code back into development. We can start with a two obvious
passing test case, as shown in Table 5.
Once you have identified these partitions, you choose test cases from each partition. To
start, choose a typical value somewhere in the middle of (or well into) each of these two
ranges. See Table 6 for test cases written to test the equivalent classes of money.
However, you will note that Test Cases 6 (Player 1 has $1200) and 7 (Player 1 has $100)
are both in the same equivalence class. Therefore, Test Case 7 is unlikely to discover any
defect not found in Test Case 6.
Money for
Player 1: $50
8 Precondition: Game is in test mode.
Number of players: 2
Money for player 1: $25
Money for player 2: $25
Player 1 dice roll: 3
Player 2 dice roll: 2 Sent to jail
Player 2: no roll
Player 1 is
out of game
Table 6: Test Plan #2 for the Jail Requirement
For each equivalent class, the test cases can be defined using the following guidelines
(Pressman, 2001):
1. If input conditions specify a range of values, create one valid and one or two
invalid equivalence classes. In the above example, this is (1) less than 50/invalid;
(2) 50 or more/valid.
2. If input conditions require a certain value (for example R and L for the side in our
train example), create an equivalence class of the valid values (R and L) and one
of invalid values (all other letters other than R and L). In this case, you need to
test all valid values individually and several invalid values.
3. If input conditions specify a member of a set, create one valid and one invalid
equivalence class.
4. If an input condition is a Boolean, define one valid and one invalid class.
Equivalence class partitioning is just the start, though. An important partner to this
partitioning is boundary value analysis.
Boris Beizer, well-known author of testing book advises, “Bugs lurk in corners and
congregate at boundaries.” (Beizer, 1990) Programmers often make mistakes on the
boundaries of the equivalence classes/input domain. As a result, we need to focus testing
at these boundaries. This type of testing is called Boundary Value Analysis (BVA) and
guides you to create test cases at the “edge” of the equivalence classes. Boundary value
Figure 5: Boundary Value Analysis. Test cases should be created for the boundaries (arrows)
between equivalence classes.
When creating BVA test cases, consider the following (Pressman, 2001):
1. If input conditions have a range from a to b (such as a=100 to b=300), create test
cases:
• immediately below a (99)
• at a (100)
• immediately above a (101)
• immediately below b (299)
• at b (300)
• immediately above b (301)
2. If input conditions specify a number of values that are allowed, test these limits.
For example, input conditions specify that only one train is allowed to start in
each direction on each station. In testing, try to add a second train to the same
station/same direction. If (somehow) three trains could start on one
station/direction, try to add two trains (pass), three trains (pass), and four trains
(fail).
My test programs are intended to break the system, to push it to its extreme limits,
to pile complication on complication, in ways that the system programmer never
consciously anticipated. To prepare such test data, I get into the meanest,
nastiest frame of mind that I can manage, and I write the cruelest code I can think
of; then I turn around and embed that in even nastier constructions that are
almost obscene. (Knuth, 1992)
Think diabolically! Think of every possible thing a user could possibly do with your
system to demolish the software. You need to make sure your program is robust – in that
it can properly respond in the face of erroneous user input. This type of testing is called
robustness testing, whereby test cases are chosen outside the domain to test robustness to
unexpected, erroneous input (Bertolino, May 2001), and is included in defensive testing
which includes tests under both normal and abnormal conditions (Copeland, 2004).
Look at every input. Does the program respond “gracefully” to these error conditions?
1. Can any form of input to the program cause division by zero? Get creative!
2. What if the input type is wrong? (You’re expecting an integer, they input a float.
You’re expecting a character, you get an integer.)
3. What if the customer takes an illogical path through your functionality?
4. What if mandatory fields are not entered?
5. What if the program is aborted abruptly or input or output devices are unplugged?
implementing a “whole” requirement. In industry, this could quite feasibly mean they
keep their code to themselves for several months. However, this is a dangerous practice
– and can lead to what is known in industry as integration hell. Just because a
component works on a programmer’s own computer, this doesn’t mean it will work when
it is assembled with the code other programmers are working on. The earlier it is known
that there are some interface problems or some data that’s not getting passed properly the
better. This knowledge can only be gained by integrating code and testing early and
often. Then, integration problems can be more easily localized in the work that was just
integrated. By localizing the code that contains a new defect, the programmer can
efficiently identify and remove defects.
4 Acceptance Testing
Acceptance test cases are written by the customer. In custom software development,
often contracts between the customer and the development organization state that the
customer can refuse to take delivery of the product if their acceptance test cases do not
run properly in the customer’s own (software and hardware) environment. Sometime the
customer shares the acceptance test cases with the team, which gives them a shared
specific goal. Other times, the customer hides the acceptance test cases from the
developers and runs them after receiving the code (in the same way as a teacher often
doesn’t tell the students the test cases they will run to grade their class projects). We
believe it is much more productive for the customer and the development team to work
openly and collaboratively on the creation of the acceptance test cases. Then, together
the customer and the development team have a similar vision of what the software has to
look like for the customer to be happy. In our experience, the collaborative acceptance
test case creation serves as an excellent means of clarifying requirements by making
requirements specified in a way that is quantifiable, measurable, and unambiguous long
before testing commences. Likewise, they can together track the progress of system
development as the team can tell the customer which acceptance test cases are passing.
By their nature, black box test cases are designed and run by people who do not see the
inner workings of the code. Ultimately, system and acceptance cases are intended to be
run through the product user interface (UI) to show that the whole product really works.
Test automation can be difficult because the developer has no knowledge of the inner
workings of the software and because system and acceptance cases must be run through
the UI. However, the more automated testing can be, the easier it is to run the test cases
and to re-run them again and again. The simpler it is to run a suite of tests, the more
often those tests will be run. The more the tests are run, the faster any deviation from
those tests will be found. (Martin, 2003)
If your role on the team is as a software developer, it is always good to consider the types
of black box test cases (functional, system, and acceptance) that will ultimately be run on
your code and to automate test cases to test the logic (separate from the UI logic) behind
these black box test cases. Automated test cases can be run often with minimal time
investment once they are written. By automating the testing of the logic behind the black
box test cases, (1) you are ensuring that the logic “behind the scenes” is working properly
so that the inevitable black box test cases can run smoothly through the UI by the testers
and the customers; and (2) you are more motivated to decouple program/business logic
separate from the UI logic (which is always a good design technique).
When test cases are automated, they can then become compile-able and executable
documentation.
6 Summary
Several practical tips for black box testing were presented throughout this chapter. The
keys for successful black box testing are summarized in Table 8.
You need to test for what the customer wants the program to do, not what the
programmer programmed it to do. The programmer is biased (through no fault of
her/her own) by knowing the intimate details of what the program does. Black
box testing is best done by someone with a fresh, objective perspective of the
customer requirements.
Use the four-item test case template (ID, Description, Expected Results, Actual
Results) when planning your test cases.
In the test case, specify exactly what the tester has to do to create the desired input
conditions and exactly how the program should respond (the output). Be explicit
in this documentation so that multiple testers (other than yourself) would be able
to run the exact same test case using the directions in the test case. These
directions will be especially important if a failure need to be re-created for the
programmer to a failure.
Test early and often.
Write the simplest test cases that could possibly reveal a mode of failure. (Test
cases can also be error-prone.)
Use equivalence class partitioning to manage the number of test cases run. Test
cases in the same equivalence class will all reveal the same fault.
Use boundary value analysis to find the very-common bugs that lurk in corners
and congregate at boundaries.
Use decision tables to record complex business rules that the system must
implement and that must be tested.
Run the equivalence class test cases first. If the program doesn’t work for the
simplest case (smack in the middle of an equivalence class), it probably won’t
work for the boundaries either. If you run a boundary test first, you’ll probably
go run the general case (equivalence class test) before investigating the problem.
So, instead just run the simple case first.
Avoid having test cases dependant upon each other (i.e. having preconditions of
another test case passing). Consider that you have 17 test cases, each having a
precondition of the prior test case passing – and you pass the first 16 test cases but
fail the 17th test case. It take you some time (until the next day) to debug your
program. Now, in order to re-run the 17th test case to see if it now passes, you
have to re-run the 16 you know pass. This can be time consuming /
Write each test case so that it can reveal one type of fault. Consider a test case
that has three different forms of invalid input. If the test case fails, you might not
know which of the three inputs make it the test case fail, and you will have to run
different, smaller test cases to see which of the inputs caused problems.
Think diabolically! What are the worst things someone could try to do to your
program? Write test for these.
Encourage a collaborative approach to acceptance testing with the customer.
When black box test cases surface failures, they only reveal the symptoms of
faults. You need to use your detective skills to find the fault in the code that
caused the failure to occur.
Table 8: Key Ideas for Black Box Testing
Reminds Dijkstra, “Program testing can be used to show the presence of bugs, but never
to show their absence!” (Dijkstra, April 1970) Mostly, testing can be used to check how
well defect-prevention activities worked. As a beneficial side effect, testing can also be
used to identify anomalies in code via dynamic execution of the code.
specification or requirement
Stubs computer program statement substituting for the (IEEE,
body of a software module that is or will be 1990)
defined elsewhere
Success path a test case that execute some desirable
functionality (something the customer wants to
work) without any error conditions
System testing testing conducted on a complete, integrated (IEEE,
system to evaluate the system compliance with its 1990)
specified requirements
Test case set of test inputs, execution conditions, and (IEEE,
expected results developed for a particular 1990)
objective, such as to exercise a particular
program path or to verify compliance with a
specific requirement
Test driver software module used to involve a module under (IEEE,
test and often, provide test inputs, controls, and 1990)
monitor execution and report test results
Test plan document describing the scope, approach, (IEEE,
resources, and schedule of intended test 1990)
activities. It identifies test items, the features to
be tested, the testing tasks, who will do each task,
and any risks requiring contingency plans
Unit testing testing of individual hardware or software units (IEEE,
or groups of related units 1990)
Usability testing testing conducted to evaluate the extent to which (IEEE,
a user can learn to operate, prepare inputs for, 1990)
and interpret outputs of a system or component
Validation the process of evaluating a system or component (IEEE,
during or at the end of the development process 1990)
to determine whether it satisfies specified
requirements
Verification the process of evaluating a system or component (IEEE,
to determine whether the products of a given 1990)
development phase satisfy the conditions
imposed at the start of that phase
White box testing testing that takes into account the internal (IEEE,
mechanism of a system or component 1990)
References:
Bertolino, A. (May 2001). Chapter 5: Software Testing. IEEE SWEBOK Trial Version
1.00.
Boehm, B. W. (1981). Software Engineering Economics. Englewood Cliffs, NJ, Prentice-
Hall, Inc.
Copeland, L. (2004). A Practitioner's Guide to Software Test Design. Boston, Artech
House Publishers.
Craig, R. D. and S. P. Jaskiel (2002). Systematic Software Testing. Norwood, MA,
Artech House Publishers.
Dijkstra, E. W., "Notes on Structured Programming," Technological University
Eindhoven T.H. Report 70-WSK-03, Second edition.
Galin, D. (2004). Software Quality Assurance. Harlow, England, Pearson, Addison
Wesley.
IEEE (1986). "ANSI/IEEE Standard 1008-1987, IEEE Standard for Software Unit
Testing."
IEEE (1987). "ANSI/IEEE Standard 1008-1987, IEEE Standard for Software Unit
Testing."
IEEE (1990). IEEE Standard 610.12-1990, IEEE Standard Glossary of Software
Engineering Terminology.
IEEE, "IEEE Standards Collection: Glossary of Software Engineering Terminology,"
IEEE Standard 610.12-1990.
Kaner, C., J. Bach, et al. (2002). Lessons Learned in Software Testing, John Wiley &
Sons.
Knuth, D. E. (1992). The errors of TeX. Software--Practice and Experience. Literate
Programming; CSLI Lecture Notes, no. 27, CSLI. 19: 607--681.
Martin, R. C. (2003). Agile Software Development: Principles, Patterns, and Practices.
Upper Saddle River, Prentice Hall.
Myers, G. J. (979). The Art of Software Testing. New York, John Wiley.
Pressman, R. (2001). Software Engineering: A Practitioner's Approach. Boston,
McGraw Hill.
Chapter Questions
1. What is the difference between black-box and white-box testing? During the software
development, how can we derive black-box tests? How about white-box tests?
2. Dharma City is installing the AutoCop Traffic Law Enforcement System. AucoCop is
a sensor-camera combo installed near a traffic light. When the sensor detects a
speeding car passing by or a car running the red light, AutoCop will activate the
camera and take a picture of the plate. Use the equivalence partitioning and boundary
value analysis methods to derive the test cases to test the camera activation logic.
3. From the perspective of automating software testing, what is the problem if the user
interface and the business logic are heavily coupled?
4. Describe in your own words the difference between validation and verification.
5. In XP, the customer and developers work cooperatively to specify the acceptance
tests. What are to pros and cons if the customer and developers work together on
acceptance tests?
6. What’s the advantage if acceptance tests can be automated?
7. Suppose you are writing a program that counts the number of alphanumeric
characters in a string. May we apply equivalence partitioning for this program? What
about boundary value analysis? Do we need more test cases to validate the program?
8. Suppose we are developing a program which decides, in a two-dimensional
coordinate system, whether a point P falls in a circle C or on its border. The program
reads five real numbers. The first two numbers are the x- and y-coordinate of the
center of C, the third number is the radius of C, and the fourth and fifth numbers
represent the coordinate of P. Develop the test cases that you feel are adequate for this
program.
9. Some organizations have independent testing groups. What tests are best designed by
the testing group? What tests are best designed by the developers? And what tests are
best designed by the customer? Justify your answer.
10. It is impractical to run all the test cases every when changes are made. An
organization may adopt a prioritization scheme for the test cases to choose
appropriate test cases to run. Give three test case prioritization criteria.
11. A teacher wants to write a program that will average 10 course grades. Using the
equivalence partitioning and boundary value analysis methods, derive a set of test
cases for the bid placement. Also give some “dirty” test cases.
12. Suppose you are writing a simple calculator program. This program can handle
positive integer calculation, including addition, subtraction, multiplication, and
division. The input is a string which composed of digits (0, 1, 2, …, 9) and operators
(+, -, *, /). No space is allowed. The input string can be at most 100 characters long,
and each number can compose of at most 10 digits. Division of two integers produces
one integer by truncation. If the answer contains more than 10 digits, this program
simply outputs an overflow error message. Using the equivalence partitioning and
boundary value analysis methods, derive a set of test cases for the program. Also give
some dirty test cases.
13. Acceptance tests are specified by the customer with the help of developers. Usually
the customer has better knowledge in their business than in programming. Therefore,
it is next to impossible for the customer to write or understand the tests using the
programming language. What do you think is a feasible form of acceptance tests?
(Remember that we’d like the acceptance tests compile-able and executable.)