Software Engineering Notes Complete
Software Engineering Notes Complete
LESSON 8: CODING
1.0 Objectives
The objective of this lesson is to make the students acquainted with the
problem of software crisis this has ultimately resulted into the development of
1.1 Introduction
redesign and modification of the source code, thorough testing of the modified
modified work products to the appropriate user. The need for systematic
apparent in the 1960s. Many software developed at that time were subject to cost
apparent that the demand for computer software was growing faster than our
ability to produce and maintain it. As a result the field of software engineering
The headlines have been screaming about the Y2K Software Crisis for years
now. Lurking behind the Y2K crisis is the real root of the problem: The Software
Is there a crisis at all? As you stroll through the aisles of neatly packaged
software in your favorite computer discount store, it wouldn’t occur to you that
there’s a problem. You may be surprised to learn that those familiar aisles of
software represent only a small share of the software market--of the $90 Billion
specifications.
than 50,000 lines of high-level language code. It’s those large systems that bring
the software crisis to light. You know that in large projects the work is done in
problem?
Why is it that the team produces fewer than 10 lines of code per day over the
Why is one of every three large projects scrapped before ever being completed?
And more:
¾ The cost of owning and maintaining software in the 1980’s was twice as
¾ During the 1990’s, the cost of ownership and maintenance increased by 30%
¾ Three quarters of all large software products delivered to the customer are
failures that are either not used at all, or do not meet the customer’s
requirements.
Software projects are notoriously behind schedule and over budget. Over the last
twenty years many different paradigms have been created in attempt to make
solution to the crisis. It appears that the Software Crisis can be boiled down to
discipline.
certainly talented and skilled, but work like craftsmen, relying on their talents and
skills and using techniques that cannot be measured or reproduced. On the other
techniques–the marks of science. The software industry is still many years away
processes exist, but their use is not widespread. A crisis similar to the software
processes are tried and true. To make matters worse, software technology is
pace than software, software developers are constantly trying to catch up and
hoc software development in an attempt to get products out on time for the new
importance and are omitted or completed after the fact. However, as the statistics
show, the ad hoc approach just doesn’t work. Software developers have
becomes embedded in more and more consumer electronics. Sixty errors per
or impossible to predict what sort of effect a simple change might have on other
are beginning to see that following a formal software process consistently leads
to better quality products, more efficient teams and individuals, reduced costs,
The SEI (Software Engineering Institute) uses a Capability Maturity Model (CMM)
Software CMM has become a de facto standard for assessing and improving
Maturity Level 5, at which an organization not only has a formal process, but also
continually refines and improves it. Each maturity level is further broken down
into key process areas that indicate the areas an organization should focus on to
change control).
and another at Loral (the on-board space shuttle software project), had earned
Maturity Level 5. Another study showed that only 2% of reviewed projects rated
in the top two Maturity Levels, in spite of many of those projects placing an
large projects will naturally seek organizations with high CMM ratings, and that
improvement.
Mature software is also reusable software. Artisans are not concerned with
standardized to such an extent that it could be marketed as a "part", with its own
part number and revision, just as though it were a hardware part. The software
Though it would seem that nothing less than a software development revolution
could make that happen, the National Institute of Standards and Technology
(NIST) founded the Advanced Technology Program (ATP), one purpose of which
The consensus seems to be that software has become too big to treat as a craft.
And while it may not be necessary to apply formal software processes to daily
The term software engineering was popularized after 1968, during the 1968
The term software engineering has been commonly used with a variety of distinct
meanings:
¾ As the informal contemporary term for the broad range of activities that was
¾ As the broad term for all aspects of the practice of computer programming, as
science;
software, that is, the application of engineering to software," and "(2) the
assignments.
the computer hardware is made useful to the user via software (computer
most complex modern machines. For example, a modern airliner has several
million physical parts (and the space shuttle about ten million parts), while the
carry out and coordinate their efforts: pair programming, code reviews and daily
idea out of a previous planned model, which should be transparent and well
documented.
1.2.7.1 Mathematics
proven. Programs are finite, so in principle, developers could know many things
theory shows that not everything useful about a program can be proven.
Mathematics works best for small pieces of code and has difficulty scaling up.
1.2.7.2 Engineering
many properties that can be measured. For example, the performance and
analysis, but often are meaningless when comparing different small fragments of
code.
1.2.7.3 Manufacturing
out those steps, much like a manufacturing assembly line, advocates hope to
There are budgets and schedules to set. People to hire and lead. Resources
(office space, computers) to acquire. All of this fits more appropriately within the
purview of management.
interfaces should be aesthetically pleasing and provide optimal audio and visual
graphic artists create graphic elements for graphical user interfaces, graphic
interfaces with user-read text and voice may also be enhanced from professional
The act of writing software requires that developers summon the energy to find
the answers they need while they are at the keyboard. Creating software is a
performance that resembles what athletes do on the field, and actors and
to spark the creation of code. Sometimes a creative spark is needed to create the
problem. Others argue that discipline is the key attribute. Pair programming
or professionalism issues.
argue that the practice of engineering involves the use of mathematics, science,
and the technology of the day, to build trustworthy products that are "fit for
Recently, software engineering has been finding its own identity and emerging as
an important freestanding field. Practitioners are slowly realizing that they form a
huge community in their own right. Software engineering may need to create a
Some people believe that software development is a more appropriate term than
software engineering for the process of creating software. Pete McBreen, (author
argues that the term Software Engineering implies levels of rigor and proven
term brings into sharper focus the skills of the developer as the key to success
as not everyone who works in construction is a civil engineer, not everyone who
Some people dispute the notion that the field is mature enough to warrant the
disciplines, whose practitioners usually object to the use of the title "engineer" by
education. In each of the last few decades, at least one radical new approach
Programming, Object Orientation, ... ), implying that the field is still changing too
the supposedly radical new approaches are actually evolutionary rather than
changes.
a conceptual entity while hardware is physical entity. When the hardware is built,
easily measured. Software being a logical has the different characteristics that
are to be understood.
classical sense.
manufacture, the two activities are fundamentally different. In both activities, high
quality is achieved through good design, but the manufacturing phase for
hardware can introduce quality problems that are nonexistent (or easily
corrected) for software. Both activities are dependent on people, but the
Both activities require the construction of a "product" but the approaches are
If you will use hardware, you will observe wear and tear with the passage of time.
But software being a conceptual entity will not wear with the passage of time.
relationship, often called the "bathtub curve," indicates that hardware exhibits
relatively high failure rates early in its life (these failures are often attributable to
design or manufacturing defects); defects are corrected and the failure rate drops
to a steady-state level (ideally, quite low) for some period of time. As time
passes, however, the failure rate rises again as hardware components suffer
from the cumulative affects of dust, vibration, abuse, temperature extremes, and
many other environmental maladies. Stated simply, the hardware begins to wear
out.
wear out. In theory, therefore, the failure rate curve for software should take the
form of the "idealized curve" shown in following Figure 1.2. Undiscovered defects
will cause high failure rates early in the life of a program. However, these are
corrected (ideally, without introducing other errors) and the curve flattens as
shown.
Failure Rate
Time
Figure 1.2 Failure curve for software
curve" shown in following Figure 1.3. During its life, software will undergo change
(maintenance). As the changes are made, it is likely that some new defects will
be introduced, causing the failure rate curve to spike as shown in Figure. Before
the curve can return to the original steady-state failure rate, another change is
requested, causing the curve to spike again. Slowly, the minimum failure rate
Another aspect of wear illustrates the difference between hardware and software.
When a hardware component wears out, it is replaced by a spare part. There are
no software spare parts. If any software fails then it indicates an error in design
or an error in the process through which design was translated into machine
executable code then it means some compilation error. So it is very much clear
existing components.
product is designed and built: The design engineer draws a simple schematic of
the digital circuitry, does some fundamental analysis to assure that proper
function will be achieved, and then goes to the shelf where catalogs of digital
only two of thousands of standard components that are used by mechanical and
electrical engineers as they design new systems. The reusable components have
been created so that the engineer can concentrate on the truly innovative
elements of a design, that is, the parts of the design that represent something
new. In the hardware world, component reuse is a natural part of the engineering
achieved on a broad scale. In the end, we can say that software design is a
had a limited domain of application. Today, we have extended our view of reuse
to encompass not only algorithms but also data structure. Modern reusable
components encapsulate both data and the processing applied to the data,
enabling the software engineer to create new applications from reusable parts.
For example, today's graphical user interfaces are built using reusable
components that enable the creation of graphics windows, pull-down menus, and
detail required to build the interface are contained with a library of reusable
/civil engineers, you will see the frequent use reusable components. To built a
computer, they will not have to start from the scratch. They will take the
components like monitor, keyboard, mouse, hard disk etc. and assemble them
engineering process.
Reusability of the components has also become the most desirable characteristic
in software engineering also. If you have to design software, don’t start from the
scratch, rather first check for the reusable components and assemble them. A
reused in may different applications. In the languages like C and Pascal we are
frequently required such as to compute the square root etc, are provided in the
library and those can be used as such.). With the advent of Object oriented
languages such as C++ and Java, reusability has become a primary issue.
problems in the software development: (1) Cost overrun and (2) schedule
slippage. If every time we will start from scratch, these problems are inevitable,
procedural detail rather we specify the desired result and supporting software
procedural steps (i.e., an algorithm) has been defined. Information content and
application. Content refers to the meaning and form of incoming and outgoing
information. For example, many business applications use highly structured input
data (e.g., a database) and produce formatted "reports." Software that controls
succession.
that have varied content and arbitrary timing, executes algorithms that can be
indeterminate.
external interfaces.
events as they occur is called real time. Elements of real-time software include a
software and has many components related to a particular field of the business.
have evolved into management information system (MIS) software that accesses
computing.
simulation, and other interactive applications have begun to take on real-time and
used to control products and systems for the consumer and industrial markets.
Embedded software can perform very limited and esoteric functions (e.g., keypad
software market has burgeoned over the past two decades. Word processing,
Web-based software: The Web pages processed by the browser are the
Java), and data (e.g. hypertext and a variety of visual and audio formats). In
software, which thinks and behaves like a human. AI software makes use of
based systems, pattern recognition (image and voice), artificial neural networks,
theorem proving, and game playing are representative of applications within this
category.
manner in which software is engineered. If the criteria are not followed, lack
3. There is a set of implicit requirements that often goes unmentioned (e.g., the
desire for ease of user). If software conforms to its explicit requirements but
"Good Software" needs to be fit for its purpose i.e. it does what it is intended to
do. Software has various attributes that lend towards it being good or bad.
External quality attributes are visible to anyone using the software. Reliability is
The internal quality of the software can be measured in terms of its technical
attributes such as coupling and cohesion. Some may question the importance of
internal quality attributes especially if the software seems to work well and the
client is satisfied with it. It can be reasoned though that the internal quality
attributes have an impact upon the external quality of the software. Low cohesion
for example can lead to messy code, which may be very hard to understand and
product.
The ISO 9126 is a Quality Model outlines the factors that contribute to software
¾ Functionality - how well the software system caters for the client's needs
• Suitability
• Interoperability
• Compliance
and data.
• Maturity
• Recoverability
• Fault Tolerance
¾ Usability - how much effort is needed on the part of users to properly use the
software system
• Learnability
• Understandability
resources it requires
• Time Behaviour
• Resource Behaviour
system
• Stability
• Analysability
• Changeability
• Testability
¾ Portability - how well the systems can be transported from one environment to
another
• Installability
• Conformance
• Replaceability
• Adaptability
shown in figure 1.4) dealing with different quality factors. These are discussed
below:
¾ Product Operation: Its is concerned with those aspects of the software when
specification.
• Reliability: Informally, software is reliable if the user can depend on it. The
over a specified time interval. For the purpose of this chapter, however, the
• Integrity:
it.
systems.
dimension is concerned with all those factors that are concerned with the
operational program.
Maintainability Portability
Flexibility Reusability
Testability Inter-operability
Product Operation
Reliability Integrity
Figure 1.4 Three dimensions of software
1.3 Summary
number of problems such as cost overru, schedule slippage, poor quality etc. It
This lesson gives an overview of the quality attributes of software. And two
quality models were discussed: McCall’s quality model and ISO 9126. According
to McCall There are three dimensions of a software product dealing with different
1.4 Keywords
ISO 9126: The ISO 9126 is a Quality Model that outlines the factors that
3. Define Software quality and discuss the ISO 9126 model of software
quality.
model.
Pressman, McGraw-Hill.
2.1 Objectives
The objective of this lesson is to introduce the students with the concept of
software measurement. After studying this lesson they will be familiar with
different types of metrics such as Function Points (FP), Source Line Of Code
2.2 Introduction
development. Tom DeMarco stated, "You can't control what you can't measure"
& Estimation, Yourdon Press, New York, USA, p3. Ejiogu suggested that a metric
should be easy to learn how to derive the metric and its computation should not
be effort and time consuming. (2) Empirically and intuitively persuasive: The
metric should satisfy the engineer's intuitive notion about the product under
appropriately under various conditions (3) consistent and Objective: The metric
should always yield results that are unambiguous. The third party would be able
to derive the same metric value using the same information (4) consistent in its
use of units and dimensions: It uses only those measures that do not lead to
specific products and processes. Software metric domain can be partitioned into
process, project, and product metrics. Process metrics are used for software
Project metrics are used by software project manager to adapt project work
flows.
2.3.2 Limitations
2.3.3 Criticisms
• Cyclomatic complexity
• Code coverage
• Cohesion
• Coupling
2.3.2 Limitations
prediction of such prior to the detail design, is very difficult to satisfactorily define
or measure. The practical utility of software metrics has thus been limited to
have therefore focused more on process metrics which assist in monitoring and
2.3.3 Criticisms
small number of numerical variables and then judge him/her by that measure.
A supervisor may assign the most talented programmer to the hardest tasks
on a project; which means it may take the longest time to develop the task
and may generate the most defects due to the difficulty of the task.
then employees will write as many separate lines of code as possible, and if
they find a way to shorten their code, they may not use it.
code measure exactly what is typed, but not of the difficulty of the problem.
Function points were developed to better measure the complexity of the code
estimators will produce different results. This makes function points hard to
Industry experience suggests that the design of metrics will encourage certain
kinds of behaviour from the people being measured. The common phrase
function points, the metric is wide open to gaming - that is, cheating.
One school of thought on metrics design suggests that metrics communicate the
real intention behind the goal, and that people should do exactly what the metric
are encouraged to write the code specifically to pass the test. If that's the wrong
code, then they wrote the wrong test. In the metrics design process, gaming is a
useful tool to test metrics and help make them more robust, as well as for helping
It should be noted that there are very few industry-standard metrics that stand up
metrics that balance each other out. In software projects, it's advisable to have at
¾ Schedule
¾ Risk
¾ Cost
¾ Quality
The Balanced scorecard is a useful tool for managing a suite of metrics that
When you can measure what you are speaking about, and express it in numbers,
you know something about it; but when you cannot measure it, when you cannot
Software plays an important role in our life. We want products which affect our
quality of software we must have some metrics to measure quality. The key point
here is quality of the same product may be change. Software is not an exception.
process and product. If software engineer know what he/she will do, then we can
estimates of software cost had relatively little impact. Today, software is the most
development.
of the software. One of the first software metric to measure the size of the
software as length is the LOC (Line of Code) The LOC measure is used
points (FP) that reflect the user's view of a system's functionality and gives size
complexity metrics are also restricted to code. The best knowns are Halstead's
¾ The vocabulary n as n = n1 + n2
Operators can be "+" and "*" but also an index "[...]" or a statement separation
Length Equation
length.
and formulas:
Program Volume: This metric is for the size of any implementation of any
algorithm.
V = N log2 n
L = V* / V
Level. It is used when the value of Potential Volume is not known because it is
Intelligence Content
In this equation all terms on the right-hand side are directly measurable from any
Programming Effort
algorithm's shortest possible form. Neither operators nor operands can require
repetition.
Effort Equation
E = V / L = V2 / V '
program volume V is a measure of it. Another aspect that influences the effort
difficulty.
A concept concerning the processing rate of the human brain, developed by the
psychologist John Stroud, can be used. Stroud defined a moment as the time
required by the human brain to perform the most elementary discrimination. The
Stroud number S is then Stroud's moments per second with 5 <= S <= 20. Thus
we can derive the time equation where, except for the Stroud number S, all of the
Advantages of Halstead:
¾ Simple to calculate.
Drawbacks of Halstead
Cyclomatic complexity is computed using a graph that describes the control flow
program. A directed edge connects two nodes if the second command might be
Definition
M=E−N+P
where
M = cyclomatic complexity
"M" is alternatively defined to be one larger than the number of decision points
Alternative definition
v(G) = e − n + 2
There is another simple way to determine the cyclomatic number. This is done by
counting the number of closed loops in the flow graph, and incrementing that
number by one.
i.e.
Where
M = Cyclomatic number.
¾ M is a lower bound for the number of possible paths through the control flow
graph.
¾ M is an upper bound for the number of test cases that are necessary to
statements.
if (c1) { f1(); }
else
{ f2(); }
if (c2) { f3(); }
else
{ f4();}
• To achieve complete branch coverage, two test cases are sufficient here.
Key Concept
The cyclomatic complexity of a section of source code is the count of the number
of linearly independent paths through the source code. For instance, if the source
complexity would be 1, since there is only a single path through the code. If the
code had a single IF statement there would be two paths through the code, one
path where the IF statement is evaluated as TRUE and one path where the IF
code with each line of source code being a node on the graph and arrows
languages can be quite terse and compact, a source code statement when
developing the graph may actually create several nodes in the graph (for
instance when using the C and C++ language "?" conditional operator (also
In general, in order to fully test a module all execution paths through the module
requires more testing effort than a module with a lower value since the higher
complexity number indicates more pathways through the code. This also implies
One would also expect that a module with higher complexity would tend to have
lower cohesion (less than functional cohesion) than a module with lower
generally implementing more than a single well defined function. However there
are certain types of modules that one would expect to have a high complexity
number, such as user interface (UI) modules containing source code for data
The results of multiple experiments (G.A. Miller) suggest that modules approach
¾ Measures the minimum effort and best areas of concentration for testing.
development.
¾ Is easy to apply.
¾ the same weight is placed on nested and non-nested loops. However, deeply
structures.
¾ It may give a misleading figure with regard to a lot of simple comparisons and
Henry and Kafura (1981) identified a form of the fan in - fan out complexity, which
maintains a count of the number of data flows from a component plus the number
of global data structures that the program updates. The data flow count includes
Henry and Kafura validated their metric using the UNIX system and suggested
components to be identified. They found that high values of this metric were often
problems.
interactions
The basis of the Measure SLOC is that program length can be used as a
LOC measure is used to measure size of the software. Source lines of code
program. SLOC is typically used to estimate the amount of effort that will be
¾ Only Source lines that are DELIVERED as part of the product are included --
¾ SOURCE lines are created by the project staff -- code created by applications
generators is excluded
¾ Simple to measure
Drawbacks of LOC
¾ It is language dependent
Because of the critics above there have been extensive efforts to characterize
Measuring SLOC
Many useful comparisons involve only the order of magnitude of lines of code in
a project. Software projects can vary between 100 to 100,000,000 lines of code.
Using lines of code to compare a 10,000 line project to a 100,000 line project is
far more useful than when comparing a 20,000 line project with a 21,000 line
magnitude.
There are two major types of SLOC measures: physical SLOC and logical SLOC.
Specific definitions of these two measures vary, but the most common definition
of physical SLOC is a count of lines in the text of the program's source code
including comment lines. Blanks lines are also included unless the lines of code
their specific definitions are tied to specific computer languages (one simple
SLOC, and physical SLOC definitions are easier to explain. However, physical
conventions. Unfortunately, SLOC measures are often stated without giving their
definition, and logical SLOC can often be significantly different from physical
SLOC.
for (i=0; i<100; ++i) printf("hello"); /* How many lines of code is this? */
¾ 1 Comment Line
Depending on the programmer and/or coding standards, the above "line of code"
¾ 2 Logical Line of Code lLOC (What about all the work writing non-statement
lines?)
¾ 1 Comment Line (Tools must account for all code and comments regardless
of comment placement.)
Even the "logical" and "physical" SLOC values can have a large number of
carefully explain and define the SLOC measure used in a project. For example,
most software systems reuse code, and determining which (if any) reused code
Origins of SLOC
At the time that people began using SLOC as a metric, the most commonly used
These languages were developed at the time when punch cards were the main
form of data entry for programming. One punch card usually represented one line
of code. It was one discrete object that was easily counted. It was the visible
SLOC measures are somewhat controversial, particularly in the way that they are
correlated with SLOC, that is, programs with larger SLOC values take more time
functionality is less well correlated with SLOC: skilled developers may be able to
develop the same functionality with far less code, so one program with less
SLOC may exhibit more functionality than another similar program. In particular,
develop only a few lines and yet be far more productive in terms of functionality
than a developer who ends up creating more lines (and generally spending more
effort). Good developers may merge multiple code modules into a single module,
improving the system yet appearing to have negative productivity because they
remove code. Also, especially skilled developers tend to be assigned the most
difficult tasks, and thus may sometimes appear less "productive" than other
code to perform the same task as a few characters in APL. The following
verbose.
Program in C
#include <stdio.h>
int main(void) {
printf("Hello World");
return 0;
Program in COBOL
000300
000400*
000900
100100
100300 BEGIN.
100700 MAIN-LOGIC-EXIT.
100800 EXIT.
tools often have the capability to auto-generate enormous amounts of code with
a few clicks of a mouse. For instance, GUI builders automatically generate all the
source code for a GUI object simply by dragging an icon onto a workspace. The
work involved in creating this code cannot reasonably be compared to the work
There are several cost, schedule, and effort estimation models which use SLOC
While these models have shown good predictive power, they are only as good as
the estimates (particularly the SLOC estimates) fed to them. Many have
functionality, but since function points are highly correlated to SLOC (and cannot
2002 Windows XP 40
David A. Wheeler studied the Red Hat distribution of the GNU/Linux operating
system, and reported that Red Hat Linux version 7.1 (released April 2001)
contained over 30 million physical SLOC. He also determined that, had it been
8,000 person-years of development effort and would have cost over $1 billion (in
A similar study was later made of Debian GNU/Linux version 2.2 (also known as
"Potato"); this version of GNU/Linux was originally released in August 2000. This
study found that Debian GNU/Linux 2.2 included over 55 million SLOC, and if
that the following release of Debian had 104 million SLOC, and as of year 2005,
Debian 2.2 56
Mac OS X 10.4 86
Blender 2.42 ~1
Gimp-2.3.8 0.65
code in a program and the number of bugs that it contains. This relationship is
not simple, since the number of errors per line of code varies greatly according to
but it does appear to exist. More importantly, the number of bugs in a program
has been directly related to the number of security faults that are likely to be
This has had a number of important implications for system security and these
can be seen reflected in operating system design. Firstly, more complex systems
are likely to be more insecure simply due to the greater number of lines of code
needed to develop them. For this reason, security focused systems such as
OpenBSD grow much more slowly than other systems such as Windows and
Linux. A second idea, taken up in both OpenBSD and many Linux variants, is
that separating code into different sections which run with different security
environments (with or without special privileges, for example) ensures that the
Advantages
process. Small utilities may be developed for counting the LOC in a program.
used for other languages due to the syntactical and structural differences
among languages.
the size of software due to the fact that it can be seen and the effect of it can
Disadvantages
using only results from the coding phase, which usually accounts for only
confirmed that effort is highly correlated with LOC, functionality is less well
correlated with LOC. That is, skilled developers may be able to develop the
same functionality with far less code, so one program with less LOC may
a few lines and still be more productive than a developer creating more lines
of code.
point (a), estimates done based on lines of code can adversely go wrong, in
all possibility.
in C++ and the other application written a language like COBOL. The number
of function points would be exactly the same, but aspects of the application
would be different. The lines of code needed to develop the application would
develop the application would be different (hours per function point). Unlike
few mouse clicks, where the programmer virtually writes no piece of code,
most of the time. It is not possible to account for the code that is automatically
and other metrics with respect to different languages, making the Lines of
code is. Do comments count? Are data declarations included? What happens
if a statement extends over several lines? – These are the questions that
often arise. Though organizations like SEI and IEEE have published some
Function points is basic data from which productivity metrics could be computed.
¾ as baseline metrics collected from past projects and used in conjunction with
value, which varies from 3 (for simple external inputs) to 15 (for complex internal
files).
The sum of all the occurrences is computed by multiplying each function count
with a weighting and then adding up all the values. The weights are based on the
complexity of the feature being counted. Albrecht’s original method classified the
weightings as:
External Input x3 x4 x6
External Input x4 x5 x7
External Inquiry x3 x4 x6
Low, average and high decision can be determined with this table:
There are 14 technical complexity factors. Each complexity factor is rated on the
1. Data communications
2. Performance
4. Transaction rate
7. Online update
8. Complex processing
9. Reusability
Function points are recently used also for real time systems.
Advantages of FP
¾ Language independent
specification.
Drawbacks of FP
¾ Subjective counting
¾ Effort prediction using the unadjusted function count is often no worse than
Organizations such as the International Function Point Users Group IFPUG have
been active in identifying rules for function point counting to ensure that counts
At IFPUG's Message Board Home Page you can find solutions about practical
use of FP.
If sufficient data exists from previous programs, function points can reasonably
2.4 Summary
Software metrics have been partitioned into process metric, project metric and
product metric. These metrics help the managers in improving the software
processes; assist in the planning, tracking and control of a software project; and
Code (LOC) is a measure of size. It is very easy to compute but have limited
measure computes the complexity of the software on the basis of operators and
operands present in the program. Function Point (FP) is also a measure of size
that can be used in the early stage of the software development. It reflects the
size on the basis of input, outputs, external interfaces, queries, and number of
files required. Metrics are the tools that help in better monitoring and control.
is the count of the number of linearly independent paths through the source code.
Function Point: Function points is basic data from which productivity metrics
Lines of Code (LOC): It is a software metric that measures the size of software
complexity? Write a program for bubble sort and compute its cyclomatic
complexity.
disadvantages? Explain.
Pressman, McGraw-Hill.
3.0 Objectives
software life cycle models. After studying this lesson, they will:
3.1 Introduction
A software process is a set of activities and associated results which lead to the
there are fundamental activities which are common to all software processes.
These are:
specification is produced.
needs.
Lack of planning is the primary cause of schedule slippage, cost overrun, poor
such problems. The steps required to plan a software project are given below:
consideration i.e. defining a life cycle model. The software life cycle includes all
the activities required to define, develop, test, deliver, operate, and maintain a
integration, and maintenance. The origin of the term "waterfall" is often cited to
advocated an iterative approach to software development and did not even use
the term "waterfall". Royce originally described what is now known as the
waterfall model as an example of a method that he argued "is risky and invites
failure".
In 1970 Royce proposed what is now popularly referred to as the waterfall model
as an initial concept, a model which he argued was flawed. His paper then
explored how the initial model could be developed into an iterative model, with
feedback from each phase influencing previous phases, similar to many methods
used widely and highly regarded by many today. Ironically, it is only the initial
model that received notice; his own criticism of this initial model has been largely
ignored. The "waterfall model" quickly came to refer not to Royce's final, iterative
Despite Royce's intentions for the waterfall model to be modified into an iterative
model, use of the "waterfall model" as a purely sequential process is still popular,
and, for some, the phrase "waterfall model" has since come to refer to any
DESIGN
IMPLEMENTATION
perfectly in order:
1. Requirements specification
2. Design
4. Integration
6. Installation
7. Maintenance
To follow the waterfall model, one proceeds from one phase to the next in a
specification" — they set in stone the requirements of the software. When and
only when the requirements are fully completed, one proceeds to design. The
requirements given. When and only when the design is fully completed, an
implementation of that design is made by coders. Towards the later stages of this
teams are integrated. After the implementation and integration phases are
complete, the software product is tested and debugged; any faults introduced in
earlier phases are removed here. Then the software product is installed, and
Thus the waterfall model maintains that one should move to a phase only when
waterfall model are thus discrete, and there is no jumping back and forth or
However, there are various modified waterfall models that may include slight or
Time spent early on in software production can lead to greater economy later on
in the software lifecycle; that is, it has been shown many times that a bug found
or design) is more economical (cheaper in terms of money, effort and time) to fix
than the same bug found later on in the process. (it is said that "a requirements
defect that is left undetected until construction or maintenance will cost 50 to 200
times as much to fix as it would have cost to fix at requirements time.") This
implement, it is easier to fix the design at the design stage than to realize months
down the track when program components are being integrated that all the work
model - time spent early on making sure that requirements and design are
absolutely correct is very useful in economic terms (it will save you much time
and effort later). Thus, the thinking of those who follow the waterfall process
goes, one should make sure that each phase is 100% complete and absolutely
requirements should be set in stone before design is started (otherwise work put
(otherwise they are implementing the "wrong" design and their work is wasted),
etc.
"partial deliverable" should a project not run far enough to produce any
date). An argument against agile development methods, and thus partly in favour
mentally by team members. Should team members leave, this knowledge is lost,
and substantial loss of project knowledge may be difficult for a project to recover
from. Should a fully working design document be present (as is the intent of Big
Design Up Front and the waterfall model) new team members or even entirely
reading the documents themselves. With that said, agile methods do attempt to
compensate for this. For example, extreme programming (XP) advises that
familiarize all members with all sections of the project (allowing individual
As well as the above, some prefer the waterfall model for its simple and arguably
more disciplined approach. Rather than what the waterfall adherent sees as
"chaos" the waterfall model provides a structured approach; the model itself
and Big Design Up Front in general can be suited to software projects which are
"shrink wrap" software) and where it is possible and likely that designers will be
able to fully predict problem areas of the system and produce a correct design
implementers follow the well made, complete design accurately, ensuring that the
NASA and upon many large government projects. Those who use such methods
do not always formally distinguish between the "pure" waterfall model and the
Steve McConnell sees the two big advantages of the pure waterfall model as
producing a "highly reliable system" and one with a "large growth envelope", but
rates it as poor on all other fronts. On the other hand, he views any of several
progress visibility", and rating as "fair" on "manage risks", being able to "be
"provide customer with progress visibility". The only criterion on which he rates a
developers.
mainly because of their belief that it is impossible to get one phase of a software
product's lifecycle "perfected" before moving on to the next phases and learning
from them (or at least, the belief that this is impossible for any non-trivial
program). For example clients may not be aware of exactly what requirements
they want before they see a working prototype and can comment upon it - they
implementers may have little control over this. If clients change their
overly large amounts of time have been invested into "Big Design Up Front".
(Thus methods opposed to the naive waterfall model, such as those used in Agile
product. That is, it may become clear in the implementation phase that a
this is the case, it is better to revise the design than to persist in using a design
that was made based on faulty predictions and which does not account for the
Steve McConnell in Code Complete (a book which criticizes the widespread use
To quote from David Parnas' "a rational design process and how to fake it ":
[systems] implementation. Some of the things that we learn invalidate our design
The idea behind the waterfall model may be "measure twice; cut once", and
those opposed to the waterfall model argue that this idea tends to fall apart when
modifications and new realizations about the problem itself. The idea behind
those who object to the waterfall model may be "time spent in reconnaissance is
seldom wasted".
Management Implications
The problems that changing requirements introduce into the software life
cycle are reflected in schedule slippages and cost overruns. One argument is
that more time spent upstream in the software life cycle results in less turmoil
downstream in the life cycle. The more time argument is typically false when
Product Implications
expected, but there would be the environment for control and discipline with
the changes.
¾ Unless those who specify requirements and those who design the software
needed in each phase of the software process before some time is spent in
the phase "following" it. That is, feedback from following phases is needed to
may need feedback from the implementation phase to identify problem design
designers may have worked on similar systems before, and so may be able to
implementing.
and inform the design process; constant integration and verification of the
track. The counter-argument for the waterfall model here is that constant
waterfall model may argue that if designers follow a disciplined process and
¾ It is difficult to estimate time and cost for each phase of the development
process without doing some "recon" work in that phase, unless those
estimating time and cost are highly experienced with the type of software
product in question.
control over a project and planning control and risk management are not
¾ Only a certain number of team members will be qualified for each phase; thus
to have "code monkeys" who are only useful for implementation work do
In response to the perceived problems with the "pure" waterfall model, many
modified waterfall models have been introduced. These models may address
some or all of the criticisms of the "pure" waterfall model. While all software
development models will bear at least some similarity to the waterfall model, as
all software development models will incorporate at least some phases similar to
the waterfall model. For models which apply further differences to the waterfall
model, or for radically different models seek general information on the software
development process.
Royce's final model, his intended improvement upon his initial "waterfall model",
illustrated that feedback could (should, and often would) lead from code testing to
design (as testing of code uncovered flaws in the design) and from design back
same paper Royce also advocated large quantities of documentation, doing the
job "twice if possible", and involving the customer as much as possible—now the
Extreme Programming.
The sashimi model (so called because it features overlapping phases, like the
"the waterfall model with feedback". Since phases in the sashimi model overlap,
information of problem spots can be acted upon during phases of the waterfall
model that would typically "precede" others in the pure waterfall model. For
example, since the design and implementation phases will overlap in the sashimi
the problems associated with the Big Design Up Front philosophy of the waterfall
model.
The conventional waterfall software life cycle model (or software process) is used
Software life cycle phase names differ from organization to organization. The
¾ specification,
¾ design,
¾ coding,
¾ testing, and
¾ Maintenance.
address the problems that are associated with the waterfall model. One
alternative software life cycle model uses prototyping as a means for providing
The waterfall model allows for a changing set of means for representing an
evolving software system. These documents then provide a basis for introducing
errors during the software life cycle. The user often begins to receive information
concerning the actual execution of the system after the system is developed.
requirement analysis with systematic review help to reduce the uncertainty about
what the system should do. However there is no substitute for trying out a
supports their work. In this process they can get new ideas and find strength
¾ Requirement validation: The prototype may reveal errors and omissions in the
throughout the conventional, waterfall software life cycle model. Now a day many
and analysis.
demonstrating a portion of the system to the end user for feedback and system
growth. The prototype emerges as the actual system moves downstream in the
life cycle. With each iteration in development, functionality is added and then
evolutionary prototyping.
Learning Curve
training for the use of a prototyping technique, there is an often overlooked need
for developing corporate and project specific underlying structure to support the
often result.
languages can have execution inefficiencies with the associated tools. The
Applicability
a process control system. The control room user interface could be described,
but not integrated with sensor monitoring deadlines under this approach.
This new approach of providing feedback early to the end user has resulted in a
problem related to the behavior of the end user and developers. An end user with
developed rapidly by using shortcuts and gives to the user to use. So that he will
start.
The end user can not throw the software needs (stated in natural language) over
the transom and expect the development team to return the finished software
One of the major problems with incorporating this technology is the large
not feasible. There is, however, a threshold that exists where the expected life
span of a software system justifies that the system would be better maintained
prototyping technology, but the range of support for the concept varies widely.
proceeds. Similarly, during development of the actual system or even later out
The iterative enhancement model counters the third limitation of the waterfall
model and tries to combine the benefits of both prototyping and the waterfall
model. The basic idea is that the software should be developed in increments,
each increment adding some functional capability to the system until the full
because testing each increment is likely to be easier than testing the entire
provide feedback to the client that is useful for determining the final requirements
of the system.
In the first step of this model, a simple initial implementation is done for a subset
of the overall problem. This subset is one that contains some of the key aspects
useful and usable system. A project control list is created that contains, in order,
all the tasks that must be performed to obtain the final implementation. This
project control list gives an idea of how far the project is at any given step from
Each step consists of removing the next task from the list, designing the
implementation for the selected task, coding and testing the implementation,
performing an analysis of the partial system obtained after this step, and
updating the list as a result of the analysis. These three phases are called the
iterated until the project control list is empty, at which time the final
Implement1 Implementn
Implement0
Analysis1 Analysisn
Analysis0
The project control list guides the iteration steps and keeps track of all the tasks
that must be done. Based on the analysis, one of the tasks in the list can include
redesign of the system will generally occur only in the initial steps. In the later
steps, the design would have stabilized and there is less chance of redesign.
Each entry in the list is a task that should be performed in one step of the
reduce the redesign work. The design and implementation phases of each step
One effective use of this type of model is product development, in which the
control on what specifications go in the system and what stay out. In fact, most
that contains some capability. Based on the feedback from users and experience
with this version, a list of additional features and capabilities is generated. These
features form the basis of enhancement of the software, and are included in the
next version. In other words, the first version contains some core capability and
essentially provide and approve the specifications, it is not always clear how this
process can be applied. Another practical problem with this type of development
project comes in generating the business contract-how will the cost of additional
organization is likely to be tied to the original vendor who developed the first
version. Overall, in these types of projects, this process model can be useful if
this process has the major advantage that the client's organization does not have
to pay for the entire software together; it can get the main part of the software
This model was originally proposed by Bohem (1988). As it is clear from the
name, the activities in this model can be organized like a spiral that has many
accomplishing the steps done so far, and the angular dimension represents the
progress made in completing each cycle of the spiral. The model is shown in
Figure 3.3.
¾ Objective setting: Each cycle in the spiral begins with the identification of
objectives for that cycle, the different alternatives that are possible for
achieving the objectives, and the constraints that exist. This is the first
¾ Risk Assessment and reduction: The next step in the cycle is to evaluate
focus of evaluation in this step is based on the risk perception for the project.
Risks reflect the chances that some of the objectives of the project may not
be met.
resolve the uncertainties and risks. This step may involve activities such as
¾ Planning: Next, the software is developed, keeping in mind the risks. Finally,
the next stage is planned. The project is reviewed and a decision made
continue, plans are drawn up for the next phase of the project.
that involves developing a more detailed prototype for resolving the risks. On the
other hand, if the program development risks dominate and the previous
The risk-driven nature of the spiral model allows it to accommodate any mixture
type of approach. An important feature of the model is that each cycle of the
spiral is completed by a review that covers all the products developed during that
cycle, including plans for the next cycle. The spiral model works for development
In a typical application of the spiral model, one might start with an extra round
zero, in which the feasibility of the basic project objectives is studied. These
are also typically very high-level, such as whether the organization should go for
one, a concept of operation might be developed. The objectives are stated more
precisely and quantitatively and the cost and other constraints are defined
precisely. The risks here are typically whether or not the goals can be met within
the constraints. The plan for the next phase will be developed, which will involve
defining separate activities for the project. In round two, the top-level
be done.
management and planning activities into the model. For high-risk projects, this
3.3 Summary
are a number of models with their merits and demerits such as water fall model,
prototyping, iterative enhancement, and spiral model. The central idea behind
waterfall model - time spent early on making sure that requirements and design
are absolutely correct, is very useful in economic terms. A further argument for
source code; it helps the new team members to be able to bring themselves "up
its simple and arguably more disciplined approach. The waterfall model however
is argued by many to be a bad idea in practice, mainly because of their belief that
before moving on to the next phases and learning from them. Clients may not be
aware of exactly what requirements they want before they see a working
prototype and can comment upon it - they may change their requirements
constantly, and program designers and implementers may have little control over
this. To sort out the limitations of water fall model, other models are proposed
such as prototyping that helps the user in identifying their requirements better.
3.4 Keywords
each increment adding some functional capability to the system until the full
system is implemented.
Spiral model: This model was proposed by Bohem and in it the activities can be
away prototype?
McGraw-Hill.
4.0 Objectives
project. The objective of this lesson is to make the students familiar with the
factors affecting the cost of the software, different versions of COCOMO and the
4.1 Introduction
Software cost estimation is the process of predicting the amount of effort required
to build a software system. Software cost estimation is one of the most difficult
and error prone task in software engineering. Cost estimates are needed
the feasibility of a project. Detailed estimates are needed to assist with project
planning. The actual effort for individual tasks is compared with estimated and
necessary.
Analysis of historical project data indicates that cost trends can be correlated with
models that can be used to assess, predict, and control software costs on a real-
time basis. Models provide one or more mathematical algorithms that compute
4.2.5 COCOMO’81
4.2.11.1 Structure
4.2.11.2 Complexity
There are a number of factors affecting the cost of the software. The major one
application programs, and that system programs are three times as difficult to
software.
time.
a program will perform a required function under stated conditions for a stated
reliability.
Size is a primary cost factor in most models. There are two common ways to
Lines of Code
The most commonly used measure of source code program length is the number
Function Points
point count (UFC). Counts are made for the following categories:
¾ External outputs – those items provided to the user that generate distinct
Once this data has been collected, a complexity rating is associated with each
Weighting Factor
Item Simple Average Complex
External inputs 3 4 6
External outputs 4 5 7
External inquiries 3 4 6
External files 7 10 15
Internal files 5 7 10
Table 4.1 Function point complexity weights.
Each count is multiplied by its corresponding complexity weight and the results
are summed to provide the UFC. The adjusted function point count (FP) is
influence on the system and 5 means the component is essential. The TCF can
The factor varies from 0.65 (if each Fi is set to 0) to 1.35 (if each Fi is set to 5).
FP = UFC x TCF
There are two types of models that have been used to estimate cost: cost models
Cost Models
Cost models provide direct estimates of effort. These models typically have a
primary cost factor such as size and a number of secondary adjustment factors
or cost drivers. Cost drivers are characteristics of the project, process, products,
or resources that influence effort. Cost drivers are used to adjust the preliminary
estimate provided by the primary cost factor. A typical cost model is derived
using regression analysis on data collected from past software projects. Effort is
plotted against the primary cost factor for a series of projects. The line of best fit
is then calculated among the data points. If the primary cost factor were a perfect
predictor of effort, then every point on the graph would lie on the line of best fit. In
necessary to identify the factors that cause variation between predicted and
actual effort. These parameters are added to the model as cost drivers.
E = A + B x S^C
months, and S is the primary input (typically either LOC or FP). The following are
Constraint Models
Constraint models demonstrate the relationship over time between two or more
parameters of effort, duration, or staffing level. The RCA PRICE S model and
Most of the work in the cost estimation field has focused on algorithmic cost
modeling. In this process costs are analyzed using mathematical formulas linking
costs or inputs with metrics to produce an estimated output. The formulae used
in a formal model arise from the analysis of historical data. The accuracy of the
and actual values. Calibration of the model can improve these figures; however,
Putnam's SLIM is one of the first algorithmic cost model. It is based on the
Norden / Rayleigh function and generally known as a macro estimation model (It
is for large projects). SLIM enables a software cost estimator to perform the
following functions:
projects.
¾ Software sizing SLIM uses an automated version of the lines of code (LOC)
costing technique.
K = (LOC / (C * t4/3)) * 3
K is the total life cycle effort in working years, t is development and the C is the
Advantages of SLIM
Drawbacks of SLIM
4.2.5 COCOMO’81
Boehm's COCOMO model is one of the mostly used models commercially. The
first version of the model delivered in 1981 and COCOMO II is available now.
2002. The programs examined ranged in size from 2000 to 100,000 lines of
forms.
of program size and a set of "cost drivers" that include subjective assessment
version with an assessment of the cost driver's impact on each step (analysis,
software project.
which small teams with good application experience work to a set of less than
rigid requirements.
software projects in which teams with mixed experience levels must meet a
E=a (KLOC) b
D=c (E) d
P=E/D
for the project (expressed in thousands), and P is the number of people required.
The coefficients ab, bb, cb and db are given in the table 4.6.
Software project a b c D
Basic COCOMO is good for quick, early, rough order of magnitude estimates of
software costs, but its accuracy is necessarily limited because of its lack of
experience, use of modern tools and techniques, and other project attributes
Product attributes
Hardware attributes
Personnel attributes
Project attributes
Each of the 15 attributes is rated on a 6-point scale that ranges from "very low" to
"extra high" (in importance or value). Based on the rating, an effort multiplier is
'effort adjustment factor (EAF). Typical values for EAF range from 0.9 to 1.4 as
Cost Ratings
Drivers Very Low Low Nominal High Very High Extra High
RELY 0.75 0.88 1.00 1.15 1.40
DATA 0.94 1.00 1.08 1.16
CPLX 0.70 0.85 1.00 1.15 1.30 1.65
TIME 1.00 1.11 1.30 1.66
STOR 1.00 1.06 1.21 1.56
VIRT 0.87 1.00 1.15 1.30
TURN 0.87 1.00 1.07 1.15
ACAP 1.46 1.19 1.00 0.86 0.71
PCAP 1.29 1.13 1.00 0.91 0.82
AEXP 1.42 1.17 1.00 0.86 0.70
VEXP 1.21 1.10 1.00 0.90
LEXP 1.14 1.07 1.00 0.95
TOOL 1.24 1.10 1.00 0.91 0.82
MODP 1.24 1.10 1.00 0.91 0.83
SCED 1.23 1.08 1.00 1.04 1.10
Table 4.7 Effort adjustment factor
thousands of delivered lines of code for the project and EAF is the factor
calculated above. The coefficient a and the exponent b are given in the next
table.
Software project a b
Organic 3.2 1.05
Semi-detached 3.0 1.12
Embedded 2.8 1.20
Table 4.8 Coefficients for intermediate COCOMO
COCOMO.
are:
prediction.
4. Calculate the predicted project effort using first equation and the effort
Mode is organic
Size = 200KDSI
Cost drivers:
and a set of cost drivers weighted according to each phase of the software
lifecycle. The Advanced model applies the Intermediate model at the component
The 4 phases used in the detailed COCOMO model are: requirements planning
and product design (RPD), detailed design (DD), code and unit test (CUT), and
integration and test (IT). Each cost driver is broken down by phase as in the
Estimates made for each module are combined into subsystems and eventually
Advantages of COCOMO'81
¾ COCOMO is transparent; you can see how it works unlike other models such
as SLIM.
Drawbacks of COCOMO'81
of the project.
tools are used for translation of existing software, but COCOMO'81 made little
performance, or technology maturity. Object points are used for sizing rather than
and third-generation components that will be used in the application. Each object
level.
The weighted instances are summed to provide a single object point number.
Reuse is then taken into account. Assuming that r% of the objects will be reused
from previous projects; the number of new object points (NOP) is calculated to
be:
Developers' experience and capability Very Low Low Nominal High Very High
ICASE maturity and capability Very Low Low Nominal High Very High
PROD 4 7 13 25 50
Table 4.13 Average productivity rates based on developer’s experience and the
ICASE maturity/capability
E = NOP / PROD
(UFC) is used for sizing. This value is converted to LOC using tables such as
E = aKLOC x EAF
model using the 7 cost drivers shown in Table 4.15. The Early Design cost
maintenance of a product. Function points or LOC can be used for sizing, with
modifiers for reuse and software breakage. Boehm advocates the set of
5 factors determining the projects scaling component. The 5 factors replace the
COCOMO model.
E = aKLOC^b x EAF
W(i) Very Low Low Nominal High Very High Extra High
Precedentedness 4.05 3.24 2.42 1.62 0.81 0.00
Development/Flexibility 6.07 4.86 3.64 2.43 1.21 0.00
Architecture/Risk Resolution 4.22 3.38 2.53 1.69 0.84 0.00
Team Cohesion 4.94 3.95 2.97 1.98 0.99 0.00
Process Maturity 4.54 3.64 2.73 1.82 0.91 0.00
Table 4.16 COCOMO II scale factors
The EAF is calculated using the 17 cost drivers shown in Table 4.17.
SLIM uses separate Rayleigh curves for design and code, test and validation,
Development effort is assumed to represent only 40 percent of the total life cycle
Several researchers have criticized the use of a Rayleigh curve as a basis for
cost estimation. Norden’s original observations were not based in theory but
rather on observations. Moreover his data reflects hardware projects. It has not
projects sometimes exhibit a rapid manpower buildup which invalidate the SLIM
Putnam used some empirical observations about productivity levels to derive the
software equation from the basic Rayleigh curve formula. The software equation
is expressed as:
Where C is a technology factor, E is the total project effort in person years, and t
primarily reflects:
The software equation includes a fourth power and therefore has strong
The manpower acceleration is 12.3 for new software with many interfaces and
Using the software and manpower-buildup equations, we can solve for effort:
E = (S / C) 9/7 (D 4/7)
the power 9/7 or ~1.286, which is similar to Boehm's factor which ranges from
1.05 to 1.20.
¾ Definition – Has the model clearly defined the costs it is estimating, and the
costs it is excluding?
¾ Fidelity – Are the estimates close to the actual costs expended on the
projects?
¾ Objectivity – Does the model avoid allocating most of the software cost
¾ Constructiveness – Can a user tell why the model gives the estimates it
does? Does it help the user understand the software job to be done?
cost estimates?
¾ Scope – Does the model cover the class of software projects whose costs
¾ Ease of Use – Are the model inputs and options easy to understand and
specify?
¾ Prospectiveness – Does the model avoid the use of information that will not
¾ Parsimony – Does the model avoid the use of highly redundant factors, or
of the predicted values fall within 25 percent of their actual values. Unfortunately
most models are insufficient based on this criterion. Kemerer reports average
errors (in terms of the difference between predicted and actual project effort) of
over 600 percent in his independent study of COCOMO. The reasons why
existing modeling methods have fallen short of their goals include model
4.2.11.1 Structure
determinant of effort, the exact relationship between size and effort is unclear.
Most empirical studies express effort as a function of size with an exponent b and
a multiplicative term a. However the values of a and b vary from data set to data
set.
adjustment factor so that larger projects require more effort than smaller ones.
Intuitively this makes sense, as larger projects would seem to require more effort
support this. Banker and Kemerer analyzed seven data sets, finding only one
Adjustment
Model
Factor
Walston-Felix 0.91
Nelson 0.98
Freburger-Basili 1.02
Herd 1.06
Bailey-Basili 1.16
Frederic 1.18
Phister 1.275
Jones 1.40
Halstead 1.50
Schneider 1.83
There is also little consensus about the effect of reducing or extending duration.
increases effort, but increasing duration decreases effort (Fenton, 1997). Other
studies have shown that decreasing duration decreases effort, contradicting both
models.
Most models work well in the environments for which they were derived, but
perform poorly when applied more generally. The original COCOMO is based on
characteristics of the data. This results in a high degree of accuracy for similar
4.2.11.2 Complexity
models include adjustment factors, such as COCOMO’s cost drivers and SLIM’s
adjustment factors to account for any variations between the model’s data set
inadequate.
Kemerer has suggested that application of the COCOMO cost drivers does not
always improve the accuracy of estimates. The COCOMO model assumes that
the cost drivers are independent, but this is not the case in practice. Many of the
cost drivers affect each other, resulting in the over emphasis of certain attributes.
The cost drivers are also extremely subjective. It is difficult to ensure that the
factors are assessed consistently and in the way the model developer intended.
extremely sensitive to the technology factor, however this is not an easy value to
determine. Calculation of the EAF for the detailed COCOMO model can also be
predict early in the development lifecycle. Many models use LOC for sizing,
Although function points and object points can be used earlier in the lifecycle,
Size estimates can also be very inaccurate. Methods of estimation and data
Unless the size metrics used in the model are the same as those used in
4.3 Summary
process. Models can be used to represent the relationship between effort and a
primary cost factor such as size. Cost drivers are used to adjust the preliminary
estimate provided by the primary cost factor. Although models are widely used to
predict software cost, many suffer from some common problems. The structure
of most models is based on empirical results rather than theory. Models are often
complex and rely heavily on size estimation. Despite these problems, models are
still important to the software development process. Model can be used most
4.4 Keywords
with an assessment of the cost driver's impact on each step of the software
engineering process.
Explain.
Pressman, McGraw-Hill.
5.0 Objectives
The objectives of this lesson are to get the students familiar with software
¾ Characteristics of SRS.
5.1 Introduction
tedious job. The description of the services and constraints are the requirements
for the system and the process of finding out, analyzing, documenting, and
for the software product in a concise and unambiguous manner, using formal
system definition. The requirement specification will state the “what of” the
5.2.3.3 Prototyping
functional requirements.
that the system is expected to provide. They provide how the system should
react to particular inputs and how the system should behave in a particular
situation.
standards etc. These requirements are not directly concerned with the specific
function delivered by the system. They may relate to such system properties
such as reliability, response time, and storage. They may define the constraints
on the system such as capabilities of I/O devices and the data representations
document.
should satisfy:
1. Introduction
1.4 References
2. General description
requirements.
4. Appendices
5. Index
the responses of the software to all classes of input data are specified in the
SRS.
verifiable i.e. there exists a procedure to check that final software meets the
requirement.
another.
identified to a source.
¾ Modifiable: An SRS is modifiable if its structure and style are such that any
consistency.
indicated.
(ii) Performance
Functionality
should be produced from the given input. For each functional requirement, a
detailed description of all the inputs, their sources, range of valid inputs, the units
also be specified.
Performance requirements
In this component of SRS all the performance constraints on the system should
Design constraints
system will have to use some existing hardware, limited primary and/or
access to data, require the use of passwords and cryptography techniques etc.
Software has to interact with people, hardware, and other software. All these
interfaces should be specified. User interface has become a very important issue
It is also known as structured analysis. Here the aim is to identify the functions
performed in the problem and the data consumed and produced by these
function.
What is a model?
Douglas T. Ross that a model answers questions; that the definition of a model
Why model?
manipulate large amounts of data and hence derive information which can assist
in decision making.
order to:
¾ Communicate:
found when attempting to analyze and understand a system. Models are also
extremely useful communication tools; i.e.: complex ideas and concepts can be
captured on paper and can be shown to users and clients for clarification and
etc. In this respect, the final models created in the Design and Development
The three most important modeling techniques used in analyzing and building
¾ Data Flow Diagramming (DFDs): Data Flow Diagrams (DFDs) model events
and processes (i.e. activities which transform data) within a system. DFDs
examine how data flows into, out of, and within the system. (Note: 'data' can
be understood as any 'thing' (eg: raw materials, filed information, ideas, etc.)
represent a system's information and data in another way. LDSs map the
Figure 5.2
¾ Entity Life Histories (ELHs): Entity Life Histories (ELHs) describe the changes
which happen to 'things' (entities) within the system as shown in figure 5.3.
These three techniques are common to many methodologies and are widely
used in system analysis. Notation and graphics style may vary across
for a number of years been widely used in the UK) systems analysts and
modelers use the above techniques to build up three, inter-related, views of the
SSADM uses different sets of Data Flow Diagram to describe the target system
¾ HOW it does it
¾ WHAT it should do
¾ HOW it should do it
Another way of looking at it is that, in SSADM, DFDs are used to answer the
However, we are not interested, here, in the development process in detail, only
¾ Just as a system must have input and output (if it is not dead), so a process
¾ Data enters the system from the environment; data flows between processes
within the system; and data is produced as output from the system
rectangular box; and data are shown as arrows coming to, or going from the
identifier (top left) and a unique name (an imperative - eg: 'do this' - statement
in the main box area) The top line is used for the location of, or the people
used to represent the flows must either start and/or end at a process box.
permanently.
Figure 5.4
5.2.3.2.4 General Data Flow Rules
¾ Entities are either 'sources of' or 'sinks' for data input and outputs - i.e. they
The 'Context Diagram ' is an overall, simplified, view of the target system, which
contains only one process box, and the primary inputs and outputs as shown in figure 5.5.
Figure 5.5
Context diagram 2
Figure 5.6
Both the above figure 5.5 and 5.6 say the same thing. The second makes use of
the possibility in SSADM of including duplicate objects. (In context diagram 2 the
duplication of the Customer object is shown by the line at the left hand side.
system.)
The Context diagram above, and the one which follows (Figure 5.7), are a first
process it is likely that diagrams will be reworked and amended many times -
until all parties are satisfied with the resulting model. A model can usefully be
The Top or 1st level DFD, describes the whole of the target system. The Top
level DFD 'bounds' the system -and shows the major processes which are
Figure 5.7
Each Process box in the Top Level diagram may itself be made up of a number
of processes, and where this is the case, the process box will be decomposed as
Each box in a diagram has an identification number derived from the parent. Any
box in the second level decomposition may be decomposed to a third level. Very
levels.
Figure 5.9
Every page in a DFD should contain fewer than 10 components. If a process has
should be combined into one and another DFD be generated that describes that
subcomponent, and so on. So for example, a top level DFD would have
components 3.1, 3.2, 3.3, and 3.4; and the subsubcomponent DFD of component
SSADM uses different sets of Data Flow Diagram to describe the target system
Table 5.1
In the following section, two approaches to prepare the DFD are proposed.
Top-Down Approach
data stores, and the data flows between these processes and data stores.
Analysis",
3. Each process is linked (with incoming data flows) directly with other
Each diagram must contain 3 to 6 nodes plus interconnecting arcs. Two basic
diagram). In actigram the nodes denote activities and arcs specify the data flow
between activities while in datagrams nodes specify the data objects and arcs
denote activities. The following figure shows the formats of actigram and
datagram. It is important to note that there are four distinct types of arcs. Arcs
coming into the left side of a node show inputs and arcs leaving the right side of
a node convey output. Arcs entering the top of a node convey control and arcs
entering the bottom specify mechanism. Following Figures 5.10 and 5.11
illustrate the activity diagrams and data diagrams. As shown, in actigram arc
coming into the left side shows the input data on that activity works, arc coming
from right indicates the data produced by the activity. Arc entering the top of the
node specifies the control data for the activity arc entering the bottom specify the
Source program
Source program interpreter Object program
Input Output data
Activity
d
Interp
reter
Proce
ssor
Storage
device
Operating
system
Disk
Check
MAKE
PAYMENT
Figure 5.12
Sometimes when the system is a totally new system, the users and clients do not
have a good idea of their requirements. In this type of cases, prototyping can be
a good option. The idea behind prototyping is that clients and users can assess
their needs much better if they can see the working of a system, even if the
implements part of the whole system helps the user in understanding their
There are two variants of prototyping: (i) Throwaway prototyping and (ii)
the final system. Gradually the increments are made to the prototype by taking
OUTLINE
REQUIREME
Figure 5.13
Evolutionary prototyping
It is the only way to develop the system where it is difficult to establish a detailed
system documentation.
NO
DELIVER YES
SYSTEM SYSTEM
ADEQUATE?
Figure 5.14
Throwaway prototyping
Figure 5.15
Customers and end users should resist the temptation to turn the throwaway
(ii) The changes made during prototype development will probably have
expensive.
unambiguity, one should use formal language, which is difficult to use and learn.
Natural language is quite easy to use but tends to be ambiguous. Some of the
The natural languages are most easy to use but it has some drawbacks and the
imprecise and ambiguous. To remove this drawback some efforts have been
made and one is the use of structured English. In structure English the
broken into sub paragraphs. Many organizations specify the strict use of some
Regular expressions are used to specify the syntactic structure of symbol strings.
expressions provide a powerful notation for such cases. The rules for regular
expressions are:
by R1 and R2.
from R1 with zero or more strings from R1. A commonly used notation is
For example the requirement, a valid data stream must start with an “a”, followed
by “b”s and “c”s in any order but always interleaved by a and terminated by “b” or
of four quadrants: condition stub, condition entries, action stub, and action
are used to combine conditions into decision rules. Action stub specifies the
Decision Rules
Rule 1 Rule 2 Rule 3 Rule 4
Following is the decision table (Table 5.3) to find the largest of three numbers.
The above decision table is an example of a limited entry decision table in which
denotes don’t care, and X denotes perform action. If more than one decision rule
has identical (Y, N, -) entries, the table is said to be ambiguous. Ambiguous pair
The above decision table (Table 5.4) illustrates redundant rules (R3 and R4) and
They specify actions to be taken when events occur under different sets of
event of interest. For example, following table 5.5 specifies that if condition c1 is
there and event E1 occurs, then one must take A1 action. A “-“ entry indicates
Conditions Event
E1 E2 E3 E4 E5
C1 A1 - A4; A5
C2 X A2, A3
C3
C4
Table 5.5
driving forces. Following figure shows the format of a transition table (Table 5.6).
Current input
Current state
A B
S0 S0 S1
S1 S1 S0
It indicates that if B is the input in S0 state then a transition will take place to s1
Summary
used such as DFD, decision table, event table, transition table, regular
expressions etc. The SRS is the agreed statement of the system requirements. It
should be organized so that both clients and developers can use it. To satisfy its
goal, the SRS should have certain desirable characteristics such as consistency,
approaches to problem analysis. The aim of problem analysis is to have the clear
data flow modeling using DFD, Structured analysis and Design Technique
(SADT) and prototyping. Data flow modeling and SADT focus mainly on the
functions performed in the problem domain and the input consumed and
Keywords
DFD: They model events and processes and examine how data flows into, out
Transition table: These are used to specify changes in the state of a system as
Event tables: They specify actions to be taken when events occur under
consisting of four quadrants: condition stub, condition entries, action stub, and
action entries.
Self-assessment questions
Specification? Explain.
requirement specification? What are the merits and demerits of using the
(DFDs).
Pressman, McGraw-Hill.
6.0 Objectives
The objective of this lesson is to make the students familiar with the concepts of
design, design notations and design concepts. After studying the lesson students
3. Modularization criteria
4. Design notations
6.1 Introduction
a design specification. Consider an example where Mrs. & Mr. XYZ want a new
and so on. An architect takes these requirements and designs a house. The
produce several designs to meet this requirement. For example, one may
addition, the style of the proposed houses may differ: traditional, modern and
two-storied. All of the proposed designs solve the problem, and there may not be
a “best” design.
specification to define the problem and transform this to a solution that satisfies
all the requirements in the specification. Design is the first step in the
development phase for any engineered product. The designer goal is to produce
6.2.4.1 Abstraction
6.2.4.3 Modularity
6.2.5.2 Cohesion
6.2.7.3 Pseudocode
¾ The process of applying various techniques and principles for the purpose of
physical realization.
¾ Inflexible since planning for long term changes was not given due emphasis.
¾ Unmaintainable since standards & guidelines for design & construction are
coupled modules with low cohesion. Data disintegrity may also result.
Design is different from programming. Design brings out a representation for the
program – not the program or any component of it. The difference is tabulated
below.
Functional: It is a very basic quality attribute. Any design solution should work,
Flexibility: It is another basic and important attribute. The very purpose of doing
design activities is to build systems that are modifiable in the event of any
Portability & Security: These are to be addressed during design - so that such
Reliability: It tells the goodness of the design - how it work successfully (More
aesthetics, directness, forgiveness, user control, ergonomics, etc) and how much
¾ Budget
¾ Time
¾ Skills
¾ Standards
Budget and Time cannot be changed. The problems with respect to integrating to
other systems (typically client may ask to use a proprietary database that he is
using) has to be studied & solution(s) are to be found. ‘Skills’ is alterable (for
example, by arranging appropriate training for the team). Mutually agreed upon
standards has to be adhered to. Hardware and software platforms may remain a
constraint.
Designer try answer the “How” part of “What” is raised during the requirement
low cost computers in an open systems environment We are moving away from
internet based.
The process of design involves conceiving and planning out in the mind" and
“making a drawing, pattern, or sketch of". In software design, there are three
characteristics include user displays and report formats, external data sources
and high-level process structure for the product. External design begins during
the analysis phase and continues in the design phase. In practice, it is not
handling and the other items. External design is concerned with refining those
requirement and establishing the high level structural view of the system, Thus,
the distinction between requirements definition and external design is not sharp,
but is rather a gradual shift in emphasis from detailed "what" to high-level 'how".
Internal design involves conceiving, planning out, and specifying the internal
structure and processing details of the software product. The goals of internal
design are to specify internal structure and processing details, to record design
decisions and indicate why certain alternatives and trade-offs were chosen, to
Architectural design is concerned with refining the conceptual view of the system,
functions into sub functions, defining internal data streams and data stores.
concrete data structures that implement the data stores, interconnection among
The test plan describes the objectives of testing, the test completion criteria, the
and techniques to be used, and the actual test cases and expected results.
analysis and are refined during the design phase. Tests that examine the internal
structure of the software product and tests that attempt to break the system
External design and architectural design typically span the Period from Software
design spans the period from preliminary design review to Critical Design Review
(CDR).
6.2.4.1 Abstraction
Abstraction is the intellectual tool that allows us to deal with concepts apart from
from the implementation details. We can, for example, specify the FIFO property
of a queue or the LIFO property of a stack without concern for the representation
the functional characteristics of the routines that manipulate data structures (e.g.,
NEW, PUSH, POP, TOP, EMPTY) without concern for the algorithmic details of
the routines.
data stores have been established. Structural considerations are then addressed
complexity that must be dealt with at any particular point in the design process.
and structural attributes of the system are emphasized. During detailed design,
of subprograms i.e. groups. Within a group, certain routines have the visible
without the visible property are hidden from other groups. A group thus provides
groups and the hidden routines exist to support the visible ones.
Data abstraction involves specifying a data type or data object by specifying legal
mechanisms for creating abstract data types. For example, the Ada package is a
programming language mechanism that provides support for both data and
generic data structure from which other data structures can be instantiated.
software design. Like procedural and data abstraction, control abstraction implies
the nature of S, the nature of the files, and how "for all I in S" is to be handled.
are hidden inside the construct. At the architectural design level, control
and co-routines and concurrent program units without concern for the exact
details of implementation.
software system is designed using the information hiding approach, each module
in the system hides the internal details of its processing activities and modules
communicates only thru well defined interfaces. Design should begin with a list of
difficult design decisions and design decisions that are likely to change. Each
module is designed to hide such a decision from the other modules. Because
these design decisions transcend execution time, design modules may not
¾ A data structure, its internal linkage, and the implementation details of the
details
Information hiding can be used as the principal design technique for architectural
design
6.2.4.3 Modularity
There are many definitions of the term "module." They range from "a module is a
each functional abstraction, each data abstraction, and each control abstraction
handles a local aspect of the problem being solved. Modular systems consist of
¾ Functions share global data selectively. It is easy to identify all routines that
system.
system. Depending on the criteria used, different system structures may result.
and its sub modules correspond to a processing step in the execution sequence;
changeable design decision from the other modules; the data abstraction
criterion, in which each module hides the representation details of a major data
structure behind functions that access and modify the data structure; levels of
the coupling between modules; and problem modeling, in which the modular
structure of the system matches the structure of the problem being solved. There
are two versions of problem modeling: either the data structures match the
problem structure and the visible functions manipulate the data structures, or the
good heuristic for achieving this goal involves the concepts of coupling and
cohesion.
6.2.5.1 Coupling
minimizes the paths along which changes and errors can propagate into other
parts of the system (‘ripple effect’). The use of global variables can result in an
degree of coupling between two modules is a function of several factors: (1) How
complicated the connection is, (2) Whether the connection refers to the module
itself or something inside it, and (3) What is being sent or received. Coupling is
usually contrasted with cohesion. Low coupling often correlates with high
cohesion, and vice versa. Coupling can be "low" (also "loose" and "weak") or
"high" (also "tight" and "strong"). Low coupling means that one module does not
interacts with another module with a stable interface. With low coupling, a
change in one module will not require a change in the implementation of another
for lower coupling; the gains in the software development process are greater
The concepts are usually related: low coupling implies high cohesion and vice
¾ Data coupling
¾ Stamp coupling
¾ Control coupling
¾ External coupling
¾ Common coupling
¾ Content coupling
Data Coupling
Two modules are data coupled if they communicate by parameters (each being
Two modules are stamp coupled if one passes to other a composite piece of data
modules share a composite data structure, each module not knowing which part
of the data structure will be used by the other (e.g. passing a student record to a
Control Coupling
Two modules are control coupled if one passes to other a piece of information
intended to control the internal logic of the other. In Control coupling, one module
what-to-do flag).
External coupling
External coupling occurs when two modules share an externally imposed data
Common coupling
Two modules are common coupled if they refer to the same global data area.
Content coupling
Two modules exhibit content coupled if one refers to the inside of the other in any
way (if one module ‘jumps’ inside another module). E.g. Jumping inside a module
violate all the design principles like abstraction, information hiding and
modularity.
coupling between a parent class and its child. The parent has no connection to
the child class, so the connection is one way (i.e. the parent is a sensible class
on its own). The coupling is hard to classify as low or high; it can depend on the
situation.
We aim for a ‘loose’ coupling. We may come across a (rare) case of module A
calling module B, but no parameters passed between them (neither send, nor
coupling (lower than Normal Coupling itself). Two modules A &B are normally
types of coupling (Common and content) are abnormal coupling and not desired.
between is large.
The meaning of composite data is the way it is used in the application NOT as
represented in a program)
¾ “What-to-do flags” are not desirable when it comes from a called module
In general, use of tramp data and hybrid coupling is not advisable. When data is
passed up and down merely to send it to a desired module, the data will have no
meaning at various levels. This will lead to tramp data. Hybrid coupling will result
when different parts of flags are used (misused?) to mean different things in
coupled in more than one way. In such cases, their coupling is defined by the
is connected to another.
included.
communicate less with other modules, and this has the virtuous side-effect of
6.2.5.2 Cohesion
Designers should aim for loosely coupled and highly cohesive modules. Coupling
is reduced when the relationships among elements not in the same module are
difficult to test, difficult to reuse, and even difficult to understand. The types of
2. Logical Cohesion
3. Temporal Cohesion
4. Procedural Cohesion
5. Communicational Cohesion
6. Sequential Cohesion
Coincidental cohesion is when parts of a module are grouped arbitrarily; the parts
Logical cohesion
relation (e.g. using control coupling to decide which part of a module to use, such
Temporal cohesion
Temporal cohesion is when parts of a module are grouped by when they are
Procedural cohesion
always follow a certain sequence of execution (e.g. a function which checks file
Communicational cohesion
on a student record, but the actions which the method performs are not clear).
Sequential cohesion
Sequential cohesion is when parts of a module are grouped because the output
from one part is the input to another part (e.g. a function which reads data from a
Functional cohesion is when parts of a module are grouped because they all
Since cohesion is a ranking type of scale, the ranks do not indicate a steady
Constantine and Edward Yourdon as well as others indicate that the first two
types of cohesion are much inferior to the others and that module with
cohesion. The seventh type, functional cohesion, is considered the best type.
cohesion for a software module, it may not actually be achievable. There are
many cases where communicational cohesion is about the best that can be
types of cohesion are associated with modules of lower lines of code per module
with the source code focused on a particular functional objective with less
variety of conditions.
students in a class:
In average() above, all of the elements are related to the performance of a single
Suppose we need to calculate standard deviation also in the above problem, our
{ …}
communication cohesion.
calc_stat(){
for i = 1 to N
read (x[i])
…}
a = sum/N
s = … // formula to calculate SD
print a, s
binding activities through control flow. calc-stat() has made two statistical
Figure 6.1
use the module, we may have to send a flag to indicate what we want (forcing
various activities sharing the interface). Examples are a module that performs all
input and output operations for a program. The activities in a logically cohesive
module usually fall into same category (validate all input or edit all data) leading
want to calculate only average, the call to it would look like calc_all_stat (x[],
flag). The flag is used to indicate out intent i.e. if flag=0 then function will return
calc_all_stat(m, flag)
limiting the physical size of modules, structuring the system to improve the
minimizing page faults in a virtual memory machine, and reducing the call return
overhead of excessive subroutine calls. For each software product, the designer
must weigh these factors and develop a consistent set of modularization criteria
concern that frequently arises when decomposing a system into modules. A large
number of small modules having data coupling and functional cohesion implies a
large execution time overhead for establishing run-time linkages between the
first design and implements the system in a highly modular fashion. System
and recombining modules, and by hand coding certain critical linkages and
is spent in 20 percent or less of the code. Furthermore, the region of code where
the majority of time is spent is usually not predictable until the program is
reconfigure and recombine small modules into larger units if necessary for better
1) Modular decomposition
¾ It starts from functions that are to be implemented and explain how each
2) Event-oriented decomposition
3) Object-oriented design
¾ It starts with object types and then explores object attributes and actions.
interest, while poor notation can complicate and interfere with good design
During the design phase, there are two things of interest: the design of the
system, producing which are the basic objective of this phase, and the process of
designing itself. It is for the latter that principles and methods are needed. In
addition, while designing, a designer needs to record his thoughts and decisions,
to represent the design so that he can view it and play with it. For this, design
Design notations are largely meant to be used during the process of design and
are used to represent design or design decisions. They are meant largely for the
designer so that he can quickly represent his decisions in a compact manner that
Once the designer is satisfied with the design he has produced, the design is to
ultimate objective of the design phase. The purpose of this design document is
quite different from that of the design notation. Whereas a design represented
Here, we first describe a design notation structure charts that can be used to
language to specify a design. Though the design document, the final output of
the design activity, typically also contains other things like design decisions taken
and background, its primary purpose is to document the design itself. We will
Every computer program has a structure, and given a program, its structure can
name written in the box. An arrow from module A to module B represents that
input and the parameters returned by B as output, with the direction of flow of the
input and output parameters represented by small arrows. The parameters can
at the tail).
B C
generally decision boxes are not there. However, there are situations where the
structure chart. For example, let us consider a situation where module A has
subordinates B, C, and D, and A repeatedly calls the modules C and D. This can
and D to A, as shown in figure 6.2. All the subordinate modules activated within a
A A
B C D B C D
represented by a small diamond in the box for A, with the arrows joining C and D
modules that obtain information from their subordinates and then pass it to their
output modules that take information from their super-ordinate and pass it on to
its subordinates. As the name suggests, the input and output modules are,
typically, used for input and output of data, from and to the environment. The
input modules get the data from the sources and get it ready to be processed,
and the output modules take the output produced and prepare it for proper
presentation to the environment. Then, there are modules that exist solely for the
sake of transforming data into some other form. Such a module is called a
category. Finally, there are modules whose primary concern is managing the flow
of data to and from different subordinates. Such modules are called coordinate
A module can perform functions of more than one type of module. For example,
the composite module in Figure 6.3 is an input module, from the point of view of
coordinate module and views its job as getting data X from one subordinate and
Input Output
Module Module
x y
Coordinate Transform
Module Module
y
Composite
Module
x y
x
functional abstraction. It shows the modules and their call hierarchy, the
interfaces between the modules, and what information passes between modules.
It is a convenient and compact notation that is very useful while creating the
design. That is, a designer can make effective use of structure charts to
useful for representing the final design, as it does not give all the information
needed about the design. For example, it does not specify the scope, structure of
We have seen how to determine the structure of an existing program. But, once
the program is written, its structure is fixed; little can be done about altering the
structure. However, for a given set of requirements many different programs can
be written to satisfy the requirements, and each program can have a different
structure. That is, although the structure of a given program is fixed, for a given
DFD is a directed graph in which node specifies process and arcs specify data
indicate decision logic under which various processing nodes in the diagram
design phase to specify external and top-level internal design specification. The
Data Flow
Process
Data Stores
Data Source/Sink
Input A
Transfo A’
rm A
C’ Output D
Compu Compu
te te
Input B B’
Transfo
rm B
Figure 6.4
6.2.7.3 Pseudocode
While-Do, and End. Keywords and indentation describes the flow of control, while
OPEN files
--------
--------
PRINT ……
CLOSE file
6.3 Summary
principles and concepts establish a foundation for the creation of the design
and modifiable. Two important criteria for modularity are coupling and cohesion.
represents how tightly bound the internal elements of the module are to one
cohesion of each module in the system, the lower the coupling between modules
discussed in this chapter are data flow diagrams, structured chart and
structure. In a structure chart, a box represents a module with the module name
written in the box. An arrow from module A to module B represents that module A
invokes module B.
1. Define design. What are the desirable qualities of a good design? Explain.
examples.
6. What can be the other criteria for modularization apart from coupling and
cohesion?
Pressman, McGraw-Hill.
7.0 Objectives
The objective of this lesson is to get the students acquainted with the design
activities, to provide a systematic approach for the derivation of the design - the
blueprint from which software is constructed. This lesson will help them in
This lesson introduces them with the notations, which may be used to represent
a function-oriented design.
7.1 Introduction
sub-systems that provide some related set of services. The initial design process
process the activities carried out are system structuring (system is decomposed
The flow of information during design is shown in the following figure (7.1). The
data design transforms the data model created during analysis into the data
elements of the software, the design patterns that can be used to achieve the
Figure 7.1
Broadly, High Level Design includes Architectural Design, Interface Design and
Data Design.
design associates the system capabilities with the system components (like
comprehensive framework that describes its form and structure, its components
addresses the functions that the system provides, the hardware and network that
are used to develop and operate it, and the software that is used to develop and
¾ Call-and-return systems
¾ Object-oriented systems
¾ Layered systems
¾ Data-centered systems
¾ Distributed systems
• Client/Server architecture
In Pipes and Filters, each component (filter) reads streams of data on its inputs
and produces streams of data on its output. Pipes are the connectors that
transmit output from one filter to another. E.g. Programs written in UNIX shell.
control hierarchy where a “main” program invokes (via procedure calls) a number
of program components, which in turn may invoke still other components. E.g.
subprograms.
In Layered systems, each layer provides service to the one outside it, and acts
as a client to the layer inside it. They are arranged like an “onion ring”. E.g. OSI
ISO model.
that operate on the central data store. In a traditional database, the transactions,
server system responds to the requests for actions / services made by client
¾ Security
¾ Audit Trails
¾ Restart / Recovery
¾ User Interface
analysis
¾ Convert each unit into a good structure chart by means of transform analysis
The transaction is identified by studying the discrete event types that drive the
system. For example, with respect to railway reservation, a customer may give
Figure 7.2
The three transaction types here are: Check Availability (an enquiry), Reserve
Ticket (booking) and Cancel Ticket (cancellation). On any given time we will get
situation, any one stimulus may be entered through a particular terminal. The
human user would inform the system her preference by selecting a transaction
types and draw the first level breakup of modules in the structure chart, by
Figure 7.3
The Main ( ) which is a over-all coordinating module, gets the information about
and print a text menu and prompt the user to select a choice and return this
choice to Main ( ). It will not affect any other components in our breakup, even
when this module is changed later to return the same input through graphical
and Transaction3 ( ) are the coordinators of transactions one, two and three
levels of abstraction.
chart of all input screens that are needed to get various transaction stimuli from
exactly the same way as seen before), for all identified transaction centers.
level 2 or level 3, etc.) for all the identified transaction centers. In case, the given
system has only one transaction (like a payroll system), then we can start
5. Verify that the final structure chart meets the requirements of the original
DFD
The central transform is the portion of DFD that contains the essential functions
and output. One way of identifying central transform is to identify the centre of the
DFD by pruning off its afferent and efferent branches. Afferent stream is traced
from outside of the DFD to a flow point inside, just before the input is being
stream is a flow point from where output is formatted for better presentation.
Figure 7.4
The processes between afferent and efferent stream represent the central
transform (marked within dotted lines above in figure 7.4). In the above example,
processes are P2, P3, P4 & P5 - which transform the given input into some form
of output.
module. A boss module can be one of the central transform processes. Ideally,
Figure 7.5
In the above illustration in figure 7.5, we have a dummy boss module “Produce
Payroll” – which is named in a way that it indicate what the program is about.
Having established the boss module, the afferent stream processes are moved to
left most side of the next level of structure chart; the efferent stream process on
the right most side and the central transform processes in the middle. Here, we
moved a module to get valid timesheet (afferent process) to the left side. The two
central transform processes are move in the middle. By grouping the other two
the next level – one to get the selection and another to calculate. Even after this
change, the “Calculate Deduction” module would return the same value.
Expand the structure chart further by using the different levels of DFD. Factor
down till you reach to modules that correspond to processes that access source /
sink or data stores. Once this is ready, other features of the software like error
handling, security, etc. has to be added. A module name should not be used for
two different modules. If the same module is to be used in more than one place,
it will be demoted down such that “fan in” can be done from the higher levels.
Ideally, the name should sum up the activities done by the module and its sub-
ordinates.
Because of the orientation towards the end-product, the software, the finer
details of how data gets originated and stored (as appeared in DFD) is not
explicit in Structure Chart. Hence DFD may still be needed along with Structure
Some characteristics of the structure chart as a whole would give some clues
about the quality of the system. Page-Jones suggest following guidelines for a
module can affect only those modules which comes under it’s control (All sub-
Increase fan-in (number of immediate bosses for a module). High fan-ins (in a
The design of user interfaces draws heavily on the experience of the designer.
1. General interaction
2. Information display
3. Data entry
Guidelines for general interaction often cross the boundary into information
/display, data entry and overall system control. They are, therefore, all
encompassing and are ignored at great risk. The following guidelines focus on
general interaction.
data display and the myriad other functions that occur in a HCI.
¾ Offer meaningful feedback: Provide the user with visual and auditory
interface) is established.
should appear.
¾ Forgive mistakes: The system should protect itself from errors that might
accordingly: One of the key benefits of the pull down menu is the ability to
lengthy command name is more difficult to recognize and recall. It may also
many different ways with text, pictures and sound, by placement, motion and
size, using color, resolution, and even omission. The following guidelines focus
on information display.
¾ Display only information that is relevant to the current context: The user
should not have to wade through extraneous data, menus and graphics to
¾ Don’t bury the user with data; use a presentation format that enables rapid
tables.
source of information.
¾ Allow the user to maintain visual context: If computer graphics displays are
scaled up and down, the original image should be displayed constantly (in
reduced form at the corner of the display) so that the user understands the
relative location of the portion of the image that is currently being viewed.
¾ Use upper and lower case, indentation, and text grouping to aid in
enable the user to "keep" many different types of information within easy
reach.
This would provide the user with both absolute and relative information.
¾ Consider the available geography of the display screen and use it efficiently:
Much of the user's time is spent picking commands, typing data and otherwise
providing system input. In many applications, the keyboard remains the primary
input medium, but the mouse, digitizer and even voice recognition systems are
input:
¾ Minimize the number of input actions required of the user: Reduce the
specify input data across a range of values; using "macros" that enable a
data.
¾ Maintain consistency between information display and data input: The visual
characteristics of the display (e.g., text size, color, and placement) should be
¾ Allow the user to customize the input: An expert user might decide to create
¾ Interaction should be flexible but also tuned to the user’s preferred mode of
input: The user model will assist in determining which mode of input is
preferred. A clerical worker might be very happy with keyboard input, while a
manager might be more comfortable using a point and pick device such as a
mouse.
This protects the user from attempting some action that could result in an
error.
¾ Let the user control the interactive flow: The user should be able to jump
engineering input (unless there may be ambiguity). Do not let the user to type
.00 for whole number dollar amounts, provide default values whenever
possible, and never let the user to enter information that can be acquired
Procedural design occurs after data, architectural, and interface designs have
detailed design is more concerned with semantic issues and less concerned with
Detailed design should be carried to a level where each statement in the design
If the entry conditions are correct, but the exit conditions are wrong, the bug
must be in the block. This is not true if execution is allowed to jump into a block.
The bug might be anywhere in the program. Debugging under these conditions
is much harder.
figure 7.6. In flow-charting terms, a box with a single entry point and single exit
point is structured. This may look obvious, but that is the idea. Structured
A sequence of blocks is correct if the exit conditions of each block match the
entry conditions of the following block. Execution enters each block at the
block's entry point, and leaves through the block's exit point. The whole
sequence can be regarded as a single block, with an entry point and an exit
point.
arranged as in the flowchart at right, then there is one entry point (at the top) and
one exit point (at the bottom). The structure should be coded so that if the entry
conditions are satisfied, then the exit conditions are fulfilled (just like a code
block).
contains a signed integer. The exit condition might be: register $8 contains the
absolute value of the signed integer. The branch structure is used to fulfill the exit
condition.
Iteration (while-loop) is arranged as at right. It also has one entry point and one
exit point. The entry point has conditions that must be satisfied and the exit
point has conditions that will be fulfilled. There are no jumps into the structure
ENTRY POINT
FALSE
?
TRUE
EXIT
Introduction to Software Engineering POINT
Page 208 of 348
Figure 7.9 Rule 4: The iteration of code block is
structured
Structure Rule Five: Nesting Structures
In flowcharting terms, any code block can be expanded into any of the
structures. Or, going the other direction, if there is a portion of the flowchart that
has a single entry point and a single exit point, it can be summarized as a single
code block.
Rule 5 of Structured Programming: A structure (of any size) that has a single
For example, say that you are designing a program to go through a list of signed
integers calculating the absolute value of each one. You might (1) first regard
the program as one block, then (2) sketch in the iteration required, and finally
(3) put in the details of the loop body, as shown in figure 7.10.
Or, you might go the other way. Once the absolute value code is working, you
program.
You might think that these rules are OK for ensuring stable code, but that they
are too restrictive. Some power must be lost. But nothing is lost. Two
researchers, Böhm and Jacopini, proved that any program could be written in a
The other control structures you may know, such as case, do-until, do-while,
and for are not needed. However, they are sometimes convenient, and are
is "a pidgin language, in that, it uses the vocabulary of one language (i.e.,
At first glance, PDL looks like a modern programming language. The difference
between PDL and a real programming language lies in the use of narrative text
(e.g., English) embedded directly within PDL statements. Given the use of
compiled (at least not yet). However, PDL tools currently exist to translate PDL
¾ A fixed syntax of keywords that provide for all structured constructs, data
¾ Data declaration facilities that should include both simple (scalar, array) and
interface description.
A basic PDL syntax should include constructs for subprogram definition, interface
for some of these POL constructs are presented in the section that follows.
It should be noted that PDL can be extended to include keywords for multitasking
and many other features. The application design, for which PDL is to be used
7.3 Summary
framework that describes its form and structure, its components and how they
architectural styles (i) Pipes and Filters (ii) Call-and-return systems (iii) Object-
Distributed systems. In the data flow oriented design, DFD representing the
information flow is converted into the structure chart. The design of user
guidelines (i) General interaction, (ii) Information display, (iii) Data entry.
Procedural design occurs after data, architectural, and interface designs have
entry and single exit constructs. Using structured programming, facilitate the
detailed design PDL is a good tool. PDL resembles a programming language that
uses of narrative text (e.g., English) embedded directly within PDL statements. It
language.
7.4 Keywords
information requirements.
elements of the software, the design patterns that can be used to achieve the
the algorithms instead of directly writing the program using a high level
language.
Explain.
Pressman, McGraw-Hill.
8.0 Objectives
2. Programming Style
8.1 Introduction
The coding is concerned with translating design specifications into source code.
The good programming should ensure the ease of debugging, testing and
and elegance are the hallmarks of good programs. Obscurity, cleverness, and
duties and responsibilities and should be provided with a well defined set of
description.
problem domain
parameters
8.2.2.3 Testing
Programming style refers to the style used in writing the source code for a
quickly read and understands the program as well as avoid making errors. (Older
while poor style can defeat the intent of en excellent language. The goal of good
The programming style used in a particular program may be derived from the
styles are often designed for a specific programming language (or language
family) and are not used in whole for other languages. (Style considered good in
C source code may not be appropriate for BASIC source code, and so on.) Good
Programming styles are often designed for a specific programming language and
are not used in whole for other languages. So there is no single set of rules that
can be applied in every situation; however there are general guidelines that are
readability.
While-do, for loop etc. If the implementation language does not provide
programmers. This will make coding style more uniform with the result that
The best time to use GOTO statement is never. In all the modern programming
languages, constructs are available which help you in avoiding the use of GOTO
statement, so if you are a good programmer then you can avoid the use of
statements are almost always forward transfers of control within a local region of
domain
Use of distinct data types makes it possible for humans to distinguish between
entities from the problem domain. All the modern programming languages
be used to represent the month of a year, then instead of using integer data type
below:
enum month = (jan, feb, march, april, may, june, july, aug, sep, oct, nov, dec);
month x;
Variables x is declared of month type. Using such types makes the program
much understandable.
X = july;
x = 7;
taken in data encapsulation, wherein data structures and its accessing routines
Appropriate choices for variable names are seen as the keystone for good style.
Poorly-named variables make code harder to read and understand. For example,
get a b c
return true
else
return false
Because of the choice of variable names, the function of the code is difficult to
work out. However, if the variable names are made more descriptive:
return true
else
return false
identifier”.
Programming styles commonly deal with the appearance of source code, with the
goal of improving the readability of the program. However, with the advent of
software that formats source code automatically, the focus on appearance will
practical point, using a computer to format source code saves time, and it is
Indenting
Indent styles assist in identifying control flow and blocks of code. In programming
languages that use indentation to delimit logical blocks of code, good indentation
style directly affects the behavior of the resulting program. In other languages,
such as those that use brackets to delimit code blocks, the indent style does not
directly affect the product. Instead, using a logical and consistent indent style
return true;
} else {
return false;
or
return true;
else
return false;
if (hours < 24 && minutes < 60 && seconds < 60) {return true;}
The first two examples are much easier to read because they are indented well,
and logical blocks of code are grouped and displayed together more clearly.
This example is somewhat contrived, of course - all the above are more complex
return hours < 24 && minutes < 60 && seconds < 60;
Spacing
Free-format languages often completely ignore white space. Making good use of
int count;
printf("%d",count*count+count);
with
int count;
in the middle of a line as different text editors render their width differently.
required. By doing this, the need for bracketing with curly braces ({ and }) is
eliminated, and readability is improved while not interfering with common coding
styles. However, this frequently leads to problems where code is copied and
Some programmers think decision structures such as the above, where the result
of the decision is merely computation of a Boolean value, are overly verbose and
like this:
return hours < 12 && minutes < 60 && seconds < 60;
The use of logical control structures for looping adds to good programming style
count = 0
print count * 2
count = count + 1
endwhile
The above snippet obeys the two aforementioned style guidelines, but the
following use of the "for" construct makes the code much easier to read:
print count * 2
In many languages, the often used "for each element in a range" pattern can be
shortened to:
for count = 0 to 5
print count * 2
routines. Use of more than five formal parameters gives a feeling that probably
five is not arbitrary. It is well known that human beings can deal with
to keep our program simple. By making the use of tricks and showing cleverness,
example:
A=A+B;
B=A-B;
A=A-B;
You can observe the obscurity in the above code. The better approach can be:
A=B;
B=T;
The second version to swap the values of two inegers is more clear and simple.
If B then ; else S;
Which is equivalent to
If (not B) the S;
If(A>B) then
if(X>Y) then
A=X
Else
B=Y
Endif
Else
A=B
Endif
Then_if statement tend to obscure the conditions under which various actions are
If(A<B) then
A=B
Elseif (X>Y) then
B=Y
Else
A=X
endif
8.2.1.2.4 Don’t nest too deeply
While X loop
If Y then
While Y loop
While Z loop
If W then S
In the above code, it is difficult to identify the conditions under which statement S
program understandable. This is not possible if the identifier is used for multiple
purposes.
requirements (validation) and each step in the process of building the software
yields the right products (verification). The differences between verification and
and tests are done at the end of each phase of the development process to
ensure software requirements are complete and testable and that design, code,
Validation Verification
Am I accessing the right data (in Am I accessing the data right (in the right
the requirement)
environment
development project with respect to software at each stage and between each
the user needs and requirements stage of the development life cycle.
Activities
The two major V&V activities are reviews, including inspections and
Reviews are conducted during and at the end of each phase of the life cycle to
personnel who have not been directly involved in the development of the
a review panel and provides and/or presents the material to be reviewed. The
Formal reviews are conducted at the end of each life cycle phase. The acquirer
of the software appoints the formal review panel or board, who may make or
affect a go/no-go decision to proceed to the next step of the life cycle. Formal
Design Review, the Software Critical Design Review, and the Software Test
Readiness Review.
assurance.
8.2.2.3 Testing
demonstrate that a product satisfies its requirements and, if it does not, to identify
the specific differences between expected and actual results. There are varied
levels of software tests, ranging from unit or element testing through integration
Informal tests are done by the developer to measure the development progress.
"Informal" in this case does not mean that the tests are done in a casual manner,
just that the acquirer of the software is not formally involved, that witnessing of
the testing is not required, and that the prime purpose of the tests is to find
errors. Unit, component, and subsystem integration tests are usually informal
tests.
driven or black box testing is done by selecting the input data and other
reactions of the software. Black box testing can be done at any level of
¾ Computational correctness.
Design-driven or white box testing is the process where the tester examines the
data and other parameters based on the internal logic paths that are to be
¾ All paths through the code. For most software products, this can be
Formal testing demonstrates that the software is ready for its intended use. A
formal test should include an acquirer- approved test plan and procedures,
Each software development project should have at least one formal test, the
acceptance test that concludes the development activities and demonstrates that
In addition to the final acceptance test, other formal testing may be done on a
matter, any contractually required test is usually considered a formal test; others
are "informal."
acceptance tests to ensure that the change did not disturb functions that have
Cycle
The V&V Plan should cover all V&V activities to be performed during all phases
of the life cycle. The V&V Plan Data Item Description (DID) may be rolled out of
the Product Assurance Plan DID contained in the SMAP Management Plan
system is to be reviewed and tested. Simple projects may compress the life cycle
steps; if so, the reviews may have to be compressed. Test concepts may involve
adequate V&V concept and plan, the cost, schedule, and complexity of the
project may be poorly estimated due to the lack of adequate test capabilities and
data.
satisfied.
generators.
verification matrix.
approved.
are integrated into subsystems and then the final system. Activities during this
Any V&V activities conducted during the prior seven phases are conducted
during this phase as they pertain to the revision or update of the software.
nor the acquirer of the software. The IV&V agent should have no stake in the
success or failure of the software. The IV&V agent's only interest should be to
make sure that the software is thoroughly tested against its complete set of
requirements.
The IV&V activities duplicate the V&V activities step-by-step during the life cycle,
with the exception that the IV&V agent does no informal testing. If there is an
IV&V agent, the formal acceptance testing may be done only once, by the IV&V
agent. In this case, the developer will do a formal demonstration that the
Perhaps more tools have been developed to aid the V&V of software (especially
testing) than any other software activity. The tools available include code
simulations, and emulations. Some tools are essential for testing any significant
set of software, and, if they have to be developed, may turn out to be a significant
An especially useful technique for finding errors is the formal inspection. Michael
however, are significantly different from walkthroughs and are significantly more
effective. A team, each member of which has a specific role, does inspections. A
moderator, who is formally trained in the inspection process, leads the team.
reviewers, who look for faults in the item; a recorder, who notes the faults; and
This formal, highly structured inspection process has been extremely effective in
finding and eliminating errors. It can be applied to any product of the software
important side benefits has been the direct feedback to the developer/author, and
8.3 Summary
easy to read and understand. Clarity of source code ease debugging, testing and
budgets.
is linearity of control flow. Linearity is assured by use of single entry and single
modification of the source code. Several guidelines for good programming style
were presented. Dos and don’ts of good programming style were also illustrated.
8.4 Keywords
Programming style: It refers to the style used in writing the source code for a
computer program.
Verification: it is the process to ensure that each step in the process of building
find errors.
structured programming?
5. What do you understand by programming style? What are the dos and
them? Explain.
Pressman, McGraw-Hill.
9.0 Objectives
The objective of this lesson is to make the students familiar with the concepts
and activities carried out during testing phase. After studying this lesson the
¾ Testing Fundamentals
¾ Psychology of testing
¾ Unit testing
¾ Integration Testing
¾ Acceptance testing
9.1 Introduction
Until 1956 it was the debugging oriented period, where testing was often
debugging. From 1957-1978 there was the demonstration oriented period where
debugging and testing was distinguished now - in this period it was shown, that
as the destruction oriented period, where the goal was to find errors. 1983-1987
prevent faults.
executing a program or application with the intent of finding errors. Quality is not
an absolute; it is value to some person. With that in mind, testing can never
furnishes a criticism or comparison that compares the state and behaviour of the
There are many approaches to software testing, but effective testing of complex
operations the tester attempts to execute with the product, and the product
answers with its behavior in reaction to the probing of the tester. Although most
inspection, the word testing is connoted to mean the dynamic analysis of the
product—putting the product through its paces. The quality of the application can,
and normally does, vary widely from system to system but some of the common
one which reveals an error; however, more recent thinking suggests that a good
test is one which reveals information of interest to someone who matters within
In case of a failure, the software does not do what the user expects. A fault is a
programming bug that may or may not actually manifest as a failure. A fault can
program. A fault will become a failure if the exact computation conditions are
met, one of them being that the faulty portion of computer software executes on
the CPU. A fault can also turn into a failure when the software is ported to a
extended.
The term error is used to refer to the discrepancy between a computed, observed
or measured value and the true, specified or theoretically correct value. Basically
it refers to the difference between the actual output of a program and the correct
output.
functions.
system but vice-versa may not be true. That is, sometimes there is a fault in the
software but failure is not observed. Fault is just like an infection in the body.
Whenever there is fever there is an infection, but sometimes body has infection
view on software and its development. They examine and change the software
engineering process itself to reduce the amount of faults that end up in the code
or deliver faster.
Regardless of the methods used or level of formality involved the desired result
confident that the software has an acceptable defect rate. What constitutes an
product can be very large, and the number of configurations of the product larger
still. Bugs that occur infrequently are difficult to find in testing. A rule of thumb is
that a system that is expected to function without faults for a certain length of
time must have already been tested for at least that length of time. This has
group of testers after the functionality is developed but before it is shipped to the
customer. This practice often results in the testing phase being used as project
testing at the same moment the project starts and it is a continuous process until
testing suites to ensure that future updates to the software don't repeat any of the
known mistakes.
Time Detected
Time Requirements Architecture Construction System Post-
Introduced Test Release
Requirements 1 3 5-10 10 10-100
Architecture - 1 10 15 25-100
Construction - - 1 10 10-25
It is commonly believed that the earlier a defect is found the cheaper it is to fix it.
driven software development" model. In this process unit tests are written first, by
the programmers. Of course these tests fail initially; as they are expected to.
Then as code is written it passes incrementally larger portions of the test suites.
The test suites are continuously updated as new failure conditions and corner
cases are discovered, and they are integrated with any regression tests that are
developed.
generally integrated into the build process (with inherently interactive tests being
The software, tools, samples of data input and output, and configurations are all
Testing is the process of finding the differences between the expected behavior
specified by system models and the observed behavior of the system. Software
finite set of test cases, suitably selected from the usually infinite executions
2. Design Analysis: During the design phase, testers work with developers in
5. Test Execution: Testers execute the software based on the plans and
make final reports on their test effort and whether or not the software
Glen Myres states a number of rules that can serves as testing objectives:
error.
¾ A good test case is one that has the high probability of finding an as-yet
undiscovered error.
¾ Testing should begin “in the small” and progress toward testing “in the
large”.
party.
“Testing cannot show the absence of defects, it can only show that software
errors are present”. So devising a set of test cases that will guarantee that all
errors will be detected is not feasible. Moreover, there are no formal or precise
methods for selecting test cases. Even though, there are a number of heuristics
and rules of thumb for deciding the test cases, selecting test cases is still a
creative activity that relies on the ingenuity of the tester. Due to this reason, the
The aim of testing is often to demonstrate that a program works by showing that
it has no errors. This is the opposite of what testing should be viewed as. The
basic purpose of the testing phase is to detect the errors that may be present in
the program. Hence, one should not start testing with the intent of showing that a
program works; but the intent should be to show that a program does not work.
With this in mind, we define testing as follows: testing is the process of executing
This emphasis on proper intent of testing is a trivial matter because test cases
are designed by human beings, and human beings have a tendency to perform
actions to achieve the goal they have in mind. So, if the goal is to demonstrate
that will try to demonstrate that goal and that will beat the basic purpose of
testing. On the other hand, if the intent is to show that the program does not
are likely to detect more errors. Testing is the one step in the software
In it the engineer creates a set of test cases that are intended to demolish the
undetected error in the program, and our goal during designing test cases should
Due to these reasons, it is said that the creator of a program (i.e. programmer)
not involved with developing the program before finally delivering it to the
customer. Another reason for independent testing is that sometimes errors occur
because the programmer did not understand the specifications clearly. Testing of
a program by its programmer will not detect such errors, whereas independent
¾ Unit testing: It tests the minimal software item that can be tested. Each
meets its requirements. It is concerned with validating that the system meets
UNIT
TESTING
MODULE
TESTING
SYSTEM
TESTING
SUB-SYSTEM
TESTING
ACCEPTANCE
TESTING
System testing involves two kinds of activities: integration testing and acceptance
include the bottom-up strategy, the top-down strategy, and the sandwich
will be available for integration into the evolving software product when needed.
The integration strategy dictates the order in which modules must be available,
and thus exerts a strong influence on the order in which modules are written,
performance tests, and stress tests to verify that the implemented system
satisfies its requirements. Acceptance tests are typically performed by the quality
sections.
Three are two important variants of integration testing, (a) Bottom-up integration
system. Unit testing has the goal of discovering errors in the individual modules
of the system. Modules are tested in isolation from one another in an artificial
module has been tested. Unit testing is eased by a system structure that is
between modules in the subsystem. Both control and data interfaces must be
tested. Large software may require several levels of subsystem testing; lower-
test cases must be carefully chosen to exercise the interfaces in the desired
manner.
the extent and nature of system testing to be performed and to establish criteria
Disadvantages of bottom-up testing include the necessity to write and debug test
harnesses for the modules and subsystems, and the level of complexity that
result from combining modules and subsystems into larger and larger units. The
extreme case of complexity results when each module is unit tested in isolation
and all modules are then linked and executed in one single integration run. This
Test harnesses provide data environments and calling sequences for the
routines and subsystems that are being tested in isolation. Test harness
preparation can amount to 50 percent or more of the coding and debugging effort
Top-down integration starts with the main routine and one or two immediately
subordinate routines in the system structure. After this top-level "skeleton" has
been thoroughly tested, it becomes the test harness for its immediately
simulate the effect of lower-level routines that are called by those being tested.
MAIN
SUB1 SUB2
Figure 9.2
1. Test MAIN module, stubs for GET, PROC, and PUT are required.
3. The top-level routines provide a natural test harness for lower-Level routines.
4. Errors are localized to the new modules and interfaces that are being added.
While it may appear that top-down integration is always preferable, there are
and integration strategy. For example, it may be difficult to find top-Level input
data that will exercise a lower level module in a particular desired manner. Also,
the evolving system may be very expensive to run as a test harness for new
100 routines each time a new routine is added. Significant amounts of machine
time can often be saved by testing subsystems in isolation before inserting them
into the evolving top-down structure. In some cases, it may not be possible to
use program stubs to simulate modules below the current level (e.g. device
modules first.
some modules and subsystems. This mix alleviates many of the problems
In integration testing also, each time a module is added, the software changes.
New data flow paths are established, new I/O may occur, and new control logic is
previously worked as desired stops working or no longer works in the same way
faults is quite common. Sometimes it occurs because a fix gets lost through poor
revision control practices (or simple human error in revision control), but just as
often a fix for a problem will be "fragile" - i.e. if some other change is made to the
program, the fix no longer works. Finally, it has often been the case that when
that when a bug is located and fixed, a test that exposes the bug is recorded and
regularly retested after subsequent changes to the program. Although this may
often done using automated testing tools. Such a 'test suite' contains software
tools that allows the testing environment to execute all the regression test cases
run all regression tests at specified intervals and report any regressions.
Common strategies are to run such a system after every successful compile (for
Regression testing can be used not only for testing the correctness of a program,
but it is also often used to track the quality of its output. For instance in the
design of a compiler, regression testing should track the code size, simulation
System testing is a series of different tests and each test has a different purpose
but all work to verify that all system elements have been properly integrated and
Many systems must recover from faults and resume processing within a specified
time. Recovery testing is a system test that forces the software to fail in a variety
Stress tests are designed to confront programs with abnormal situations. Stress
quantity, frequency, or volume. For example, a test case that may cause
For real time and embedded systems, performance testing is essential. In these
integrated system.
performance tests, and stress tests in order to demonstrate that the implemented
system satisfies its requirements. Stress tests are performed to test the
integration testing. Additional test cases are added to achieve the desired level of
Beta testing comes after alpha testing. Versions of the software, known as beta
software is released to groups of people so that further testing can ensure the
product has few faults or bugs. Sometimes, beta versions are made available to
the open public to increase the feedback field to a maximal number of future
users.
9.3 Summary
A high quality software product satisfies user needs, conforms to its
testing, acceptance testing etc. Testing plays a critical role in quality assurance
for software. Testing is a dynamic method for verification and validation. In it the
system is executed and the behavior of the system is observed. Due, to this
be deduced.
The goal of the testing is to detect the errors so there are different levels of
testing. Unit testing focuses on the errors of a module while integration testing
tests the system design. There are a number of approaches of integration testing
with their own merits and demerits such as top-down integration, and bottom up
integration. To goal of the acceptance testing is to test the system against the
The primary goal of verification and validation is to improve the quality of all the
Fault: It is a programming bug that may or may not actually manifest as a failure.
its specification.
Unit testing: It tests the minimal software item that can be tested.
outside of the company for further testing to ensure the product has few faults or
bugs.
1. Differentiate between
3. Does simply presence of fault mean software failure? If no, justify your
6. Explain why regression testing is necessary and how automated testing tools
objective way.
8. What do you understand by error, fault, and failure? Explain using suitable
examples.
Pressman, McGraw-Hill.
10.0 Objectives
The objective of this lesson is to make the students familiar with the process of
test case design, to show them how program structure analysis can be used in
the testing process. After studying this lesson the students will have the
knowledge of test case design using functional and structural testing techniques.
10.1 Introduction
Testing of the software is most time and efforts consuming activity. On average
strategies are discussed to design the test cases. According to Myres a good test
is one that reveals the presence of defects in the software being tested. So a test
suit does not detect defects, this means that the test chosen have not exercised
the system so that defects are revealed. It does not mean that program defects
do not exists.
A test case is usually a single step, and its expected result, along with various
one expected result or expected outcome. The optional fields are a test case ID,
category, author, and check boxes for whether the test is automatable and has
been automated. Larger test cases may also contain prerequisite states or steps,
and descriptions. A test case should also contain a place for the actual result.
able to see past test results and who generated the results and the system
The most common term for a collection of test cases is a test suite. The test suite
often also contains more detailed instructions or goals for each collection of test
cases. It definitely contains a section where the tester identifies the system
configuration used during testing. A group of test cases may also contain
Collections of test cases are sometimes incorrectly termed a test plan. They
There are two basic approaches to test case design: functional (black box) and
structural (white box). In functional testing, the structure of the program is not
considered. Structural testing, on the other hand, is concerned with testing the
White box and black box testing are terms used to describe the point of view a
test engineer takes when designing test cases. Black box is an external view of
In recent years the term grey box testing has come into common usage. The
environment, like seeding a database, and can view the state of the product after
her actions, like performing a SQL query on the database to be certain of the
who has to manipulate XML files (DTD or an actual XML file) or configuration
files directly. It can also be used of testers who know the internal workings or
algorithm of the software under test and can write tests specifically for the
involves loading the target database with information, and verifying the
correctness of data population and loading of data into the correct tables.
White box testing (also known as clear box testing, glass box testing or structural
testing) uses an internal perspective of the system to design test cases based on
internal structure. It requires programming skills to identify all paths through the
software. The tester chooses test case inputs to exercise all paths and
Since the tests are based on the actual implementation, if the implementation
changes, the tests probably will need to also. For example ICT needs updates if
component values change, and needs modified/new fixture if the circuit changes.
This adds financial resistance to the change process, thus buggy products may
stay buggy. Automated optical inspection (AOI) offers similar component level
correctness checking without the cost of ICT fixtures, however changes still
While white box testing is applicable at the unit, integration and system levels, it's
typically applied to the unit. So while it normally tests paths within a unit, it can
a system level test. Though this method of test design can uncover an
the specification or missing requirements. But you can be sure that all paths
The most common structure based criteria are based on the control flow of the
program. In this criterion, a control flow graph of the program is constructed and
graph of program consists of nodes and edges. A node in the graph represents a
block of statement that is always executed together. An edge frm node i to node j
represents a possible transfer of control after executing the last statement in the
node j. Three common forms of code coverage used by testers are statement (or
line) coverage, branch coverage, and path coverage. Line coverage reports on
the execution footprint of testing in terms of which lines of code were executed to
complete the test. According to this criterion each statement of the program to be
tested should be executed at least once. Using branch coverage as the test
criteria, the tester attempts to find a set of test cases that will execute each
tester attempts to find a set of test cases that ensure the traversal of each logical
program. CFGs consist of all the typical building blocks of any flow diagrams.
There is always a start node, an end node, and flows (or arcs) between nodes.
Each node is labeled in order for it to be identified and associated correctly with
CFGs allow for constructs to be nested in order to represent nested loops in the
Star
Star Star
1
1
1
2 3
2 2
End
End
End
If loop While Loop Do While Loop
Figure 10.1
In programs where while loops exist, there are potentially an infinite number of
unique paths through the program. Every path through a program has a set of
associated conditions. Finding out what these conditions are allows for test data
of variable, which change through the execution of the code. At any point in the
Statements in the code such as "x = x + 1" alter the state of the program by
Infeasible paths are those paths, which cannot be executed. Infeasible paths
Example:
input a,b,c;
max=a;
if (b>max) max=b;
if(c=max) max=c;
output max;
The control flow graph of this program is given below in figure 10.2. In this
1 3 5
Figure 10.2
c=15 is sufficient.
To ensure Branch coverage [1, 3, 5] and [1, 2, 3, 4, 5], two test cases are
required (i) a=5, b=10, c=15 and (ii) a=15, b=10, and c=5.
To ensure Path coverage ([1,2,3,4,5], [1,3,5], [1,2,3,5], and [1,3,4,5]), four test
Path coverage criteria leads to a potentially infinite number of paths, some efforts
have been made to limit the number of paths to be tested. One such approach is
complexity is three so three test cases are sufficient. As these are the
basic paths.
The data flow testing is based on the information about where the variables are
defined and where the definitions are used. During testing the definitions of
variables and their subsequent use is tested. Data flow testing looks at how data
moves within a program. There are a number of associated test criteria and
these should complement the control-flow criteria. Data flow occurs through
place.
To illustrate the data flow based testing; let us assume that each statement in the
program has been assigned a unique statement number and that each function
does not modify its parameters or global variables. For a statement with S as its
statement number,
If statement S is an if or loop statement, its DEF set is empty and its USE set is
that does not contain any other definition of X. A Definition Use chain (DU chain)
of variable X is of the form [X, S, S’], where S and S’ are statement numbers, X is
in DEF(S) and USE(S’), and the definition of X in the statement S is live at the
statement S’.
One simple data flow testing strategy is to require that every DU chain be
Loops are very important constructs for generally all the algorithms. Loop testing
constructs. Four different types of loops are: simple loop, concatenated loop,
Figure 10.3
- Start at the innermost loop. Set all other loops to minimum value.
- Conduct the simple loop test for the innermost loop while holding the outer
Concatenated loops: These can be tested using the approach of simple loops if
each loop is independent of other. However, if the loop counter of loop 1 is used
as the initial value for loop 2 then approach of nested loop is to be used.
Unstructured loop: This class of loops should be redesigned to reflect the use
Black box testing takes an external perspective of the test object to derive test
The test designer selects valid and invalid input and determines the correct
integration, system and acceptance. The higher the level, and hence the bigger
and more complex the box, the more we are forced to use black box testing to
specification, you can't be sure that all existent paths are tested. Some common
The equivalence partitions are usually derived from the specification of the
component's behaviour. An input has certain ranges which are valid and other
ranges which are invalid. This may be best explained at the following example of
a function which has the pass parameter "month" of a date. The valid range for
the month is 1 to 12, standing for January to December. This valid range is called
a partition. In this example there are two further partitions of invalid ranges. The
first invalid partition would be <= 0 and the second invalid partition would be >=
13.
The testing theory related to equivalence partitioning says that only one test case
of each partition is needed to evaluate the behaviour of the program for the
related partition. In other words it is sufficient to select one test case out of each
partition to check the behaviour of the program. To use more or even all test
cases of a partition will not find new faults in the program. The values within one
partition are considered to be "equivalent". Thus the number of test cases can be
reduced considerably.
An additional effect by applying this technique is that you also find the so called
"dirty" test cases. An inexperienced tester may be tempted to use as test cases
the input data 1 to 12 for the month and forget to select some out of the invalid
one hand, and a lack of test cases for the dirty ranges on the other hand.
the subject there are cases where it applies to the white box testing as well.
like in the example above. However internally the function may have a
differentiation of values between 1 and 6 and the values between 7 and 12.
Depending on the input value the software internally will run through different
paths to perform slightly different actions. Regarding the input and output
interfaces to the component this difference will not be noticed, however in your
white-box testing you would like to make sure that both paths are examined. To
would not be needed for black-box testing. For this example this would be:
P1 P2
Invalid Partition 1 Invalid Partition 2
Valid Partition
To check for the expected results you would need to evaluate some internal
applied to select the most effective test cases out of these partitions.
component input ranges. Testing experience has shown that especially the
programmer who has to implement e.g. the range 1 to 12 at an input, which e.g.
stands for the month January to December in a date, has in his code a line
But a common programming error may check a wrong range e.g. starting the
range at 0 by writing:
For more complex range checks in a program this may be a problem which is not
To set up boundary value analysis test cases you first have to determine which
analysis and equivalence partitioning are inevitably linked together. For the
example of the month in a date you would have the following partitions:
Applying boundary value analysis you have to select now a test case at each
side of the boundary between two partitions. In the above example this would be
0 and 1 for the lower boundary as well as 12 and 13 for the upper boundary.
Each of these pairs consists of a "clean" and a "dirty" test case. A "clean" test
case should give you a valid operation result of your program. A "dirty" test case
should lead to a correct and specified input error treatment such as the limiting of
interface, it has to lead to warning and request to enter correct data. The
boundary value analysis can have 6 textcases.n,n-1,n+1 for the upper limit and
A further set of boundaries has to be considered when you set up your test
cases. A solid testing strategy also has to consider the natural boundaries of the
data types used in the program. If you are working with signed values this is
especially the range around zero (-1, 0, +1). Similar to the typical range check
E.g. this could be a division by zero problems when a zero value is possible to
occur although the programmer always thought the range starting at 1. It could
be a sign problem when a value turns out to be negative in some rare cases,
additional test cases checking the range around zero. A further natural boundary
is the natural lower und upper limit of the data type itself. E.g. an unsigned 8-bit
The tendency is to relate boundary value analysis more to the so called black
look on the subject there are cases where it applies also to white box testing.
After determining the necessary test cases with equivalence partitioning and the
One weakness with the equivalence class partitioning and boundary value
methods is that they consider each input separately. That is, both concentrate on
the conditions and classes of one input. They do not consider combinations of
input circumstances that may form interesting situations that should be tested.
approach will result in an unusually large number of test cases, many of which
will not be useful for revealing any new errors. For example, if there are n
different input conditions, such that any combination of the input conditions is
conditions in a systematic way, such that the number of test cases does not
become unmanageably large. The technique starts with identifying causes and
effect is a distinct output condition. Each condition forms a node in the cause-
effect graph. The conditions should be stated such that they can be set to either
true or false. For example, an input condition can be "file is empty," which can be
set to true by having an empty input file, and false by a nonempty file. After
identifying the causes and effects, for each effect we identify the causes that can
produce that effect and how the conditions have to be combined to make the
effect true. Conditions are combined using the Boolean operators "and", "or", and
"not", which are represented in the graph by Λ, V and zigzag line respectively.
Then, for each effect, all combinations of the causes that the effect depends on
which will make the effect true, are generated (the causes that the effect does
not depend on are essentially "don't care"). By doing this, we identify the
combinations of conditions that make different effects true. A test case is then
generated for each combination of conditions, which make some effect true.
Let us illustrate this technique with a small example. Suppose that for a bank
The requirements are that if the command is credit and the acct-number is valid,
then the account is credited. If the command is debit, the acct-number is valid,
and the transaction_amount is valid (less than the balance), then the account is
debited. If the command is not valid, the account number is not valid, or the debit
Cause:
Effects:
The cause effect of this is shown in following Figure 10.4. In the graph, the
cause-effect relationship of this example is captured. For all effects, one can
easily determine the causes each effect depends on and the exact nature of the
dependency. For example, according to this graph, the effect E5 depends on the
causes c2, c3, and c4 in a manner such that the effect E5 is enabled when all c2,
From this graph, a list of test cases can be generated. The basic strategy is to
set an effect to I and then set the causes that enable this condition. The condition
of causes forms the test case. A cause may be set to false, true, or don't care (in
the case when the effect does not depend at all on the cause). To do this for all
conditions in the table for an effect is a test case. Together, these condition
combinations check for various effects the software should display. For example,
to test for the effect E3, both c2 and c4 have to be set. That is, to test the effect
"Print debit amount not valid," the test case should be: Command is debit
(setting: c2 to True), the account number is valid (setting c3 to False), and the
E1
C1
E2
C2
E3
C3
E5
C4
E4
SNo. 1 2 3 4 5
Cl 0 1 x x 1
C2 0 x 1 1 x
C3 x 0 1 1 1
El 1
E2 1
E3 1
E4 1
E5 1
Cause-effect graphing, beyond generating high-yield test cases, also aids the
understanding of the functionality of the system, because the tester must identify
the distinct causes and effects. There are methods of reducing the number of test
cases generated by proper traversing of the graph. Once the causes and effects
are listed and their dependencies specified, much of the remaining work can also
be automated.
White box testing is concerned only with testing the software product; it cannot
guarantee that the complete specification has been implemented. Black box
testing is concerned only with testing the specification; it cannot guarantee that
all parts of the implementation have been tested. Thus black box testing is
testing against the specification and will discover faults of omission, indicating
that part of the specification has not been fulfilled. White box testing is testing
against the implementation and will discover faults of commission, indicating that
part of the implementation is faulty. In order to fully test a software product both
source code to be produced before the tests can be planned and is much more
laborious in the determination of suitable input data and the determination if the
software is or is not correct. The advice given is to start test planning with a black
box test approach as soon as the specification is available. White box planning
should commence as soon as all black box tests have been successfully passed,
with the production of flow graphs and determination of paths. The paths should
then be checked against the black box test plan and any additional required test
The consequences of test failure at this stage may be very expensive. A failure of
a white box test may result in a change which requires all black box testing to be
repeated and the re-determination of the white box paths. The cheaper option is
quality control. The intention is that sufficient quality will be put into all previous
design and production stages so that it can be expected that testing will confirm
that there are very few faults present, quality assurance, rather than testing being
relied upon to discover any faults in the software, quality control. A combination
of black box and white box test considerations is still not a completely adequate
well tested if all simple faults are predicted and removed; complex faults are
faults.
Mutation testing is used to test the quality of your test suite. This is done by
mutating certain statements in your source code and checking if your test code
is able to find the errors. However, mutation testing is very expensive to run,
which can be used to run mutation tests on Java code. Jester looks at specific
areas of your source code, for example: forcing a path through an if statement,
The idea of mutation testing was introduced as an attempt to solve the problem
of not being able to measure the accuracy of test suites. The thinking goes as
follows: Let’s assume that we have a perfect test suite, one that covers all
possible cases. Let’s also assume that we have a perfect program that passes
this test suite. If we change the code of the program (this process is called
mutation) and we run the mutated program (mutant) against the test suite, we will
1. The results of the program were affected by the code change and the test
suite detects it. If this happens, the mutant is called a killed mutant.
2. The results of the program are not changed and the test suite does not
The ratio of killed mutants to the total mutants created measures how sensitive
the program is to the code changes and how accurate the test suite is.
which may cause the program to function incorrectly. For example, a simple
condition check in the following code may perform viciously because of a little
operator and a little change in the condition is brought into effect as shown in the
Mutation Operators
Summary
A high quality software product satisfies user needs, conforms to its
for software. Testing is a dynamic method for verification and validation. In it the
system is executed and the behavior of the system is observed. Due, to this
testing observes the failure of the system, from which the presence of faults can
be deduced.
There are two basic approaches to testing: white box testing and black box
testing. In white box testing the structure of the program i.e. internal logic is
considered to decide the test cases while in black box testing the examples of
which are boundary value analysis, equivalence partitioning, the test cases are
The goal of the testing is to detect the errors so there are different levels of
testing. Unit testing focuses on the errors of a module while integration testing
tests the system design. To goal of the acceptance testing is to test the system
The primary goal of verification and validation is to improve the quality of all the
Key words
White box testing: It uses an internal perspective of the system to design test
certain statements in your source code and checking if your test code is able to
ranges.
Black box: It takes an external perspective of the test object to derive test cases.
test cases to a necessary minimum and to select the right test cases to cover all
possible scenarios. By dividing the input ranges into equivalent partition and then
Self-Assessment Questions
9. Differentiate between
11. What is equivalence class partitioning? What are the advantages of using this
testing technique?
12. What do you understand by structural testing? Using an example show that
coverage criteria.
technique.
14. Why the programmer of a program is not supposed to be its tester? Explain.
15. What types of errors are detected by boundary value analysis and
17. Does simply presence of fault mean software failure? If no, justify your
18. What do you understand by regression testing and where do we use it?
19. Define testing. What characteristics are to be there in a good test case?
20. What do you understand by loop testing? Write a program for bubble sort and
21. What is cyclomatic complexity? How does cyclomatic complexity number help
References/Suggested readings
37. Software Engineering concepts by Richard Fairley, Tata McGraw Hill.
Pressman, McGraw-Hill.
11.0 Objectives
11.1 Introduction
With the advent of the computer age, computers, as well as the software running
on them, are playing a vital role in our daily lives. We may not have noticed, but
having their analog and mechanical parts replaced by CPUs and software. The
design, flexible handling, rich features and competitive cost. Like machinery-
parts are quickly pushing their mechanical counterparts out of the market.
capacitor, software will stay "as is" unless there are problems in hardware that
changes the storage content or data path. Software does not age, rust, wear-out,
has no shape, color, material, and mass. It cannot be seen or touched, but it has
Without being proven to be wrong, optimistic people would think that once after
the software can run correctly, it will be correct forever. A series of tragedies and
chaos caused by software proves this to be wrong. These events will always
year 1986, caused by the software not being able to detect a race condition,
safety control and surrender our lives completely to software controlled safety
mechanism.
Software can make decisions, but can just as unreliable as human beings. The
British destroyer Sheffield was sunk because the radar system identified an
incoming missile as "friendly". The defense system has matured to the point that
it will not mistaken the rising moon for incoming missiles, but gas-field fire,
descending space junk, etc, were also examples that can be misidentified as
Software can also have small unnoticeable errors or drifts that can culminate into
a disaster. On February 25, 1991, during the Golf War, the chopping error that
Fixing problems may not necessarily make the software more reliable. On the
contrary, new serious problems may arise. In 1991, after changing three lines of
code in a signaling program which contains millions lines of code, the local
telephone systems in California and along the Eastern seaboard came to a stop.
Once perfectly working software may also break if the running environment
changes. After the success of Ariane 4 rocket, the maiden flight of Ariane 5
ended up in flames while design defects in the control software were unveiled by
There are much more scary stories to tell. This makes us wondering whether
embedded applications. You can hardly ruin your clothes if the embedded
software in your washing machine issues erroneous commands; and 50% of the
chances you will be happy if the ATM machine miscalculates your money; but in
easily claim people's lives. With processors and software permeating safety
critical embedded world, the reliability of software is simply a matter of life and
death.
11.2.1 Definition
11.2.1 Definition
with the notion of time, we must note that, different from traditional Hardware
mechanical parts may become "old" and wear out with time and usage, but
software will not wear-out during its life cycle. Software will not change over time
because the complexity of software tends to be high. While any system with a
high degree of complexity, including software, will be hard to reach a certain level
of reliability, system developers tend to push complexity into the software layer,
with the rapid growth of system size and ease of doing so by upgrading the
software. For example, large next-generation aircraft will have over one million
source lines of software on-board; next-generation air traffic control systems will
contain between one and two million lines; the upcoming international Space
Station will have over two million lines on-board and over ten million lines of
ground support software; several major life-critical defense systems will have
over five million source lines of software. While the complexity of software is
Reliability, software and hardware have basic differences that make them
different in failure mechanisms. Hardware faults are mostly physical faults, while
software faults are design faults, which are harder to visualize, classify, detect,
and correct. Design faults are closely related to fuzzy human factors and the
faults may also exist, but physical faults usually dominate. In software, we can
place does not count. Therefore, the quality of software will not change once it is
uploaded into the storage and start running. Trying to achieve higher reliability by
simply duplicating the same software modules will not work, because voting
listed below:
¾ Wear-out: Software does not have energy related wear-out phase. Errors can
problems.
operational time.
statements.
standard parts will help improve maintainability and reliability. But in software
industry, we have not observed this trend. Code reuse has been around for
some time, but to a very limited extent. Strictly speaking there are no
Over time, hardware exhibits the failure characteristics shown in following Figure
11.1, known as the bathtub curve. Period A, B and C stand for burn-in phase,
Software reliability, however, does not show the same characteristics similar as
software reliability on the same axes. There are two major differences between
hardware and software curves. One difference is that in the last phase, software
does not have an increasing failure rate as hardware does. In this phase,
upgrades or changes to the software. Therefore, the failure rate will not change.
The second difference is that in the useful-life phase, software will experience a
drastic increase in failure rate each time an upgrade is made. The failure rate
upgrades.
The upgrades in above Figure imply feature upgrades, not upgrades for
increased, since the functionality of software is enhanced. Even bug fixes may be
a reason for more software failures, if the bug fix induces other defects into
Following Figure shows the testing results of fifteen POSIX compliant operating
systems. From the graph we see that for QNX and HP-UX, robustness failure
rate increases after the upgrade. But for SunOS, IRIX and Digital UNIX,
robustness is one aspect of software reliability, this result indicates that the
Figure 11.3
concerning reliability.
understand the characteristics of how and why software fails, and try to quantify
software reliability. Over 200 models have been developed since the early 1970s,
but how to quantify software reliability still remains largely unsolved. As many
models as there are and many more emerging, none of the models can capture a
have to be made for the quantifying process. Therefore, there is no single model
One model may work well for a set of certain software, but may be completely off
Most software models contain the following parts: assumptions, factors, and a
mathematical function that relates the reliability with the factors. The
modeling and estimation modeling. Both kinds of modeling techniques are based
inference. The major differences of the two models are shown in following Table
11.1.
WHEN USED IN Usually made prior to Usually made later in life cycle
DEVELOPMENT development or test phases; can (after some data have been
Putnam's Model, and Rome Laboratory models TR-92-51 and TR-92-15, etc.
models and Weibull distribution model are usually named as classical fault
count/fault rate estimation models, while Thompson and Chelson's model belong
The field has matured to the point that software models can be applied in
practical situations and give meaningful results and, second, that there is no one
model that is best in all situations. Because of the complexity of software, any
model has to have extra assumptions. Only limited factors can be put into
so, complexity is reduced and abstraction is achieved, however, the models tend
the problems. We have to carefully choose the right model that suits our specific
case. Furthermore, the modeling results cannot be blindly believed and applied.
never ceased. Until now, we still have no good way of measuring software
reliability.
what aspects are related to software reliability. We cannot find a suitable way to
reliability. Even the most obvious product metrics such as software size have not
uniform definition.
initial approach to measuring software size. But there is not a standard way of
other non-executable statements are not counted. This method cannot faithfully
compare software not written in the same language. The advent of new
technologies of code reuse and code generation technique also cast doubt on
inquires, and interfaces. The method can be used to estimate the size of a
Test coverage metrics are a way of estimating fault and reliability by performing
function of the portion of software that has been successfully verified or tested.
process and the ability to complete projects on time and within the desired quality
process, process metrics can be used to estimate, monitor and improve the
faults found during testing (i.e., before delivery) and the failures (or other
analyzed to achieve this goal. Test strategy is highly relative to the effectiveness
of fault metrics, because if the testing scenario does not cover the full
functionality of the software, the software may pass all tests and yet be prone to
failure once delivered. Usually, failure metrics are based upon customer
information regarding failures found after release of the software. The failure data
time expected until the first failure of a piece of equipment. MTTF is a statistical
value and is meant to be the mean over a long period of time and large number
of units. For constant failure rate systems, MTTF is the inverse of the failure rate.
the number of hours that pass before a component, assembly, or system fails. It
MTBF can be calculated as the inverse of the failure rate for constant failure rate
systems. For example: If a component has a failure rate of 2 failures per million
Actually MTBF is the summation of MTTF and MTRF (Mean Time To Repair).
MTBF=MTTF+MTTR
Availability
It is a measure of the time during which the system is available. It may be stated
as:
It is defined as the probability that the system will fail when a service is
requested.
are necessary steps. Software testing is heavily used to trigger, locate and
remove software defects. Software testing is still in its infant stage; testing is
hoc manner. Various analysis tools such as trend analysis, fault-tree analysis,
Orthogonal Defect classification and formal methods, etc, can also be used to
minimize the possibility of defect occurrence after release and therefore improve
software reliability.
After deployment of the software product, field data can be gathered and
Software fault tolerance is the ability for software to detect and recover from a
systems.
understand the nature of the problem that software fault tolerance is supposed to
solve. Software faults are all design faults. Software manufacturing, the
being solely design faults is very different than almost any other system in which
fault tolerance is a desired property. This inherent issue, that software faults are
Current software fault tolerance methods are based on traditional hardware fault
tolerance. The deficiency with this approach is that traditional hardware fault
environmental and other faults secondarily. Design diversity was not a concept
applied to the solutions to hardware fault tolerance, and to this end, N-Way
redundant systems solved many single errors by replicating the same hardware.
tolerance to solve a different problem, but by doing so creates a need for design
create software, which has different enough designs that they don't share similar
failure modes. Design diversity and independent failure modes have been shown
to be a particularly difficult problem though. The issue still remains that for a
complex problem, the need for humans to solve that problem error free is not
easily solvable.
good for errors which are not caused by design faults, however, replicating a
design fault in multiple places will not aide in complying with a specification. It is
also important to note the emphasis placed on the specification as the final
arbiter of what is an error and what is not. Design diversity increases pressure on
algorithms for the necessary redundancy. The definition itself may no longer be
appropriate for the type of problems that current fault tolerance is trying to solve,
Randell argues that the difference between fault tolerance versus exception
handling is that exception handling deviates from the specification and fault
software will accomplish its task under adverse conditions while robust software
will be able to indicate a failure correctly, (hopefully without the entire system
failing.)
Software Bugs
Software faults are most often caused by design faults. Design faults occur when
Software faults are common for the simple reason that the complexity in modern
systems is often pushed into the software part of the system. It is estimated that
60-90% of current computer errors are from software faults. Software faults may
also occur from hardware; these faults are usually transitory in nature, and can
techniques.
was observed as somewhat current practice at the time. The recovery block
system view is broken down into fault recoverable blocks. The entire system is
constructed of these fault tolerant blocks. Each block contains at least a primary,
secondary, and exceptional case code along with an adjudicator. The adjudicator
is the component, which determines the correctness of the various blocks to try.
speed and aide in correctness. Upon first entering a unit, the adjudicator first
executes the primary alternate. (There may be N alternates in a unit which the
adjudicator may try.) If the adjudicator determines that the primary block failed, it
then tries to roll back the state of the system and tries the secondary alternate. If
the adjudicator does not accept the results of any of the alternates, it then
invokes the exception handler, which then indicates the fact that the software
Recovery block operation still has the same dependency, which most software
fault tolerance systems have: design diversity. The recovery block method
different multiple alternatives that are functionally the same. This issue is further
The recovery block system is also complicated by the fact that it requires the
ability to roll back the state of the system from trying an alternate. This may be
operations. This try and rollback ability has the effect of making the software to
transactional nature, the largest of which is the difficult nature of getting such a
check pointing and recovery may aide in constructing a distributed hardware fault
tolerant system.
variant accomplishes the same task, but hopefully in a different way. Each
version then submits its answer to voter or decider, which determines the correct
answer, and returns that as the result of the module. This system can hopefully
overcome the design faults present in most software by relying upon the design
the system could include multiple types of hardware using multiple versions of
software. The goal is to increase the diversity in order to avoid common mode
successful and successfully tolerate faults if the required design diversity is met.
decider may choose equally between them, but cannot be so limiting that the
The flexibility in the specification to encourage design diversity, yet maintain the
The differences between the recovery block method and the N-version method
are not too numerous, but they are important. In traditional recovery blocks, each
determined by the adjudicator. The recovery block method has been extended to
two methods is the difference between an adjudicator and the decider. The
recovery block method requires that each module build a specific adjudicator; in
the N-version method, a single decider may be used. The recovery block
adjudicator, will create a system, which is difficult to enter into an incorrect state.
important for the engineer to explore the space to decide on what the best
predicts how software reliability should improve over time as faults are
discovered and repaired. These models help the manager in deciding how much
efforts should be devoted in testing. The goal of project manager is to test and
There are various models, which have been derived from reliability experiments
testing data. The quality of their results depends upon the input data, the better
the outcome. The more data points available the better the model will perform.
When using calendar time for large projects, you need to verify homogeneity of
testing effort.
assumptions:
1. At the beginning of testing, there are u0 faults in the software code with u0
instantaneously and without introducing any new fault into the software.
of the (i−1)st fault is proportional to the number of faults remaining in the software
with the hazard rate of one fault, za(t) = φ, being the constant of proportionality:
The Jelinski-Moranda model belongs to the binomial type of models. For these
models, the failure intensity function is the product of the inherent number of
faults and the probability density of the time until activation of a single fault, fa(t),
i.e.:
It can easily be seen from equations (2) and (3) that the failure intensity can also
be expressed as
assumptions:
with mean value function μ(t). This mean value function has the boundary
constant of proportionality is φ.
3. For any finite collection of times t1 < t2 < · · · < tn the number of failures
occurring in each of the disjoint intervals (0, t1), (t1, t2), ..., (tn−1, tn) is
independent.
instantaneously and without introducing any new fault into the software.
Since each fault is perfectly repaired after it has caused a failure, the number of
inherent faults in the software at the beginning of testing is equal to the number
of failures that will have occurred after an infinite amount of testing. According to
fixed but unknown actual number of initial software faults u0 in the Jelinski-
Moranda model. Indeed, this is the main difference between the two models.
Just like in the Jelinski-Moranda model the failure intensity is the product of the
constant hazard rate of an individual fault and the number of expected faults
Musa's basic execution time model is based on an execution time model, i.e., the
time taken during modeling is the actual CPU execution time of the software
being modeled. This model is simple to understand and apply, and its predictive
value has been generally found to be good. The model focuses on failure
intensity while modeling reliability. It assumes that the failure intensity decreases
with time, that is, as (execution) time increases, the failure intensity decreases.
This assumption is generally true as the following is assumed about the software
testing activity, during which data is being collected: during testing, if a failure is
observed, the fault that caused that failure is detected and the fault is removed.
Even if a specific fault removal action might be unsuccessful, overall failures lead
In the basic model, it is assumed that each failure causes the same amount of
decrement in the failure intensity. That is, the failure intensity decreases with a
constant rate with the number of failures. In the more sophisticated Musa's
explicitly require that the time measurements be in actual CPU time utilized in
Although it was not originally formulated like that the model can be classified by
three characteristics:
type.
It shares these attributes with the Goel-Okumoto model, and the two models are
lies in the interpretation of the constant per-fault hazard rate φ. Musa split φ up in
two constant factors, the linear execution frequency f and the so-called fault
exposure ratio K:
dμ(t) / dt = f K [N − μ(t )]
under test, IS, times the average number of object instructions per source code
instruction, Qx: f = r / IS Qx. The fault exposure ratio relates the fault velocity f [N
− μ(t)], the speed with which defective parts of the code would be passed if all the
fault remaining in the code during one linear execution of the program.
This analysis yields results for both, the time dependent evolution of the system
and the steady state of the system. For example, in reliability engineering, the
represents the states and rates of a dynamic system. This diagram consists of
(representing the rate at which the system operation transitions from one state to
describing the operation of the system. These performance measures include the
following:
¾ System reliability.
¾ Availability.
(maintainability).
¾ The average number of visits to a given state within a given time period.
The name Markov model is derived from one of the assumptions which allows
this system to be analyzed; namely the Markov property. The Markov property
states: given the current state of the system, the future evolution of the system is
The assumptions on the Markov model may be relaxed, and the model may be
may be the case with a mechanical system. For example, the mechanical wear of
process, with the transition rates being time dependent. Markov models can also
dependent events.
11.3 Summary
as: the probability of failure-free software operation for a specified period of time
Reliability of the software depends on the faults in the software. To assess the
reliability of software, reliability models are required. To use the model, data is
collected about the software. Most reliability models are based on the data
Software reliability modeling has matured to the point that meaningful results can
be obtained by applying suitable models to the problem. There are many models
exist, but no single model can capture a necessary amount of the software
Development process, faults and failures found are all factors related to software
reliability.
Software reliability improvement is hard. The difficulty of the problem stems from
of software. Until now there is no good way to conquer the complexity problem of
of time and budget severely limits the effort put into software reliability
improvement.
As more and more software is creeping into embedded systems, we must make
sure they don't embed disasters. If not considered carefully, software reliability
can be the reliability bottleneck of the whole system. Ensuring software reliability
is no easy task. As hard as the problem is, promising progresses are still being
made toward more reliable software. More standard components, and better
11.4 Keywords
which predicts how software reliability should improve over time as faults are
as the number of hours that pass before a component, assembly, or system fails.
POFOD: It is defined as the probability that the system will fail when a service is
requested.
[2] Differentiate between fault, error and failure. Does testing observe faults or
failures?
[4] What are the assumptions made in Jelinski-Moranda Model? Explain the J-M
[5] What is the difference between software reliability and hardware reliability?
Explain.
[6] What do you understand by software fault tolerance? Discuss the recovery
[8] What are the differences between JM model, GO model, and Musa’s basic
McGraw Hill.
12.0 Objectives
The objective of this lesson is to make the students familiar with object oriented
design. Earlier in chapter 6 and 7 function oriented design was discussed. This
chapter is intended to impart the knowledge of object modeling, functional
modeling, and dynamic modeling. The important objective of this lesson is to get
the student acquainted with OMT, a methodology for object oriented design.
12.1 Introduction
(ClassName)
Object name
Introduction to Software Engineering Page 326 of 348
12.2.2.3 Derived object: It is defined as a function of one or more objects. It is
completely determined by the other objects. Derived object is redundant but can
be included in the object model.
12.2.2.4 Derived attribute: A derived attribute is that which is derived from other
attributes. For example, age can be derived from date of birth and current date.
12.2.2.5 Class: A class describes a group of objects with similar properties,
operations and relationships to other objects. Classes are represented by the
rectangular symbol and may be divided into three parts. The top part contains the
name of the class, middle part attributes and bottom part operations. An attribute
is a data value held by the objects in a class. For example, person is a class;
Mayank is an object while name, age, and sex are its attributes. Operations are
functions or transformations that may be applied to or by objects in a class. For
example push and pop are operations in stack class.
ClassName
Attribute-name1:data-type1=default-val1
Attribute-name2:data-type2=default-val2
Operation-name1(arguments1):result-type1
12.2.2.6 Links and Associations: A link is a physical or conceptual connection
Operation-name2(arguments2):result-type2
between object instances. For example Mayank flies Jaguar. So ‘flies’ is a link
between Mayank and Jaguar.
An association describes a group of links with common structure and common
semantics. For example a pilot flies an airplane. So here ‘flies’ is an association
between pilot and airplane. All the links in an association connect objects from
the same classes.
Associations are bidirectional in nature. For example, a pilot flies an airplane or
an airplane is flown by a pilot.
Associations may be binary, ternary or higher order. For example, a programmer
develops a project in a programming language represents a ternary association
drives
among programmer, project and programming language. Links and associations
are represented by a line between objects or classes as shown in a diagram
below:
12.2.2.7 Multiplicity: It specifies how many instances of one class may relate to
a single instance of an associated class. Multiplicity is described in the following
manner:
¾ Line without any ball indicates one-to-one association.
¾ Hollow ball indicates zero or one.
¾ Solid ball indicates zero, one or more.
¾ 1,2,6 indicates 1 or 2 or 6.
¾ 1+ indicates 1 or more
12.2.2.8 Link attributes: It is a property of the links in an association. For
example, ‘accessible by’ is an association between class File and class User.
‘Access permission’ is a link attribute.
Accessible by
File User
Works for
Role names for an association
{Ordered}
Window Screen
Visible on
Qualified association
12.2.2.12 Aggregation: Aggregation is a form of association. It is the “part-
whole” or “a-part-of” relationship in which objects representing the component of
something are associated with an object representing the entire assembly. A
hollow diamond is attached to the end of the path to indicate the aggregation. For
example, a team is aggregation of players.
Team Players
Aggregation
Program
Block
Compound Simple
statement statement
Recursive Aggregation
Vehicle
Boat House
BoatHouse
Multiple Inheritance
12.2.2.15 Metadata: Metadata is data about data. For example, the definition of
a class is metadata. Models, catalogs, blueprints, dictionary etc. are all examples
of metadata.
12.2.2.16 Grouping Constructs: There are two grouping constructs: module
and sheet.
Module is logical construct for grouping classes, associations and
generalizations. An object model consists of one or more modules. The module
name is usually listed at the top of each sheet.
A sheet is a single printed page. Sheet is the mechanism for breaking a large
object model into a series of pages. Each module is contained in one or more
sheets. Sheet numbers or sheet names inside circle contiguous to a class box
indicate other sheets that refer to a class.
12.2.3 Dynamic Modeling
State Name
Multiply
Product
Multiplier
Actors: An actor is an active object that drives the data flow diagram by
producing or consuming values. Actors are attached to the inputs and outputs of
a data flow diagram. Actors are also called as terminators as they act as source
and sink for data. An actor is represented by rectangle.
Select
name
Customer
Update
Introduction to Software Engineering Page 334 of 348
request
Data Stores: It stores data for later use. It does not generate any operation on its
own but can respond to request. So it is a passive object in a data flow diagram.
It is represented by a pair of parallel lines containing the name of store as shown
in figure below. Input arrow indicates storing data in the data store and output
arrow indicates accessing of data from data store.
name
Customer
Request Balance
Update
Data Flows: A data flow connects the output of an object or process to the input
of another object or process. An arrow between the producer and the consumer
of the data value represents a data flow. Arrow is labeled with description of data.
Sometimes an aggregate value is split into its constituents, each of which goes to
a different process. A fork in the path as shown below can show this.
Street
Address City
State
Amount
Customer Update
Cash
Control flow
Managers
Developers
Problem Statement
User Interviews
Build Models
Domain Knowledge
12.2.5.1.1Object Model
It consists of object diagrams and data dictionary. Following guidelines are used
to construct the object model:
¾ Identify objects and classes
This chapter is focused on how a software system can be design using objects
and classes while in chapter 6 and chapter 7 the focus was on function oriented
design. In object oriented approach, an object is the basic design unit. During
design the classes for objects are identified. A class represents the type for the
object and defines the possible state space for the objects of that class and the
class. Objects do not exist in isolation but are related to each other. One f the
goal of design here is to identify the relationships between the objects of different
classes.
and functional behaviors of the system are described by object model, dynamic
model and functional model. The object model describes the static, structural and
data aspects of a system. The dynamic model describes the temporal, behavioral
model for the system, and then refines it through dynamic and functional
that allows for definition of one class based on the definition of existing class. If a
class inherits features from more than one superclass, it is called as multiple
inheritance.
Dynamic model: describes those aspects of the system that changes with the
time.
6. What is the difference between object model, dynamic model and functional
model?
construct it?
8. What do you understand by DFD? What are the different symbols used to
examples.
aggregation.
11. If an association between classes has some attributes of its own, how will
Pressman, McGraw-Hill.