Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Cocomo (Constructive Cost Model) : Software Projects A B C D

Download as pdf or txt
Download as pdf or txt
You are on page 1of 104

Cocomo (Constructive Cost Model)

✓ based on LOC, i.e number of Lines of Code.


✓ It was proposed by Barry Boehm in 1970 and is based on the study of 63
projects, which make it one of the best-documented models.
The key parameters which define the quality of any software products, which
are also an outcome of the Cocomo are primarily Effort & Schedule:
• Effort
• Schedule
Boehm’s definition of organic, semidetached, and embedded systems:
• Organic
• Semi-detached
• Embedded

Types of Models: COCOMO consists of a hierarchy of three increasingly


detailed and accurate forms. Any of the three forms can be adopted according
to our requirements. These are types of COCOMO model:
1. Basic COCOMO Model
2. Intermediate COCOMO Model
3. Detailed COCOMO Model
Estimation of Effort: Calculations –
1. Basic Model –
E (effort) =a(KLOC)power b
Time=c(effort) power d

Personal required=effort/time

The above formula is used for the cost estimation of for the basic COCOMO model,
and also is used in the subsequent models. The constant values a,b,c and d for the Basic
Model for the different categories of system:

Software Projects a b c d

Organic 2.4 1.05 2.5 0.38

Semi Detached 3.0 1.12 2.5 0.35


Software Projects a b c d

Embedded 3.6 1.20 2.5 0.3

The effort is measured in Person-Months and as evident from the formula is


dependent on Kilo-Lines of code.
The development time is measured in Months.
These formulas are used as such in the Basic Model calculations.
WATERFALL & INCREMENTAL MODEL
THE ECONOMICS
OF SOFTWARE MAINTENANCE
IN THE TWENTY FIRST CENTURY

Version 3 – February 14, 2006

Abstract

All large companies utilize software in significant amounts. Some companies exceed
1,000,000 function points in the total volume of their corporate software portfolios.
Much of this software is now more than 10 years old, and some applications are more
than 25 years old. Maintenance of aging software tends to become more difficult year by
year since updates gradually destroy the original structure of the applications.

Starting at the end of the twentieth century a series of enormous maintenance problems
began to occur. The first of these problems consisted of the software updates necessary to
support the unified European currency or Euro. The second problem consisted of the
software updates to repair or and minimize the impact of the Year 2000 software bug in
existing portfolios. Two similar problems that will occur later in the century will be the
need to add digits to U.S. telephone numbers and to add digits to social security numbers.

The resources devoted to the Euro and Y2K problems caused delays in many other
projects. Mass-update and other maintenance projects will potentially absorb almost 70%
of the world’s software professionals during much of the 21st century. Mass update
software projects can top five trillion dollars in overall costs before the middle of the
twenty first century. It is obvious that better maintenance tools and technologies are an
urgent global priority.

Capers Jones, Chief Scientist Emeritus


Software Productivity Research, Inc.

Email CJones@SPR.com
Web http://www.spr.com

Copyright 1998 - 2006 by Capers Jones.


All Rights Reserved.
THE ECONOMICS OF SOFTWARE MAINTENANCE
IN THE TWENTY-FIRST CENTURY

INTRODUCTION

As the twenty-first century advances more than 50% of the global software population is
engaged in modifying existing applications rather than writing new applications. This
fact by itself should not be a surprise, because whenever an industry has more than 50
years of product experience the personnel who repair existing products tend to outnumber
the personnel who build new products. For example there are more automobile
mechanics in the United States who repair automobiles than there are personnel employed
in building new automobiles.

At the end of the twentieth century software maintenance grew rapidly during 1997-2000
under the impact of two “mass updates” that between them are required modifications to
about 85% of the world’s supply of existing software applications.

The first of these mass updates was the set of changes needed to support the new unified
European currency or Euro that rolled out in January of 1999. About 10% of the total
volume of world software needed to be updated in support of the Euro. However in the
European Monetary Union, at least 50% of the information systems required modification
in support of the Euro.

The second mass-update to software applications was the “Y2K” or year 2000 problem.
This widely discussed problem was caused by the use of only two digits for storing
calendar dates. Thus the year 1998 would have been stored as 98. When the century
ended, the use of 00 for the year 2000 would violate normal sorting rules and hence cause
many software applications to fail or to produce incorrect results unless updated.

The year 2000 problem affected as many as 75% of the installed software applications
operating throughout the world. Unlike the Euro, the year 2000 problem also affected
some embedded computers inside physical devices such as medical instruments,
telephone switching systems, oil wells, and electric generating plants.

Although these two problems were taken care of, the work required for handling them
triggered delays in other kinds of software projects and hence made software backlogs
larger than normal.

Under the double impact of the Euro conversion work and year 2000 repair work it is
appeared that more than 65% of the world’s professional software engineering population
was engaged in various maintenance and enhancement activities during 1999 and 2000.

Although the Euro and the Y2K problem are behind us, they are not the only mass-update
problems that we will face. For example it may be necessary to add one or more digits to

2
U.S. telephone numbers by about the year 2015. The UNIX calendar expires in the year
2038 and could troublesome like the year 2000 problem. Even larger, it may be necessary
to add at least one digit to U.S. social security numbers by about the year 2050.

The imbalance between software development and maintenance is opening up new


business opportunities for software outsourcing groups. It is also generating a significant
burst of research into tools and methods for improving software maintenance
performance.

What is Software Maintenance?

The word “maintenance” is surprisingly ambiguous in a software context. In normal


usage it can span some 21 forms of modification to existing applications. The two most
common meanings of the word maintenance include: 1) Defect repairs; 2) Enhancements
or adding new features to existing software applications.

Although software enhancements and software maintenance in the sense of defect repairs
are usually funded in different ways and have quite different sets of activity patterns
associated with them, many companies lump these disparate software activities together
for budgets and cost estimates.

The author does not recommend the practice of aggregating defect repairs and
enhancements, but this practice is very common. Consider some of the basic differences
between enhancements or adding new features to applications and maintenance or defect
repairs as shown in table 1:

Table 1: Key Differences Between Maintenance and Enhancements

Enhancements Maintenance
(New features) (Defect repairs)

Funding source Clients Absorbed


Requirements Formal None
Specifications Formal None
Inspections Formal None
User documentation Formal None
New function testing Formal None
Regression testing Formal Minimal

Because the general topic of “maintenance” is so complicated and includes so many


different kinds of work, some companies merely lump all forms of maintenance together
and use gross metrics such as the overall percentage of annual software budgets devoted
to all forms of maintenance summed together.

3
This method is crude, but can convey useful information. Organizations which are
proactive in using geriatric tools and services can spend less than 30% of their annual
software budgets on various forms of maintenance, while organizations that have not
used any of the geriatric tools and services can top 60% of their annual budgets on
various forms of maintenance.

Although the use of the word “maintenance” as a blanket term for more than 20 kinds of
update activity is not very precise, it is useful for overall studies of national software
populations. Table 2 shows the estimated U.S. software population for the United States
between 1950 and 2025 divided into “development” and “maintenance” segments.

In this table the term “development” implies creating brand new applications or adding
major new features to existing applications. The term “maintenance” implies fixing bugs
or errors, mass updates such as the Euro and Year 2000, statutory or mandatory changes
such as rate changes, and minor augmentation such as adding features that require less
than a week of effort.

Table 2: U.S. Software Populations in Development and Maintenance

Year Development Maintenance Total Maintenance


Personnel Personnel Personnel Percent

1950 1,000 100 1,100 9.09%


1955 2,500 250 2,750 9.09%
1960 20,000 2,000 22,000 9.09%
1965 50,000 10,000 60,000 16.67%
1970 125,000 25,000 150,000 16.67%
1975 350,000 75,000 425,000 17.65%
1980 600,000 300,000 900,000 33.33%
1985 750,000 500,000 1,250,000 40.00%
1990 900,000 800,000 1,700,000 47.06%
1995 1,000,000 1,100,000 2,100,000 52.38%
2000 750,000 2,000,000 2,750,000 72.73%
2005 775,000 2,500,000 3,275,000 76.34%
2010 800,000 3,000,000 3,800,000 78.95%
2015 1,000,000 3,500,000 4,500,000 77.78%
2020 1,100,000 3,750,000 4,850,000 77.32%
2025 1,250,000 4,250,000 5,500,000 77.27%

Notice that under the double impact of the Euro and the Year 2000 so many development
projects were delayed or cancelled so that the population of software developers in the
United States actually shrank below the peak year of 1995. The burst of mass update
maintenance work is one of the main reasons why there is such a large shortage of
software personnel.

As can be seen from table 2, the work of fixing errors and dealing with mass updates to
aging legacy applications has become the dominant form of software engineering. This

4
tendency will continue indefinitely so long as maintenance work remains labor-intensive.

Before proceeding, let us consider 21 discrete topics that are often coupled together under
the generic term “maintenance” in day to day discussions, but which are actually quite
different in many important respects:

Table 3: Major Kinds of Work Performed Under the Generic Term “Maintenance”

1. Major Enhancements (new features of > 20 function points)


2. Minor Enhancements (new features of < 5 function points)
3. Maintenance (repairing defects for good will)
4. Warranty repairs (repairing defects under formal contract)
5. Customer support (responding to client phone calls or problem reports)
6. Error-prone module removal (eliminating very troublesome code segments)
7. Mandatory changes (required or statutory changes)
8. Complexity analysis (quantifying control flow using complexity metrics)
9. Code restructuring (reducing cyclomatic and essential complexity)
10. Optimization (increasing performance or throughput)
11. Migration (moving software from one platform to another)
12. Conversion (Changing the interface or file structure)
13. Reverse engineering (extracting latent design information from code)
14. Reengineering (transforming legacy application to client-server form)
15. Dead code removal (removing segments no longer utilized)
16. Dormant application elimination (archiving unused software)
17. Nationalization (modifying software for international use)
18. Year 2000 Repairs (date format expansion or masking)
19. Euro-currency conversion (adding the new unified currency to financial applications)
20. Retirement (withdrawing an application from active service)
21. Field service (sending maintenance members to client locations)

Although the 21 maintenance topics are different in many respects, they all have one
common feature that makes a group discussion possible: They all involve modifying an
existing application rather than starting from scratch with a new application.

Although the 21 forms of modifying existing applications have different reasons for being
carried out, it often happens that several of them take place concurrently. For example,
enhancements and defect repairs are very common in the same release of an evolving
application. There are also common sequences or patterns to these modification
activities. For example, reverse engineering often precedes reengineering and the two
occur so often together as to almost comprise a linked set. For releases of large
applications and major systems, the author has observed from six to 10 forms of
maintenance all leading up to the same release!

5
Nominal Default Values for Maintenance and Enhancement Activities

The nominal default values for exploring these 21 kinds of maintenance are shown in
table 4. However, each of the 21 has a very wide range of variability and reacts to a
number of different technical factors, and also to the experience levels of the maintenance
personnel. Let us consider some generic default estimating values for these various
maintenance tasks using two useful metrics: “assignment scopes” and “production rates.”

The term “assignment scope” refers to the amount of software one programmer can keep
operational in the normal course of a year, assuming routine defect repairs and minor
updates. Assignment scopes are usually expressed in terms of function points and the
observed range is from less than 300 function points to more than 5,000 function points.

The term “production rate” refers to the number of units that can be handled in a standard
time period such as a work month, work week, day, or hour. Production rates are usually
expressed in terms of either “function points per staff month” or the similar and
reciprocal metric, “work hours per function point.”

We will also include “Lines of code per staff month” with the caveat that the results are
merely based on an expansion of 100 statements per function point, which is only a
generic value and should not be used for serious estimating purposes.

Table 4: Default Values for Maintenance Assignment Scopes and Production Rates

Assignment Production Production Production


Scopes Rates Rates Rates
in Function (Funct. Pts. (Work Hours (LOC per
Points per Month) per Funct. Pt.) Staff Month)

Customer support 5,000 3,000 0.04 300,000


Code restructuring 5,000 1,000 0.13 100,000
Complexity analysis 5,000 500 0.26 50,000
Reverse engineering 2,500 125 1.06 12,500
Retirement 5,000 100 1.32 10,000
Field service 10,000 100 1.32 10,000
Dead code removal 750 35 3.77 3,500
Enhancements (minor) 75 25 5.28 2,500
Reengineering 500 25 5.28 2,500
Maintenance (defect repairs) 750 25 5.28 2,500
Warranty repairs 750 20 6.60 2,000
Migration to new platform 300 18 7.33 1,800
Enhancements (major) 125 15 8.80 1,500
Nationalization 250 15 8.80 1,500
Conversion to new interface 300 15 8.80 1,500
Mandatory changes 750 15 8.80 1,500
Performance optimization 750 15 8.80 1,500
Year 2000 repairs 2,000 15 8.80 1,500

6
Euro-currency conversion 1,500 15 8.80 1,500
Error-prone module removal 300 12 11.00 1,200
Average 2,080 255 5.51 25,450

Each of these forms of modification or support activity have wide variations, but these
nominal default values at least show the ranges of possible outcomes for all of the major
activities associated with support of existing applications.

Table 5 shows some of the factors and ranges that are associated with assignment scopes,
or the amount of software that one programmer can keep running in the course of a
typical year.

In table 5 the term “experienced staff” means that the maintenance team has worked on
the applications being modified for at least six months and are quite familiar with the
available tools and methods.

The term “good structure” means that the application adheres to the basic tenets of
structured programming; has clear and adequate comments; and has cyclomatic
complexity levels that are below a value of 10.

The term “full maintenance tools” implies the availability of most of these common
forms of maintenance tools: 1) Defect tracking and routing tools; 2) Change control
tools; 3) Complexity analysis tools; 4) Code restructuring tools; 5) Reverse engineering
tools; 6) Reengineering tools; 7) Maintenance “workbench” tools; 8) Test coverage
tools.

The term “high level language” implies a fairly modern programming language that
requires less than 50 statements to encode 1 function point. Examples of such languages
include most object-oriented languages such as Smalltalk, Eiffel, and Objective C.

By contrast “low level languages” implies language requiring more than 100 statements
to encode 1 function point. Obviously assembly language would be in this class since it
usually takes more than 200 to 300 assembly statements per function point. Other
languages that top 100 statements per function point include many mainstream languages
such as C, Fortran, and COBOL.

In between the high-level and low-level ranges are a variety of mid-level languages that
require roughly 70 statements per function point, such as Ada83, PL/I, and Pascal.

The variations in maintenance assignment scopes are significant in understanding why so


many people are currently engaged in maintenance of aging legacy applications. If a
company owns a portfolio of 100,000 function points maintained by generalists many
more people will be required than if maintenance specialists are used. If the portfolio
consists of poorly structured code written in low-level languages then the assignment
scope might be less than 500 function points or a staff of 200 maintenance personnel.

7
If the company has used complexity analysis tools, code restructuring tools, and has a
staff of highly trained maintenance specialists then the maintenance assignment scope
might top 3,000 function points. This implies that only 33 maintenance experts are
needed, as opposed to 200 generalists. Table 5 illustrates how maintenance assignment
scopes vary in response to four different factors, when each factor switches from “worst
case” to “best case.” Table 5 assumes Version 4.1 of the International Function Point
Users Group (IFPUG) counting practices manual.

Table 5: Variations in Maintenance Assignment Scopes Based on Four Key Factors


(Data expressed in terms of function points per maintenance team member)

Worst Average Best


Case Case Case

Inexperienced staff 100 200 350


Poor structure
Low-level language
No maintenance tools

Inexperienced staff 150 300 500


Poor structure
High-level language
No maintenance tools

Inexperienced staff 225 400 600


Poor structure
Low-level language
Full maintenance tools

Inexperienced staff 300 500 750


Good structure
Low-level language
No maintenance tools

Experienced Staff 350 575 900


Poor structure
Low-level language
No maintenance tools

Inexperienced staff 450 650 1,100


Good structure
High-level language
No maintenance tools

Inexperienced staff 575 800 1,400


Good structure
Low-level language
Full maintenance tools

8
Experienced staff 700 1,100 1,600
Good structure
Low-level language
No maintenance tools

Inexperienced staff 900 1,400 2,100


Poor structure
High-level language
Full maintenance tools

Experienced staff 1,050 1,700 2,400


Poor structure
Low-level language
Full maintenance tools

Experienced staff 1,150 1,850 2,800


Poor structure
High-level language
No maintenance tools

Experienced staff 1,600 2,100 3,200


Good structure
High-level language
No maintenance tools

Inexperienced staff 1,800 2,400 3,750


Good structure
High-level language
Full maintenance tools

Experienced staff 2,100 2,800 4,500


Poor structure
High-level language
Full maintenance tools

Experienced staff 2,300 3,000 5,000


Good structure
Low-level language
Full maintenance tools

Experienced staff 2,600 3,500 5,500


Good structure
High-level language
Full maintenance tools

Average 1,022 1,455 2,278

9
None of the values in table 5 are sufficiently rigorous by themselves for formal cost
estimates, but are sufficient to illustrate some of the typical trends in various kinds of
maintenance work. Obviously adjustments for team experience, complexity of the
application, programming languages, and many other local factors are needed as well.

Metrics Problems With Small Maintenance Projects

There are several difficulties in exploring software maintenance costs with accuracy. One
of these difficulties is the fact that maintenance tasks are often assigned to development
personnel who interleave both development and maintenance as the need arises. This
practice makes it difficult to distinguish maintenance costs from development costs
because the programmers are often rather careless in recording how time is spent.

Another and very signficant problem is that fact that a great deal of software maintenance
consists of making very small changes to software applications. Quite a few bug repairs
may involve fixing only a single line of code. Adding minor new features such as
perhaps a new line-item on a screen may require less than 50 source code statements.

These small changes are below the effective lower limit for counting function point
metrics. The function point metric includes weighting factors for complexity, and even if
the complexity adjustments are set to the lowest possible point on the scale, it is still
difficult to count function points below a level of perhaps 15 function points.

Quite a few maintenance tasks involve changes that are either a fraction of a function
point, or may at most be less than 10 function points or about 1000 COBOL source code
statements. Although normal counting of function points is not feasible for small
updates, it is possible to use the “backfiring” method or converting counts of logical
source code statements in to equivalent function points. For example, suppose an update
requires adding 100 COBOL statements to an existing application. Since it usually takes
about 105 COBOL statements in the procedure and data divisions to encode 1 function
point, it can be stated that this small maintenance project is “about 1 function point in
size.”

If the project takes one work day consisting of six hours, then at least the results can be
expressed using common metrics. In this case, the results would be roughly “6 staff
hours per function point.” If the reciprocal metric “function points per staff month” is
used, and there are 20 working days in the month, then the results would be “20 function
points per staff month.”

Best and Worst Practices in Software Maintenance

Because maintenance of aging legacy software is very labor intensive it is quite important
to explore the best and most cost effective methods available for dealing with the millions
of applications that currently exist. The sets of best and worst practices are not
symmetrical. For example the practice that has the most positive impact on maintenance

10
productivity is the use of trained maintenance experts. However the factor that has the
greatest negative impact is the presence of “error –prone modules” in the application that
is being maintained.

Table 6 illustrates a number of factors which have been found to exert a beneficial
positive impact on the work of updating aging applications and shows the percentage of
improvement compared to average results:

Table 6: Impact of Key Adjustment Factors on Maintenance


(Sorted in order of maximum positive impact)

Maintenance Factors Plus


Range

Maintenance specialists 35%


High staff experience 34%
Table-driven variables and data 33%
Low complexity of base code 32%
Y2K and special search engines 30%
Code restructuring tools 29%
Reengineering tools 27%
High level programming languages 25%
Reverse engineering tools 23%
Complexity analysis tools 20%
Defect tracking tools 20%
Y2K “mass update” specialists 20%
Automated change control tools 18%
Unpaid overtime 18%
Quality measurements 16%
Formal base code inspections 15%
Regression test libraries 15%
Excellent response time 12%
Annual training of > 10 days 12%
High management experience 12%
HELP desk automation 12%
No error prone modules 10%
On-line defect reporting 10%
Productivity measurements 8%
Excellent ease of use 7%
User satisfaction measurements 5%
High team morale 5%

Sum 503%

At the top of the list of maintenance “best practices” is the utilization of full-time, trained
maintenance specialists rather than turning over maintenance tasks to untrained
generalists. The positive impact from utilizing maintenance specialists is one of the
reasons why maintenance outsourcing has been growing so rapidly. The maintenance
productivity rates of some of the better maintenance outsource companies is roughly
twice that of their clients prior to the completion of the outsource agreement. Thus even

11
if the outsource vendor costs are somewhat higher, there can still be useful economic
gains.

Let us now consider some of the factors which exert a negative impact on the work of
updating or modifying existing software applications. Note that the top-ranked factor
which reduces maintenance productivity, the presence of error-prone modules, is very
asymmetrical. The absence of error-prone modules does not speed up maintenance work,
but their presence definitely slows down maintenance work.

Error-prone modules were discovered by IBM in the 1960’s when IBM’s quality
measurements began to track errors or bugs down to the levels of specific modules. For
example it was discovered that IBM’s IMS data base product contained 425 modules, but
more than 300 of these were zero-defect modules that never received any bug reports.
About 60% of all reported errors were found in only 31 modules, and these were very
buggy indeed.

When this form of analysis was applied to other products and used by other companies, it
was found to be a very common phenomenon. In general more than 80% of the bugs in
software applications are found in less than 20% of the modules. Once these modules are
identified then they can be inspected, analyzed, and restructured to reduce their error
content down to safe levels.

Table 7 summarizes the major factors that degrade software maintenance performance.
Not only are error-prone modules troublesome, but many other factors can degrade
performance too. For example, very complex “spaghetti code” is quite difficult to
maintain safely. It is also troublesome to have maintenance tasks assigned to generalists
rather than to trained maintenance specialists.

A very common situation which often degrades performance is lack of suitable


maintenance tools, such as defect tracking software, change management software, test
library software, and so forth. In general it is very easy to botch up maintenance and
make it such a labor-intensive activity that few resources are left over for development
work. The simultaneous arrival of the year 2000 and Euro problems have basically
saturated the available maintenance teams, and are also drawing developers into the work
of making mass updates. This situation can be expected to last for many years, and may
introduce permanent changes into software economic structures.

12
Table 7: Impact of Key Adjustment Factors on Maintenance
(Sorted in order of maximum negative impact)

Maintenance Factors Minus


Range

Error prone modules -50%


Embedded variables and data -45%
Staff inexperience -40%
High complexity of base code -30%
No Y2K of special search engines -28%
Manual change control methods -27%
Low level programming languages -25%
No defect tracking tools -24%
No Y2K “mass update” specialists -22%
Poor ease of use -18%
No quality measurements -18%
No maintenance specialists -18%
Poor response time -16%
Management inexperience -15%
No base code inspections -15%
No regression test libraries -15%
No HELP desk automation -15%
No on-line defect reporting -12%
No annual training -10%
No code restructuring tools -10%
No reengineering tools -10%
No reverse engineering tools -10%
No complexity analysis tools -10%
No productivity measurements -7%
Poor team morale -6%
No user satisfaction measurements -4%
No unpaid overtime 0%

Sum -500%

Given the enormous amount of effort that is now being applied to software maintenance,
and which will be applied in the future, it is obvious that every corporation should
attempt to adopt maintenance “best practices” and avoid maintenance “worst practices” as
rapidly as possible.

Software Entropy and Total Cost of Ownership

The word “entropy” means the tendency of systems to detstabilize and become more
chaotic over time. Entropy is a term from physics and is not a software-related word.
However entropy is true of all complex systems, including software.: All known
compound objects decay and become more complex with the passage of time unless
effort is exerted to keep them repaired and updated. Software is no exception. The
accumulation of small updates over time tends to gradually degrade the initial structure of
applications and makes changes grow more difficult over time.

13
For software applications entropy has long been a fact of life. If applications are
developed with marginal initial quality control they will probably be poorly structured
and contain error-prone modules. This means that every year, the accumulation of defect
repairs and maintenance updates will degrade the original structure and make each change
slightly more difficult. Over time, the application will destabilize and “bad fixes” will
increase in number and severity. Unless the application is restructured or fully
refurbished, eventually it will become so complex that maintenance can only be
performed by a few experts who are more or less locked into the application.

By contrast, leading applications that are well structured initially can delay the onset of
entropy. Indeed, well-structured applications can achieve declining maintenance costs
over time. This is because updates do not degrade the original structure, as happens in
the case of “spaghetti bowl” applications where the structure is almost unintelligible
when maintenance begins.

The total cost of ownership of a software application is the sum of four major expense
elements: 1) the initial cost of building an application; 2) the cost of enhancing the
application with new features over its lifetime; 3) the cost of repairing defects and bugs
over the application’s lifetime; 4 The cost of customer support for fielding and
responding to queries and customer-reported defects.

Table 8 illustrates the total cost of ownership of three similar software applications under
three alternate scenarios. Assume the applications are nominally 1000 function points in
size. (To simplify the table, only a 5-year ownership period is illustrated.)

The “lagging” scenario in the left column of table 8 assumes inadequate quality control,
poor code structure, up to a dozen severe error-prone modules, and significant “bad fix”
injection rates of around 20%. Under the lagging scenario maintenance costs will
become more expensive every year due to entropy and the fact that the application never
stabilizes.

The “average” scenario assumes marginal quality control, reasonable initial code
structure, one or two error-prone modules, and an average bad-fix injection rate of around
7%. Here too entropy will occur. But the rate at which the application’s structure
degrades is fairly slow. Thus maintenance costs increase over a five-year period, but not
at a very significant annual rate.

The “leading” scenario assumes excellent quality control, very good code structure at the
initial release, zero error-prone modules, and a very low bad-fix injection rate of 1% or
less. Under the leading scenario, maintenance costs can actually decline over the five-
year ownership period. Incidentally, such well-structured applications of this type are
most likely to be found for systems software and defense applications produced by
companies at or higher the Level 3 on the Software Engineering Institute (SEI) capability
maturity model (CMM) scale.

14
Table 8: Five-Year Cost of Software Application Ownership
(Costs are in Dollars per Function Point)

Lagging Average Leading


Projects Projects Projects

DEVELOPMENT $1,200.00 $1,000.00 $800.00

Year 1 $192.00 $150.00 $120.00


Year 2 $204.00 $160.00 $112.00
Year 3 $216.00 $170.00 $104.00
Year 4 $240.00 $180.00 $96.00
Year 5 $264.00 $200.00 $80.00
MAINTENANCE $1,116.00 $860.00 $512.00

TOTAL COST $2,316.00 $1,860.00 $1,312.00

Difference $456.00 $0.00 -$548.00

Under the lagging scenario, the five-year maintenance costs for the application (which
include defect repairs, support, and enhancements) are greater than the original
development costs. Indeed, the economic value of lagging applications is questionable
after about three to five years. The degradation of initial structure and the increasing
difficulty of making updates without “bad fixes” tends toward negative returns on
investment (ROI) within a few years.

For applications in COBOL there are code restructuring tools and maintenance
workbenches available that can extend the useful economic lives of aging legacy
applications. But for many languages such as assembly language, Algol, Bliss, CHILL,
CORAL, and PL/I there are few maintenance tools and no commercial restructuring tools.
Thus for poorly structured applications in many languages, the ROI may be marginal or
negative within less than a 10 year period. Of course if the applications are vital or
mission critical (such as air traffic control or the IRS income tax applications) there may
be no choice but to keep the applications operational regardless of cost or difficulty.

Under the average scenario, the five-year maintenance costs for the application are
slightly below the original development costs. Most average applications have a mildly
positive ROI for up to 10 years after initial deployment.

Under the leading scenario with well-structured initial applications, the five-year
maintenance costs are only about half as expensive as the original development costs.
Yet the same volume of enhancements is assumed in all three cases. For leading
applications, the ROI can stay positive for 10 to 20 years after initial deployment. This is
due to the low entropy and the reduced bad-fix injection rate of the leading scenario. In
other words, if you build applications properly at the start, you can get many years of

15
useful service. If you build them poorly at the start, you can expect high initial
maintenance costs that will grow higher as time passes. You can also expect a rapid
decline in return on investment (ROI).

The same kind of phenomena can be observed outside of software. If you buy an
automobile that has a high frequency of repair as shown in Consumer Reports and you
skimp on lubrication and routine maintenance, you will fairly soon face some major
repair problems – probably before 50,000 miles.

By contrast, if you buy an automobile with a low frequency of repair as shown in


Consumer Reports and you are scrupulous in maintenance, you should be able to drive
the car more than 100,000 miles without major repair problems.

Summary and Conclusions

In every industry maintenance tends to require more personnel than those building new
products. For the software industry the number of personnel required to perform
maintenance is unusually large and may soon top 75% of all technical software workers.
The main reasons for the high maintenance efforts in the software industry are the
intrinsic difficulties of working with aging software, and the growing impact of “mass
updates” that began with the roll-out of the Euro and the arrival of the year 2000 problem.
However similar mass-updates will occur in the future as we run out of telephone
numbers and social security numbers.

Given the enormous efforts and costs devoted to software maintenance, every company
should evaluate and consider best practices for maintenance, and should avoid worst
practices if at all possible.

References

Arnold, Robert S.; Software Reengineering; IEEE Computer Society Press, Los Alamitos,
CA; 1993; ISBN 0-8186-3272-0; 600 pages.

Arthur, Lowell Jay; Software Evolution - The Software Maintenance Challenge; John
Wiley & Sons, New York; 1988; ISBN 0-471-62871-9; 254 pages.

Boehm, Barry Dr.; Software Engineering Economics; Prentice Hall, Englewood Cliffs,
NJ; 1981; 900 pages.

Brown, Norm (Editor); The Program Manager’s Guide to Software Acquisition Best
Practices; Version 1.0; July 1995; U.S. Department of Defense, Washington, DC;
142 pages.

16
Department of the Air Force; Guidelines for Successful Acquisition and Management of
Software Intensive Systems; Volumes 1 and 2; Software Technology Support Center,
Hill Air Force Base, UT; 1994.

Gallagher, R.S.; Effective Customer Support; International Thomson Computer Press,


Boston, MA; 1997; ISBN 1-85032-209-0; 480 pages.

Grady, Robert B.; Practical Software Metrics for Project Management and Process
Improvement; Prentice Hall, Englewood Cliffs, NJ; ISBN 0-13-720384-5; 1992; 270
pages.

Grady, Robert B. & Caswell, Deborah L.; Software Metrics: Establishing a Company-
Wide Program; Prentice Hall, Englewood Cliffs, NJ; ISBN 0-13-821844-7; 1987;
288 pages.

Jones, Capers; Applied Software Measurement; McGraw Hill, 2nd edition 1996; ISBN 0-
07-032826-9; 618 pages.

Jones, Capers; Critical Problems in Software Measurement; Information Systems


Management Group, 1993; ISBN 1-56909-000-9; 195 pages.

Jones, Capers; Software Productivity and Quality Today -- The Worldwide Perspective;
Information Systems Management Group, 1993; ISBN -156909-001-7; 200 pages.

Jones, Capers; Assessment and Control of Software Risks; Prentice Hall, 1994; ISBN 0-
13-741406-4; 711 pages.

Jones, Capers; New Directions in Software Management; Information Systems


Management Group; ISBN 1-56909-009-2; 150 pages.

Jones, Capers; Patterns of Software System Failure and Success; International Thomson
Computer Press, Boston, MA; December 1995; 250 pages; ISBN 1-850-32804-8;
292 pages.

Jones, Capers; The Year 2000 Software Problem - Quantifying the Costs and Assessing
the Consequences; Addison Wesley, Reading, MA; 1998; ISBN 0-201-30964-5; 303
pages.

Jones, Capers; Software Quality – Analysis and Guidelines for Success; International
Thomson Computer Press, Boston, MA; ISBN 1-85032-876-6; 1997; 492 pages.

Jones, Capers; Software Assessments, Benchmarks, and Best Practices; Addison Wesley
Longman, Boston, MA; ISBN 0-201-48542-7; 2000; 657 pages.

17
Jones, Capers: “Sizing Up Software;” Scientific American Magazine, Volume 279, No. 6,
December 1998; pages 104-111.

Kan, Stephen H.; Metrics and Models in Software Quality Engineering; Addison Wesley,
Reading, MA; ISBN 0-201-63339-6; 1995; 344 pages.

Howard, Alan (Ed.); Software Maintenance Tools; Applied Computer Research (ACR;
Phoenix, AZ; 1997; 30 pages.

Marciniak, John J. (Editor); Encyclopedia of Software Engineering; John Wiley & Sons,
New York; 1994; ISBN 0-471-54002; in two volumes.

McCabe, Thomas J.; “A Complexity Measure”; IEEE Transactions on Software


Engineering; December 1976; pp. 308-320.

Mertes, Karen R.; Calibration of the CHECKPOINT Model to the Space and Missile
Systems Center (SMC) Software Database (SWDB); Thesis AFIT/GCA/LAS/96S-
11, Air Force Institute of Technology (AFIT), Wright Patterson AFB, Ohio;
September 1996; 119 pages.

Muller, Monika & Abram, Alain (editors); Metrics in Software Evolution; R. Oldenbourg
Vertag GmbH, Munich; ISBN 3-486-23589-3; 1995.

Multiple authors; Rethinking the Software Process; (CD-ROM); Miller Freeman,


Lawrence, KS; 1996. (This is a new CD ROM book collection jointly produced by
the book publisher, Prentice Hall, and the journal publisher, Miller Freeman. This
CD ROM disk contains the full text and illustrations of five Prentice Hall books:
Assessment and Control of Software Risks by Capers Jones; Controlling Software
Projects by Tom DeMarco; Function Point Analysis by Brian Dreger; Measures for
Excellence by Larry Putnam and Ware Myers; and Object-Oriented Software Metrics
by Mark Lorenz and Jeff Kidd.)

Parikh, Girish; Handbook of Software Maintenance; John Wiley & Sons, New York;
1986; ISBN 0-471-82813-0; 421 pages.

Pigoski, Thomas M.; Practical Software Maintenance - Best Practices for Managing Your
Software Investment; IEEE Computer Society Press, Los Alamitos, CA; 1997; ISBN
0-471-17001-1; 400 pages.

Putnam, Lawrence H.; Measures for Excellence -- Reliable Software On Time, Within
Budget; Yourdon Press - Prentice Hall, Englewood Cliffs, NJ; ISBN 0-13-567694-0;
1992; 336 pages.

18
Putnam, Lawrence H and Myers, Ware.; Industrial Strength Software - Effective
Management Using Measurement; IEEE Press, Los Alamitos, CA; ISBN 0-8186-
7532-2; 1997; 320 pages.

Rubin, Howard; Software Benchmark Studies For 1997; Howard Rubin Associates,
Pound Ridge, NY; 1997.

Sharon, David; Managing Systems in Transition - A Pragmatic View of Reengineering


Methods; International Thomson Computer Press, Boston, MA; 1996; ISBN 1-
85032-194-9; 300 pages.

Shepperd, M.: “A Critique of Cyclomatic Complexity as a Software Metric”; Software


Engineering Journal, Vol. 3, 1988; pp. 30-36.

Stukes, Sherry, Deshoretz, Jason, Apgar, Henry and Macias, Ilona; Air Force Cost
Analysis Agency Software Estimating Model Analysis ; TR-9545/008-2; Contract
F04701-95-D-0003, Task 008; Management Consulting & Research, Inc.; Thousand
Oaks, CA 91362; September 30 1996.

Symons, Charles R.; Software Sizing and Estimating – Mk II FPA (Function Point
Analysis); John Wiley & Sons, Chichester; ISBN 0 471-92985-9; 1991; 200 pages.

Takang, Armstrong and Grubh, Penny; Software Maintenance Concepts and Practice;
International Thomson Computer Press, Boston, MA; 1997; ISBN 1-85032-192-2;
256 pages.

Zvegintzov, Nicholas; Software Management Technology Reference Guide; Dorset


House Press, New York, NY; ISBN 1-884521-0; 1994; 240 pages.

19
- -

-=

Software Engineering Economics


BARRY W. BOEHM

Manuscript received April 26, 1983 ;revised June 28, 1983.


The author is with the Software Information Systems Division,
TRW Defense Systems Group, Redondo Beach, CA 90278.

Abstruct-This paper summarizes the current state of the art and


recent trends in software engineering economics. It provides an over-
view of economic analysis techniques and their applicability to soft-
ware engineering and management. It surveys the field of software
cost estimation, including the major estimation techniques available,
the state of the art in algorithmic cost models, and the outstanding
research issues in software cost estimation.
Index Terms-Computer programming costs, cost models, manage-
ment decision aids, software cost estimation, software economics,
software engineering, software management.

Definitions
The dictionary defines "economics" as "a social science
concerned chiefly with description and analysis of the produc-
tion, distribution, and consumption of goods and services."
Here is another defmition of economics which I think is more
helpful in explaining how economics relates to software engi-
neering.
Economics is the study of how people make decisions
.in resource-limited situations.
This definition of economics fits the major branches of
classical economics very well.
Macroeconomics is the study of how people make decisions
in resource-limited situations on a national or global scale. It
deals with the effects of decisions that national leaders make
on such issues as tax rates, interest rates, foreign and trade
policy.
Microeconomics is the study of how people make decisions
in resource-limited situations on a more personal scale. It deals
with the decisions that individuals and organizations make on
such issues as how much insurance to buy, which word proc-
essor to buy, or what prices to charge for their products or
services.
Economics and Software Engineering Management
If we look at the discipline of software engineering, we see
that the microeconomics branch of economics deals more with
the types of decisions we need to make as software engineers
or managers.
Clearly, we deal with limited resources. There is never
enough time or money to cover all the good features we would
like to put into our software products. And even in these days
of cheap hardware and virtual memory, our more significant
software products must always operate within a world of lim-
ited computer power and main memory. If you have been in
the software engineering field for any length of time, I am sure
you can think of a number of decision situations in which you
had to determine some key software product feature as a func-
tion of some limiting critical resource.
Throughout the software life cycle,' there are many de-
cision situations involving limited resources in which software
engineering economics techniques provide useful assistance. To
provide a feel for the nature of these economic decision issues,
an example is given below for each of the major phases in the
software life cycle.
Feasibiliw Phase: How much should we invest in in-
formation system analyses (user questionnaires and in-
1 Economic principles underlie the overall structure of the software
Iife cycle, and its primary refinements of prototyping, incremental de-
velopment, and advancemanship. The primary economic driver of the
life-cycle structure is the significantly increasing cost of making a soft-
ware change or fmhg a software problem, as a function o f the phase
in which the change or fur is made. See [ 11, ch. 41.
terviews, current-system analysis, workload characteri-
zations, simulations, scenarios, prototypes) in order
that we converge on an appropriate definition and con-
cept of operation for the system we plan t o imple-
ment?
Plans and Requirements Phase: How rigorously should
we specify requirements? How much should we invest
iri requirements validation activities (automated com-
pleteness, consistency, and traceability checks, analytic
models, simulations, prototypes) before proceeding to
design and develop a software system?
Product Design Phase: Should we organize the software
to make it possible to use a complex piece of existing
software which generally but not completely meets our
requirements?
Programming Phase: Given a choice between three data
storage and retrieval schemes which are primarily exe-
cution time-efficient, storage-efficient, and easy-to-
modify, respectively; which of these should we choose
to implement?
Integration and Test Phase: How much testing and for-
mal verification should we perform on a product be-
fore releasing it to users?
Maintenance Phase: Given an extensive list of suggested
product improvements, which ones should we imple-
ment first?
Phaseout: Given an aging, hard-to-modify software
product, should we replace it with a new product, re-
structure it, or leave it alone?

Outline of This Paper


The economics field has evolved a number of techniques
(cost-benefit analysis, present value analysis, risk analysis, etc.)
for dealing with decision issues such as the ones above. Section
I1 of this paper provides an overview of these techniques and
their applicability to software engineering.
One critical problem which underlies all applications of
economic techniques to software engineering is the problem of
estimating software costs. Section I11 contains three major
sections which summarize this field:
111-A: Major Software Cost Estimation Techniques
111-B: Algorithmic Models for Software Cost Estimation
111-C: Outstanding Research Issues in Software Cost Estima-
tion.
Section IV concludes by summarizing the major benefits of
software engineering economics, and commenting on the
major challenges awaiting the field.

Overview of Relevant Techniques


The microeconomics field provides a number of techniques
for dealing with software life-cycle decision issues such as the
ones given in the previous section. Fig. 1 presents an overall
master key t o these techniques and when to use them.*
As indicated in Fig. 1, standard optimization techniques
can be used when we can find a single quantity such as dollars
(or pounds, yen, cruzeiros, etc.) to serve as a "universal sol-
vent" into which all of our decision variables can be converted.
Or, if the nondollar objectives can be expressed as constraints
(system availability must be at least 98 percent; throughput
must be at least 150 transactions per second), then standard
constrained optimization techniques can be used. And if cash
flows occur at different times, then present-value techniques
can be used to normalize them to a common point in time.

2 The chapter numben in Fig. 1 refer to the chapters in [ 11] , in


which those techniques are discussed in further detail.
MASTER KEY
T O SOFTWARE ENGINEERING ECONOMICS
DECISION ANALYSIS TECHNIQUES
USF STANOARn

NE r VALUE HlGHLV SENSl r l V E


PRESENT SJ TFCHNIOUFS TO ASSUMPTIONS?
ICHAPTTRS 10. 131

y HAPTFR I 7

USE ZTbNDARO
N t W - SDCr VCS CONSTRAINF~- FINO.
EXPUCSSIRLE AS OPTIMIZATION USE LES.
SENSlTlVF
TkCHNlOUCS
ICHAPTFR 161

-RF w
WM-SOC:%

r I~ - C R I ~ RION,
I
USE COST
BENEFIT lCRl

TF CHNIOUFS
-
n c c l s l O N MAKING

ICIiAPTFRS 11. 121


1
USF FIGURE OF -
WN S OCs -
~(IAMTI~ I A B L C ' CB TFCHNIOUES
ICHAPTF R IS1

RECONCll ING NON-


O U A N I IFIAHLE DCs
ICHAPTER 101

USE PRESENT VALUE r E C H N l W F S TO


CONVERT FUTURF S T 0 PRESENT S
ICHAPTkR 14) II
I W E N SOME 00 I N V O L V E
I UNCE QTAINTIES
I

Fig. 1. laster key to software engineering economics decision analysis


techniques!
LJmanDIoeMnI

,-+niqcp.wn--wnarn
I

Throughput am
lm -
( transutions
sac
1 '"r-
140

120-
1m -
m-
u) -
a-

20 40 W m 100 1ZO 140 100 1m 200 210 240 i80 2BO 300
CIC'U

Fig. 2. Cost-effectiveness comparison, transaction processing system


options.

More frequently, some of the resulting benefits from the


software system are not expressible in dollars. In such situa-
tions, one alternative solution will not necessarily dominate
another s o htion.
An example situation is shown in Fig. 2, which compares
the cost and benefits (here, in terms of throughput in trans-
actions per second) of two alternative approaches to develop-
ing an operating system for a transaction processing system.

Option A: Accept an availabIe operating system. This


will require only $80K in software costs, but will
achieve a peak performance of 120 transactions per
second, using five $10K minicomputer processors, be-
cause of a high multiprocessor overhead factor.
Option B: Build a new operating system. This system
would be more efficient and would support a higher
peak throughput, but would require $180K in soft-
ware costs.
The cost-versus-performance curve for these two options
are shown in Fig. 2. Here, neither option dominates the
other, and various cost-benefit decision-making techniques
(maximum profit margin, costlbenefit ratio, return on in-
vestments, etc.) must be used to choose between Options
A and B.
In general, software engineering decision problems are
even more complex than Fig. 2, as Options A and B will
have several important criteria on which they differ (e.g.,
robustness, ease of tuning, ease of change, functional
capability). If these criteria are quantifiable, then some type
of figure of merit can be defined t o support a comparative
analysis of the preferability of one option over another. If
some of, the criteria are unquantifiable (user goodwill, pro-
grammer morale, etc.), then some techniques for comparing
unquantifiable criteria need to be used. As indicated in Fig. 1,
techniques for each of these situations are available, and
discussed in [ 111.

Analyzing Risk, Uncertainty, and the Value o f Information


In software engineering, our decision issues are generally
even more complex than those discussed above. This is be-
cause the outcome of many of our options cannot be deter-
mined in advance. For example, building an operating sys-
tem with a significantly lower multiprocessor overhead may
be achievable, but on the other hand, it may not. In such cir-
cumstances, we are faced .with a problem of decision making
under uncertainty, with a considerable risk of an undesired
outcome.
The main economic analysis techniques available to sup-
port us in resolving such problems are the following.
1) Techniques for decision making under complete un-
certainty, such as the maximax rule, the maximin rule, and
the Laplace rule [38]. These techniques are generally inade-
quate for practical software engineering decisions.
2) Expected-value techniques, in which we estimate the
probabilities of occurrence of each outcome (successful or
unsuccessful development of the new operating system) and
complete the expected payoff of each option:

These techniques are better than decision making under com-


plete uncertainty, but they still involve a great deal of risk if
the Prob(fai1ure) is considerably higher than our estimate of it.
3) Techniques in which we reduce uncertainty by buying
information. For example, prototyping is a way of buying in-
formation to reduce our uncertainty about the likely success
or failure of a multiprocessor operating system; by developing
a rapid prototype of its high-risk elements, we can get a clearer
picture of our likelihood of successfully developing the full
operating system.
In general, prototyping and other options for buying in-
formation3 are most valuable aids for software engineering de-
cisions. However, they always raise the following question:
"how much information-buying is enough?"
In principle,.this question can be answered via statistical de-
cision theory techniques involving the use of Bayes' Law, which
allows us to calculate the expected payoff from a software
project as a function of our level of investment in a prototype
or other information-buying option. (Some examples, of the
use of Bayes' Law to estimate the appropriate level of invest-
ment in a prototype are given in [l 1, ch. 201 .)
Ln practice, the use of Bayes' Law involves the estimation
of a number of conditional probabilities which are not easy to

3 Other examples of options for buying information to support


software engineering decisions include feasibility studies, user sur-
veys, simulation, testing, and mathematical program verification tech-
niques.
estimate accurately. However, the Bayes' Law approach can be
translated into a number of value-of-information guidelines, or
conditions under which it makes good sense t o decide on in-
vesting in more information before committing ourselves t o a
particular course of action.
Condition 1: There exist attractive alternatives whose pay-
off vanes greatly, depending on some critical states of nature.
I f not, we can commit ourselves to one of the attractive alter-
natives with no risk of significant loss.
Condition 2: The critical states of nature .have an appreci-
able probability of occumng. If not, we can again commit our-
selves without major risk. For situations with extremely high
variations in payoff, the appreciable probability level is lower
than in situations with smaller variations in payoff.
Condition 3: The investigations have a high probability of
accurately identioing the occurrence of the critical states of
nature. If not, the investigations will not do much to reduce
our risk of loss due to making the wrong decision.
Condition 4: The required cost and schedule o f the investi-
gations do not overly curtail their net value. I t does us little
good to obtain results which cost more than they can save us,
or which arrive too late to help us make a decision.
Condition 5: There exist significant side benefits derived
from performing the investigations. Again, we may be able to
justify an investigation solely on the basis of its value in train-
ing, team-building, customer relations, or design validation.

Some Pitfalls Avoided by Using the Value-ofhformation


Approach
The guideline conditions provided by the value-of-informa-
tion approach provide us with a perspective which helps us
avoid some serious software engineering pitfalls. The pitfalls
below are expressed in terms of some frequently expressed but
faulty pieces of software engineering advice.
Pitfall 1: Always use a simulation to investigate the feasibil-
ity of complex realtime software. Simulations are often ex-
tremely valuable in such situations. However, there have been
a good many simulations developed which were largely an ex-
pensive waste of effort, frequently under conditions that would
have been picked up by the guidelines above. Some have been
relatively useless because, once they were built, nobody could
tell whether a given set of inputs was realistic or not (picked
up by Condition 3). Some have been taken so long to develop
that they produced their first results the week after the pro-
posal was sent out, or after the key design review was com-
pleted.(picked up by Condition 4).
Pitfall 2: Always build the software twice. The guidelines
indicate that the prototype (or build-it-twice) approach is often
valuable, but not in all situations. Some prototypes have been
built of software whose aspects were all straightforward and
familiar, in which case nothing much was learned by building
them (picked up by Conditions 1 and 2).
Pitfall 3: Build the sofnvare purely top-down. When inter-
preted too literally, the top-down approach does not concern
itself with the design of low level modules until the higher
levels have been fully developed. If an adverse state of nature
makes such a low level module (automatically forecast sales
volume, automatically discriminate one type of aircraft from
another) impossible to develop, the subsequent redesign will
generally require the expensive rework of much of the higher
level design and code. Conditions 1 and 2 warn us to temper
our top-down approach with a thorough top-to-bottom soft-
ware risk analysis during the requirements and product design
phases.
Pitfall 4: Every piece of code should be proved correct.
Correctness proving is still an expensive way to get informa-
tion on the fault-freedom of software, although it strongly
satisfies Condition 3 by giving a very high assurance of a pro-
gram's correctness. Conditions 1 and 2 recommend that proof
techniques be used in situations where the operational cost of
a software fault is very large, that is, loss of life, compromised
national security, major financial losses. But if the operational
cost of a software fault is small, the added information on
fault-freedom provided by the proof will not be worth the in-
vestment (Condition 4).
Pitfall 5: Nominal-case testing is sufficient. This pitfall is
just the opposite of Pitfall 4. If the operational cost of poten-
tial software faults is large, it is highly imprudent not t o per-
form off-nominal testing.

Summary: The Economic Value of lizforrnation


' Let us step back a bit from these guidelines and pitfalls. Put
simply, we are saying that, as software engineers:
"It is often worth paying for information because it
helps us make better decisions."
If we look at the statement in a broader context, we can see
that it is the primary reason why the software engineering field
exists. It is what practicalIy all of our software customers say
when they decide to acquire one of our products: that i t is
worth paying for a management information system, a weather
forecasting system, an air traffic control system, an inventory
controlsystem, etc., because it helps them make better decisions.
Usually, software engineers are producers of management
information to be consumed by other people, but during the
software life cycle we must also be consumers of management
information to support our own decisions. As we come t o ap-
preciate the factors which make it attractive for us t o pay for
processed information which helps us make better decisions as
software engineers, we will get a better appreciation for what
our customers and users are looking for in the information
processing systems we develop for them.
Introduction
All of the software engineering economics decision analysis
techniques discussed above are only as good as the input data
we can provide for them. For software decisions, the most
critical and difficult of these inputs to provide are estimates
of the cost of a proposed software project. In this section,
we will summarize:
1) the major software cost estimation techniques avail-
able, and their relative strengths and difficulties;
2) algorithmic models for software cost estimation;
3) outstanding research issues in software cost estimation.

A. Major Software Cost Estimation Techniques


Table I summarizes the relative strengths and difficulties of
the major software cost estimation methods in use today.
1 ) Algorithmic Models: These methods provide one or
more algorithms which produce a software cost estimate as a
function of a number of variables which are considered to be
the major cost drivers.
2) Expert Judgment: This method involves consulting one
or more experts, perhaps with the aid of an expert-consensus
mechanism such as the Delphi technique.
3) Analogy: This method involves reasoning by analogy
with one or more completed projects t o relate their actual
costs to an estimate of the cost of a similar new project.
4 ) Parkinson: A Parkinson principle ("work expands to
fill the available volume") is invoked t o equate the cost esti-
mate to the available resources.
5 ) Price-to-Win: Here, the cost estimate is equated.to the
price believed necessary to win the job (or the schedule be-
lieved necessary to be first in the market with a new product,
etc.).
6 ) Top-Down: An overall cost estimate for the project is
derived from global properties of the software product. The
total cost is then split up among the various components.
7) Bott0.m-Up: Each component of the software job is
separately estimated, and the resuits aggregated to produce
an estimate for the overall job.
The main conclusions that we can draw from Table I are
the following.
None of the alternatives is better than the others from
all aspects.
The Parkinson and price-to-win methods are unaccept-
able and do not produce satisfactory cost estimates.
The strengths and weaknesses of the other techniques
are complementary (particularly the algorithmic models versus
expert judgment and top-down versus bottom-up).
Thus, in practice, we should use combinations of the
above techniques, compare their results, and iterate on them
where they differ.

TABLE I
STRENGTHS AND WEAKNESSES OF SOFTWARE
COST-ESTIMATION METHODS

slbPmJeinp*r
Assessment d
drawnstancsr
Calibrated to prrf not
future
NO bener ~h.n
prrc~~ts
Biases.- i recall

Analogy Based on m t i v e errperierwr,


Parkinson
Pnce to win
Less dewed brPs
Less stable
Fundamental Limitations o f Software Cost Estimution
Techniques
Whatever the strengths of a software cost estimation tech-
nique, there is really no way we can expect the technique t o
compensate for our lack of definition or understanding of the
software job to be done. Until a software specification is fully
defined, it actually represents a range of software products,
and a corresponding range of software development costs.
This fundamental limitation of software cost estimation
technology is illustrated in Fig. 3, which shows the accuracy
within which software cost estimates can be made, as a func-
tion of the software lifecycle phase (the horizontal axis), or of
the level of knowledge we have of what the software is in-
tended to do. This level of uncertainty is illustrated in Fig. 3
with respect to a human-machine interface component of
the software.

Product Deuiled
CarcDt of Requirements d-i? *ipn. Accepted
m i m ~pcifiutions rpecifiut~onr ~ ~ a i t ~ u t ~ o r a softmre
0 A A A A
Feas~blhtv Plans and Product Dcta~led Develoommt and test
requt~rnents d-19" -89"
Phases md m i l a t o n s

Fig. 3. Software cost estimation accuracy versus phase.


When we first begin to evaluate alternative concepts for a
new software application, the relative range of our software
cost estimates is roughly a factor of four on either the high or
low side? This range stems from the wide range of uncertainty
we have at this time about the actual nature of the product.
For the human-machine interface component, for example,
we do not know at this time what classes of people (clerks,
computer specialists, middle managers, etc.) or what classes of
data (raw or pre-edited, numerical or text, digital or analog) the
system will have to support. Until we pin down such uncer-
tainties, a factor of four in either direction is not surprising as
a range of estimates.
The above uncertainties are indeed pinned down once we
complete the feasibility phase and settle on a particular con-
cept of operation. At this stage, the range of our estimates di-
minishes to a factor of two in either direction. This range is
reasonable because we stiU have not pinned down such issues
as the specific types of user query to be supported, or the spe-
cific functions t o be performed within the microprocessor in
!
4
lj
the intelligent terminal. These issues will be resolved by the I

time we have developed a software requirements specification,


at which point, we will be able to estimate the software costs
within a factor of 1.5 in either direction.
1
1
i.
By the time we complete and validate a product design
specification, we will have resolved such issues as the internal
data structure of the software product and the specific tech-
niques for handling the buffers between the terminal micro- [

processor and the central processors on one side, and between i

the microprocessor and the display driver on the other. At this :i


(i
point, our software estimate should be accurate to within a ., ,.
factor of 1.25, the discrepancies being caused by some remain- i
' . \
ing sources of uncertainty such as the specific algorithms to be
4 These ranges have been determined subjectively, and are intended
to represent 80 percent confidence limits, that is, "within a factor of
four on either side, 80 percent of the time."
used for task scheduling, error handling, abort processing, and
the like. These will be resolved by the end of the detailed de-
sign phase, but there will still be a residual uncertainty about
10 percent based on how well the programmers really under-
stand the specifications to which they are to code. (Thisfactor
also includes such consideration as personnel turnover uncer-
tainties during the development and test phases.)

B. Algorithmic Models for Software Cost Estimation


Algorithmic Cost Models: Early Development
Since the earliest days of the software field, people have
been trying to develop algorithmic models to estimate soft-
ware costs. The earliest attempts were simple rules of thumb,
such as:
on a large project, each software performer will provide
an average of one checked-out instruction per man-hour (or
roughly 150 instructions per man-month);
each software maintenance person can maintain four
boxes of cards (a box of cards held 2000 cards, or roughly
2000 instructions in those days of few comment cards).
Somewhat later, some projects began collecting quantita-
tive data on the effort involved in developing a software
product, and its distribution across the software life cycle. One
of the earliest of these analyses was documented in 1956 in [8] .
It indicated that, for very large operational software products on
the order of 100 000 delivered source instructions (100 KDSI),
that the overall productivity was more like 64 DSIJman-month,
that another 100 KDSI of support-software would be required;
that about 15 000 pages of documentation would be produced
and 3000 hours of computer time consumed; and that the dis-
tribution of effort would be as follows:

Program Specs: 10 percent


Coding Specs: 30 percent
Coding: 10 percent
Parameter Testing: 20 percent
Assembly Testing: 30 percent

with an additional 30 percent required to produce operational


specs for the system. Unfortunately, such data did not become
well known, and many subsequent software projects went
through a painful process of rediscovering them.
During the late 1950's and early 1960's, relatively little
progress was made in software cost estimation, while the fre-
quency and magnitude of software cost overruns was becom-
ing critical t o many large systems employing computers. In
1964, the U.S. Air Force contracted with System Develop-
ment Corporation for a landmark project in the software cost
estimation field. This project collected 104 attributes of 169
software projects and treated them to extensive statistical anal-
ysis. One result was the 1965 SDC cost model [41] which was
the best possible statistical 13-parameter linear estimation
model for the sample data:

+9.15 (Lack of Requirements) (0-2)


+ 10.73 (Stability of Design) (0-3)
+0.5 1 (Percent Math Instructions)
+0.46 (Percent StoragelRetrieva1Instructions)
+0 -40 (Number of Subprograms)
+7.28 (Programming Language) (0-1)
-2 1.45 (Business Application) (0-1)
+ 13.53 (Stand-Alone Program) (0.1)
+ 12-35(First Program on Computer) (0-1)
+ 58.82 (Concurrent Hardware Development) (0-1)
+30.6 1 (Random Access Device Used) (0-1)
+29.55 (Difference Host, Target Hardware) (0-1)
+0.54 (Number of Personnel Trips)
-25.20 (Developed by Military Organization) (0-1).

The numbers in parentheses refer to ratings to be made by the


estimator.
When applied to its database of 169 projects, this model
produced a mean estimate of 40 MM and a standard deviation
of 62 MM; not a very accurate predictor. Further, the applica-
tion of the model is counterintuitive; a project with all zero
ratings is estimated at minus 3 3 MM; changing language from a
higher order language to assembly language adds 7 MM,inde-
pendent of project size. The most conclusive result from the
SDC study was that there were too many nonlinear aspects of
software development for a linear cost-estimation model to
work very well.
Still, the SDC effort provided a valuable base of information
and insight for cost estimation and future models. Its cumula-
tive distribution of productivity for 169 projects was a valu-
able aid for producing or checking cost estimates. The estima-
tion rules of thumb for various phases and activities have been
very helpful, and the data have been a major foundation for
some subsequent cost models.
In the late 1960's and early 1970's, a number of cost models
were developed which worked reasonably well for a certain re-
stricted range of projects to which they were calibrated. Some
of the more notable examples of such models are those de-
scribed in [3], [54], [57].
The essence of the TRW Wolverton model [57] is shown in
Fig. 4, which shows a number of curves of software cost per
object instruction as a function of relative degree of difficulty
(0 to loo), novelty of the application (new or old), and type
of project. The best use of the model involves breaking the
software into components and estimating their cost individu-
Catcgorles
C = Control 0
I = Inputloutput T (all)
P = Relport processor
A = Algor~thm Category ( ) .
D = Data management
T = Time cr~ticalpraessor

Sample range
excludes upper and lower
20 percentiles
/-

New

Old

Easy Medium Hard


10
0 m 40 60 80 l(
Relative degree of d~fficulty:percent of total
sample expcrlencmg this rate or less

~ i g 4.
. TRW Wolverton model: Cost per object instruction versus rela-
tive degree of difficulty.

ally. This, a 1000 object-instruction module of new data man-


agement software of medium (50 percent) difficulty would be
costed at $46/instruction, or $46 000.
This model is well-calibrated to a class of near-real-time
government command and control projects, but is less ac-
curate for some other classes of projects. In addition, the
model provides a good breakdown of project effort by phase
and activity.
In the late 1 9 7 0 ' ~several
~ software cost estimation models
were developed which established a significant advance in the
state of the art. These included the Putnam SLIM Model [44] ,
the Doty Model [27], the RCA PRICE S model [ 2 2 ] , the
COCOMO model [I 11, the IBM-FSD model [53], the Boeing
model [9] , and a series of models developed by GRC [15]. A
summary of these models, and the earlier SDC and Wolverton
models, is shown in Table 11, in terms of the size, program,
computer, personnel, and project attributes used by each
model to determine software costs. The first four of these
models are discussed below.
The Pu tnam SLIM Model [44], [45]
The Putnam SLIM Model is a commercially available (from
Quantitative Software Management, Inc.) software product
based on Putnam's analysis of the software life cycle in terms
of the Rayleigh distribution of project personnel level versus
time. The basic effort macro-estimation model used in SLLM
is

where
Ss = number of delivered source instructions
K = life-cycle effort in man-years
td = development time in years
Ck = a "technology constant."
Values of Ck typically range between 6 10 and 57 3 14. The
current version of SLIM allows one to calibrate Ck to past
projects or to past projects'or to estimate it as a function of a
project's use of modern programming practices, hardware con-
straints, personnel experience, interactive development, and
other factors. The required development effort, DE, is esti-
mated as roughly 40 percent of the life-cycle effort for large
-- ., .
TABLE
. ll-
..- -- .

. FACTORS USED IN VARIOUS

SDC. TRW. WTNAM.


COST MODELS

RCA. BOEING. GRC.


GRWP FACTOR 1%5 1971 SLIM DOTY PRICES IBM I977 IS79 COCOMO SOFCOST DSN JENSEI

SIZE SOURCE lNnRuCTlONS x x X X x X X X


ATTRIBUTES OBJECT INSTRUCTIONS X X X X
NUMBER OF ROUTINES x x X
NUMBER OF DATA ITEMS X X X
NUMBER OF OUTPUT FORMATS X X
DOCUMENTATION X X X X
NUMBER OF PERSONNEL X X X X X

PROGRAM TYPE X X X X X X X X
ATTRIBUTES COMPLEXITY X X X X X X X X
UNGUAGE X X X X X X
REUSE X X X X X X X X
REOUIRED RELIABILITY X X X X X
DISPLAY REQUIREMENTS X X X

COMPUTER TIME CONSTRAINT X X X X X X X X X X


ATTRIBUTES STORAGE CONSTRAINT x X i I X X X X X X
HARDWARE CONFIGURATION X
CONCURRENT HARDWARE
DEVELOPMENT X X X X X X X X
INTCRFACING EWIPMENT, S M X X
PERSONNEL PERSONNEL CAPABILITV X x X X X X X

I ATlRIBUTES

PROJECT
PERSONNEL CONTINUITY
HARDWARE EXPERIENCE
APPLICATIONS EXPERIENCE
CANGUAGE EXPERIENCE
TOOLS AND TECHNIQUES
X
X
X
X
X
X
X X
x
X
X
X
X

X
x

X
x

X
X I X
X
X
x
X
X
X
X

X
x

X
X
x
X
X
ATTRIB~ES CUSTOMER INTERFACE x x x x
REOUIREMENTS DEFINITION X X X X X X
REOUIREMENTS VOLATILITY X X X X X X X X X
SCUEDULE X X X X X X
SECURITV X x X
COMPUTER ACCESS X X X X X X X X
TRAVELIREHOSTINGNULTI~SITE X X X X X X
SUPPORT SOFTWARE MATIJRITY X X
CALIBRATION
X X X
FACTOR
EFFORT
EOUATION
mNOM
~ c ~ M Xo ~ , . t.0 1.047 0.91 1.0 t.06-I 1 1.0 1.2

SCHEDULE
EOUATION
ID c IMMI~. x = 0.35 -
0.31 0.38 0.360 0.333
systems. For smaller systems, the percentage varies as a func-
tion of system size.
The SLIM model includes a number of useful extensions to
estimate such quantities as manpower distribution, cash flow,
major-milestone schedules, reliability levels, computer time,
and documentation costs.
The most controversial aspect of the SLIM model is its
tradeoff relationship between development effort K and be-
tween development time t d . For a software product of a given
size, the SLIM software equation above gives

constant
K=
C
For example, this relationship says that one can cut the
cost of a software project in half, simply by increasing its de-
velopment time by 19 percent (e.g., from 10 months to 12
months). Fig. 5 shows how the SLIM tradeoff relationship com-
pares with those of other models; see [ l l , ch. 271 for further
discussion of this issue.
On balance, the SLIM approach has provided a number
of useful insights into software cost estimation, such as the
Rayleigh-curve distribution for one-shot software efforts, the
explicit treatment of estimation risk and uncertainty, and the
cube-root relationship defining the minimum development time
achievable for a project requiring a given amount of effort.

The Doty Model [2 71


This model is the result of an extensive data analysis activ-
ity, including many of the data points from the SDC sample.
A number of models of similar form were developed for dif-
ferent application areas. As an example, the model for general
application is
1.1 -
RELATIVE
EFFORT
-
MM 114
MMNOM
RELATIVE SCHEDULE
TDESIREDI~NOM

0.0 -

-
\
0.8

'\
\ \.
\

a7 -
Fig. 5. Comparative effort-schedule tradeoff relationships.

MM = 5.288 ( K D s I ) ' . ~ ~ ~ , for KDSI 2 10

MM = 2.060 ( K D S I ) ~ - O ~ ~f i ) , (nj= 1
for KDSI < l o .

The effort multipliers fi are shown in Table 111. This model has
a much more appropriate functional form than the SDC
model, but it has some problems with stability, as it exhibits a
discontinuity at KDSI = 10, and produces widely varying esti-
mates via the f factors (answering "yes" to "first software de-
veloped on CPU" adds 92 percent to the estimated cost).
The R CA PRICE S Model [22]
PRICE S is a commercially available (from RCA, Inc.)
macro cost-estimation model developed primarily for embed-
TABLE I11
DOTY MODEL FOR SMALL PROGRAMS*
PI*
MM = 2.060 P"
A'1
Factor '1 Yes No

sg.oj.l-' A 1-11 1.00


~.t~ddomliond-- 6 1.00 1.11
c-wtO'Jp--='- 4 1.05 1.00
-m 4 1.33 1.00
w m w n a ~ m 4 1.43 1.00
Wcine- 4 1.33 1.00
Fi.1~dmbP.dmW ir 1.02 1.00
Cawumtdwdopmtd~~~- 1 1 s 1.00

-
TiwmshWproearhgh
6nb9mml 6 0.83 1.00
~ u i p c a r p r r C r . t ~ i . a l t y 6 1.a 1.00
D.nbpm(.t-.k &a 1 s 1.00
-compuar-mntrg.t
6s 125 1.00
Dmbemc.tmonamon* fu 115 1.00

ded system applications. It has improved steadily with experi-


ence; earlier versions with a widely varying subjective complex-
ity factor have been replaced by versions in which a number of
computer, personnel, and project attributes are used to modu-
late the complexity rating.
PRICE S has extended a number of costiestimating relation-
ships developed in the early 1970's such as the hardware con-
straint function shown in Fig. 6 [lo]. It was primarily devel-
oped to handle military software projects, but now also in-
cludes rating levels to cover business applications.
PRICE S also provides a wide range of useful outputs on
gross phase and activity distributions analyses, and monthly
project cost-schedule-expected progress forecasts. Price S uses
a two-parameter beta distribution rather than a Rayleigh curve
t o calculate development effort distribution versus calendar
time.
'PRICE S has recently added a softwae life-cycle support
cost estimation capability called PRICE SL [34]. It involves
the definition of three categories of support activities.
I Utiliratmn Nomulued
can
Normalized
schedule

L..
0 0.4 0.5 0.6 0.7 0.8 0.9
Utilizat~on
of mailable speed and memory

Fig. 6 . RCA PRICE S model: Effect of hardware constraints.

Growth: The estimator specifies the amount of code t o


be added to the product. PRICE SL then uses its standard
techniques to estimate the resulting life-cycle-effort distribu-
tion.
Enhancement: PRICE SL estimates the fraction of the
existing product which will be momed (the estimator may
provide his own fraction), and uses its standard techniques to
estimate the resulting life-cycle effort distribution.
Maintenance: The estimator provides a parameter indi-
cating the quality level of the developed code. PRICE SL uses
this to estimate the effort required to eliminate remaining er-
rors.
The Constnative Cost Model (COCOMO)[ I I ]
The primary motivation for the COCOMO model has been
to help people understand the cost consequences of the de-
cisions they will make in commissioning, developing, and sup-
porting a software product. Besides providing a software cost
estimation capability, COCOMO therefore provides a great
deal of material which explains exactly what costs the model
is estimating, and why it comes up with the estimates it does.
Further, it provides capabilities for sensitivity analysis and
tradeoff analysis of many of the common software engineering
decision issues.
COCOMO is actually a hierarchy of three increasingly de-
tailed models which range from a single macroestimation
scaling model as a function of product size to a microestirna-
tion model with a three-level work breakdown structure and
a set of phase-sensitive multipliers for each cost driver attri-
bute. To provide a reasonably concise example of a current
state of the art cost estimation model, the intermediate level
of COCOMO is described below.
Intermediate COCOMO estimates the cost of a proposed
software product in the following way.
1) A nominal development effort is estimated as a func-
tion of the product's size in delivered source instructions in
thousands (KDSI) and the project's development mode.
2) A set of effort multipliers are determined from the
product's ratings on a set of 15 cost driver attributes.
3) The estimated development effort is obtained by mul-
tiplying the nominal effort estimate by all of the product?^
effort multipliers.
4) Additional factors can be used to determine dollar
costs, development schedules, phase and activity distributions,
computer costs, annual maintenance costs, and other elements
from the development effort estimate.
Step I-Nominal Effort Estimation: First, Table IV is used
to determine the project's development mode. Organic-mode
projects typically come from stable, familiar, forgiving, rela-
tively unconstrained environments, and were found in the
COCOMO data analysis of 63 projects have a different scaling
equation from the more ambitious, unfamiliar, unforgiving,
tightly constrained embedded mode. The resulting scaling
equations for each mode are given in Table V; these are used
determine the nominal development effort for the project
man-months as a function of the project's size in KDSI
and the project's development mode.
For example, suppose we are estimating the cost to develop
the microprocessor-based communications processing software
for a highly ambitious new electronic funds transfer network
with high reliability, performance, development schedule, and
interface requirements. From Table IV, we determine
that these characteristics best fit the profile of an
embedded-mode project.
We next estimate the size of the product as 10000 delivered
source instructions, or 10 KDSI. From Table V, we then deter-
mine that the nominal development effort for this Embedded-
mode project is
TABLE IV
COCOMO SOFTWAREDEVELOPMENT MODES
Mode

Feature Organtc Semdetached Embedded

Oganizational understanding of
product objectives Thorough General
in worktng with related
som~aresystems Extensive Moderate
Need for software conformance
with preestablished require-
ments Basic
Need tor software conformance
rrim external ~nterfacespecifics-
Wns Basic Full
Concunent development of associ-
ated new hardware and opera-
tional procedures Extensive
Needtor innovative data processing
architectures. algorithms Minimal Sari Cons~derable
Premium on early completion Low Medrun High
Roducl sae range <SO KDSI QM) KDSl All wzes
-@es Batch data Moa wansaction Large. complex
reduct~on pmcesvng SYS- transaction
Scientific tenu processing
models Ne- OS. DBMS systems
Busmess Amhbous tnven- Amb~t~ous,very
models tuy. wuctlon large OS
Famil~ar mud Avtoncs
OS, compler Smote command- Amb~tlouscom-
S~mpleinven- mud mand-control
tory. produc-
tton control
TABLE V
COCOMO NOMINAL EFFORT AND SCHEDULE EQUATIONS

DEVELOPMENT htODE N O M I N A L EFFORT SCHEDULE

Organic (?dhl)NO,! = 3.2(KDSI) T D E V = 2.5(hlh1DEv)038

Semidetached (h'hl) NOU = 3. OIKDSI) ' . I 2 T D E V = 2.S(MMDE,,) 0.35

Embedded (t\h\! Nc,,,= : . ~ ( K D s I ) TDEV = 2. 5(hlhlDEy) 0. 32

( K D S I = thousands o f delivered source instructions)

2.8(10)' -2 O = 44 man-months (MM).

Step 2-Detemine Effort Multipliers: Each of the 15 cost


driver attributes in COCOMO has a rating scale and a-set of ef-
fort multipliers which indicate b y how m u c h the nominal ef-
fort estimate must be multiplied t o account for the project's
having to work at its rating level for the attribute.
These cost driver attributes and their corresponding effort
multipliers are shown in Table VI. The summary rating scales
for each cost driver attribute are shown in Table VII, except
for the complexity rating scale which is shown in Table VIII
(expanded rating scales for the other attributes are provided
in [ I l l ) .
The results of applying these tables t o our microprocessor
communications software example are shown in Table IX. The
effect of a software fault in the electronic fund transfer system
could be a serious financial loss; therefore, the project's RELY
rating from Table VII is High. Then, from Table VI, the effort
multiplier for achieving a High level of required reliability is
1.15, or 15 percent more effort than it would take t o develop
the software to a nominal level of required reliability.
TABLE VI
TERMEDIATE COCOMO SOFTWARE DEVELOPMENT EFFORT
MULTIPLIERS

--
Ralucl Atmbutes
RELY Requred r d t w u e rekabilily
DATA Data base sue
CPLX Producl complexity
Computer Attribrtes
TIME Exeartcon twne ~ l r n i n t
STOR Mam storage constraint
VlRT Vwtual maclune vdelYtp
TURN Complier lwnaround tlme
Personnel Allribules
ACAP Analyst capalnlity 1.46 1.19 1.00 .86 .71
AEXP Applicel~onsexpcwience 1.29 1.13 1.00 9 1 .82
PCAP Pro~amrnercapebtllty 142 1.17 1.00 .86 .70
VEXP V~flual machlne experiencrr 1.21 1-10 1.00 .90
LEXP Programm~nglanguage e x p e r m 1.14 1.07 1.00 .95
Prop3 Atlnbules
MOOP Use of modern programrmng practices 124 1.10 1.00 .91 .82
TOOL Use of software tools 1.24 1.10 1.00 9 1 .83
SCED Required development scbdule 1.n 1.00 $1.00 1.04 1.10
F a a gwm soflwwa product, hs ~ n md )rr mlh. oFplu d nd .O))r~e(0s.
DBMS. etc ) 11 calls on IO .ccompksh as hrhr

The effort multipliers for the other cost driver attributes


are obtained similarly, except for the Complexity attribute,
which is obtained via Table VIII. Here, we first determine that
communications processing is best classified under device-de-
pendent operations (column 3 in Table VIII). From this col-
umn, we determine that communication line handling typi-
cally has a complexity rating of Very High; from Table VI,
then, we determine that its corresponding effort multiplier is
1.30.
Step 3-Estimate Development Effort: We then compute
the estimated development effort for the microprocessor com-
munications software as the nominal development effort (44
MM) times the product of the effort multipliers for the 15 cost
I
TABLE VIII
COCOMO MODULE COMPLEXXY RATINGS VERSUS TYPE O F
MODULE
i.
;- Data
Connot Commtanmal klccda~endent Managemn!

S h q t m i m code
rnth a few non-
rla8I.d sp*o#r-
atom DOs.
US&
IlTHENRSEs.
Simple
Utes
SbUghfforWVd No cogmame
nesting of SP op. needed of par-
uators. MosUy ticular pro-
vmpk cessor or 110
6.na chuu-
tamka. I10
dona at GET/
PUT Iwd. No
cw"'w-of
w=m
W R k inRlt uld
sngle M.art-
w Simgle
suucwal
dlmws.
edits

S g s c l . 1 ' ~
urtnuuanes ac-
mated by data
sueam con-
tents. Comolex
aata resw. y-
mg at mmru
Iewl
R o u a m for mtw- A p o m l m d w -
ruot dpgnoJs. ~ t w
file sauctwmg

-
semcmg. mask-
tng. Communc rwnne. Fib
ca m line lnnldlng. com-
handing mand praeu-
trig.
OD-rn

MuInoIe resource mticult and un- nishs -.


scheduling wrn RNCMd N.A.: mmr rota-
bF.m-lly tngnly -te nonal s m .
dangtng pnon- .Mlvos of rony. mas. Nacunl
nos. Mrrocoda ROCnasDt dam lurgruge aata
Ieve~control management
Cost Effort
Driver Situation Rating Multiplier

RELY Serious financial consequences o f software f a u l t s High


DATA 20,000 bytes Low
CPLX Communications p r o c e s s i n g Very High
TIME Wi!l use 70; o f available time High
STOR 45K o f 64K store ( 7 0 % ) High
VlRT Based o n commercial microprocessor h a r d w a r e Nominal
TURN T w o - - h o u r average t u r n a r o u n d time Nominal
ACAP Cood senior analysts High
4EXP T h r e e years Nominal
PCAP Cood senior programmers High
VEXP S i x months Low
LE X P T w e l v e months Nominal
h10DP Most techniques in u s e o v e r o n e y e a r High
TOOL A t basic minicomputer tool l e v e l Low
SCED N i n e months Nominal

E f f o r t adjustment factor ( p r o d u c t o f e f f o r t m u l t i p l i e r s ) 1. 35
- . -
133

er attributes in Table D< (1.3 5, in Table IX). The resulting


dte&effort for the project is then
(44 MM) (1.35) = 59 MM.
- Step 4-Estimate Related Project Factors: COCOMO has
, Mditional cost estimating relationships for computing the re-

8dting dollar cost of the project and for the breakdown of


:coost and effort by life-cycle phase (requirements, design, etc.)
md by type of project activity (programming, test planning,
management, etc.). Further relationships support the estima-
tion of the project's schedule and its phase distribution. For
example, the recommended development schedule can be ob-
tained from the estimated development man-months via the
embedded-mode schedule equation in Table V:
TDEv = 2.5(59)0.32 = 9 months.
As mentioned above, COCOMO also supports the most com-
mon types of sensitivity analysis and tradeoff analysis involved
in scoping a software project. For example, from Tables VI
and VII, we can see that providing the software developers
with an interactive computer access capability (Low turn-
around time) reduces the TURN effort multiplier from 1.00 to
0.87, and thus reduces the estimated project effort from 59
MM to
(59 MM) (0.87) = 51 MM.
The COCOMO model has been validated with respect to a
sample of 63 projects representing a wide variety of business,
scientific, systems, real-time, and support software projects.
For this sample, Intermediate COCOMO estimates come
within 20 percent of the actuals about 68 percent of the time
(see Fig. 7). Since the residuals roughly follow a normal
distribution, this is equivalent to a standard deviation of
roughly 20 percent of the project actuals. This level of accu-
racy is representative of the current state of the art in soft-
ware cost models. One can do somewhat better with the- aid
Imrmdm COCOMO aumm mnmanh

Fig. 7. Intermediate COCOMO estimates versus project actuals.

of a calibration coefficient (also a COCOMO option), or within


a limited applications context, but it is difficult to improve
significantly on this level of accwacy while the accuracy of
software data collection remains in the "+20 percent" range.
A Pascal version of COCOMO is available for a nominal dis-
tribution charge from the Wang Institute, under the name WI-
COMO [18].
Recent Software Cost Estimation Models
Most of the recent software cost estimation.models tend t o
f d o w the Doty and COCOMO models in having a nominal
scaling equation of the form MMNoM = c (KDSI)X and a set
of multiplicative effort adjustment factors determined by a
number of cost driver attribute ratings. Some of them use the
Rayleigh curve approach to estimate distribution across the
software life-cycle, but most use a more conservative effort/
schedule tradeoff relation than the SLIM model. These aspects
have been summarized for the various models in Table I1 and
Fig. 5.
The Bailey-Basili meta-model [4] derived the scaling equa-
tion

and used two additional cost driver attributes (methodology


level and complexity) to model the development effort of 18
projects in the NASA-Goddard Software Engineering Labora-
tory to within a standard deviation of 15 percent. Its accuracy
for other project situations has not been determined.
The Grumrnan SOFCOSTModel [19] uses a similar but un-
published nominal effort scaling equation, modified by 30
multiplicative cost driver variables rated on a scale of 0 to 10.
Table I1 includes a summary of these variables.
The Tausworthe Deep Space Network (DSNj model [SO]
uses a linear scaling equation (MMNOM = ~ ( K D S I.'))~ and a
similar set of cost driver attributes, also summarized in Table
11. It also has a well-considered approach for determining the
equivalent KDSI involved in adapting existing software within
a new product. It uses the Rayleigh curve to determine the
phase distribution of effort, but uses a considerably more con-
servative version of the SLIM effort-schedule tradeoff relation-
ship (see Fig. 5).
The Jensen model [30] , [3 11 is a commercially available
model with a similar nominal scaling equation, and a set of cost
driver attributes very similar to the Doty and COCOMO models
(but with different effort multiplier ranges); see Table 11. Some
of the multiplier ranges in the Jensen model vary as functions
of other factors; e.g., increasing access to computer resources
widens the multiplier ranges on such cost drivers as personnel
capability and use of software tools. It uses the Rayleigh curve
for effort distribution, and a somewhat more conservative ef-
fort-schedule tradeoff relation than SLIM (see Fig. 5). As with
the other commercial models, the Jensen model produces a
number of useful outputs on resource expenditure rates, prob-
ability distributions on costs and schedules, etc.
C Outstanding Research Issues in Software Cost Estimation
Although a good deal of progress has been made in software
cost estimation, a great deal remains to be done. This section
updates the state-of-the-art review published in [ 11] , and sum-
marizes the outstanding issues needing further research:
1) Software size estimation;
2) Software size and complexity metrics;
3) Software cost driver attributes and their effects;
4) Software cost model analysis and refinement;
5 ) Quantitative models of software project dynamics;
6 ) Quantitative models of software life-cycle evolution;
7) Software data collection.
1 ) Software Size Estimation: The biggest difficulty in us-
ing today's algorithmic software cost models is the problem of
providing sound sizing estimates. Virtually every model re-
quires an estimate of the number of source or object instruc-
tions to be developed, and this is an extremely difficult quan-
tity to determine in advance. It would be most useful to have
some formula for determining the size of a software product in
terms of quantities known early the software life cycle, such
as the number and/or size of the files, input formats, reports,
displays, requirements specification elements, or design specifi-
cation elements.
Some useful steps in this direction are the function-point
approach in [2] and the sizing estimation model of [29] , both
of which have given reasonably good results for small-to-medium
sized business programs within a single data processing organiza-
tion. Another more general approach is given by DeMarco in
[17]-. It has the advantage of basing its sizing estimates on the
properties of specifications developed in conformance with
137

'hMarco's paradigm models for software specifications and de-


dgns: number of functional primitives, data elements, input
- elements, output elements, states, transitions between states,
-.relations, modules, data tokens, control tokens, etc. To date,
.however, there has been relatively little calibration of the for-
;mulas to project data. -4 recent IBM study [14] shows some
..correlation between the number of variables defined in a state-
:machine design representation and the product size in source
instructions.
Although some useful results can be obtained on the soft-
ware sizing problem, one should not expect too much. A wide
range of functionality can be implemented beneath any given
specification element or 110 element, leading to a wide range
of sizes (recall the uncertainty ranges of this nature in Fig. 3).
For example, two experiments, involving the use of several
teams developing a software program to the same overall
functional specification, yielded size ranges of factors of 3 t o
5 between programs (see Table X).

TABLE X
SIZE RANGES OF SOFTWARE PRODUCTS PERFORMING
SAME FUNCTION

No. of Size Range


Experirnent Product Teams (source-instr.)
--

Weinberg Simultaneous 6 33-1 65


& Schulman [55] linear equations
Boehrn, Gray, Interactive 7 15 14-4606
& Seewaldt [ 131 cost model

The primary implication of this situation for practical soft-


ware sizing and cost estimation is that there is no royal road t o
software sizing. This is no magic formula that will provide an
easy and accurate substitute for the process of thinking
through and fully understanding the nature of the software
product to be developed. There are still a number of useful
things that one can do to improve the situation, including the
following.
Use techniques which explicitly recognize the ranges 01'
variability in software sizing. The PERT estimation technique
[56] is a good example.
Understand the primary sources of bias in software
.
sizing estimates. See [ l l , ch. 211
Develop and use a corporate memory on the nature and
size of previous software products.
2) Software Size and Complexity Metrics: Delivered source
instructions (DSI) can be faulted for being too low-level a
metric for use in early sizing estimation. On the other hand,
DSI can also be faulted for being too high-level a metric for
precise software cost estimation. Various complexity metrics
have been formulated to more accurately capture the relative
information content of a program's instructions, such as the
Halstead Software Science metrics 1241, or to capture the rela-
tive control complexity of a program, such as the metrics for-
mulated by McCabe in [39]. A number of variations of these
metrics have been developed; a good recent survey of them is
given in [ 2 6 ] .
However, these metrics have yet to exhibit any practical
superiority t o DSI as a predictor of the relative effort required
t o develop software. Most recent studies [48], [32] show a
reasonable correlation be tween these complexity me trics and
development effort, but no better a correlation than that be-
tween DSI and development effort.
Further, the recent [25] analysis of the software science re-
sults indicates that many of the published software science
"successes" were not as successful as they were previously con-
sidered. It indicates that much of the apparent agreement be-
tween software science formulas and project data was due to
factors overlooked in the data analysis: inconsistent defini-
tions and interpretations of software science quantities, unreal-
istic or inconsistent assumptions about the nature of the proj-
acts analyzed, overinterpretation of the significance of statisti-
cal measures such as the correlation coefficient, and lack of in-
vestigation of alternative explanations for the data. The software
science use of psychological concepts such as the Stroud num-
ber have also been seriously questioned in [16] .
The overall strengths and difficulties of software science are
summarized in [47]. Despite the difficulties, some of the soft-
ware science metrics have been useful in such areas as identify-
ing error-prone modules. In general, there is a strong intuitive
argument that more definitive complexity metrics will eventu-
dy serve as better bases for definitive software cost estimation
than will DSI. Thus, the area continues to be an attractive one
for further research.
3) Software Cost Driver Attributes and Their Effects: Most
of the software cost models discussed above contain a selec-
tion of cost driver attributes and a set of coefficients, func-
tions, or tables representing the effect of the attribute on soft-
ware cost (see Table II). Chapters 24-28 of [ l l ] contain
summaries of the research to date on about 20 of the most
significant cost driver attributes, plus statements of nearly 100
outstanding research issues in the area.
Since the publication of [ l 11 in 1981, a few new results
have appeared. Lawrence [35] provides an analysis of 278
business data processing programs which indicate a fairly uni-
form development rate in procedure lines of code per hour,
some significant effects on programming rate due to batch
turnaround time and level of experience, and relatively little
effect due to use of interactive operation and modern pro-
gramming practices (due, perhaps, to the relatively repetitive'
nature of the software jobs sampled). Okada and Azuma [42]
analyzed 30 CAD/CAM programs and found some significant
effects due to type of software, complexity, personnel skill
level, and requirements volatility.
4 ) Sofn~are Cost Model Analysis and Refinement: The
most useful comparative analysis of software cost models to
date is the Thibodeau [52] study performed for the U S . Alr
Force. This study compared the results of several models ( t h t t
Wolverton, Doty, PRICE S, and SLIM models discussed earlier.
plus models from the Boeing, SDC, Tecolote, and Aerospace
corporations) with respect to 45 project data points from
three sources.
Some generally useful comparative results were obtained,
but the results were not definitive, as models were evaluated
with respect to larger and smaller subsets of the data. Not too
surprisingly, the best results were generally obtained using
models with calibration coefficients against data sets with few
points. In general, the study concluded that the models with
calibration coefficients achieved better results, but that none
of the models evaluated were sufficiently accurate t o be used
as a definitive Air Force software cost estimation model.
Some further comparative analyses are currently being con-
ducted by various organizations, using the database of 63 soft-
ware projects in [ l l ] , but to date none of these have been
published.
In general, such evaluations play a useful role in model re-
finement. As certain models are found to be inaccurate in cer-
tain situations, efforts are made to determine the causes, and
t o refine the model to eliminate the sources of inaccuracy.
Relatively less activity has been devoted to the formulation,
evaluation, and refinement of models to cover the effects of
more advanced methods of software development (prototyp-
ing, incremental development, use of application generators,
etc.) or to estimate other software-related life-cycle costs (con-
version, maintenance, installation, training, etc.). An exception
is the excellent work on software conversion cost estimation
performed by the Federal Conversion Support Center [28] .
An extensive model to estimate avionics software support
costs using a weighted-multiplier technique has recently been
developed [49]. Also, some initial experimental results have
been obtained on the quantitative impact of prototyping in
141

] and on the impact of very high level nonprocedural lan-


ages in [58]. In both studies, projects using prototyping and
HLL's were completed with significantly less effort.
5 ) Quantitative Models of Software hoject Dynamics: Cur-
- Wnt software cost estimation models are limited in their abil-
' i t y to represent the internal dynamics of a software project,
md to estimate how the project's phase distribution of effort
and schedule will be affected by environmental or project a

management factors. For example, it would be valuable to


have a model which would accurately predict the effort and
schedule distribution effects of investing in more thorough
! design verification, of pursuing an incremental development -
( strategy, of varying the staffing rate or experience mix, of re-
ducing module size, etc.
Some current models assume a universal effort distribution,
such as the Rayleigh curve [44] or the activity distributions in
[57], which are assumed to hold for any type of project situa-
tion. Somewhat more realistic, but still limited are models
with phase-sensitive effort multipliers such as PRICE S [22]
and Detailed COCOMO [l 11 .
Recently, some more realistic models of software project
dynamics have begun to appear, although to date none of
them have been calibrated to software project data. The Phister
phase-by-phase model in [43] estimates the effort and schedule
required t o design, code, and test a software product as a func-
tion of such variables as the staffing level during each phase,
the size of the average module to be developed, and such
factors as interpersonal communications overhead rates and
error detection rates. The Abdel Hamid-Madnick model [ I ] ,
based o n Forrester's System Dynamics world-view, estimates
the time distribution of effort, schedule, and residual defects
as a function of such factors as staffing rates, experience mix,
training rates, personnel turnover, defect introduction rates,
and initial estimation errors. Tausworthe [51] derives and
calibrates alternative versions of the SLIM effort-schedule
tradeoff relationship, using an intercommunication-overhead
model of project dynamics. Some other recent models of
software project dynamics are the Mitre SWAP model and
the Duclos [2 11 total software life-cycle model.
6) Quantitative Models of Software Life-Cycle Evolution:
Although most of the software effort is devoted t o the soft-
ware maintenance (or life-cycle support) phase, only a few sig-
nificant results have been obtained to date in formulating
quantitative models of the software life-cycle evolution proc-
ess. Some basic studies by Belady and Lehman analyzed data
on several projects and derived a set of fairly general "laws of
program evolution" [7] , [37] . For example, the first of these
laws states:

"A program that is used and that as an implementation


of its specification reflects some other reality, undergoes
continual change or becomes progressively less useful.
The change or decay process continues until it is judged
more cost effective to replace the system with a re-
created version."

Some general quantitative support for these laws was obtained


~ in more recent studies
in several studies during the 1 9 7 0 ' ~and
such as [33]. However, efforts to refine these general laws into
a set of testable hypotheses have met with mixed results. For
example, the Lawrence [36] statistical analysis of the Belady-
Lahman data showed that the data supported an even stronger
form of the first law ("systems grow in size over their useful
life"); that one of the laws could not be formulated precisely
enough t o be tested by the data; and that the other three laws
did not lead to hypotheses that were supported by the data.
However, it is likely that variant hypotheses can be found
that are supported by the data (for example, the operating
system data supports some of the hypotheses better than does
the applications data). Further research is needed to clarify
this important area.
I?
I 143
1
< I

!
7) Software Data Collection: A fundamental limitation t o
rlgnificant progress in software cost estimation is the lack of
unambiguous, widely-used standard definitions for software
data. For example, if an organization reports its "software
development man-months," do these include the effort de-
voted to requirements analysis, to training, t o secretaries, t o
quality assurance, to technical writers, t o uncompensated
overtime? Depending on one's interpretations, one can easily
cause variations of over 20 percent (and often over a factort
of 2) in the meaning of reported "software development man-
months" between organizations (and similarly for "delivered
instructions," "complexity," "storage constraint," etc.) Given
such uncertainties in the ground data, it is not surprising that
software cost estimation models cannot do much better than
"within 20 percent of the actuals, 70 percent of the time."
Some progress towards clear software data definitions has
been made. The LBM FSD database used in [53] was carefully
collected using thorough data definitions, but the detailed
data and definitions are not generally available. The NASA-
Goddard Software Engineering Laboratory database [5] , [6],
[40] and the COCOMO database [ l l ] provide both clear
data definitions and an associated project database which are
available for general use (and reasonably compatible). The re-
cent Mitre SARE report [59] provides a good set of data defi-
nitions.
But there is still no commitment across organizations to
establish and use a set of clear and uniform software data defi-
nitions. Until this happens, our progress in developing more
precise software cost estimation methods will be severely lim-
ited.
IV. SOFTWAREENGINEERINGECONOMICSBENEFITSAND
CHALLENGES
This final section summarizes the benefits t o software engi-
neering and software management provided by a software engi-
needng economics perspective in general and by software cost
estimation technology in particular. It concludes with son I c,
observations on the major challenges awaiting the field.
Benefits of a Software Engineering Economics Perspective
The major benefit of an economic perspective on softwan,
engineering is that it provides a balanced view of candidate
software engineering solutions, and an evaluation framework
which takes account not only of the programming aspects of
a situation, but also of the human problems of providing the
best possible information processing service within a resource-
limited environment. Thus, for example, the software engi-
neering economics approach does not say, "we should use
these structured structures because they are mathematically
elegant" or "because they run like the wind" or "because
they are part of the structured revolution." Instead, it says
"we should use these structured structures because they pro-
vide people with more benefits in relation to their costs
than do other approaches." And besides the framework, of
course, it also provides the techniques which help us to arrive
a t this conclusion.

Benefits of Software Cost Estimation Technology


The major benefit of a good software cost estimation model
is that it provides a clear and consistent universe of discourse
within which t o address a good many of the software engineer-
ing issues which arise throughout the software life cycle. It can
help people get together t o discuss such issues as the following.
Which and how many features should we put into the
software product?
0 Which features should we put in first?
How much hardware should we acquire to support the
software product's development, operation, and maintenance?
0 How much money and how much calendar time should
we allow for software development?
145

How much of the product should we adapt from exist-

f
How much should we invest in tools and training?
Further, a well-defined software cost estimation model can
kelp avoid the frequent misinterpretations, underestimates,
sverexpectations, and outright buy-ins which still plague the
software field. In a good cost-estimation model, there is no
way of reducing the estimated software cost without changing
some objectively verifiable property of the software project.
This does not make it impossible to create an unachievable
buy-in, but it significantly raises the threshold of credibility.
A related benefit of software cost estimation technology
is that it provides a powerful set of insights on how a software
organization can improve its productivity. Many of a software
cost model's cost-driver attributes are management control-
lable~:use of software tools and modem programming prac-
tices, personnel capability and experience, available computer
speed, memory, and turnaround time, software reuse. The cost
model helps us determine how t o adjust these management
controllables to increase productivity, and further provides an
estimate of how much of a productivity increase we are likely
to achieve with a given level of investment. For more informa-
tion on this topic, see [ l l, ch. 333 , [12] and the recent plan
for the U.S. Department of Defense Software Initiative [20].
Finally, software cost estimation technology provides an
absolutely essential foundation for software project planning
and control. Unless a software project has clear definitions of
its key milestones and realistic estimates of the time and
money it will take to achieve them, there is no way that a
project manager can tell whether his project is under control
or not. A good set of cost and schedule estimates can provide
realistic data for the PERT charts, work breakdown structures,
manpower schedules, earned value increments, etc., necessary
to establish management visibility and control.
Note that this opportunity t o improve management visibil-
ity and control requires a complementary management com-
mitment to define and control the reporting of data on software
progress and expenditures. The resulting data are therefore
worth collecting simply for their management value in compar-
ing plans versus achievements, but they can serve another valu-
able function as well: they provide a continuing stream of cali-
bration data for evolving a more accurate and refined software
cost estimation models.
Software Engineering Economics Challenges
The opportunity t o improve software project management
decision making through improved software cost estimation,
planning, data collection, and control brings us back full-circle
to the original objectives of software engineering economics:
to provide a better quantitative understanding of how software
people make decisions in resource-limited situations.
The more clearly we as software engineers can understand
the quantitative and economic aspects of our decision situa-
tions, the more quickly we can progress from a pure seat-of-
the-pants approach on software decisions to a more rational
approach which puts all of the human and economic decision
variables into clear perspective. Once these decision situations
are more clearly illuminated, we can then study them in more
detail to address the deeper challenge: achieving a quantitative
understanding of how people work together in the software
engineering process.
Given the rather scattered and imprecise data currently
available in the software engineering field, it is remarkable how
much progress has been made on the software cost estimation
problem so far. But, there is not much further we can go until
better data becomes available. The software field cannot hope
to have its Kepler or its Newton until it has had its army of
Tycho Brahes, carefully preparing the well-defined observa-
tional data from which a deeper set of scientific insights may
be derived.
T. K. Abdel-Hamid and S. E. Madnick, "A model of software
project management dynamics," in Proc. IEEE COMPSAC 82.
NOV. 1982, pp. 539-554.
A. J. Albrecht, "Measuring Application Development Productiv-
ity," in SHARE-GUIDE. 1979, pp. 83-92.
J. D. Aron, "Estimating resources for large programming sys-
tems." NATO Sci. Committee, Rome, Italy, Oct. 1969.
J. J. Bailey and V. R. Basili, "A meta-model for software devel-
opment resource expenditures," in Proc. 5th Int. Conf. Sofrware
Eng.. IEEE/ACM/NBS, Mar. 1981, pp. 107-1 16.
V. R. Basili,. "Tutorial on models and metrics for software and
engineering," IEEE Cat. EHO- 167-7, Oct. 1980.
V. R. Basili and D. M. Weiss, "A methodology for collecting valid
software engineering data," Univ.. Maryland Technol. Rep. TR-
1235, Dec. 1982.
L. A. Belady and M. M. Lehman, "Characteristics of large sys-
tems," in Research Directions in Sofnvare Technology, P. Wegner,
Ed. Cambridge, MA: MIT Press,. 1979.
H. D. Benington, "Production of large computer programs," in
Proc. ONR Symp. Advanced Programming Methods for Digital
Computers, June 1956, pp. 15-27.
R. K. D. Black, R. P. Curnow, R. Katz, and M. D. Gray, "BCS
software production data," Boeing Comput. Services, Inc., Final
Tech. Rep., RADC-TR-77-116, NTIS AD-A039852, Mar. 1977.
B. W. Boehm, "Software and its impact: A quantitative assess-
ment," Datamation. pp. 48-59, May 1973.
-, Software Engineering Economics. Englewood Cliffs, NJ:
Prentice-Hall, 198 1.
B. W. Boehm, I. F. Elwell, A. B. Pyster, E. D. Stuckle, and R. D.
Williams, "The TRW software productivity system," in Proc.
IEEE 6th Int. Conf. Sofnvare Eng., Sept. 1982.
B. W. Boehm, T. E. Gray, and T. Seewaldt, "Prototyping vs.
specifying: A multi-project experiment," IEEE Trans. Sofrware
Eng., to be published.
R. N. Britcher and J. E. Gaffney, "Estimates of software size from
state machine designs," in Proc. NASA-Goddard Sofrware Eng.
Workshop, Dec. 1982.
W . M. Carriere and R. Thibodeau, "Development of a logistics
software cost estimating technique for foreign military sales,"
General Res. Corp., Rep. CR-3-839, June 1979.
N. S. Coulter, "Software science and cognitive psychology,"
IEEE Trans. Sofnvare Eng.. pp. 166-171, Mar. 1983.
T. DeMarco, Controlling- Sofnvare
- Projects. New York: Your-
don, 1982.
M. Demshki, D. Ligett, B. Linn, G. McCluskey, and R. Miller,
"Wang Institute cost model (WICOMO) tool user's manual,"
Wang Inst. Graduate Studies, Tyngsboro, MA, June 1982.
H. F. Dircks, "SOFCOST: Grumman's software cost eliminating
model," in IEEE NAECON 1981, May 1981.
L. E. Druffel, "Strategy for DoD software initiative," RADCI
DACS, Griffiss AFB, NY, Oct. 1982.
L. C. Duclos, "Simulation model for the life-cycle of a software
product: A quality assurance approach," Ph.D. dissertation, Dep.
Industrial and Syst. Eng., Univ. Southern California, Dec. 1982.
F. R. Freiman and R. D. Park, "PRICE software model-Version
3: An overview," in Proc. IEEE-PINY Workshop on Quantitative
Software Models, IEEE Cat. TH0067-9, Oct. 1979, pp. 3 2 4 1.
R. Goldberg and H. Lorin, The Economics of Information Process-
ing. New York: Wiley, 1982.
M. H. Halstead, Elements of Sofnvare Science. New York: Else-
vier, 1977.
P. G. Hamer and G. D. Frewin, "M. H. Halstead's software
science-A critical examination," in Proc. IEEE 6th Int. Conf.
Sofnvare Eng., Sept. 1982, pp. 197-205.
W. Harrison, K. Magel, R. Kluczney, and A. DeKock, "Applying
software complexity metrics to program maintenance," Computer.
pp. 65-79, Sept. 1982.
J. R. Herd, J. N. Postak, W. E. Russell, and K. R. Stewart,
"Software cost estimation study-Study results," Doty Associates,
Inc., Rockville, MD, Final Tech. Rep. RADC-TR-77-220, vol. I
(of two), June 1977.
C. Houtz and T. Buschbach, "Review and analysis of conversion
cost-estimating techniques," GSA Federal Conversion Support
Center, Falls Church, VA, Rep. GSAIFCSC-8 1/00 1, Mar. 1981.
M. Itakura and A. Takayanagi, " A model for estimating program
size and its evaluation," in Proc. IEEE 6th Sofrware Eng.. Sept.
1982, pp. 104-109.
R. W. Jensen, "An improved macrolevel software development
resource estimation model," in Proc. 5th ISPA Conf.. Apr. 1983,
pp. 88-92.
R. W. Jensen and S. Lucas, "Sensitivity analysis of the Jensen
software model," in Proc. 5th ISPA Conf.. Apr. 1983, pp. 384-
389.
B. A. Kitchenham, "Measures of programming complexity," ICL
Tech. J.. pp. 298-316, May 1981.
-, "Systems evolution dynamics of VMEIB," ICL Tech. J., pp.
43-57, May 1982.
W. W. Kuhn, "A software lifecycle case study using the PRICE
model," in Proc. IEEE NAECON. May 1982.
M . J. Lawrence, "Programming methodology, organizational en-
vironment,and programming productivity," J. Syst. Software. pp.
257-270, Sept. I98 1.
-, "An ekimination of evolution dynamics," in Proc. IEEE 6th
Int. Conf. Software Eng., Sept. 1982, pp. 188-196.
M. M. Lehman, "Programs, life cycles, and laws of software
evolution," Proc. IEEE, pp. 1060-1076, Sept. 1980.
R. D. Luce and H. Raiffa, Games and Decisions. New York:
Wiley, 1957.
T. J. McCabe, "A complexity measure," IEEE Trans. Sojiware
Eng., pp. 308-320, Dec. 1976.
F. E. McGarry; "Measuring software development technology:
What have we learned in six years," in Proc. NASA-Goddard
Software Eng. Workshop, Dec. 1982.
E. A. Nelson, "Management handbook for the estimation of com-
puter programming costs," Syst. Develop. C o p , AD-A648750,
0ct. 31, 1966.
M. Okada and M. Azuma, "Software development estimation
study-A model from CAD/CAM system development experi-
ences," in Proc. IEEE COMPSAC 82, Nov. 1982, pp. 555-564.
M. Phister, Jr., "A model of the software development process,"
J. Syst. Software, pp, 237-256, Sept. 1981.
L. H. Putnam, "A general empirical soluticin to the macro software
sizing and estimating problem," IEEE Truns. Sqtiwure Eng., pp.
345-361, July 1978.
L. H. Putnam and A. Fitzsimmons, "Estimating software costs,"
Datamation, pp. 189-198, Sept. 1979; continued in Dcrtamution,
pp. 171-178, Oct. 1979 and pp. 137-140, Nov. 1979.
L.H. Putnam, "The real economics of software development," in
The Economics of Information Processing, R. Goldberg and H.
Lorin. New York: Wiley, 1982.
V. Y. Shen, S. D. Conte, and H. E. Dunsmore, "Software science
revisited: A critical analysis of the theory and its empirical sup-
port," IEEE Trans. Sofnvare Eng., pp. 155-165, Mar. 1983.
T. Sunohara, A. Takano, K. Uehara, and T. Ohkawa, "Program
complexity measure for software development management," in
Proc. IEEE 5th Int. Conf. Software Eng., Mar. 1981, pp. 100-106.
SYSCON Corp., "Avionics software support cost model," USAF
Avionics Lab., AFWAL-TR-I 173, Feb. 1, 1983.
R. C. Tausworthe, "Deep space network software cost estimation
model," Jet Propulsion Lab., Pasadena, CA, I98 1.
-, "Staffing implications of softwurc pmduclivity mcklels," i n
Proc. 7th Annu. Soft wure Eng. Workshop, N ASA/Goddard, Green-
belt, MD, Dec. 1982.
R. Thibodeau, "An evaluation of software cost estimating mod-
els," General Res. Corp., Rep. Tl0-2670, Apr. 1981.
C. E. Walston and C. P. Felix, "A method of programming meas-
urement and estimation," ISM Sysr. J . , vol. 16, no. I , pp. 54-73,
1977.
G. F. Weinwurm, Ed., O n the Management of Computer Progmrtl
ming. New York: Auerbach, 1970.
G. M. Weinberg and E. L. Schulman, "Goals and performance I I I
computer programming," Human Factors, vol. 16, no. I , pl)
70-77, 1974.
J. D. Wiest and F. K. Levy, A Management Guide to PERTICPM
Englewood Cliffs, NJ: Prentice-Hall, 1977.
R. W. Wolverton, "The cost of developing large-scale software,"
IEEE 'f'rutts. Cornput., pp. 615-636, June 1974.
E. Harel and E. R. McLean, "The effects of using a nonprocedural
computer language on programmer productivity," UCLA Inform
Sci. Working Paper 3-83, Nov. 1982.
R. L. Dumas, "Final report: Software acquisition resource ex-
penditure (SARE) data collection methodology," MITRfi Corp..
MTR 903 1, Sept. 1983.

Barry W. Boehm received the B.A. degree in


mathematics from Harvard University, Cam-
bridge, MA, in 1957 and the M.A. and Ph.D.
degrees from the University of California, Los
~ n ~ e l eins ,1961 and 1964,respectively.
From 1978 to 1979 be was a Visiting Professor
of Computer Science at the University bf South-
ern California. He is currently a Visitiqg Profes-
sor at the University of Culiforniu. Los Angeles,

r
and Chief Engineer of TRW's Software I formu-
tion Systems Division. He was previous y Head
of the Information Sciences Department at The Rqnd Corporation, and
Director of the 1971 Air Force CCIB-85 study. His responsibilities at
TRW include direction of TRW's internal software R&D program, of
contract software technology projects. of the TRW software development
policy and standards program, of the TRW Software Cost Methodology
Program, ,.andthe TRW Software Productivity Program. His most recent
book is Software Engineering Economics, by Prentice-Hall.
Dr. Boehm is a member of the IEEE Computer Society and the
Association for Computing Machinery, and an ~ssociateFellow of the
American Institute of Aeronautics and Astronautics.
International Journal of Engineering & Technology (iJET), ISSN: 2049-3444, Vol. 2, No. 5, 2012
http://iet-journals.org/archive/2012/may_vol_2_no_5/255895133318216.pdf

A Simulation Model for the Waterfall


Software Development Life Cycle
Youssef Bassil

LACSC – Lebanese Association for Computational Sciences


Registered under No. 957, 2011, Beirut, Lebanon
youssef.bassil@lacsc.org

ABSTRACT
Software development life cycle or SDLC for short is a methodology for designing, building, and maintaining information and
industrial systems. So far, there exist many SDLC models, one of which is the Waterfall model which comprises five phases to
be completed sequentially in order to develop a software solution. However, SDLC of software systems has always encountered
problems and limitations that resulted in significant budget overruns, late or suspended deliveries, and dissatisfied clients. The
major reason for these deficiencies is that project directors are not wisely assigning the required number of workers and
resources on the various activities of the SDLC. Consequently, some SDLC phases with insufficient resources may be delayed;
while, others with excess resources may be idled, leading to a bottleneck between the arrival and delivery of projects and to a
failure in delivering an operational product on time and within budget. This paper proposes a simulation model for the Waterfall
development process using the Simphony.NET simulation tool whose role is to assist project managers in determining how to
achieve the maximum productivity with the minimum number of expenses, workers, and hours. It helps maximizing the
utilization of development processes by keeping all employees and resources busy all the time to keep pace with the arrival of
projects and to decrease waste and idle time. As future work, other SDLC models such as spiral and incremental are to be
simulated, giving project executives the choice to use a diversity of software development methodologies.

Keywords: Software Engineering, SDLC, Waterfall Model, Computer Simulation, Simphony.NET

1. INTRODUCTION adopted it as their prime development framework and


SDLC to plan, build, and maintain their products [4].
The process of building computer software and information
Additionally, these firms went to the extreme by
systems has been always dictated by different development
establishing several departments each of which is run by a
methodologies. A software development methodology
team of expert people totally responsible for and dedicated
refers to the framework that is used to plan, manage, and
to handle a particular phase of the Waterfall model. This
control the process of developing an information system
includes, for instance, business and requirements analysis
[1]. Formally, a software development methodology is
department, software engineering department, development
known as SDLC short for Software Development Life
and programming department, quality assurance (QA)
Cycle and is majorly used in several engineering and
department, and technical support department.
industrial fields such as systems engineering, software
engineering, mechanical engineering, computer science, However, assigning the exact and the appropriate number
computational sciences, and applied engineering [2]. In of resources for each phase of the Waterfall model
effect, SDLC has been studied and investigated by many including people, equipment, processes, time, effort, and
researchers and practitioners all over the world, and budget was a dilemma and confusion for project managers
numerous models have been proposed, each with its own and directors to achieve the maximum productivity with
acknowledged strengths and weaknesses. The Waterfall, the minimum number of expenses, workers, and hours. In
spiral, incremental, rational unified process (RUP), rapid that sense, it is vital to find the optimal number of
application development (RAD), agile software resources that should be assigned in order to complete a
development, and rapid prototyping are few to mention as specific task or phase. For instance, project managers need
successful SDLC models. In a way or another, all SDLC to find out the number of system analysts that should be
models suggested so far share basic properties. They all hired to work on the business analysis phase. They also
consist of a sequence of phases or steps that must be need to know how many computers are required for the
followed and completed by system designers and implementation phase, and how many testers should be
developers in order to attain some results and deliver a acquired to cover all possible test cases during the testing
final product. For instance, the Waterfall model, one of the phase. In order to answer all these questions, a simulation
earliest SDLC models, comprises five consecutive phases for the SDLC is needed so as to estimate the appropriate
and they are respectively: Business analysis, design, number of resources necessary to fulfill a certain project of
implementation, testing, and maintenance. On the other a certain scale.
hand, the incremental model has seven phases and they are
respectively: Planning, requirements, analysis, Relatedly, a computer simulation is a computer program
implementation, deployment, testing, and evaluation [3]. that tries to simulate an abstract model of a particular
system. In practice, simulations can be employed to
Due to the success of the Waterfall model, many software discover the behavior, to estimate the outcome, and to
development firms and industrial manufacturers have analyze the operation of systems [5].
International Journal of Engineering & Technology (iJET), ISSN: 2049-3444, Vol. 2, No. 5, 2012
http://iet-journals.org/archive/2012/may_vol_2_no_5/255895133318216.pdf

This paper proposes a simulation model to simulate and non-functional requirements refer to the various criteria,
mimic the Waterfall SDLC development process from the constraints, limitations, and requirements imposed on the
analysis to the maintenance phase using the design and operation of the software rather than on
Simphony.NET computer simulation tool. The model particular behaviors. It includes such properties as
simulates the different stakeholders involved in the reliability, scalability, testability, availability,
Waterfall model which are essential throughout the whole maintainability, performance, and quality standards.
development process. They include the software solution to
Design Phase: It is the process of planning and problem
design and develop; the employees such as designers and
solving for a software solution. It implicates software
programmers; the different Waterfall phases; and the
developers and designers to define the plan for a solution
workflow of every Waterfall task. Furthermore, the
which includes algorithm design, software architecture
proposed simulation takes into consideration three different
design, database conceptual schema and logical diagram
types of software solutions based on their complexity and
design, concept design, graphical user interface design, and
scale. The simulation also measures the rate of projects
data structure definition.
arrival, the rate of projects delivery, and the utilization of
various resources during every phase and task. Implementation Phase: It refers to the realization of
business requirements and design specifications into a
The goal of the proposed simulation is to identify the
concrete executable program, database, website, or
optimal number of resources needed to keep the company
software component through programming and
up with the continuous flow of incoming projects using the
deployment. This phase is where the real code is written
minimal amount of workers, time, and budget.
and compiled into an operational application, and where
the database and text files are created. In other words, it is
2. THE WATERFALL SDLC MODEL
the process of converting the whole requirements and
The Waterfall SDLC model is a sequential software blueprints into a production environment.
development process in which progress is regarded as
flowing increasingly downwards (similar to a waterfall) Testing Phase: It is also known as verification and
through a list of phases that must be executed in order to validation which is a process for checking that a software
successfully build a computer software. Originally, the solution meets the original requirements and specifications
Waterfall model was proposed by Winston W. Royce in and that it accomplishes its intended purpose. In fact,
1970 to describe a possible software engineering practice verification is the process of evaluating software to
[6]. The Waterfall model defines several consecutive determine whether the products of a given development
phases that must be completed one after the other and phase satisfy the conditions imposed at the start of that
moving to the next phase only when its preceding phase is phase; while, validation is the process of evaluating
completely done. For this reason, the Waterfall model is software during or at the end of the development process to
recursive in that each phase can be endlessly repeated until determine whether it satisfies specified requirements [7].
it is perfected. Fig. 1 depicts the different phases of the Moreover, the testing phase is the outlet to perform
SDLC Waterfall model. debugging in which bugs and system glitches are found,
corrected, and refined accordingly.
Maintenance Phase: It is the process of modifying a
software solution after delivery and deployment to refine
output, correct errors, and improve performance and
quality. Additional maintenance activities can be
performed in this phase including adapting software to its
environment, accommodating new user requirements, and
increasing software reliability [8].

3. RELATED WORK
[9] proposed a simulation planning that must be completed
Fig. 1 The Waterfall model prior to starting any development process. Its purpose is to
identify the structure of the project development plan and
Essentially, the Waterfall model comprises five phases: to classify what must be simulated, the degree of
Analysis, design, implementation, testing, and simulation, and how to use the simulation results for future
maintenance. planning. Moreover, the approach takes into consideration
Analysis Phase: Often known as Software Requirements such issues as configuration requirements, design
Specification (SRS) is a complete and comprehensive constraints, development criteria, problem reporting and
description of the behavior of the software to be developed. resolution, and analysis of input and output data sets. [10]
It implicates system and business analysts to define both described three types of simulation methodologies. The
functional and non-functional requirements. Usually, first is called “simulation as software engineering” and
functional requirements are defined by means of use cases revolves around simulating the delivery of a product. This
which describe the users’ interactions with the software. comprises the use of large simulation models to represent a
They include such requirements as purpose, scope, real system at the production environment. The second is
perspective, functions, software attributes, user called “simulation as a process of organizational change”
characteristics, functionalities specifications, interface and revolves around the delivery of a service. This
requirements, and database requirements. In contrast, the comprises the use of temporary small-scale models to
International Journal of Engineering & Technology (iJET), ISSN: 2049-3444, Vol. 2, No. 5, 2012
http://iet-journals.org/archive/2012/may_vol_2_no_5/255895133318216.pdf

simulate small-scale tasks and processes. The third is called reason for this is that project managers are not intelligently
“simulation as facilitation” and revolves around assigning the required number of employees and resources
understanding and debating about a problem situation. This on the various activities of the SDLC. For this reason,
comprises using “quick-and-dirty” very small-scale models some SDLC phases may be delayed due to the insufficient
to simulate minute-by-minute processes. [11] proposed the number of workers; while, other dependent phases may
use of simulation as facilitation based on system dynamics. stay idle, doing nothing, but waiting for other phases to get
The model proposes the simulation of three development completed. Consequently, this produces a bottleneck
stages: The conceptualization stage which simulates between the arrival and delivery of projects which leads to
problem situation and system objectives; the development a failure in delivering a functional product on time, within
stage which simulates the coding, verification, validation, budget, and to an agreed level of quality.
and calibration processes; and the facilitation stage which
The proposed simulation for the Waterfall model is aimed
simulates group learning around the model, project
at finding the trade-offs of cost, schedule, and functionality
findings, and project recommendations. [12] proposed a
for the benefit of the project outcome. It helps maximizing
guideline to be followed for performing a simulation study
the utilization of development processes by keeping all
for software development life cycles. It is composed of ten
employees and resources busy all the time to keep pace
processes, ten phases, and thirteen reliability evaluation
with the incoming projects and reduce waste and idle time.
stages. Its purpose is to assess the credibility of every stage
As a result, the optimal productivity is reached with the
after simulation and match it with the initial requirements
least possible number of employees and resources,
and specifications. The model provides one of the most
delivering projects within the right schedule, budget, and
documented descriptions for simulating life-cycles in the
conforming to the initial business needs and requirements.
software engineering field [13]. [14] proposed a software
engineering process simulation model called SEPS for the
dynamic simulation of software development life cycles. It
5. THE SIMULATION MODEL
is based on using feedback principles of system dynamics This paper proposes a simulation model to simulate the
to simulate communications and interactions among the different phases of the Waterfall SDLC model including all
different SDLC phases and activities from a dynamic related resources, input, workflow, and output. The
systems perspective. Basically, SEPS is a planning tool simulation process is carried out using a simulation tool
meant to improve the decision-making of managers in called Simphony.NET [20] which provides an adequate
controlling the projects outcome in terms of cost, time, and environment to create, manage, and control the different
functionalities. [15] proposed a discrete open source event simulation entities. The purpose of this simulation is to
simulation model for simulating the programming and the guarantee that the interval-time between each project
testing stages of a software development process using arrival is equal to the interval-time between each project
MathLab. The model investigates the results of adopting production. In other words, if a new project is emerging
different tactics for coding and testing a new software every 10 days, a project must be delivered every other 10
system. It is oriented toward pair programming in which a days, taking into consideration that the optimal number of
programmer writes the code and the simulation acts as an employees should be assigned to every project, that is the
observer which reviews the code and return feedback to the number of idle and busy resources should be kept as
original programmer. In effect, this approach automates the minimum as possible.
testing and the reviewing processes and promotes best Generally speaking, the proposed simulation process
programming practices to deliver the most reliable and consists of the following steps:
accurate code. [16] proposed an intelligent computerized 1. Run the simulation, examine the data produced by
tool for simulating the different phases of a generic SDLC. the simulation,
It is intended to help managers and project directors in 2. Find changes to be made to the model based on the
better planning, managing, and controlling the analysis of data produced by the simulation,
development process of medium-scale software projects. 3. Repeat as much as it takes to reach the optimal
The model is based on system dynamics to simulate the results.
dynamic interaction between the different phases of the
development process taking into consideration the Technically speaking, the simulation process of the
existence of imprecise parameters that are treated as fuzzy- Waterfall model consists of the following steps:
logic variables. 1. Divide the Waterfall model into independent phases,
2. Understand the concept and the requirements that lie
4. PROBLEM DEFINITION & behind every phase,
MOTIVATIONS 3. Define the resources, tasks, entities, and the work
flow of every phase,
In practice, software development projects have regularly 4. Simulate each phase apart and record results,
encountered problems and shortcomings that resulted in 5. Integrate the whole phases together, simulate the
noteworthy delays and cost overruns, as well as occasional system, and record results.
total failures [17]. In effect, the software development life
cycle of software systems has been plagued by budget 5.1. Assumptions and Specifications
overrun, late or postponed deliveries, and disappointed Prior to simulating the Waterfall model, a number of
customers [18]. A deep investigation about this issue was assumptions and specifications must be clearly made.
conducted by the Standish Group [19], it showed that many
projects do not deliver on-time, do not deliver on budget, Basically, projects arrive randomly at a software firm with
and do not deliver as expected or required. The major inter-arrival time from a Triangular distribution with a
International Journal of Engineering & Technology (iJET), ISSN: 2049-3444, Vol. 2, No. 5, 2012
http://iet-journals.org/archive/2012/may_vol_2_no_5/255895133318216.pdf

lower limit of 30 days, an upper limit of 40 days, and a The testing phase requires a Uniform distribution with a
mode of 35 days. The probability density function is then lower limit of 5 days and an upper limit of 10 days.
given as:

The maintenance phase requires a Uniform distribution


with a lower limit of 1 day and an upper limit of 3 days.

Projects can be divided into three groups based on their


And assuming that each phase upon completion is subject
complexity and scale: 70% of the projects are small-scale
to the following errors:
projects, 25% are medium-scale projects, and 5% are large-
 There is a 10% probability that a small-scale
scale projects.
project will have an error
Each project will require a different mix of specialists,  There is a 20% probability that a medium-scale
employees, and resources to be delivered based on the project will have an error
scale of the project:  There is a 30% probability that a large-scale
 Small-scale projects require 1 business analyst, 1 project will have an error
designer, 2 programmers, 2 testers, and 1
maintenance man. 5.2. The Simphony Model
 Medium-scale projects require 2 business analyst, The proposed simulation model is built using the
2 designer, 4 programmers, 6 testers, and 2 Simphony.NET simulation tool [20]. In fact,
maintenance man. Simphony.NET consists of a working environment and a
 Large-scale projects require 5 business analyst, 5 foundation library that allow the development of new
designer, 10 programmers, 20 testers, and 5 simulation scenarios in an easy and efficient manner. A
maintenance man. project in Simphony.NET is made out of a collection of
modeling elements linked to each other by logical
Assuming that the resources available at the software firm relationships.
are the following:
 5 Business Analyst Essentially, the proposed model consists of a set of
 5 Designers resource, queue, task, probability branch, capture, release,
 10 Programmers and counter modeling elements. The resources are the basic
 20 Testers employees and workers assigned to work on the phases of
the Waterfall model. Each resource has a FIFO queue
 5 Maintenance Men
which accumulates and stores processing events to be
And assuming that there exist the following tasks: processed later. Fig. 2 depicts the resource modeling
 Business Analysis elements along with their counts and queues. They are
 Design respectively the business analyst, the designer, the
 Implementation programmer, the tester, and the maintenance man.
 Testing
 Maintenance
And assuming that the duration for every phase to be
completed is defined as follows:
The business analysis phase requires a Uniform distribution
with a lower limit of 3 days and an upper limit of 5 days.

The design phase requires a Uniform distribution with a


lower limit of 5 days and an upper limit of 10 days.

Fig. 2 Resource modeling elements

On the other hand, the Waterfall phases are modeled as a


The implementation phase requires a Uniform distribution set of task modeling elements each with a capture and
with a lower limit of 15 days and an upper limit of 20 days. release elements. The capture element binds a particular
resource to a particular task and the release element
releases the resource from the task when it is completed.
Additionally, several probability branch elements exist
between the different tasks of the model whose purpose is
International Journal of Engineering & Technology (iJET), ISSN: 2049-3444, Vol. 2, No. 5, 2012
http://iet-journals.org/archive/2012/may_vol_2_no_5/255895133318216.pdf

to simulate the error probability that a Waterfall task might


exhibit after completion. The probability element has two
branches: Branch 1 with Prob=0.1 denotes that 10% of the
small-scale projects are subject to errors; and branch 2 with
Prob=0.9 denotes that 90% of the small-scale projects will
not exhibit errors after the completion of every phase.
These branches simulate the recursive property of the
waterfall model to loop over the preceding task if an error
was found in the current task.
Moreover, another probability branch element exists at the
beginning of every project development cycle whose
purpose is to simulate the scale of projects under
development. It actually has three branches: Branch 1 with
Prob=0.7 denotes that 70% of the incoming projects are
small-scale; branch 2 with Prob=0.25 denotes that 25% of
the incoming projects are medium-scale; and branch 3 with
Prob=0.05 denotes that 5% of the incoming projects are
large-scale.
The model starts with a new entity element which sets the
number of incoming projects and a counter that counts the
number of projects being received, and ends with another
counter that counts the number of projects being delivered.
Fig. 3 shows the simulation model for the different phases
of the Waterfall development process without going deeply
into modeling every type of projects. However, Fig. 4
shows the different modeling elements for simulating
small-scale type projects.

Fig. 4 Simulation model for small-scale type projects

5.3. Running the Simulation


The simulation model was executed 5 times, for 1500
milliseconds (2.5 minutes) with 50 incoming projects using
the Simphony.NET environment. Table 1 delineates the
obtained statistics including the number of projects
received and delivered, in addition to the ArT mean time.
Table 2 delineates the average utilization of every resource
after the completion of the simulation. Furthermore, a
graphical representation for resource utilization is plotted
in Fig. 5 for the programmer resource; while, Fig. 6 is for
Fig. 3 Simulation model for the Waterfall SDLC the designer resource.
TABLE I
STATISTICS OBTAINED FOR SIMULATING THE WATERFALL MODEL

small-scale projects received ArT Mean


35 52.09
medium-scale projects received ArT Mean
10 130.45
large-scale projects received ArT Mean
5 426.29
Total number of projects received: 50
Average ArT Mean: 34.46
small-scale projects delivered ArT Mean
35 53.37
medium-scale projects delivered ArT Mean
10 134.84
large-scale projects delivered ArT Mean
5 448.23
International Journal of Engineering & Technology (iJET), ISSN: 2049-3444, Vol. 2, No. 5, 2012
http://iet-journals.org/archive/2012/may_vol_2_no_5/255895133318216.pdf

Total number of projects delivered: 50 numbers of resources are considered to be the necessary
Average ArT Mean: 35.55 number of workers needed to keep the company up with
the continuous flow of incoming projects, in this particular
TABLE II case, dispatching and producing exactly 50 projects on time
SIMULATED RESOURCES WITH THEIR AVERAGE UTILIZATION
and within budget.
Average
Resource
Utilization 6. CONCLUSIONS & FUTURE WORK
Business Analysts 5.2
This paper proposed a simulation model for simulating the
Designers 11.6
Waterfall software development life cycle using the
Programmers 21.02
Simphony.NET simulator tool. It consists of simulating all
Testers 7.4
Maintenance Men 2.09 entities of the Waterfall model including, software
solutions to be developed, operational resources,
employees, tasks, and phases. Its aim was to assist project
managers in determining the optimal number of resources
required to produce a particular project within the allotted
schedule and budget. Experiments showed that the
proposed model proved to be accurate as it accurately
calculated the number of optimal resources required to
accomplish a particular software solution based on their
utilization metric.
As future work, other SDLC models such as spiral and
incremental are to be simulated, allowing project managers
to select among a diversity of software development
methodologies to support their decision-making and
planning needs.

ACKNOWLEDGMENT
This research was funded by the Lebanese Association for
Fig. 5 Utilization of the programmer resource Computational Sciences (LACSC), Beirut, Lebanon, under
the “Simulation & Testing Research Project – STRP2012”.

REFERENCES
[1] Ian Sommerville, Software Engineering, Addison
Wesley, 9th ed., 2010.
[2] Richard H. Thayer, and Barry W. Boehm, “software
engineering project management”, Computer Society
Press of the IEEE, pp.130, 1986.
[3] Craig Larman and Victor Basili, “Iterative and
Incremental Development: A Brief History”, IEEE
Computer, 2003.
[4] N. Munassar and A. Govardhan, “A Comparison
Between Five Models Of Software Engineering”,
IJCSI International Journal of Computer Science
Issues, vol. 7, no. 5, 2010.
[5] P. Humphreys, Extending Ourselves: Computational
Science, Empiricism, and Scientific Method, Oxford
Fig. 6 Utilization of the designer resource University Press, 2004.
[6] Royce, W., “Managing the Development of Large
5.4. Results Interpretation Software Systems”, Proceedings of IEEE WESCON
The results obtained after running the simulation for many 26, pp.1-9, 1970.
times using the Simphony.NET simulator, clearly showed [7] IEEE-STD-610, A Compilation of IEEE Standard
that the system reached the optimal state when the total Computer Glossaries, IEEE Standard Computer
number of projects received was equal to the total number Dictionary, 1991.
of project delivered. In fact, 50 projects were delivered out [8] Andrew Stellman, Jennifer Greene, Applied Software
of 50 without any loss in time or schedule. Additionally, Project Management, O'Reilly Media, 2005.
the results helped in pin pointing the optimal number of [9] Jim Ledin, “Simulation Planning” PE, Ledin
resources needed to handle the different phases of the Engineering, 2000.
waterfall model. The optimal number of required analysts [10] Robinson, S., “Modes of simulation practice:
is 5.2, the optimal number of required designers is 11.6, the approaches to business and military simulation”,
optimal number of required programmers is 21.02, the Proceedings in Simulation Modeling Practice and
optimal number of required testers is 7.4, and the optimal Theory, vol. 10, pp. 513-523 , 2002.
number of required maintenance men is 2.09. These
International Journal of Engineering & Technology (iJET), ISSN: 2049-3444, Vol. 2, No. 5, 2012
http://iet-journals.org/archive/2012/may_vol_2_no_5/255895133318216.pdf

[11] Robinson, S., “Soft with a hard centre: discrete-event


simulation in facilitation”, Journal of the Operational
Research Society, vol. 52, pp. 905-915 , 2001.
[12] Balci, O., “Guidelines for successful simulation
studies”, Proceedings of the Simulation Conference,
pp. 25-32, New Orleans, LA, 1990.
[13] R. Sargent, R. Nance, C. Overstreet, S. Robinson, and
J. Talbot, “The simulation project life-cycle: models
and realities”, Proceedings of the Winter Simulation
Conference, 2006.
[14] Chi Y Lin, Tarek Abdel-Hamid, and Joseph S Sherif,
“Software-Engineering Process Simulation model
(SEPS)”, Journal of Systems and Software, Vol. 38,
no. 3, pp. 263-277, 1997.
[15] Shmuel Ur, Elad Yom-Tov and Paul Wernick, An
Open Source Simulation Model of Software
Development and Testing, Hardware and Software,
Verification and Testing, Lecture Notes in Computer
Science, Springer, vol. 4383, pp. 124-137, 2007.
[16] Reuven R. Levary, Chi Y. Lin, “Modeling the
Software Development Process Using an Expert
Simulation System Having Fuzzy Logic”, Journal of
Software, Practice and Experience, vol. 21, no. 2,
pp.133-148, 1991.
[17] B. Boehm and K.J. Sullivan, “Software Economics:
Status and Prospects,” Special Millenium Issue,
Information and Software Technology, 2000.
[18] Leung, H., and Fan, Z., Software Cost Estimation.
Handbook of Software Engineering, Hong Kong
Polytechnic University 2002.
[19] Extreme Chaos (2001), Standish Group, [Online].
Available: http://standishgroup.com/sample_research
/extreme_chaos.pdf
[20] Simphony.NET (2005), University of Alberta,
[Online]. Available:
http://irc.construction.ualberta.ca/html/research/softw
are/simphony.net.html

You might also like