Fe4b PDF
Fe4b PDF
Fe4b PDF
1987
Chris F. Kemerer
Massachusetts Institute of Technology
Recommended Citation
Banker, Rajiv D. and Kemerer, Chris F., "FACTORS AFFECTING SOFTWARE MAINTENANCE PRODUCTIVITY: AN
EXPLORATORY STUDYl" (1987). ICIS 1987 Proceedings. 27.
http://aisel.aisnet.org/icis1987/27
This material is brought to you by the International Conference on Information Systems (ICIS) at AIS Electronic Library (AISeL). It has been accepted
for inclusion in ICIS 1987 Proceedings by an authorized administrator of AIS Electronic Library (AISeL). For more information, please contact
elibrary@aisnet.org.
FACTORS AFFECTING SOFTWARE MAINTENANCE
PRODUCTIVITY: AN EXPLORATORY STUDYl
Rajiv D. Banker
Srlkant M. Datar
Carnegie Mellon University
Chris F. Kemerer
Sloan School of Management
Massachusetts Institute of Technology
ABSTRACT
Systems developers and researchers have long been interested in the factors that affect
software development productivity. Identification of factors as either aiding or hindering
productivity enables management to take steps to encourage the positive influences and to
eliminate the negative ones. This research has explored the possibility of developing an
estimable model of software development productivity using a frontier estimation method. The
approach taken is based upon output metrics for the entire project life-cycle, and includes
project quality metrics. A large number of factors potentially affecting software maintenance
productivity were included in this initial investigation. The empirical analysis of a pilot data
set indicated that high project quality did not necessarily reduce project productivity.
Significant factors in explaining positive variations in productivity included project team
capability and good system response (turnaround) time. Factors significantly associated with
negative variations in productivity included lack of team application experience and high
project staff loading, The use of a new structured analysis and design methodology also
resulted in lower short term productivity. These preliminary results have suggested a number
of new research directions and have prompted the data-site to begin a full scale data collec-
tion effort in order to validate a model of software maintenance productivity.
160
under the control of the project leader, and that The general approach of this research is to model
superior project leaders will make productive use of software development as a microeconomic production
their staffs in ways that do not sacrifice quality process utilizing inputs and producing products.
(Lambert 1984; Mohanty 1981). An example of this This approach is suggested by the work of Kriebel
is the use of software tools, such as data and Raviv (1980; 1982) and Stabell (1982). This
dictionaries or code generators, that both relieve general model is best represented by the simple
project team members of some of the more mundane diagram shown in Figure 1.
tasks while improving quality by ensuring consis-
tency. A second research question that we address
is the relationship between quality and productivity
on a software maintenance project.
161
2. SOFTWARE DEVELOPMENT INPUTS AND We address this problem using Albrecht's Function
PRODUCTS Point metric (Albrecht and Gaffney 1983). The
Function Point metric first counts the number of
The critical input to software development that we unique input types, output types, logical files,
focus on is the amount of professional work-hours external interface files, and external queries handled
expended by the project team. Personnel costs by an application. These counts are then weighted
constitute at least 45% to 50% of a data processing depending upon difficulty and further modified by
department's budget (Grammas 1985), and 80% of the fourteen "complexity factors" defined by Albrecht.5
department's costs at the current data-site. Since Function Points thus capture the magnitude and
professional data processing staff time is the most complexity of the analysis and design task of
expensive and scarce input resource in software various projects.
development, work-hours has been the variable of
interest in most previous studies. Furthermore, the The use of Function Points as a measure of the
cost of the other major input, hardware, (i.e., CPU product of software development has been validated
time, disk storage, etc.) continues to decline, or suggested by Behrens (1983), Vacca (1985), Jones
increasing the ratio of personnel cost to machine (1986), Kemerer (1987), Albrecht (1985), Gaffney
cost. (1986), and Lambert (1984). In a recent Delphi-type
survey by the Quality Assurance Institute (Perry
The identification of consistent, quantifiable 1986), Function Points per man-month was selected
products from the software development process is as the leading productivity measurement by a
probably the single biggest challenge in the field of number of Fortune 500 level firms. In summary,
software metrics. As the final product of any the inputs of the general model were implemented
systems development project is a coded program or with work-hours and the products with Function
programs, the traditional measure has been the Points and Source Lines of Code. (See Figure 2.)
count of the number of written source lines of code
(SLOC).4 SLOC has the advantage of being easily
countable by automated means, in addition to ap-
parently representing the amount of work required
to build a system. The SLOC metric, however, is ENVIRONMENTAL
COMMEWTY
not without its weaknesses. Two common problems *
are comparing programs written in different 1 1
FUNCTION POINIS
languages and comparing the results of studies that 1 PRODUCTION PROCESS 1 (Analysis & Design Phase)
SLOC, however, is actually the product of only one Figure 2. Specific Production Process Model
phase of the project, the programming phase. For
new development projects SLOC is generally
considered to be an accurate surrogate for all 3. ENVIRONMENTAL VARIABLES SELECTION
project activities since larger systems typically
require both more analysis and more programming Previous research on new development has identified
than smaller systems. In the case of maintenance a large number of factors which may have an
projects, this assumption will not, in general, hold. impact on productivity. In addition, detailed discus-
It is easy to imagine a project in a maintenance sions with managers at the data-site led to the
environment with large amounts of effort expended identification of other factors believed to be
in analysis and design that result in relatively few important sources of productivity variation. These
additions or changes to lines of code. Therefore, factors are summarized into the following four
while SLOC is an adequate measure of thi size of categories: personnel, project management, user,
the coding and testing phase, it is inadequate with and technical environment. A brief discussion of
respect to the size and complexity of analysis and each follows. An additional factor that may
design on a maintenance project. influence productivity is the overall quality ofthe
162
Table 1 Personal Factor Variables
abcd e f g h i j S BDK
DP Experien X X X X X X 6 X
Appl Expern XXX X XX 6 X
S/W Expernc XXX X 4
H/W Expernc X X X 3
Facility Expr X 1 X
Capability X 1 X
Education X 1 X
Inhouse % X 1 X
Parttime % X 1
Prog Partic X 1
Age X 1
Morale 1
X
product produced. This variable has generally not in COBOL for IBM mainframes, the facility
been included in previous empirical studies of experience data obviated the need to collect
productivity and is discussed separately in Section software and hardware experience.
3.5.
Capability, or some measure of talent, is often
discussed but rarely used in research of this type
3.1 Personnel Variables due to difficulties in measurement. In this
research, staff capability was captured through the
Personnel variables are widely believed to be use of the personnel review system at the data-site.
critical in affecting the productivity performance of Each staff member is given a yearly review that is
a project team. Table 1 is the first of four tables summarized in a numerical score ranging from 1
showing the selection of productivity variables by (best) to 5 (worst). These data are used as a
other researchers. Each row represents a variable, measure of capability or skill.
and each column a researcher. An "X" indicates
that the variable was used by the researcher. A We also collected information on the highest level
summary column shows the number of previous of education and the amount of in-house versus
researchers using a particular variable, and the last outside contractor staffing. However, the hiring
column ("BDK") designates whether it was used in and personnel policies at the data-site generated a
the current research. From Table 1, it is apparent very homogeneous dataset, and therefore education
that the experience of project team members, and in-house percentage were dropped as potential
measured along one or more dimensions, is believed variables. The remaining variables were each used
to be a critical element. For this study, each by only one of the ten previous researchers, and
project team member's total data processing were not felt to be either important or measurable
experience, his data processing experience at this at the data-site. Therefore, the variables included
facility, and his experience with each application in the model were capability, application experience,
were recorded. As all of the projects were written and data processing experience.
163
Table 2. Project Management Factor Variables
abcd e f g h i j S BDK
Schd Constr X X 2 X
Staff Load X X
Travel X X 22 X
Communictn X 2
3.2 Project Management Variables of staff (average project size for the 65 projects
was 2.6 people) meant that intra-project communica-
Project management variables, including schedule tion was not a critical issue. Therefore, the
constraints, staff loading, travel requirements, and variables considered were deadline pressure and
project communication are less well represented in manpower loading.
the literature. (See Table 2.)
Schedule variables are among the most critical that 3.3 User Variables
may be under a project manager's direct control.
This research also recorded the calendar duration of Although Lientz and Swanson (1981) have discussed
the project in order that the loading, or work- the potential importance of user variables, Table 3
months per calendar month, could be calculated. shows that user variables have played only a limited
None of the projects in the dataset required any role in previous empirical studies.
travel, and their small size in terms of the number
abcd e f g h i j
S BDK
High Reliab X X X 3 X
Reqmt Volat X X X.3 X
User Partic X X 2 X
# User Orgn X 1 X
Usr DP Knwl X 1 X
Usr Appl Kn X 1 X
164
Table 4. Technical Environment Factor Variables
abcd e f g h i j S BDK
Mod Prog Pr XX X X X5 X
Tools X X X X 4 X
Respons Time X X X - 4 X
Language X X X X 3 X
Volatility X X 2 X
Reusbl Code X 1
Classified X 1
Distance X 1
Exist Docum 0
Earlier research suggested two important user change in the underlying environment in which the
variables -- high user-required reliability (the application is being written.
importance placed on avoiding system failure) and
requirements volatility (the degree to which the Use of reusable code was in its infancy at the
user-stated requirements changed over the course of data-site during the period when data were being
the project). Additionally, discussion with staff collected, and insufficient data were available on its
members at the data-site indicated the perceived use. None of the work at the data-site was
importance of the following variables: the degree classified, and distance to the machine room has
of user participation in the project, number of user ceased to be a variable of interest with modern
organizations having signoff responsibility, the teleprocessing. We included one new variable, good
user's data processing knowledge, and the user's quality documentation, that is perceived to be
application knowledge. Information on these significant in a maintenance environment. The
variables was obtained from the project leader and quality of the documentation was rated by the
validated by his or her section head. Controls on project leader. Of the variables measured, tools
this data collection are described in Section 5. and language were dropped due to lack of variance
in the data. Therefore, four technical environment
3.4 Technical Environment Variables variables were included in the model: use of
modern structured analysis, design, and programming
Technical environment variables, shown in Table 4, practices; presence of good interactive response or
have a long history of inclusion in productivity batch turnaround time; hardware or software
models. This likely reflects practitioners' hopes for volatility, and good documentation.
technological solutions to the productivity problem
and researchers' attempts to find those solutions. 3.5 Measurement of Project Quality
Five variables of interest suggested by the litera-
ture are the use of modern programming practices, While the emphasis on measuring the size of a
use of software tools, response time, choice of system is clearly critical to productivity measure-
language and hardware/ software volatility. Hard- ment, it could be argued that a size metric, such as
ware/software volatility reflects the amount of SLOC, without a measure of the quality of those
165
lines, is insufficient. Two important dimensions of Envelopment Analysis (DEA). DEA uses a linear
program quality are adherence to specifications and programming approach to identify the most efficient
freedom from defects. projects (Banker, Charnes and Cooper 1984; Charnes,
Cooper and Rhodes 1981).
A significant body of literature exists on the
construct of user satisfaction. Unfortunately, the DEA is an appropriate tool for this purpose for
focus of that research is general user satisfaction several reasons. First, since no preset standards
with a data processing department, whereas our exist, productivity needs to be evaluated relative to
focus is on user satisfaction across projects within other projects, which is the basis of the DEA
the same department. However, a survey instrument efficiency rating. Second, software development
for measuring user project satisfaction has been produces multiple products, so that simple partial
developed by Powers (1971) and was later validated productivity ratio measures are insufficient. Third,
by McKeen (1983). This instrument was used in the DEA does not impose a parametric form on the
current research. production function and only assumes a monotonic
and convex relationship between inputs and
The converse problem exists in the software quality products. Given the limited knowledge about the
research literature; that is, many of the metrics production process underlying software development,
developed in this area have generally been too specifying a parametric form such as Cobb-Douglas
specific, at the level of a line of code or groups of (Stabell 1982) for the production correspondence is
lines of code within a program.6 The data required difficult to substantiate theoretically or validate
for this micro level of detail were not available at statistically and it is not immediately apparent what
the data-site. However, a recent survey (Perry restrictions these hypotheses, treated as axioms in
1986) by the Quality Assurance Institute suggests the econometric approach, impose on the production
three quality metrics as the most widely accepted in correspondence.
industry. These are user perceived functional
quality, user software satisfaction, and production We next explore the average impact of different
jobs processed without incidence. The first two of environmental factors on the DEA efficiency rating.
these are covered by the Powers instrument. The Since our objective is to identify the average
third idea was developed into a site-specific quality impact of environmental factors on productivity, we
metric that rated projects as average, above use multivariate regression analysis. The general
average, or below average with respect to the idea is that two projects could be identical in terms
problems encountered after the project's software of their outputs, yet one may have environmental
was turned operational. This metric is described in factors (such as poor hardware response time) that
Section 5.2.3. causes that project to consume more labor hours.
The latter project will be rated as inefficient
4. EXPLORATORY MODEL AND ESTIMATION relative to the former, since both produced the
same outputs but the second required more inputs.
The primary purpose of this initial analysis is to The purpose of the analysis will be to isolate and
investigate the potential impact of a number of measure the factors that may have influenced the
potential productivity factors. We model the actual productivity ratings.
input resources (labor hours) used as a multiplica-
tive function of the primary production correspon- 5. DATA COLLECTION
dence and the environmental factors. This general
model is similar to others developed in the litera- 5.1 Data Source
ture (Albrecht and Gaffney 1983; Boehm 1981). It
should be noted that since the product requirements Data for this research were collected at a large
are prespecified, we model the primary production regional bank's data processing department. The
correspondence as the minimum amount of input types of applications represented are typical
resources required to produce the prespecified financial transaction processing systems, and are
product, which is described in terms of Function written in COBOL to run on IBM hardware. COBOL
Points and SLOC. Since our objective is to and IBM are the most widely used software and
estimate the minimum (rather than the average) hardware in commercial data processing and
consumption of input resources, we adopt an therefore this site is likely to be representative of
extremal or frontier estimation technique, Data much of current business data processing.7 The
166
data processing department is divided into eighteen 5.2 Data Collection Methods and Controls
"sections," which are organized around common sets
of applications. Three of the sections were selected This section describes the main data types and how
by the Bank as representative of the department as they were collected. When possible, we attempted
a whole. Two criteria were used to select projects to use data already collected and employed by the
completed by these three sections: size and Bank, rather than developing new data collection
recency. Selecting larger projects allows the instruments that would impose additional burden on
examination of the projects that consume the bulk Bank staff.
of the Bank's resources. Project size is also
important in that the factors affecting productivity 5.2.1 Professional work-hours
on short, one person projects are likely to be
overwhelmed by individual skill differences across The key input variable was the number of work-
project staff members (DeMarco 1982; Sackman, hours charged by project by person. Previous
Erikson and Grant 1968). We only considered research has generally been satisfied with work--
"significant" projects at the Bank that cost a hours by project only. The limitation of that
minimum of $5,000 in internal dollars. approach is immediately apparent if a 1000 work-
hour project staffed by a team of veteran program-
Project recency is important for two reasons. Since mer/analysts who were also intimately familiar with
data were collected retrospectively, old projects the application being provided is compared with one
were not included because personnel turnover and staffed by a team of novices. Intuition suggests
lack of documentation retention made data collec- that the former team is likely to be more produc-
tion impossible. Second, using only recent projects tive, yet much prior research has treated both of
legitimizes cross project comparisons in that the these simply as "two 1000 work-hour" projects.
technology and personnel involved are likely to be This paper characterizes the actual work-hours
very similar. After discussions with Bank staff, expended along a number of dimensions, particularly
only projects completed within the 18 month period experience and capability.
between January 1, 1985 and July 1, 1986 were
included in the study. Data were collected during The characterization of work-hour data was
the summer of 1986. Due to a number of factors, accomplished via a personnel survey that requested
including reorganizations, the conversion to a new each project member to fill in data on his or her
time reporting system, use of contractors, personnel total data processing experience, data processing
turnover, and the elimination of a few unsuitable experience at the Bank, application experience, and
(i.e., non-COBOL) projects, complete data were education. These forms were matched to the
available for only 65 of the 84 potential projects. records in time reporting via an employee number.
These 65 projects have the characteristics shown in Forms were not received from all project members
Table 5. charging time, chiefly due to the fact that the
167
individual had transferred or left the Bank or had team for a second review for completeness and
been an outside contractor. In the event that the reasonableness.
hours on a project could not be categorized, that
project was dropped from the study. 5.2.3 Quality data collection
5.2.2 Product size and environmental complexity An important issue in measuring productivity is
factors whether the products of efficient projects are of
the same quality as those of less efficient projects.
Product size and environmental complexity data This study addressed two questions as adjuncts to
were collected via a survey of project leaders. The the efficiency measure generated. These metrics
size data collection form captured data on should not be confused with general measures of
systems effectiveness.
o Function Points
o New and modified source lines of code The first quality concept is that of operational
quality, whether the system operates smoothly once
while the environmental complexity form captured it is implemented. This measure was generated by a
data on staff section within the Bank from three existing
sources:
o Function Point complexity
o Project management o daily abnormal end (ABEND) report
o User factors o weekly section status reports
o Technical environment o ad hoc user problem reports
Due to the broad nature of the phenomenon Data from the two month period following imple-
modeled, a large number of factors were identified mentation were compared with data from the
as possible variables. In order to make the data previous twelve months' trend. Significant devia-
collection effort feasible at the field site, most of tions resulted in above or below average operational
the factors were measured in only one way. This quality ratings. The control for this measure was
raises the question of whether any factors were not to forward the ratings for each section to the
shown to be significant due to method variance. appropriate section head for review.
One control that was used to mitigate this was to
use questions drawn from previous research The second quality concept is that of user project
whenever possible. satisfaction. A survey, based on the Powers
instrument, was sent to the users who had
A number of steps were taken to assure that the requested the individual projects. These forms were
data collection forms would be filled out as returned directly to the research team without
accurately as possible. A training session to walk review by the sections.
through the data collection forms was held for all
project leaders and their section heads. In 6. DATA ANALYSIS
addition, a member of the research team was on-
site during the entire data collection process and There were two broad objectives to the data
provided ad hoc support to project leaders. An analysis: 1) to determine the appropriateness of the
automated tool was also available to aid in the general approach of using DEA for software
counting of source lines of code. development analysis, and 2) to identify which
factors in our pilot data sample seemed to be the
The following controls were established to attempt most important and therefore merit further
to provide additional assurances of data validity. investigation in future research. The results of
After the project leader had completed the data these analyses are presented below.
collection form, it was first reviewed by the section
head. As each project was compared only to other 6.1 DEA Efficiency Ratings
projects within the same section, the review by the
section head also ensured consistency across A DEA efficiency score for each project was
projects. After review by the section head, the developed using metrics for total work hours,
data collection form was forwarded to the research Function Points, and SLOC. Three separate primary
168
Table 6. Summary of DEA Results
production functions were estimated using DEA (one SLOC, and Functions Points. This three variable
for each section). This was done to ensure that model did not exhibit appreciably more explanatory
projects were being rated only against similar power than the two variable model (RG of .698
applications. versus .689). The null hypothesis of equality of the
, estimated coefficients for New SLOC and Modified
One interesting result was that all three sections SLOC could not be rejected at the 10% level
showed wide variations in productivity, as shown in (Pindyck and Rubinfeld 1981). Therefore, the three
Table 6. variable model was not pursued in the interests of
parsimony. In summary, we conclude that the
This is consistent with much of the literature on original assumptions regarding the choice of metrics
software development productivity, particularly that sufficiently represent the product of the software
dealing with individual differences. In addition, the maintenance process. We use the DEA efficiency
distribution of efficiency results within each section ratings to examine the effects of the environmental
is consistent across sections, which supports the variables on productivity.
pooling of the individual section results in the
multivariate regression analysis. 6.2 Multivariate Regression Results
We also estimated a linear regression model The reciprocal of the DEA efficiency score is
(consistent with that of other researchers) with regressed against the environmental variables
work-hours as the dependent variable and Function described in Section 3 in a multivariate regression
Points and total SLOC (new SLOC plus modified model. A summary of this model appears in
SLOC) as the independent variables. This model Table 7, and we discuss each of the significant
(presented below) showed that these two measures variables in turn.
of size were excellent predictors of total effort.
The dependent variable is the reciprocal of the DEA
efficiency score. Therefore, the interpretation of
Actual work-hours = 355.0 + 3.49(FP) + .03(SLOC) the signs of the coefficients is that positive (+)
(.11) (7.25) (3.76) signs show reduced productivity, while negative (-)
signs show increased productivity. The R' for the
R'=69.9% (R' = 68.9) entire sixteen variable model is .53 (F-value of 3.32
is significant at the 1% level). The value of the
One concern with this model might be multicol- intercept was 1.9 (t=1.98). The Belsley-Kuh-Welsch
linearity, as previous research on new development (1980) test did not indicate any multicollinearity
projects has shown a correlation of .94 and greater problems.
between Function Points and SLOC (Albrecht and
Gaffney 1983). Our earlier discussion suggests that It should be noted that the R2 indicates the amount
Function Points and SLOC are not likely to be as of variation in productivity explained. Other
highly correlated in the case of maintenance researchers have explained the variation in hours,
projects. Indeed, the correlation between Function where the independent variables have included size.
Points and SLOC is .57 in this dataset. For our dataset, an analogous regression model with
hours as the dependent variable and size and
An alternative model was also estimated, utilizing environmental complexity as the independent
three independent variables: New SLOC, Modified variables produces an R = .85 (F = 14.02).
169
Table 7. Summary of the Regression Model
170
total number of work-months divided by the total 6.2.4 Technical environment variables
project duration in calendar months. Higher loading
indicates a greater amount of parallelism on the The most significant variable in the list of
project, plus a possible increase in the amount of Technical Environment variables was GOODRESP, a
project communications. This was found to have a dummy variable indicating either an interactive
negative impact on productivity at the 5% level, and development environment or good (< 4 hours) batch
is consistent with the results of researchers on new turnaround. This had the effect of improving
software development. productivity, which is intuitive, and consistent with
some limited research in this area (Boehm 1981;
A second significant variable is deadline pressure. Lambert 1984).
Project leaders and their section heads were asked
on their surveys whether there was greater than A second significant variable was the dummy
average deadline pressure on the project. This variable STRCMETH, which indicated the use of a
variable, TIGHTDEAD, was found to be a significant structured analysis and design methodology based on
boost to productivity, at least in the short run the Gane/Sarson principles and tools. Projects
sense indicated by this measure. The explanation using this methodology were less productive than
seems to be that increased deadline pressure those that did not, an initially eye-opening result
reduces, at least for the duration of the project, for the managers at the Bank, but one that actually
some amount of the stack that is present in any makes a good deal of sense upon close scrutiny.
organization. Whether an organization would want What is being measured is a snapshot of short-term
to pursue this tactic as a long-term strategy is productivity, not long-term productivity. Many of
questionable, however, given the likely deleterious the benefits of using a detailed methodology that
effect on morale and the resulting increase in requires a lot of documentation are not observed
turnover. This is particularly important in light of until the next project, when enhancement or repairs
the significance of the application experience need to be made to the system. In the short term,
variable. the extra effort is not necessarily going to show
any benefit, and the extra hours will show up as
reduced productivity. Additionally, it should be
6.2.3 User variables added that use of this methodology was new at the
Bank, and was, at least for one of the sections,
In general, the significance of the user variables exactly coincident with the projects collected in
was low.9 Given the Bank staff's a priori sugges- this dataset. Therefore, this factor may also
tions, this was a surprising result. It may be that exhibit a learning curve.
the impression that poor user relationships leave
with project leaders is greater than their actual VOLATLTY, defined as frequent changes to the
effect on project productivity. hardware/software environment, either every few
weeks for major changes or every few days for
The most significant of the user variables was minor changes, was only marginally significant.
INTERNAL. A survey question concerning the This variable was also shown to reduce productivity
number of user signoffs required was designed to (consistent with other researchers, see Boehm 1981),
identify those projects that needed to reach although only at the 1596 significance level.
agreement across multiple users. In terms of
responses, however, very few projects had greater Good documentation (GOODDOC) had been suggested
than one user, while a significant number of by managers at the Bank as a potential important
projects were internally generated, typically to factor in explaining productivity. As shown in
increase efficiency or throughput on an application. Table 7, it was not a significant factor.
This variable was coded as a zero-one dummy,
where INTERNAL = 1 meant that no outside users 6.2.5 Quality as a productivity variable
were involved. The fact that these projects may be
more efficient is not surprising, since the removal One remaining question about these productivity
of the need to communicate specifications across measures is the relationship between the most
departments and the likely reduced documentation productive projects and quality: Do projects with
burden would aid in increasing efficiency of product high quality exhibit high productivity, or is high
development. productivity attained only by sacrificing quality?
171
Table 8. Operational Quality Versus Productivity
Low Quality 3 1 5
Med Quality 15 12 16
High Quality 3 8 2
No 7 3 8
Yes 6 12 9
A first attempt to answer this question would 7. CONCLUSIONS AND FUTURE RESEARCH
involve adding the two quality metrics, operational
quality (QUALITY) and user project satisfaction
(CUSTACCP), as independent variables in the This paper explored the potential of developing a
multiple regression analysis. Unfortunately, only 45 DEA-based model of software maintenance produc-
of the 65 user surveys were returned, and therefore tivity and sought to identify productivity factors
CUSTACCP was not employed as an independent that merit further study. The estimation of this
variable. However, QUALITY was added and was model using pilot data collected from a large
not significant. (See Table 7.) A second approach commercial bank have suggested several interesting
was to cross-tabulate the data as shown in Tables 8 insights. Our analysis indicates that the factors
and 9. that affect new software development (particularly
personnel experience and capability) also seem to
Table 8 shows the operational quality data versus an play an important role in software maintenance.
aggregation of the productivity data. For low Our analysis suggests that high productivity appears
quality projects, there is an approximately equal to be possible in a maintenance environment without
chance of a low or high productivity rating, and sacrificing quality, and that the quality/productivity
similarly for high quality projects. This explains relationship bears further investigation.
why the quality variable is not found to be signifi-
cant in the exploratory multivariate regression This research has raised many questions which
analysis. Therefore, the data from this data-site do suggest possible avenues for future research. An
not support the hypothesis that achieving high interesting methodological extension would be the
productivity or quality requires sacrificing the simultaneous consideration of output and input
other.10 variables as well as the environmental factors in a
single model. Another methodological extension
Table 9 was constructed by converting a five point would involve estimating a stochastic frontier using
scale on which data for user project satisfaction techniques of Stochastic DEA. This involves a
were converted into a dummy variable, above composed error formulation with a two-sided random
average user project satisfaction. As with opera- component and a one-sided error caused by
tional quality, a relatively random spread of quality- inefficiencies.
productivity occurrences is seen.11
172
Another area for further work stems from the fact ease, operational ease, multiple sites, and flexibility
that the productivity measures used in this analysis (see Albrecht 1984).
are clearly short-term. The long-term impact on
productivity of some of these factors (particularly 6 See, for example, the survey by Mohanty (1979).
the use of structured analysis and design method-
ologies) would be an interesting extension. The 7 Of course, while the dataset contains 65 projects,
notion of long-term productivity is related to they were all gathered within one organization.
quality in that a better quality product today should Therefore, the external validity of the results
result in less maintenance in the future. Research remains to be demonstrated.
could be directed at modeling software quality as a
primary goal, rather than as an adjunct as was done 8 The correlation coefficient between LODPEXP and
here. Finally, a larger and richer dataset could LOAPPEXP is .30.
allow more detailed examination of the factors and
their possible interplay, an exercise not really 9 We tested the possibility of high correlation
feasible with the limited amount of data available in among the user variables. The highest correlations
this study. between user variables (and the highest correlation
of any independent variables) were between
STAFFAPT and LOAGREE (.46) and between
ENDNOTES STAFFAPT and INTERNAL (.43). Six separate
regressions were run with one user variable as the
1 ThiS research was funded in part by the Center dependent variable and the other five as indepen-
for the Management of Technology and Information dents, varying the dependent variable on each run.
in Organizations, Graduate School of Industrial The best fit (R2 =.36) was for the STAFFAPT
Administration, Carnegie-Mellon University, and the variable. While this degree of correlation was not
International Business Machines Corporation. believed to be large, a run of the main productivity
model was made, omitting STAFFAPT. The t-
Helpful comments from four anonymous referees are statistic for LOAGREE improved, but not enough to
gratefully acknowledged. make it a significant variable at the 10% level.
2 The term "development" is used here in its most 10 The chi-squared test value is 7.93, significant at
general sense, which includes maintenance the 1096 level. The explanation for this relatively
programming. The term "new development" will be high value is that there seems to be some drift
used in this paper to describe programming that is towards average productivity (neither high nor low)
strictly the generation of new code. when the quality is high. Note that the chi-square
test may be inappropriate in this case, given that
3 Note that this is in contrast to many other 2/3 of the cells have expected values less than five.
production settings where the manager has fixed
inputs and desires to maximize output. 11 The chi-squared test value is 3.89, significant at
the 15% level. The explanation of these results
4 Computer scientists have also developed what given in footnote 10 applies here as well, mutatis
might be termed "micro" software metrics, those mutandis.
below the level of a source line of code. Examples
of these would be the software science metrics of REFERENCES
Halstead (1977) and the complexity measure of
McCabe (1976). These metrics have generally not Albrecht, A. J. AD/M Productivity Measurement
been applied to large scale software productivity and Estimate Validation. CIS & A Guideline 313,
due to difficulties in measurement. IBM Corporate Information Systems and Administra-
tion, November 1, 1984.
5 These are data communications, distributed
processing, application performance objectives, Albrecht, A. J. "Function Points Help Managers
heavily used configuration, high transaction rate, Assess Application, Maintenance Values."
online data entry, end user efficiency, online Computerworld Special Report on Software Produc-
update, complex processing, reusability, installation tivity, CW Communications, 1985, pp. SR20-SR21.
173
Albrecht, A. J., and Gaffney, J. Jr. "Software Software Engineering, Vol. SE-12, No. 3, March
Function, Source Lines of Code, and Development 1986, pp. 496-499.
Effort Prediction: A Software Science Validation:
IEEE Transactions on Software Engineering, SE-9, Gayle, J. B. "Multiple Regression Techniques for
No. 6, November 1983, pp. 639-648. Estimating Computer Programming Costs.' Journal
of Systems Management, Vol. 22, No. 2, February
Banker, R.; Charnes A.; and Cooper, W. "Some 1971, pp. 13-16.
Models for Estimating Technical and Scale
Inefficiencies in DEA." Management Science, Vol. Grammas, G. W., and Klein, J. R. "Software
30, No. 9, September 1984, pp. 1078-1092. Productivity as a Strategic Variable: Interfaces,
Vol. 15, No. 3, May-June 1985, pp. 116-126.
Behrens, C. A. "Measuring the Productivity of
Computer Systems Development Activities with Halstead, M. H. Elements of Software Science.
Function Points." IEEE Transactions on Software Elsevier, New York, 1977.
Engineering, SE-9, No. 6, November 1983, pp. 648-
652. Jeffery, D. R., and Lawrence, M. J. "Managing
Programming Productivity: Journal of Systems and
Belsley, D.; Kuh, E.; and Welsch, R. Regression Software, Vol. 5, 1985, pp. 49-58.
Diagnostics. John Wiley and Sons, New York, 1980.
Jones, C. Programming Productivity. McGraw-Hill
Boehm, B. W. Software Engineering Economics. Book Company, New York, 1986.
Prentice-Hall, Englewood Cliffs, NJ, 1981.
Kemerer, C. F. "An Empirical Validation of
Case, A. F. "Computer-Aided Software Engineering." Software Cost Estimation Models." Communications
Database, Vol. 17, No. 1, Fall 1985, pp. 35-43. of the ACM, Vol. 30, No. 5, May 1987, pp. 416-429.
Charnes, A.; Cooper, W. W.; and Rhodes, E. Kolodziej, S. "Gaining Control of Maintenance."
"Evaluating Program and Managerial Efficiency: An Computerworld Focus, Vol. 20, No. 7A, February 19,
Application of Data Envelopment Analysis to 1986, pp. 31-36.
Program Follow Through." Management Science,
Vol. 27, No. 6, June 1981, pp. 668-697. Kriebel, C. H. "Evaluating the Quality of Informa-
tion Systems: In N. Szyperski and E. Grochla
Chrysler, E. "Some Basic Determinants of Computer (eds.), Design and Implementation of Computer
Programming Productivity." Communications of the Based Information Systems, Sitjhoff & Noordhoff,
ACM, Vol. 21, No. 6, June 1978, pp. 472-483. The Netherlands, 1979, Chapter 2, pp. 29-43.
Curtis, B. "Substantiating Programmer Variability: Kriebel, C. H., and Raviv, A. "An Economics
Proceedings of the IEEE Conference, Vol. 69, No. 1, Approach to Modeling the Productivity of Computer
July 1981, p. 846. Systems: Management Science, Vol. 26, No. 3,
March 1980, pp. 297-311.
DeMarco, T. Controlling Software Projects.
Yourdon Press, New York, 1982. Kriebel, C. H., and Raviv, A. "Application of a
Productivity Model for Computer Systems." Decision
Elshoff, J. L. "An Analysis of Some Commercial Sciences, Vol. 13, April 1982, pp. 266-284.
PL/I Programs: IEEE Transactions on Software
Engineering, Vol. SE2, No. 2, 1976, pp. 113-120. Lambert, G. N. "A Comparative Study of System
Response Time on Program Developer Productivity:
Freedman, D. H. "Programming Without Tears." /BM Systems Journal, Vol. 23, No. 1, 1984, pp. 36-
High Technology, Vol. 6, No. 4, April 1986, pp. 38- 43.
45.
Lientz, B. P., and Swanson, E. B. Software
Gaffney, J. E. "The Impact on Software Develop- Maintenance Management. Addison-Wesley, Reading,
ment Costs of Using HOL's." IEEE Transactions on MA, 1980.
174
Lientz, B. P., and Swanson, E. B. "Problems in Putnam, L. H. "General Empirical Solution to the
Application Software Maintenance." Communications Macro Software Sizing and Estimating Problem."
of the ACM, Vol. 24, No. 11, November 1981, pp. IEEE Transactions on Software Engineering, Vol. 4,
763-769. 1978, pp. 345-361.
McCabe, T. "A Complexity Measure." LEEE Rubin, H. A. Using EST/MACS E. Management and
Transactions on Software Engineering, Vol. SE-2, Computer Services, Inc., Valley Forge, PA.
December 1976, pp. 308-320.
Sackman, H.; Erikson, W. J.; and Grant, E. E.
McKeen, J. D. "Successful Development Strategies "Exploratory Experimental Studies Comparing Online
for Business Application Systems." MIS Quarterly, and Offline Programming Performance." Communica-
Vol. 7, No. 3, September 1983, pp. 47-65. tions of the ACM, Vol. 11, No. 1, January 1968, pp.
3-11.
Mohanty, S. N. "Models and Measurement for
Quality Assessment of Software." ACM Computing Scott, R. F., and Simmons, D. "Programmer
Surveys, Vol. 11, 1979, pp. 251-275. Productivity and the Delphi Technique:
Datamation, Vol. 20, No. 5, May 1974, pp. 71-73.
Mohanty, S. "Software Cost Estimation: Present
and Future." Software: Practice and Experience, Stabell, C. B. "Office Productivity: A Micro-
Vol. 11, 1981, pp. 103-121. economic Framework for Empirical Research."
Office Technology and People, Vol. 1, No. 1, 1982,
Parikh, G. "Restructuring Your COBOL Programs." pp. 91-106.
Computerwor/d Focus, Vol. 20, No. 7A, February 19,
1986, pp. 39-42. Vacca, J. "Function Points: The New Measure of
Software." Computerworld XIX, No. 46, November
Perry, W. E. The Best Measures for Measuring Data 18,1985, pp. 99-108.
Processing Quality and Productivity. Quality
Assurance Institute, 1986. Walston, C. E., and Felix, C. P. "A Method of
Programming Measurement and Estimation: IBM
Pindyck, R. S., and Rubinfeld, D. L. Econometric Systems Journa/, Vol. 16, No. 1, 1977, pp. 54-73.
Models and Economic Forecasts. McGraw-Hill Book
Company, New York, 1981. Wolverton, W. R. "Cost of Developing Large Scale
Software: IEEE Transactions on Computers, Vol.
Powers, R. F. An Empirical Investigation of 23, June 1974, pp. 615-634.
Selected Hypotheses Related to the Success of
Management Information System Projects. Zavala, A. Research on Factors thal Influence the
Unpublished Ph.D. thesis, University of Minnesota, Productivity of Software Development Workers.
April 1971. Final Report 4677-85-FR-68, SRI International, June,
1985.
175