Purdue University
Purdue e-Pubs
Computer Science Technical Reports
Department of Computer Science
1975
he Dependence of Operating System Size Upon
Allocatable Resources
Atilla Elci
Report Number:
75-172
Elci, Atilla, "he Dependence of Operating System Size Upon Allocatable Resources" (1975). Computer Science Technical Reports.
Paper 117.
htp://docs.lib.purdue.edu/cstech/117
his document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries. Please contact epubs@purdue.edu for
additional information.
The Dependence of Operating System
Size Upon Allocatable Resources
AtiJ.la Elci
computer Science Department
Purdue University
Nest Lafayette, Indiana 47907
December 1975
CSD-TR 172
Atilla Elci
1
ABSTRACT
A detailed analysis of 40 operating systems demonstrates that the
number of instructions required to control their resources varies over
four orders of magnitude, but that 88% of this variance depends only upon
the number of unique types of resources controlled.
Differences in the
size contributions of different resource types were not detectable.
In addition, a software theoretic model is derived, and shown to be
in accord with the exponential nature of the data.
Atilla Elci
2
I.
INTRODUCTION
In the past, the factors which contribute to the size of an operating
system have not been well understood.
In fact, even the efforts to obtain
procedures for estimating program sizes in general have not produced highly
accurate methods [1,2,3,4,5,6].
However, some insight into the processes
which might be operating is alluded to by Brooks (2] in relation to the
programming team size, and by Birkhoff [7] in relation to computational
complexity of combinatorial problems.
since a wide range of reliable data may often contribute to better
understanding of any phenomenon, it is the primary purpose of this paper to
present the results of a detailed analysis [8] of some 40 different operating
systems, ranging from the smallest which could be found to some of the
largest.
For each of these systems, the count of unique allocatable
resource types was obtained (as defined in Section II), together with a
count of those instructions which control them.
The high correlation ob-
served between the number of instructions and the number of unique resource
types is shown in the statistical analysis of Section III.
Borrowing some concepts and software relationships from a developing
research area designated as Algorithm Dynamics or Software Physics, Section IV
introduces the derivation of an equation relating the size to the number of
distinct resources.
Atilla Elci
3
II.
A.
DEFINITIONS.
THE DATA BASE.
Definition of resource types.
In general, any facility of the computing system or the operating
system required by a job including hardware and software alike might be
recognized as a resource.
We revise this definition by a restriction which
will disqualify some facilities.
A facility can become a resource if the
operating system has control over it, or there is at least a mutually
responsive protocol established between the facility and the operating
system.
Thus, our definition of a resource type can be stated as follows:
A unique physical device or the program library whose control
is among the operating system's capabilities.
The following list includes all of the resource types encountered in this
study:
Central facilities:
l.
Central processing unit,
2.
Memory,
3.
System timer (real-time clock) ,
4.
System console,
5.
The program library,
Storage or input/output devices:
6.
Tape drive,
7.
Disk unit,
8.
Drum unit,
Atilla Elci
9.
4
Cassette,
10.
Card-random-access memory (CRAM),
11.
Teletype terminal,
12.
Display terminal,
13.
Card reader,
14.
Paper tape reader,
15.
Keyboard,
16.
RJE (CR/LP) station,
17.
MICR/OCR,
18.
Hand-held numeric/function keyboard,
19.
Microscope terminal,
20.
Scanning/measuring projector,
21.
Buttonboard,
22.
Precision CRT,
23.
Photomultiplier,
24.
Mirror assembly,
25.
Line printer,
26.
Plotter,
27.
Paper tape punch,
28.
Scanner,
29.
Card punch,
Analog/digital facilities:
3D.
A/D channels,
31.
Discrete output lines,
32.
Discrete input lines,
33.
Analog input lines,
34.
Analog output lines,
35.
A/D converter,
AHlla Elci
5
Remote processors:
B.
36.
CDC 6500,
37.
CDC 6400,
38.
IBM 360/40,
39.
IBM 360/44,
40.
IBM 360/22,
4l.
IBM 7094,
42.
IBM 7750,
43.
IBM 1401,
44.
PDP 11/45,
45.
PDP 1,
46.
DGC SUPERNOVA,
47.
XDS 9300,
48.
MODCOMP IV,
49.
AD-4,
50.
Elbit 100,
5l.
Digital filter unit,
52.
Polly
(l1PuP)
Definition of the size of a CFOS.
An operating system consists of two bodies of programs:
1.
Control functions part,
(CFOS), is composed of the routines
implementing the control of the system resources, and the
communication with the environment.
in three divisions:
and (iii)
(i)
These routines can be collected
Job management,
Data management parts;
fii)
Task management,
Atilla Elci
2.
6
The processing programs part, (PPOS), is composed of language
translators, service programs, and problem programs.
For the size of a CFOS, we will choose the number of instructions as
the unit of measure for no better reason than that programs implement
algorithms whose actions are expressed in instructions.
Then, the size
of a CFOS is the total additive number of machine instructions of the group
of routines which implement the CFOS functions, and the PPOS is ignored.
C.
The data base.
Forty operating systems are included in our forty point data hase.
Each operating system is represented by the two variables defined above:
the number of its distinct resources, and the size of its CFOS.
Each
entry also includes additional information as to the identity of the
individual resources available.
The data base is displayed in Tahle 1.
Ati11a E1ci
7
Table 1.
The data base collected.
NUMBER OF
RESOURCES
NAME OF OS
SIZE
(K instr.)
+
LIST OF RESOURCES
•
POLLY
(J,Jpup)
5
.016
TLMTR
(IBM 7094)
6
.403
1,2,11,13,25,36
PILOT TSDS
(UNIVAC 1108)
7
.474
1,2,3,4,5,13,25
DIGITAL FILTER LAB. (PDP 11/10)
8
1.034
1,2,4,7,11,14,18,51
SOS
(DGC 800)
9
2.816
1,2,4,5,6,11,14,25,26
SOS
(DGC 800)
9
2.816
1,2,4,5,9,11,14,25,26
DOS
(IBM 360/30)
10
3.219
1,2,3,4,5,6,7,13,25,29
SNOS
(DGe SUPERNOVA)
10
4.801
1,2,3,4,12,15,21,24,38,52
PCP
(IBM 360/30)
10
4.918
1,2,3,4,5,6,7,13,25,29
ABMS
(PDP 11)
10
5.000
1,2,3,11,14,27,31,32,33,34
RT-l1
(PDP 11)
10
5.250
1,2,3,4,5,7,11,12,26,30
TUMTR
(IBM 7094)
10
5.943
1,2,3,6,11,13,25,28,36,43
FOURIER OS
(HP 21005)
10
6.042
1,2,4,5,11,12,14,15,26,35
RC 4000 MS
(RC 4000)
10
6.200
1,2,3,4,6,7,11,14,25,27
M20'5
(MODCOMP II)
10
8.058
1,2,3,11,13,25,26,36,40,44
M3¢S
(MODCOMP III)
10
8.058
1,2,3,11,13,25,26,36,40,41
OS/MFT
(IB!'l 370/145)
10
21.312
1,2,3,4,5,6,7,13,25,29
vs
(PDP 15/30)
11
11.881
1,2,4,5,6,7,11,12,13,14,25
ZANEMAR
(NCR 4130)
11
12.000
1,2,4,5,10,11,13,14,25,27,50
05/1621
(MICRODATA 1621)
11
13.900
1,2,4,5,6,7,11,13,14,25,27
MINIMOP 2
(ICL 1909)
11
14.000
1,2,3,4,5,6,7,11,14,25,27
DOS 15
(PDP 15/30)
11
14.828
1,2,4,5,6,7,11,12,13,14,25
MAXIMOP
(ICL 1909)
11
15.000
1,2,3,4,5,6,7,11,14,25,27
soc
(IBM AN/FSQ-32)
11
16.384
1,2,5,6,7,8,12,13,29,43,45
TSS
1,2,22,23,46
Atilla Elci
8
Table 1 cant.
NUMBER OF
RESOURCES
NAME OF OS
SIZE
(K instr.)
LIST OF RESOURCES
•
SAS
(NCR 4130)
11
21. 000
1,2,4,5,10,11,13,14,25,27,50
CREOPS
(NCR 4130)
11
28.000
1,2,4,5,10,11,13,14,25,27,50
PS/MFT
(IBM 360/40)
11
29.632
1,2,3,4,5,7,11,19,20,39,46
PS/MFT
(IBM 360/44)
11
31. 309
1,2,3,4,5,6,7,13,25,29,38
GEORGE 3
(ICL 1904A)
11
40.000
1,2,4,5,6,7,8,11,13,16,25
MAC SYSTEM (IBM 7094)
12
32.000
1,2,3,5,6,7,8,12,13,25,29,42
CTSS
(IBM 7094)
12
32.700
1,2,3,5,6,7,8,12,13,25,29,42
CTSS
(IBM 7094)
12
32.768
1,2,3,5,6,7,8,11,12,13,25,29
OS/MVT
(IBM 370/155)
12
42.076
1,2,3,4,5,6,7,11,12,13,25,29
OS/MFT
(IBM 370/155)
12
48.094
1,2,3,4,5,6,7,11,12,13,25,29
RSX 15
(PDP 15/30)
12
49.920
1,2,3,4,5,6,7,11,12,13,14,25
EXEC B
(UNIVAC 1106)
12
60.000
1,2,3,4,5,6,7,11,12,13,25,29
CP/67+CMS
(IBM 360/67)
13
64.363
1,2,3,4,5,6,7,8,11,12,13,25,2S
OS/VSl
(IBM 370/145)
13
109.300
OS/MVT
(IBM 360/65)
14
84.494
1,2,3,4,5,6,7,8,11,13,25,29,
47,49
DUAL MACE
(CDC 6500+6400)
15
129.943
1,2,3,4,5,6,7,13,14,25,27,29
37,41,48
•Note:
The numbers refer to the list in Section II.A.
1,2,3,4,5,6,7,11,12,13,17,25,
29
Atilla Elci
9
III.
ANALYSIS
In search of a relationship between the size and the number of
resources we performed a regressional analysis on the data.
Polynomial
regression produced the second degree polynomial shown in Table II.
Inspite of its rather high coefficient of correlation (O.94) this polynomial
has a serious drawback:
it predicts negative sizes.
Non polynomial
behavior of the data is also indicated by the sign fluctuations in the
forward differences.
Nonlinear regression on an exponential model
produced the results in Table II.
Atilla Elci
10
Table II.
Model
Results of regression analysis
on the size.
y (x)
Regression coefficients:
130.7178
.03447
-36.9427
.37489
2.4687
Coefficient of correlation
.938379
.9128
Atilla Elci
11
IV.
DERIVATION OF THE SIZE FORMULA
To derive a formula to calculate the size of a CFOS given the number
of resources it should control, we make use of concepts and formulae from
a developing research area designated as Algorithm Dynamics or Software
Physics [9-21].
In Software Physics n
to an algorithm.
•
2
represents the number of input/output parameters
In relation to operating systems, we identify n
•
2
with the
number of distinct resources a CFOS manages.
By the Second Law of Software Physics, the internal quality of a pure
algorithm is defined as:
IQ
where level,
L
•
LV
•
2
"2
N
2
"1
volume,
V
minimum volume,
V
•
V
•
(1)
N 10g2 n
•
(2 + "2)
109
length,
N
•
N + N
l
2
vocabulary,
n
•
n l + T]2
•
2
(2 + "2)
and
N
1
is the total usage of operators,
"2
is the total usage of operands,
"1
is the number of distinct operators,
"2
is the number of distinct operands.
Atilla Elci
12
Substituting N 1092 n
for
V*
セ
V,
and
2
"2
N
2
",
2
",
"2
N
2
for
L
above
N 1092 n
(2)
Bulut's study [18J of the first fourteen algorithms from ACM Algorithms
shows that mean
n2/n l
ratio for machine language versions is
We will assume that
.9875.
In machine language, each instruction
carries an operator and an operand.
Where the operand is indexed there
is one more operator - operand pair.
Thus, the total count of operator
usage, NI , goes hand in hand with the total count of operand usage, N .
2
However, branch instructions add only to N but not to N . Similarly,
l
2
subroutine call parameters placed in-line add to N but not to N , thus
2
I
partially compensating for branch instructions. consequently, the N /N
l 2
ratio comes quite close to 1.0.
Substituting these two assumptions into
2, and solving it for n we obtain
V*
"
2'
セ
(3)
It has been empirically validated (21,22,23] that
(4)
Making use of
assumption, and substituting 3 into 4 we obtain
N
セ
(V * - 4)
2
4
V*
4
Knuth's study (24) of FORTRAN programs shows that:
IS)
Atilla Elci
13
1.
78% of operands are variables, and
2.
42\ of variables are indexed.
It is clear that these statistics are carried over to the machine language
versions of the FORTRAN programs studied.
Assuming that the statistics
above hold for machine language programs, we will employ them to relate
N to the number of instructions, P.
2(1 + .42
N =
*
.78)
Specifically,
P
8
"- 3
p
(6)
Combining 5 and 6, we can solve for P to obtain
V
p
(v* _ 4) 2 4
3
32
•
In the formula 7, V
•
number of instructions
•
can be replaced by (2 + n )
2
calculate the size of a CFOS, P, given the number ッセ
•
n , it should manage.
2
( 7)
to
distinct resources,
Atilla Elci
14
v.
COMPARISON
In Figure 1 the observed and the theoretical sizes are plotted
against the number of resources.
(The observed size is the size of a
CFOS as it is found in the data base.
P, obtained from the formula 7 above).
The theoretical size is the size,
For ease of comparison the means
of the data at each number of resources group are connected.
We observe
that the curve formed by the means of the data follows the theoreticalsize curve closely.
To determine how close the agreement is we resort
to a regression analysis.
After replacing P in formula 7 above by the size of a CFOS in our
data base, we can solve 7 for n * to obtain n * -calculated, the calculated2
2
number of resources.
*
Together with the observed n * , n2-calculated
forms
2
a couple for each of the points in our data base which we can correlate.
*
Simple linear regression of n2-calculated
versus n * -observed produces
2
.94 as the coefficient of correlation.
Atilla Elci
15
••
10
1
2
•
,
••
10'
"c.n
10°
"lHfil'm CAI..
1C1l\'l Naャセ
1
•
,
••
セ
0
+
1
+
+
•
,
セ
••
ll:
セ
セ
t-j
16'
1
•
<Il
,
••
2
16
1
•
,
•
•
QVセ
Figure 1.
Observed and theoretical sizes versus number
of resources.
Atilla Elci
16
VI.
CONCLUSION AND DISCUSSION
In Section III, statistical analysis of the data indicated nonpolynomial characteristics.
On the other hand, an exponential model
was shown to fit the data comfortably.
The predictions of the theoretical-
size formula derived in Section IV, which is also exponential, cOMpared
favorably with the data yielding a high correlation.
Thus it may be concluded that:
1.
The study has confirmed our hypothesis that as more and more
resources are incorporated into the configuration of a computing system,
the size of its CFOS increases exponentially;
2.
The resources are of equal potency in determining the size of
a CFOS, and what counts is the presence of a resource in the configuration;
3.
The size formula derived is a valid model for the size of a CPOS
given the number of resources CFOS manages.
As a result of the study, a セッ、・ャ
where none existed before.
1.
for the size of a CFOS is de riven
In general terms the model states that:
The size of a CFOS is determined primarily hy the nurnher of distinct
resources CPOS manages;
2.
The relationship between the size and the number of resources
is of exponential character;
3.
In determining the size, the identity of the individual resources
is immaterial, and what matters is that CPOS pays attention to the individual
resources simultaneously and in combination with all of the others.
Atilla Blci
17
The model provides a fairly basic understanding of the processes
involved in CFOS size estimation as well as a quantitative method.
However, the size formula is presented as an integral part of the model.
It should be employed in the context of the model and not as a suhstitute
for it.
We can identify two pitfalls regarding the usage of the size
formula.
First, since the formula is dependent on the number of resources,
a proper identification of distinct resources is crucially important.
Failure to do so セ 。 ケ
lead to exaggerated size estimates.
Secondly, the
size estimate given by the formula should not be taken as precise and
without its margin of variation.
If we let P
n
be the evaluation of the
size formula for n resources, the margin of variation associated with P
n
can be stated as from P to P + .
n l
n l
The importance of this context can
be vividly visualized if one realizes that the margin around P
least three times P .
n
Pi-I)·
ャセィゥャ・
n
is at
(Note that the size formula indicates P. > 2.68
l
other CFOS design methods may yet he invented to lower
the size from what the size formula predicts, in our data base, the encountered reductions in size were not large enough to cut the size to
less than Pn-2 for any n - resourced CFOS.
Similarly, the design strategy
used in Dijkstra's THE - Multiprogramming System [25}, insofar as it is
fairly represented by Brinch Hansen's RC 4000 Multiprogramming System
[26,27], was found incapable of achieving any important reduction, since
in the latter case the observed length was 6200 instructions against 6319
predicted hy equation 7.
Atilla Elci
18
ACKNOWLEDGEMENTS
I thank Professor M. H. Halstead for wisdom, guidance, and enduring
patience.
I also thank Professors S. D. Conte and H. D. Schwetman for
their constructive suggestions.
My appreciations are due to the many
individuals who helped us, both directly and through correspondence
in obtaining values and listings for our data base.
Atilla Elci
19
LIST OF REFERENCES
L
HIRSCH, R. E., "Programming performance: Monitoring, maximization,
and prediction", Proc 10th SIG CPR C 1972, pp 36-46.
2.
BROOKS, F. P., Jr, The Mythical Man-Month, Essays on Software
Engineering, Addison-Wesley, 1975.
3.
HENNEY, A.,
4.
SHELL, R. L., "Work measurement for computer programming operations",
Industrial Engineering, (October 1972), pp 32-36.
5.
ARON, J. D., "Estimating resources for large programming systems",
Software Engineering Techniques, 1969, (Buxton and Randal, ed.'s),
pp 68-79.
6.
PIETRASANTA, A.. M., "Resource analysis of computer program system
development", in On The Management of Computer Programming,
(Weinwvrm, G. F., ed.), Averbach 1970.
7.
BIRKHOFF, G., "Mathematics and computer Science", American Scientist,
63, (January-February 1975), pp 83-91.
8.
ELeI, A., Factors Affecting The Program Size of Control Functions of
Operating Systems, Ph.D. Thesis, Purdue University, December 1975.
9.
HALSTEAD, M. H. A Thermodynamics of Algorithms?, CSD TR 66, Purdue
University, Computer Science Department, February 1972.
"Techniques for estimating programming effort and assessing
programmer perfomance", Data Systems (Sept 1969), pp 34-36.
10.
"Natural Laws controlling algorithm structures?",
ACM-SIGPLAN Notices, 7,2 (February 1972), pp 19-26.
11.
A Theoretical Relationship between Mental セャッイォ
and Machine Language Programming, CSD TR 67, Purdue university,
Computer Science Department, Fehruary 1972.
12.
"An experimental determination of the 'purity' of
a trivial algorithm", ACM-SIGME:Perforrnance Evaluation Review,
2,1 (March 1973), pp 10-15.
13.
"Language level, a missing concept in Information Theory",
Ibid, pp 7-9.
·
.
Atilla Elci
14.
20
BAYER, R., A Theoretical Study of Halstead's Software Phenomenon,
C5D TR 69, Purdue University, Computer Science Department,
May 1972.
15.
HALSTEAD, M. H., and R. BAYER, "Algorithm Dynamics", Proe. ACM
National Conference 1973, pp 126-135.
16.
ZWEBEN, S. H., Software Physics: Resolution of an Ambiguity in
The Counting Procedure, eSD TR 93, Purdue university, Computer
Science Department, April 1973.
17.
HALSTEAD, M. H. , and P. M. ZISLIS, Experimental Verification of
Two Theorems of Software Physics, eSD TR 97, Purdue University,
Computer Science Department, June 1973.
lB.
BULUT, N., Invariant Properties of Algorithms, Ph.D. Thesis,
Purdue University, August 1973.
19.
ZWEBEN, S. H. t The Internal Structure of Algorithms, Ph.D. Thesis,
Purdue University, May 1974.
20.
HALSTEAD, M. H., and N. BULUT, "Impurities found in algorithm
implementations", ACM-SIGPLAN Notices, 9,3 (March 1974),
pp 9-12.
21.
BULUT, N., M. H. HALSTEAD, and R. BAYER, "Experimental validation
of a structural property of FORTRAN algorithms", Proc ACM
National Conference 1974, pp 207-211.
22.
ELSHOFF, J. r "Measuring commercial PL/I programs using Halstead's
criteria", GM Research Report, and also in ACM SIGPLAN Notices,
February 1976.
23.
BOHRER, R., "Halstead's criteria and statistical algorithms",
Proc. 8th Computer Science/Statistics Interface Symposium,
Los Angeles, February 1975.
24.
KNUTH, D. F., "An empirical study of FORTRAN programs", SoftwarePractice and Engineering, 1,2, 1971.
25.
DIJKSTRA, E. W., "The structure of THE - multiprogramming system",
CACM 11, 5 (May 1968), pp 341-346.
26.
BRINCH HANSEN, Per, "The nucleus of a multiprogramming system",
CACM 13, 4 (April 1970), pp 238-241, 250.
27.
BRINCR HANSEN, Per, Operating System Principles, Prentice-Hall,
1973. (Chapter 8).