Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

The Dependence of Operating System Size Upon Allocatable Resources

1975
...Read more
Purdue University Purdue e-Pubs Computer Science Technical Reports Department of Computer Science 1975 he Dependence of Operating System Size Upon Allocatable Resources Atilla Elci Report Number: 75-172 his document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries. Please contact epubs@purdue.edu for additional information. Elci, Atilla, "he Dependence of Operating System Size Upon Allocatable Resources" (1975). Computer Science Technical Reports. Paper 117. htp://docs.lib.purdue.edu/cstech/117
The Dependence of Operating System Size Upon Allocatable Resources AtiJ.la Elci computer Science Department Purdue University Nest Lafayette, Indiana 47907 December 1975 CSD-TR 172
Purdue University Purdue e-Pubs Computer Science Technical Reports Department of Computer Science 1975 he Dependence of Operating System Size Upon Allocatable Resources Atilla Elci Report Number: 75-172 Elci, Atilla, "he Dependence of Operating System Size Upon Allocatable Resources" (1975). Computer Science Technical Reports. Paper 117. htp://docs.lib.purdue.edu/cstech/117 his document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries. Please contact epubs@purdue.edu for additional information. The Dependence of Operating System Size Upon Allocatable Resources AtiJ.la Elci computer Science Department Purdue University Nest Lafayette, Indiana 47907 December 1975 CSD-TR 172 Atilla Elci 1 ABSTRACT A detailed analysis of 40 operating systems demonstrates that the number of instructions required to control their resources varies over four orders of magnitude, but that 88% of this variance depends only upon the number of unique types of resources controlled. Differences in the size contributions of different resource types were not detectable. In addition, a software theoretic model is derived, and shown to be in accord with the exponential nature of the data. Atilla Elci 2 I. INTRODUCTION In the past, the factors which contribute to the size of an operating system have not been well understood. In fact, even the efforts to obtain procedures for estimating program sizes in general have not produced highly accurate methods [1,2,3,4,5,6]. However, some insight into the processes which might be operating is alluded to by Brooks (2] in relation to the programming team size, and by Birkhoff [7] in relation to computational complexity of combinatorial problems. since a wide range of reliable data may often contribute to better understanding of any phenomenon, it is the primary purpose of this paper to present the results of a detailed analysis [8] of some 40 different operating systems, ranging from the smallest which could be found to some of the largest. For each of these systems, the count of unique allocatable resource types was obtained (as defined in Section II), together with a count of those instructions which control them. The high correlation ob- served between the number of instructions and the number of unique resource types is shown in the statistical analysis of Section III. Borrowing some concepts and software relationships from a developing research area designated as Algorithm Dynamics or Software Physics, Section IV introduces the derivation of an equation relating the size to the number of distinct resources. Atilla Elci 3 II. A. DEFINITIONS. THE DATA BASE. Definition of resource types. In general, any facility of the computing system or the operating system required by a job including hardware and software alike might be recognized as a resource. We revise this definition by a restriction which will disqualify some facilities. A facility can become a resource if the operating system has control over it, or there is at least a mutually responsive protocol established between the facility and the operating system. Thus, our definition of a resource type can be stated as follows: A unique physical device or the program library whose control is among the operating system's capabilities. The following list includes all of the resource types encountered in this study: Central facilities: l. Central processing unit, 2. Memory, 3. System timer (real-time clock) , 4. System console, 5. The program library, Storage or input/output devices: 6. Tape drive, 7. Disk unit, 8. Drum unit, Atilla Elci 9. 4 Cassette, 10. Card-random-access memory (CRAM), 11. Teletype terminal, 12. Display terminal, 13. Card reader, 14. Paper tape reader, 15. Keyboard, 16. RJE (CR/LP) station, 17. MICR/OCR, 18. Hand-held numeric/function keyboard, 19. Microscope terminal, 20. Scanning/measuring projector, 21. Buttonboard, 22. Precision CRT, 23. Photomultiplier, 24. Mirror assembly, 25. Line printer, 26. Plotter, 27. Paper tape punch, 28. Scanner, 29. Card punch, Analog/digital facilities: 3D. A/D channels, 31. Discrete output lines, 32. Discrete input lines, 33. Analog input lines, 34. Analog output lines, 35. A/D converter, AHlla Elci 5 Remote processors: B. 36. CDC 6500, 37. CDC 6400, 38. IBM 360/40, 39. IBM 360/44, 40. IBM 360/22, 4l. IBM 7094, 42. IBM 7750, 43. IBM 1401, 44. PDP 11/45, 45. PDP 1, 46. DGC SUPERNOVA, 47. XDS 9300, 48. MODCOMP IV, 49. AD-4, 50. Elbit 100, 5l. Digital filter unit, 52. Polly (l1PuP) Definition of the size of a CFOS. An operating system consists of two bodies of programs: 1. Control functions part, (CFOS), is composed of the routines implementing the control of the system resources, and the communication with the environment. in three divisions: and (iii) (i) These routines can be collected Job management, Data management parts; fii) Task management, Atilla Elci 2. 6 The processing programs part, (PPOS), is composed of language translators, service programs, and problem programs. For the size of a CFOS, we will choose the number of instructions as the unit of measure for no better reason than that programs implement algorithms whose actions are expressed in instructions. Then, the size of a CFOS is the total additive number of machine instructions of the group of routines which implement the CFOS functions, and the PPOS is ignored. C. The data base. Forty operating systems are included in our forty point data hase. Each operating system is represented by the two variables defined above: the number of its distinct resources, and the size of its CFOS. Each entry also includes additional information as to the identity of the individual resources available. The data base is displayed in Tahle 1. Ati11a E1ci 7 Table 1. The data base collected. NUMBER OF RESOURCES NAME OF OS SIZE (K instr.) + LIST OF RESOURCES • POLLY (J,Jpup) 5 .016 TLMTR (IBM 7094) 6 .403 1,2,11,13,25,36 PILOT TSDS (UNIVAC 1108) 7 .474 1,2,3,4,5,13,25 DIGITAL FILTER LAB. (PDP 11/10) 8 1.034 1,2,4,7,11,14,18,51 SOS (DGC 800) 9 2.816 1,2,4,5,6,11,14,25,26 SOS (DGC 800) 9 2.816 1,2,4,5,9,11,14,25,26 DOS (IBM 360/30) 10 3.219 1,2,3,4,5,6,7,13,25,29 SNOS (DGe SUPERNOVA) 10 4.801 1,2,3,4,12,15,21,24,38,52 PCP (IBM 360/30) 10 4.918 1,2,3,4,5,6,7,13,25,29 ABMS (PDP 11) 10 5.000 1,2,3,11,14,27,31,32,33,34 RT-l1 (PDP 11) 10 5.250 1,2,3,4,5,7,11,12,26,30 TUMTR (IBM 7094) 10 5.943 1,2,3,6,11,13,25,28,36,43 FOURIER OS (HP 21005) 10 6.042 1,2,4,5,11,12,14,15,26,35 RC 4000 MS (RC 4000) 10 6.200 1,2,3,4,6,7,11,14,25,27 M20'5 (MODCOMP II) 10 8.058 1,2,3,11,13,25,26,36,40,44 M3¢S (MODCOMP III) 10 8.058 1,2,3,11,13,25,26,36,40,41 OS/MFT (IB!'l 370/145) 10 21.312 1,2,3,4,5,6,7,13,25,29 vs (PDP 15/30) 11 11.881 1,2,4,5,6,7,11,12,13,14,25 ZANEMAR (NCR 4130) 11 12.000 1,2,4,5,10,11,13,14,25,27,50 05/1621 (MICRODATA 1621) 11 13.900 1,2,4,5,6,7,11,13,14,25,27 MINIMOP 2 (ICL 1909) 11 14.000 1,2,3,4,5,6,7,11,14,25,27 DOS 15 (PDP 15/30) 11 14.828 1,2,4,5,6,7,11,12,13,14,25 MAXIMOP (ICL 1909) 11 15.000 1,2,3,4,5,6,7,11,14,25,27 soc (IBM AN/FSQ-32) 11 16.384 1,2,5,6,7,8,12,13,29,43,45 TSS 1,2,22,23,46 Atilla Elci 8 Table 1 cant. NUMBER OF RESOURCES NAME OF OS SIZE (K instr.) LIST OF RESOURCES • SAS (NCR 4130) 11 21. 000 1,2,4,5,10,11,13,14,25,27,50 CREOPS (NCR 4130) 11 28.000 1,2,4,5,10,11,13,14,25,27,50 PS/MFT (IBM 360/40) 11 29.632 1,2,3,4,5,7,11,19,20,39,46 PS/MFT (IBM 360/44) 11 31. 309 1,2,3,4,5,6,7,13,25,29,38 GEORGE 3 (ICL 1904A) 11 40.000 1,2,4,5,6,7,8,11,13,16,25 MAC SYSTEM (IBM 7094) 12 32.000 1,2,3,5,6,7,8,12,13,25,29,42 CTSS (IBM 7094) 12 32.700 1,2,3,5,6,7,8,12,13,25,29,42 CTSS (IBM 7094) 12 32.768 1,2,3,5,6,7,8,11,12,13,25,29 OS/MVT (IBM 370/155) 12 42.076 1,2,3,4,5,6,7,11,12,13,25,29 OS/MFT (IBM 370/155) 12 48.094 1,2,3,4,5,6,7,11,12,13,25,29 RSX 15 (PDP 15/30) 12 49.920 1,2,3,4,5,6,7,11,12,13,14,25 EXEC B (UNIVAC 1106) 12 60.000 1,2,3,4,5,6,7,11,12,13,25,29 CP/67+CMS (IBM 360/67) 13 64.363 1,2,3,4,5,6,7,8,11,12,13,25,2S OS/VSl (IBM 370/145) 13 109.300 OS/MVT (IBM 360/65) 14 84.494 1,2,3,4,5,6,7,8,11,13,25,29, 47,49 DUAL MACE (CDC 6500+6400) 15 129.943 1,2,3,4,5,6,7,13,14,25,27,29 37,41,48 •Note: The numbers refer to the list in Section II.A. 1,2,3,4,5,6,7,11,12,13,17,25, 29 Atilla Elci 9 III. ANALYSIS In search of a relationship between the size and the number of resources we performed a regressional analysis on the data. Polynomial regression produced the second degree polynomial shown in Table II. Inspite of its rather high coefficient of correlation (O.94) this polynomial has a serious drawback: it predicts negative sizes. Non polynomial behavior of the data is also indicated by the sign fluctuations in the forward differences. Nonlinear regression on an exponential model produced the results in Table II. Atilla Elci 10 Table II. Model Results of regression analysis on the size. y (x) Regression coefficients: 130.7178 .03447 -36.9427 .37489 2.4687 Coefficient of correlation .938379 .9128 Atilla Elci 11 IV. DERIVATION OF THE SIZE FORMULA To derive a formula to calculate the size of a CFOS given the number of resources it should control, we make use of concepts and formulae from a developing research area designated as Algorithm Dynamics or Software Physics [9-21]. In Software Physics n to an algorithm. • 2 represents the number of input/output parameters In relation to operating systems, we identify n • 2 with the number of distinct resources a CFOS manages. By the Second Law of Software Physics, the internal quality of a pure algorithm is defined as: IQ where level, L • LV • 2 "2 N 2 "1 volume, V minimum volume, V • V • (1) N 10g2 n • (2 + "2) 109 length, N • N + N l 2 vocabulary, n • n l + T]2 • 2 (2 + "2) and N 1 is the total usage of operators, "2 is the total usage of operands, "1 is the number of distinct operators, "2 is the number of distinct operands. Atilla Elci 12 Substituting N 1092 n for V* セ V, and 2 "2 N 2 ", 2 ", "2 N 2 for L above N 1092 n (2) Bulut's study [18J of the first fourteen algorithms from ACM Algorithms shows that mean n2/n l ratio for machine language versions is We will assume that .9875. In machine language, each instruction carries an operator and an operand. Where the operand is indexed there is one more operator - operand pair. Thus, the total count of operator usage, NI , goes hand in hand with the total count of operand usage, N . 2 However, branch instructions add only to N but not to N . Similarly, l 2 subroutine call parameters placed in-line add to N but not to N , thus 2 I partially compensating for branch instructions. consequently, the N /N l 2 ratio comes quite close to 1.0. Substituting these two assumptions into 2, and solving it for n we obtain V* " 2' セ (3) It has been empirically validated (21,22,23] that (4) Making use of assumption, and substituting 3 into 4 we obtain N セ (V * - 4) 2 4 V* 4 Knuth's study (24) of FORTRAN programs shows that: IS) Atilla Elci 13 1. 78% of operands are variables, and 2. 42\ of variables are indexed. It is clear that these statistics are carried over to the machine language versions of the FORTRAN programs studied. Assuming that the statistics above hold for machine language programs, we will employ them to relate N to the number of instructions, P. 2(1 + .42 N = * .78) Specifically, P 8 "- 3 p (6) Combining 5 and 6, we can solve for P to obtain V p (v* _ 4) 2 4 3 32 • In the formula 7, V • number of instructions • can be replaced by (2 + n ) 2 calculate the size of a CFOS, P, given the number ッセ • n , it should manage. 2 ( 7) to distinct resources, Atilla Elci 14 v. COMPARISON In Figure 1 the observed and the theoretical sizes are plotted against the number of resources. (The observed size is the size of a CFOS as it is found in the data base. P, obtained from the formula 7 above). The theoretical size is the size, For ease of comparison the means of the data at each number of resources group are connected. We observe that the curve formed by the means of the data follows the theoreticalsize curve closely. To determine how close the agreement is we resort to a regression analysis. After replacing P in formula 7 above by the size of a CFOS in our data base, we can solve 7 for n * to obtain n * -calculated, the calculated2 2 number of resources. * Together with the observed n * , n2-calculated forms 2 a couple for each of the points in our data base which we can correlate. * Simple linear regression of n2-calculated versus n * -observed produces 2 .94 as the coefficient of correlation. Atilla Elci 15 •• 10 1 2 • , •• 10' "c.n 10° "lHfil'm CAI.. 1C1l\'l Naャセ 1 • , •• セ 0 + 1 + + • , セ •• ll: セ セ t-j 16' 1 • <Il , •• 2 16 1 • , • • QVセ Figure 1. Observed and theoretical sizes versus number of resources. Atilla Elci 16 VI. CONCLUSION AND DISCUSSION In Section III, statistical analysis of the data indicated nonpolynomial characteristics. On the other hand, an exponential model was shown to fit the data comfortably. The predictions of the theoretical- size formula derived in Section IV, which is also exponential, cOMpared favorably with the data yielding a high correlation. Thus it may be concluded that: 1. The study has confirmed our hypothesis that as more and more resources are incorporated into the configuration of a computing system, the size of its CFOS increases exponentially; 2. The resources are of equal potency in determining the size of a CFOS, and what counts is the presence of a resource in the configuration; 3. The size formula derived is a valid model for the size of a CPOS given the number of resources CFOS manages. As a result of the study, a セッ、・ャ where none existed before. 1. for the size of a CFOS is de riven In general terms the model states that: The size of a CFOS is determined primarily hy the nurnher of distinct resources CPOS manages; 2. The relationship between the size and the number of resources is of exponential character; 3. In determining the size, the identity of the individual resources is immaterial, and what matters is that CPOS pays attention to the individual resources simultaneously and in combination with all of the others. Atilla Blci 17 The model provides a fairly basic understanding of the processes involved in CFOS size estimation as well as a quantitative method. However, the size formula is presented as an integral part of the model. It should be employed in the context of the model and not as a suhstitute for it. We can identify two pitfalls regarding the usage of the size formula. First, since the formula is dependent on the number of resources, a proper identification of distinct resources is crucially important. Failure to do so セ 。 ケ lead to exaggerated size estimates. Secondly, the size estimate given by the formula should not be taken as precise and without its margin of variation. If we let P n be the evaluation of the size formula for n resources, the margin of variation associated with P n can be stated as from P to P + . n l n l The importance of this context can be vividly visualized if one realizes that the margin around P least three times P . n Pi-I)· ャセィゥャ・ n is at (Note that the size formula indicates P. > 2.68 l other CFOS design methods may yet he invented to lower the size from what the size formula predicts, in our data base, the encountered reductions in size were not large enough to cut the size to less than Pn-2 for any n - resourced CFOS. Similarly, the design strategy used in Dijkstra's THE - Multiprogramming System [25}, insofar as it is fairly represented by Brinch Hansen's RC 4000 Multiprogramming System [26,27], was found incapable of achieving any important reduction, since in the latter case the observed length was 6200 instructions against 6319 predicted hy equation 7. Atilla Elci 18 ACKNOWLEDGEMENTS I thank Professor M. H. Halstead for wisdom, guidance, and enduring patience. I also thank Professors S. D. Conte and H. D. Schwetman for their constructive suggestions. My appreciations are due to the many individuals who helped us, both directly and through correspondence in obtaining values and listings for our data base. Atilla Elci 19 LIST OF REFERENCES L HIRSCH, R. E., "Programming performance: Monitoring, maximization, and prediction", Proc 10th SIG CPR C 1972, pp 36-46. 2. BROOKS, F. P., Jr, The Mythical Man-Month, Essays on Software Engineering, Addison-Wesley, 1975. 3. HENNEY, A., 4. SHELL, R. L., "Work measurement for computer programming operations", Industrial Engineering, (October 1972), pp 32-36. 5. ARON, J. D., "Estimating resources for large programming systems", Software Engineering Techniques, 1969, (Buxton and Randal, ed.'s), pp 68-79. 6. PIETRASANTA, A.. M., "Resource analysis of computer program system development", in On The Management of Computer Programming, (Weinwvrm, G. F., ed.), Averbach 1970. 7. BIRKHOFF, G., "Mathematics and computer Science", American Scientist, 63, (January-February 1975), pp 83-91. 8. ELeI, A., Factors Affecting The Program Size of Control Functions of Operating Systems, Ph.D. Thesis, Purdue University, December 1975. 9. HALSTEAD, M. H. A Thermodynamics of Algorithms?, CSD TR 66, Purdue University, Computer Science Department, February 1972. "Techniques for estimating programming effort and assessing programmer perfomance", Data Systems (Sept 1969), pp 34-36. 10. "Natural Laws controlling algorithm structures?", ACM-SIGPLAN Notices, 7,2 (February 1972), pp 19-26. 11. A Theoretical Relationship between Mental セャッイォ and Machine Language Programming, CSD TR 67, Purdue university, Computer Science Department, Fehruary 1972. 12. "An experimental determination of the 'purity' of a trivial algorithm", ACM-SIGME:Perforrnance Evaluation Review, 2,1 (March 1973), pp 10-15. 13. "Language level, a missing concept in Information Theory", Ibid, pp 7-9. · . Atilla Elci 14. 20 BAYER, R., A Theoretical Study of Halstead's Software Phenomenon, C5D TR 69, Purdue University, Computer Science Department, May 1972. 15. HALSTEAD, M. H., and R. BAYER, "Algorithm Dynamics", Proe. ACM National Conference 1973, pp 126-135. 16. ZWEBEN, S. H., Software Physics: Resolution of an Ambiguity in The Counting Procedure, eSD TR 93, Purdue university, Computer Science Department, April 1973. 17. HALSTEAD, M. H. , and P. M. ZISLIS, Experimental Verification of Two Theorems of Software Physics, eSD TR 97, Purdue University, Computer Science Department, June 1973. lB. BULUT, N., Invariant Properties of Algorithms, Ph.D. Thesis, Purdue University, August 1973. 19. ZWEBEN, S. H. t The Internal Structure of Algorithms, Ph.D. Thesis, Purdue University, May 1974. 20. HALSTEAD, M. H., and N. BULUT, "Impurities found in algorithm implementations", ACM-SIGPLAN Notices, 9,3 (March 1974), pp 9-12. 21. BULUT, N., M. H. HALSTEAD, and R. BAYER, "Experimental validation of a structural property of FORTRAN algorithms", Proc ACM National Conference 1974, pp 207-211. 22. ELSHOFF, J. r "Measuring commercial PL/I programs using Halstead's criteria", GM Research Report, and also in ACM SIGPLAN Notices, February 1976. 23. BOHRER, R., "Halstead's criteria and statistical algorithms", Proc. 8th Computer Science/Statistics Interface Symposium, Los Angeles, February 1975. 24. KNUTH, D. F., "An empirical study of FORTRAN programs", SoftwarePractice and Engineering, 1,2, 1971. 25. DIJKSTRA, E. W., "The structure of THE - multiprogramming system", CACM 11, 5 (May 1968), pp 341-346. 26. BRINCH HANSEN, Per, "The nucleus of a multiprogramming system", CACM 13, 4 (April 1970), pp 238-241, 250. 27. BRINCR HANSEN, Per, Operating System Principles, Prentice-Hall, 1973. (Chapter 8).