Abstract
The history and evolution of the Unix operating system is made available as a revision management repository, covering the period from its inception in 1972 as a five thousand line kernel, to 2016 as a widely-used 27 million line system. The 1.1gb repository contains 496 thousand commits and 2,523 branch merges. The repository employs the commonly used Git version control system for its storage, and is hosted on the popular GitHub archive. It has been created by synthesizing with custom software 24 snapshots of systems developed at Bell Labs, the University of California at Berkeley, and the 386bsd team, two legacy repositories, and the modern repository of the open source Freebsd system. In total, 973 individual contributors are identified, the early ones through primary research. The data set can be used for empirical research in software engineering, information systems, and software archaeology.
Similar content being viewed by others
Notes
Updates may add or modify material. To ensure replicability the repository’s users are encouraged to fork it on GitHub or archive it.
The dates provided here are given by Salus (1994, p. 43).
References
Aho A V, Kernighan B W, Weinberger P J (1979) Awk—a pattern scanning and processing language. Softw Pract Exper 9(4):267–280
Babaog~lu O, Joy W (1981) Converting a swap-based system to do paging in an architecture lacking page-referenced bits. In: Proceedings of the Eighth ACM symposium on operating systems principles SOSP ’81. ACM, New York, pp 78–86
Bashkow TR (1972) Study of UNIX. Bell Laboratories memo MH-8234-TRB-mbh. Available online at http://bitsavers.informatik.uni-stuttgart.de/pdf/bellLabs/unix/PreliminaryUnixImplementationDocument_Jun72.pdf. Current September 2015
Bird C, Gourley A, Devanbu P, Gertz M, Swaminathan A (2006) Mining email social networks. In: Proceedings of the 2006 International Workshop on Mining Software Repositories, ACM, New York, NY, USA, MSR ’06, pp 137–143. doi:10.1145/1137983.1138016
Bourne S R (1978) The UNIX shell. Bell Syst Tech J 56(6):1971–1990
Bourne SR (1979) An introduction to the UNIX shell. In: UNIX programmer’s manual, volume 2—supplementary documents, 7th edn. Bell Telephone Laboratories. Murray Hill
Dolotta T A, Haight R C, Mashey J R (1978) The programmer’s workbench. Bell Syst Tech J 56(6):2177–2200
Feldman S I (1979) Make—a program for maintaining computer programs. Softw Pract Exper 9(4):255–265
FreeBSD (2015) FreeBSD Handbook. The FreeBSD Documentation Project, revision 47376 edn, available online, https://www.freebsd.org/doc/handbook/index.html
Gall H, Menzies T, Williams L, Zimmermann T (2014) Software Development Analytics (Dagstuhl Seminar 14261). Dagstuhl Reports 4(6):64–83. doi:10.4230/DagRep.4.6.64. http://drops.dagstuhl.de/opus/volltexte/2014/4763
Gehani N (2003) Bell labs: life in the crown jewel. Silicon Press, Summit
Johnson S C (1975) Yacc—yet another compiler-compiler. Computer Science Technical Report 32. Bell Laboratories, Murray Hill
Johnson S C (1977) Lint, a C program checker. Computer Science Technical Report 65. Bell Laboratories, Murray Hill
Johnson S C, Lesk M E (1978) Language development tools. Bell Syst Tech J 56(6):2155–2176
Johnson S C, Ritchie D M (1978) Portability of C programs and the UNIX system. Bell Syst Tech J 57(6):2021–2048
Jolitz W F, Jolitz L G (1991) Porting UNIX to the 386: a practical approach. Designing a software specification. Dr Dobb’s J 16(1)
Kernighan B, Lesk M, Ossanna J J (1978) UNIX time-sharing system: Document preparation. Bell Syst Techn J 57(6):2115–2135
Kernighan B W (1982) A typesetter-independent TROFF. Computer Science Technical Report 97. Bell Laboratories, Murray Hill, available online at http://cm.bell-labs.com/cm/cs/cstr/97.ps.gz
Kernighan B W, Cherry L L (1974) A system for typesetting mathematics. Computer Science Technical Report 17. Bell Laboratories, Murray Hill
Kernighan BW, Ritchie DM (1979) The M4 macro processor. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2– supplementary documents, 7th edn. Bell Telephone Laboratories, Murray Hill
Lesk M (1979a) Some applications of inverted indexes on the Unix system. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2–Supplementary Documents, 4th edn. Bell Telephone Laboratories, Murray Hill
Lesk M E (1975) Lex—a lexical analyzer generator. Computer Science Technical Report 39. Bell Laboratories, Murray Hill
Lesk ME (1979b) TBL—a program to format tables. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2–Supplementary Documents, 7th edn. Bell Telephone Laboratories, Murray Hill
Lewis A (1956) AT&T settles antitrust case; shares patents. New York Times 16:1
Libes D, Ressler S (1989) Life with UNIX. Prentice Hall, Englewood Cliffs
Lions J (1996) Lions’ commentary on Unix 6th edition with source code. Annabooks, Poway
Mashey JR, Smith DW (1976) Documentation tools and techniques. In: Proceedings of the 2Nd international conference on software engineering ICSE ’76. IEEE Computer Society Press, Los Alamitos, pp 177–181
McIlroy M D, Pinson E N, Tague B A (1978) UNIX time-sharing system: foreword. Bell Syst Tech J 57(6):1899–1904
McKusick M K (1999) Twenty years of Berkeley Unix: from AT&T-owned to freely redistributable. In: DiBona C, Ockman S, Stone M (eds) Open sources: voices from the open source revolution, O’Reilly, pp 31–46
McKusick M K, Neville-Neil G V (2004) The design and implementation of the FreeBSD operating system. Addison-Wesley, Reading
McMahon LE (1979) SED—a non-interactive text editor. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2–Supplementary Documents, 7th edn. Bell Telephone Laboratories, Murray Hill
Nowitz DA, Lesk ME (1979) A dial-up network of UNIX systems. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2–Supplementary Documents, 7th edn. Bell Telephone Laboratories, Murray Hill
Ossanna JF (1979) NROFF/TROFF user’s manual. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2–Supplementary Documents, 7th edn. Bell Telephone Laboratories, Murray Hill
Pike R, Kernighan B W (1984) Program design in the UNIX system environment. AT&T Bell Lab Tech J 63(8):1595–1606
Quarterman J S, Hoskins J C (1986) Notable computer networks. Commun ACM 29(10):932–971
Raymond ES (2003) The art of Unix programming. Addison-Wesley
Resnick P (2008) Internet message format. RFC 5322, RFC Editor. doi:10.17487/RFC5322. http://www.rfc-editor.org/rfc/rfc5322.txt
Ritchie D M (1978) A retrospective. Bell System Technical Journal 56(6):1947–1969
Ritchie D M (1984) The evolution of the UNIX time-sharing system. AT&T Bell Lab Tech J 63(8):1577–1593
Ritchie DM (1993) The development of the C language. ACM SIGPLAN Not 28 (3):201–208. preprints of the History of Programming Languages Conference (HOPL-II)
Ritchie D M, Thompson K (1974) The UNIX time-sharing system. Commun ACM 17(7):365–375
Ritchie D M, Thompson K (1978) The UNIX time-sharing system. Bell Syst Tech J 57(6):1905–1929
Ritchie D M, Johnson S C, Lesk M E, Kernighan B W (1978) The C programming language. Bell Syst Tech J 57(6)
Rochkind M J (1975) The source code control system. IEEE Trans Softw Eng SE 1(4):255–265
Rosler L (1984) The evolution of C — past and future. Bell Syst Tech J 63(8)
Salus P H (1994) A quarter century of UNIX. Addison-Wesley, Boston
Spinellis D (2015) A repository with 44 years of Unix evolution. In: MSR ’15: Proceedings of the 12th working conference on mining software repositories. IEEE, pp 462–465. doi:10.1109/MSR.2015.6. http://www.dmst.aueb.gr/dds/pubs/conf/2015-MSR-Unix-History/html/Spi15c.html, best Data Showcase Award
Spinellis D, Louridas P, Kechagia M (2015) An exploratory study on the evolution of C programming in the Unix operating system. In: Wang Q, Ruhe G (eds) ESEM ’15: 9th International symposium on empirical software engineering and measurement. http://www.dmst.aueb.gr/dds/pubs/conf/2015-ESEM-CodeStyle/htm l/SLK15.html. IEEE, pp 54–57
Spinellis D, Louridas P, Kechagia M (2016) The evolution of C programming practices: a study of the Unix operating system. In: Visser W, Williams L (eds) ICSE ’16: Proceedings of the 38th international conference on software engineering. doi:10.1145/2884781.2884799, (to appear in print). to appear. Association for Computing Machinery, New York, pp 1973–2015
Stevens W R (1990) UNIX network programming. Prentice Hall, Englewood Cliffs
Stroustrup B (1984) Data abstraction in C. Bell Syst Tech J 63(8):1701–1732
Stroustrup B (1994) The design and evolution of C++. Addison-Wesley, Boston
Takahashi N, Takamatsu T (2013) UNIX license makes Linux the last missing piece of the puzzle. Ann Bus Admin Sci 12:123–137
Tichy WF (1982) Design, implementation, and evaluation of a revision control system. In: Proceedings of the 6th international conference on software engineering. IEEE
Toomey W (2009) The restoration of early UNIX artifacts. In: Proceedings of the 2009 USENIX annual technical conference USENIX’09. USENIX Association, Berkeley, pp 20–26
Toomey W (2010) First edition Unix: its creation and restoration. IEEE Ann Hist Comput 32(3):74–82. doi:10.1109/MAHC.2009.55
Wall L, Schwartz R L (1990) Programming Perl. O’Reilly and Associates, Sebastopol
Yoo A B, Jette M A, Grondona M (2003) SLURM: Simple Linux utility for resource management. In: Feitelson D, Rudolph L, Schwiegelshohn U (eds) JSSPP 03: 9th International workshop on job scheduling strategies for parallel processing. doi:10.1007/10968987_3, (to appear in print). lecture Notes in Computer Science Volume 2862. Springer, Berlin Heidelberg, pp 44–60
Acknowledgments
The author thanks the many individuals who contributed, directly or indirectly, to the effort. John Cowan, Brian W. Kernighan, Larry McVoy, Doug McIlroy, Jeremy C. Reed, Aharon Robbins, and Marc Rochkind helped with Bell Labs login identifiers. Clem Cole, John Cowan, Era Eriksson, Mary Ann Horton, Warner Losh, Kirk McKusick, Jeremy C. Reed, Ingo Schwarze, Anatole Shaw, and Norman Wilson helped with bsd login identifiers and code authorship information. The historical and current material used in the repository was made available thanks to efforts by the Free bsd Project, Lynne Greer Jolitz, William F. Jolitz, Kirk McKusick, and the Unix Heritage Society. The early Unix editions were released under an bsd-style license thanks to the efforts of Bill Broderick, Paul Hatch, Dion L. Johnson II, Ransom Love, and Warren Toomey. The bsd sccs import code is based on work by H. Merijn Brand and Jonathan Gray. The newoldar program is a result of work by Brandon Creighton and Dan Frasnelli. The First Research Edition Unix was restored by Johan Beiser, Tim Bradshaw, Brantley Coile, Christian David, Alex Garbutt, Hellwig Geisse, Cyrille Lefevre, Ralph Logan, James Markevitch, Doug Merritt, Tim Newsham, Brad Parker, and Warren Toomey.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Romain Robbes, Martin Pinzger and Yasutaka Kamei
The work has been partially funded by the Research Centre of the Athens University of Economics and Business, under the Original Scientific Publications framework (project code EP-2279-01) and supported by computational time granted from the Greek Research & Technology Network (grnet) in the National hpc facility — aris — under project id pa003005-cdolpot.
Rights and permissions
About this article
Cite this article
Spinellis, D. A repository of Unix history and evolution. Empir Software Eng 22, 1372–1404 (2017). https://doi.org/10.1007/s10664-016-9445-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-016-9445-5