Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Disco: running commodity operating systems on scalable multiprocessors

Published: 01 November 1997 Publication History

Abstract

In this article we examine the problem of extending modern operating systems to run efficiently on large-scale shared-memory multiprocessors without a large implementation effort. Our approach brings back an idea popular in the 1970s: virtual machine monitors. We use virtual machines to run multiple commodity operating systems on a scalable multiprocessor. This solution addresses many of the challenges facing the system software for these machines. We demonstrate our approach with a prototype called Disco that runs multiple copies of Silicon Graphics' IRIX operating system on a multiprocessor. Our experience shows that the overheads of the monitor are small and that the approach provides scalability as well as the ability to deal with the nonuniform memory access time of these systems. To reduce the memory overheads associated with running multiple operating systems, virtual machines transparently share major data structures such as the program code and the file system buffer cache. We use the distributed-system support of modern operating systems to export a partial single system image to the users. The overall solution achieves most of the benefits of operating systems customized for scalable multiprocessors, yet it can be achieved with a significantly smaller implementation effort.

References

[1]
ACCETTA, M. J., BARON, R. V., BOLOSKY, W. J., GOLUB, D. B., RASHID, R. F., TEVANIAN, A., AND YOUNG, M. 1986. Mach: A new kernel foundation for UNIX development. In Proceedings of the Summer 1986 USENIX Technical Conference and Exhibition. USENIX Assoc., Berkeley, Calif.
[2]
BOLOSKY, W. J., FITZGERALD, R. P., AND SCOTT, M.L. 1989. Simple but effective techniques for NUMA memory management. In Proceedings of the 12th ACM Symposium on Operating System Principles. ACM, New York, 19-31.
[3]
BRESSOUD, T. C. AND SCHNEIDER, F.B. 1996. Hypervisor-based fault tolerance. ACM Trans. Comput. Syst. 14, 1 (Feb.), 80-107.
[4]
BREWER, T. AND ASTFALK, G. 1997. The evolution of the HP/Convex Exemplar. In Proceedings of COMPCON Spring '97. 81-96.
[5]
CORMEN, T. H., LEISERSON, C. E., AND RIVEST, R. L. 1990. Introduction to Algorithms. McGraw-Hill, New York.
[6]
Cox, A. L. AND FOWLER, R.J. 1989. The implementation of a coherent memory abstraction on a NUMA multiprocessor: Experiences with PLATINUM. In Proceedings of the 12th ACM Symposium on Operating System Principles. ACM, New York, 32-44.
[7]
CREASY, R. 1981. The origin of the VM/370 time-sharing system. IBM J. Res. Devel. 25, 5, 483-490.
[8]
CUSTER, H. 1993. Inside Windows NT. Microsoft Press, Redmond, Wash.
[9]
EBCIOGLU, K. AND ALTMAN, E.R. 1997. DAISY: Dynamic compilation for 100% architectural compatibility. In Proceedings of the 24th International Symposium on Computer Architecture. 26-37.
[10]
ENGLER, D. R., KAASHOEK, M. F., AND O'TOOLE, J., JR. 1995. Exokernel: An operating system architecture for application-level resource management. In Proceedings of the 15th ACM Symposium on Operating Systems Principles. ACM, New York.
[11]
FORD, B., HIBLER, M., LEPREAU, J., TULLMAN, P., BACK, G., AND CLAWSON, S. 1996. Microkernels meet recursive virtual machines. In the 2nd Symposium on Operating Systems Design and Implementation. 137-151.
[12]
GOLDBERG, R.P. 1974. Survey of virtual machine research. IEEE Comput. 7, 6, 34-45.
[13]
HERLIHY, M. 1991. Wait-free synchronization. ACM Trans. Program. Lang. Syst. 13, 1 (Jan.), 124-149.
[14]
IBM. 1972. IBM Virtual Machine~370 Planning Guide. IBM Corp., Armonk, N.Y.
[15]
KAASHOEK, M. F., ENGLER, D. R., GANGER, G. R., BRICENO, H. M., HUNT, R., MAZIERES, D., PINCKNEY, T., GRIMM, R., JANNOTTI, J., AND MACKENZIE, K. 1997. Application performance and flexibility on exokernel systems. In Proceedings of the 16th ACM Symposium on Operating Systems Principles. ACM, New York.
[16]
KING, A. 1995. Inside Windows 95. Microsoft Press, Redmond, Wash.
[17]
KUSKIN, J., OFELT, D., HEINRICH, M., HEINLEIN, J., SIMONI, R., GHARACHORLOO, K., CHAPIN, J., NAKAHIRA, D., BAXTER, J., HOROWITZ, M., GUPTA, A., ROSENBLUM, M., AND HENNESSY, J. 1994. The Stanford FLASH Multiprocessor. In Proceedings of the 21st International Symposium on Computer Architecture. 302-313.
[18]
LAUDON, J. AND LENOSKI, D. 1997. The SGI Origin: A ccNUMA highly scalable server. In Proceedings of the 24th Annual International Symposium on Computer Architecture. 241- 251.
[19]
LOVETT, T. AND CLAPP, R. 1996. STING: A CC-NUMA computer system for the commercial marketplace. In Proceedings of the 23rd Annual International Symposium on Computer Architecture. 308-317.
[20]
PEREZ, M. 1995. Scalable hardware evolves, but what about the network OS? PCWeek (Dec.).
[21]
PERL, S. E. AND SITES, R. L. 1996. Studies of windows NT performance using dynamic execution traces. In Proceedings of the 2nd Symposium on Operating System Design and Implementation. 169-184.
[22]
ROSENBLUM, M., BUGNION, E., HERROD, S. A., AND DEVINE, S. 1997. Using the simOS machine simulator to study complex computer systems. ACM Trans. Modeling Comput. Sire. 7, 1 (Jan.), 78-103.
[23]
ROSENBLUM, M., BUGNION, E., HERROD, S. A., WITCHEL, E., AND GUPTA, A. 1995. The impact of architectural trends on operating system performance. In Proceedings of the 15th ACM Symposium on Operating Systems Principles. ACM, New York, 285-298.
[24]
ROSENBLUM, M., CHAPIN, J., TEODOSIU, D., DEVINE, S., LAHIRI, T., AND GUPTA, A. 1996. Implementing efficient fault containment for multiprocessors: Confining faults in a sharedmemory multiprocessor environment. Commun. ACM 39, 9 (Sept.), 52-61.
[25]
SHULER, L., JONG, C., RIESER, R., VAN DRESSER, D., MACCABE, A. B., FISK, L., AND STALLCUP, T. 1995. The Puma operating system for massively parallel computers. In Proceedings of the Intel Supercomputer User Group Conference.
[26]
UNRAU, R. C., KRIEGER, O., GAMSA, B., AND STUMM, M. 1995. Hierarchical clustering: A structure for scalable multiprocessor operating system design. J. Supercomput. 9, 1/2, 105-134.
[27]
VERGHESE, B., DEVINE, S., GUPTA, A., AND ROSENBLUM, M. 1996. Operating system support for improving data locality on CC-NUMA computer servers. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, New York, 279-289.
[28]
Woo, S. C., OHARA, M., TORRIE, E., SHINGH, J. P., AND GUPTA, A. 1995. The SPLASH-2 programs: Characterization and methodological considerations. In Proceedings of the 22nd Annual International Symposium on Computer Architecture. 24-36.

Cited By

View all
  • (2024)Tackling Memory Footprint Expansion During Live Migration of Virtual Machines2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid59990.2024.00027(158-167)Online publication date: 6-May-2024
  • (2023)Joint Optimization of Memory Sharing and Communication Distance for Virtual Machine Instantiation in Cloudlet NetworksElectronics10.3390/electronics1220420512:20(4205)Online publication date: 10-Oct-2023
  • (2023)Homogeneous Batch Memory Deduplication Using Clustering of Virtual MachinesComputer Systems Science and Engineering10.32604/csse.2023.02494544:1(929-943)Online publication date: 2023
  • Show More Cited By

Recommendations

Reviews

Armin B. Cremers

Virtual machine monitors, a popular operating systems approach in the 1970s, are investigated as a means of extending modern system software so it can run efficiently on large-scale shared-memory multiprocessors without a massive implementation effort. Virtual machines add an intermediate level between multiple copies of commodity operating systems and the scalable multiprocessor hardware in order to hide certain novel attributes of the machines, such as its size and aspects of the nonuniform memory architecture (NUMA). As part of the Stanford FLASH shared-memory multiprocessor project, a “NUMA-aware” prototype implementation called Disco has been developed to combine commodity operating systems into a new performant system software base. In a simulation-based experiment, this prototype was used to run multiple copies of SGI's IRIX operating system. The results show that many traditional problems of the virtual machine approach do not occur in this approach. Disco succeeds in supporting a global buffer cache functionality that is transparently shared across all virtual machines. This is achieved by combining a suitable emulation of the DMA engine with standard distributed filesystem protocols. There are indications that many of the techniques also apply to more loosely coupled environments, such as networks of workstations.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Computer Systems
ACM Transactions on Computer Systems  Volume 15, Issue 4
Nov. 1997
92 pages
ISSN:0734-2071
EISSN:1557-7333
DOI:10.1145/265924
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 1997
Published in TOCS Volume 15, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. scalable multiprocessors
  2. virtual machines

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)978
  • Downloads (Last 6 weeks)269
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Tackling Memory Footprint Expansion During Live Migration of Virtual Machines2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid59990.2024.00027(158-167)Online publication date: 6-May-2024
  • (2023)Joint Optimization of Memory Sharing and Communication Distance for Virtual Machine Instantiation in Cloudlet NetworksElectronics10.3390/electronics1220420512:20(4205)Online publication date: 10-Oct-2023
  • (2023)Homogeneous Batch Memory Deduplication Using Clustering of Virtual MachinesComputer Systems Science and Engineering10.32604/csse.2023.02494544:1(929-943)Online publication date: 2023
  • (2023)User-guided Page Merging for Memory Deduplication in Serverless Systems2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386487(159-169)Online publication date: 15-Dec-2023
  • (2021)An Approximation Algorithm for Sharing-Aware Virtual Machine Revenue MaximizationIEEE Transactions on Services Computing10.1109/TSC.2017.278672814:1(1-15)Online publication date: 1-Jan-2021
  • (2021)Classification criteria for data deduplication methodsData Deduplication Approaches10.1016/B978-0-12-823395-5.00011-2(69-96)Online publication date: 2021
  • (2020)Lightweight kernel isolation with virtualization and VM functionsProceedings of the 16th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/3381052.3381328(157-171)Online publication date: 17-Mar-2020
  • (2020)∅sim: Preparing System Software for a World with Terabyte-scale MemoriesProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378451(267-282)Online publication date: 9-Mar-2020
  • (2020)The Ideal Versus the RealACM Computing Surveys10.1145/336519953:1(1-31)Online publication date: 6-Feb-2020
  • (2020)Large-Scale Analysis of the Docker Images and Performance Implications to Container Storage SystemsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.3034517(1-1)Online publication date: 2020
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media