Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Efficient synchronization primitives for large-scale cache-coherent multiprocessors

Published: 01 April 1989 Publication History
  • Get Citation Alerts
  • Abstract

    This paper proposes a set of efficient primitives for process synchronization in multiprocessors. The only assumptions made in developing the set of primitives are that hardware combining is not implemented in the inter-connect, and (in one case) that the interconnect supports broadcast.
    The primitives make use of synchronization bits (syncbits) to provide a simple mechanism for mutual exclusion. The proposed implementation of the primitives includes efficient (i.e. local) busy-waiting for syncbits. In addition, a hardware-supported mechanism for maintaining a first-come first-serve queue of requests for a syncbit is proposed. This queueing mechanism allows for a very efficient implementation of, as well as fair access to, binary semaphores. We also propose to implement Fetch and Add with combining in software rather than hardware. This allows an architecture to scale to a large number of processors while avoiding the cost of hardware combining.
    Scenarios for common synchronization events such as work queues and barriers are presented to demonstrate the generality and ease of use of the proposed primitives. The efficient implementation of the primitives is simpler if the multiprocessor has a hardware cache-consistency protocol. To illustrate this point, we outline how the primitives would be implemented in the Multicube multiprocessor [GoWo88].

    References

    [1]
    Archibald, J., and I. L. Baer, "Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model," ACM Transactions on Computer Systems, November 1986, pp. 273-298.
    [2]
    Baer, J. L., and W. H. Wang, "Architectural Choices for Multilevel Cache Hierarchies," Proceedings of the 1987 International Conference on Parallel Processing, August 1987, pp. 258-261.
    [3]
    Bell C. G., "Multis: A New Class of Multiprocessor Computers," Science, April 26, 1985, pp. 462-467.
    [4]
    Bitar, P., and A. M. Despain, "Multiprocessor Cache Synchronization Issues, Innovations, Evolution," Proceedings of the 13th Annual International Symposium on Computer Architecture, June 1986, pp. 424-433.
    [5]
    Brantley, W. C., K. P. McAuliffe, and J. Weiss, "RP3 Processor-Memory Element," Proceedings of the 1985 International Conference on Parallel Processing, August 1985, pp 782-789.
    [6]
    Brooks, E. D., "The Butterfly Barrier," International Journal of Parallel Programming, August 1986, pp 295-307.
    [7]
    Goodman, J. R., M. D. Hill, and P. J. Woest, "Scalability and Its Application to Multicube," submitted to the 16th Annual international Symposium on Coo~uter Architecture, May 1989.
    [8]
    Goodman, J. R., and P. J. Woest, "The Wisconsin Multicube: A New Large-Scale Cache-Coherent Multiproeessor," Proceedings of the 15th Annual International Symposium on Computer Arclu'tecture, June 1988, pp. 422-431.
    [9]
    Gottlieb, A., B. D. Lubachevsky, and L. Rudolph, "Basic Techniques for the Efficient Coordination of Very Large Numbers of Cooperating Sequential Pmcsots," ACM Transactions on Programming Languages and Systems, April 1983, pp. 164-189.
    [10]
    Oottlieb, A., R. Orishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, And M. Snir, "The NYU Ultracomputer-- Designing an MIMD, Shared Memory Parallel Machine," IEEE Transactions on Computers, February 1983, pp. 175-189.
    [11]
    Jordan, H. F., "Performance Measurements on HEP -- a Pipelined MIMD Computer," Proceedings of the lOth Annual international Symposium on Computer Architecture, June 1983, pp. 207-212.
    [12]
    Leutenegger, S. T., and M. K. Vernon, "A Mean- Value Performance Analysis of a New Multiprocessor Architecture," Proceedings of the 1988 ACM SIG- METRICS Conference, May 1988, pp. 16%176.
    [13]
    Lundstrom, S. F., "Applications Considerations in the System Design of Highly Concurrent Multiprocesmrs," IEEE Transactions on Computers, November 1987, pp. 1292-1309.
    [14]
    Osterhaug, A., Guide to Parallel Programming on Sequent Computer Systems, 2nd ed., Sequent Computer Systems, Inc., Beaverton, Oregon, 1987.
    [15]
    Pfister, O. A., and V. A. Norton, "Hot Spot Contention and Combining in Multistage Interconnection Networks," Proceedings of the 1985 International Conference on Parallel Processing, August 1985, pp. 790-797.
    [16]
    Rudolph, L., and Z. Segall, "Dynamic Dex.~tralized Cache Scbemes for MIMD Parallel Processors," Proceedings of the l lth Annual International Symposiam on Computer Architecture, June 1984, pp. 340-347.
    [17]
    Yew, P. C., N. F. Tzeng, and D. H. Lawrie, "Distributing Hot-Spot Addressing in Large-Scale Multiprocessors," IEEE Transactions on Computers, April 1987, pp 388-395.
    [18]
    Zhu, C. Q., and P. C. Yew, "A Scheme to Enforce Data Dependence on Large Multiprocessor Systems," IEEE Transactions on Soj~are Engineering, June 1987, pp. 726-739.

    Cited By

    View all
    • (2019)Skeap & SeapThe 31st ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3323165.3323193(287-296)Online publication date: 17-Jun-2019
    • (2019)A barrier optimization framework for NUMA multi‐core systemConcurrency and Computation: Practice and Experience10.1002/cpe.552732:5Online publication date: 21-Oct-2019
    • (2017)Lease/ReleaseACM Transactions on Parallel Computing10.1145/31321684:2(1-25)Online publication date: 10-Oct-2017
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 17, Issue 2
    Special issue: Proceedings of ASPLOS-III: the third international conference on architecture support for programming languages and operating systems
    April 1989
    291 pages
    ISSN:0163-5964
    DOI:10.1145/68182
    Issue’s Table of Contents
    • cover image ACM Conferences
      ASPLOS III: Proceedings of the third international conference on Architectural support for programming languages and operating systems
      April 1989
      303 pages
      ISBN:0897913000
      DOI:10.1145/70082
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 April 1989
    Published in SIGARCH Volume 17, Issue 2

    Check for updates

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)125
    • Downloads (Last 6 weeks)29
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Skeap & SeapThe 31st ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3323165.3323193(287-296)Online publication date: 17-Jun-2019
    • (2019)A barrier optimization framework for NUMA multi‐core systemConcurrency and Computation: Practice and Experience10.1002/cpe.552732:5Online publication date: 21-Oct-2019
    • (2017)Lease/ReleaseACM Transactions on Parallel Computing10.1145/31321684:2(1-25)Online publication date: 10-Oct-2017
    • (2016)Lease/releaseACM SIGPLAN Notices10.1145/3016078.285115551:8(1-12)Online publication date: 27-Feb-2016
    • (2016)Lease/releaseProceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/2851141.2851155(1-12)Online publication date: 27-Feb-2016
    • (2012)Energy optimization of representative barrier algorithmsJournal of Central South University10.1007/s11771-012-1348-z19:10(2823-2831)Online publication date: 4-Oct-2012
    • (2012)Effcient Handling of Lock Hand-off in DSM Multiprocessors with Buffering Coherence ControllersJournal of Computer Science and Technology10.1007/s11390-012-1207-227:1(75-91)Online publication date: 9-Jan-2012
    • (2012)Efficient fetch-and-incrementProceedings of the 26th international conference on Distributed Computing10.1007/978-3-642-33651-5_2(16-30)Online publication date: 16-Oct-2012
    • (2010)Architectural Support for Fair Reader-Writer LockingProceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2010.12(275-286)Online publication date: 4-Dec-2010
    • (2010)A relaxed synchronization primitive for macroprogramming systems2010 Seventh International Conference on Networked Sensing Systems (INSS)10.1109/INSS.2010.5573561(219-226)Online publication date: Jun-2010
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media