Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/74925.74970acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article
Free access

Adaptive backoff synchronization techniques

Published: 01 April 1989 Publication History

Abstract

Shared-memory multiprocessors commonly use shared variables for synchronization. Our simulations of real parallel applications show that large-scale cache-coherent multiprocessors suffer significant amounts of invalidation traffic due to synchronization. Large multiprocessors that do not cache synchronization variables are often more severely impacted. If this synchronization traffic is not reduced or managed adequately, synchronization references can cause severe congestion in the network. We propose a class of adaptive back-off methods that do not use any extra hardware and can significantly reduce the memory traffic to synchronization variables. These methods use synchronization state to reduce polling of synchronization variables. Our simulations show that when the number of processors participating in a barrier synchronization is small compared to the time of arrival of the processors, reductions of 20 percent to over 95 percent in synchronization traffic can be achieved at no extra cost. In other situations adaptive backoff techniques result in a tradeoff between reduced network accesses and increased processor idle time.

References

[1]
Norman Abramson. The ALOHA System - Another alternative for computer communications. In Proc. Fall Joint Computer Conf., pages 261-285, 1977.
[2]
Anant Agarwal and Mathews Cherian. Adaptive Backofi Synchronization Techniques. MIT VLSI Memo, April 1989.
[3]
Anant Agarwal, Richard Simoni, John Hennessy, and Mark Horowitz. An Evaluation of Directory Schemes for Cache Coherence. In Proc. 15th Intl. Symp. on Computer Architecture, IEEE, New York, June 1988.
[4]
Lucien M. Censier and Paul Feautrier. A New Solution to Coherence Problems in Multicache Systems. IEEE Trans. on Computers, C-27(12):1112-1118, December 1978.
[5]
J. W. Cooley and J. W. "Tukey. An Algorithm for the Machine Calculation of Complex Fourier Series. Math. Comput., 19:297-301, April 1965.
[6]
W. P. Crowley and C. P. Hendrickson and T. E. Rudy. The Simple Code. Lawrence Livermore Laboratory TR, February 1978.
[7]
F. Darema-Rogers, D. A. George, V. A. Norton, and G. F. Pfister. Single-Program-Multiple-Data Computational Model for EPEX/FORTRAN. TR RC 11552 (55212), IBM T. J. Watson Research Center, Yorktown Heights, November 1986.
[8]
Daniel Gajski, David Kuck, Duncan Lawrie, and Ahmed Saleh. Cedar - A Large Scale Multiprocessor. In Proc. ICPP, pages 524-529, August 1983.
[9]
A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, and M. Snir. The NYU Ultracomputer - Designing a MIMD Shared-Memory Parallel Machine. IEEE Trans. on Computers, C-32(2):175- 189, February 1983.
[10]
Tsutomu Hoshino. PAX Computer. High-Speed ParaL lel Processing and Scientific Computing. Addison Wesley, Reading Mass., 1989. Harold S. Stone, Editor.
[11]
Eugenia Kalnay-Rivas and David Hoitsma. Documentation of the Fourth Order Band Model. Technical Report, NASA Modeling and Simulation Facility Laboratory for Atmospheric Science, NASA/Goddard Space Flight Center, Greenbelt, MD, 1979.
[12]
L. Kleinrock and Y. Yemini. An Optimal Adaptive Scheme for Multiple Acess Broadcast Communication. Proc. ICC, pages 7.2.1-7.2.5, June 1978.
[13]
S. S. Lam. A' Carrier Sense Multiple Access Protocol for Local Networks. Computer Networks, 4(1):21-32, Jan. 1980.
[14]
S. S. Lam and L. Kleinrock. Packet Switching in a Multiaccess Broadcast Channel: Dynamic Control Procedures. IEEE Trans. on Computers, C-23, Sept. 1975.
[15]
E. L. Lusk and R. A. Overbeek. Implementation of Monitors with Macros: A Programming Aid for the HEP and other Parallel Processors. TR ANL-83-97, Argonne National Laboratory, Argonne, Illinois, December 1983.
[16]
R. Metcalfe and D. Boggs. Ethernet: Distributed Packet Switching for Local Computer Networks. Communications of the ACM, 19(7), July 1976.
[17]
Janak H. Patel. Analysis of Multiprocessors with Private Cache Memories. IEEE Trans. on Computers, C- 31(4):296-304, April 1982.
[18]
G. F. Pfister, W. C. Brantley, D. A. George, S. L. Harvey, W. J. Kleinfelder, K. P. McAuliffe, E. A. Melton, A. Norton, and J. Weiss. The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture. In Proc. ICPP, pages 764-771, August 1985.
[19]
G. F. Pfister and V. A. Norton. 'Hotspot' Contention and Combining in Multistage Interconnection Networks. IEEE Trans. on Computers, C-34(10), October 1985.
[20]
Steven Scott and Gurindar Sohi. Using Feedback to Control Tree Saturation In Multistage Interconnection Networks. In Proc. 16th Annual Int. Symp. on Computer Architecture, June 1989.
[21]
K. So, F. Darema-Rogers, D. A. George, V. A. Norton, and G. F. Pfister. PSIMUL - A System for Parallel Simulation of Parallel Systems. Technical Report RC 11674 (58502), IBM T. J. Watson Research Center, Yorktown Heights, November 1987.
[22]
Peiyi Tang and Pen-Chung Yew. Processor Selfscheduling for Multiple-Nested Parallel Loops. In Proc. ICPP, pages 528-535, August 1986.
[23]
Wolf-Dietrich Weber and Anoop Gupta. Analysis of Cache Invalidation Patterns in Multiprocessors. In Proc. ASPLOS III, April 1989.
[24]
P.-C. Yew, N.-F. Tzeng, and D. H. Lawrie. Distributed Hot-Spot Addressing in Large-Scale Multiprocessors. IEEE Tmns. on Computers, C-36(14):388-395, April 1987.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISCA '89: Proceedings of the 16th annual international symposium on Computer architecture
April 1989
426 pages
ISBN:0897913191
DOI:10.1145/74925
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 17, Issue 3
    Special Issue: Proceedings of the 16th annual international symposium on Computer Architecture
    June 1989
    400 pages
    ISSN:0163-5964
    DOI:10.1145/74926
    Issue’s Table of Contents

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 1989

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)50
  • Downloads (Last 6 weeks)13
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Lock–UnlockACM Transactions on Computer Systems10.1145/330150136:1(1-149)Online publication date: 14-Mar-2019
  • (2018)Accurate counting algorithm for high‐speed parallel applicationsConcurrency and Computation: Practice and Experience10.1002/cpe.509031:13Online publication date: 23-Nov-2018
  • (2017)Contention in Structured ConcurrencyACM SIGPLAN Notices10.1145/3155284.301876252:8(75-88)Online publication date: 26-Jan-2017
  • (2017)Contention in Structured ConcurrencyProceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3018743.3018762(75-88)Online publication date: 26-Jan-2017
  • (2016)Decidability and Complexity for Quiescent ConsistencyProceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science10.1145/2933575.2933576(116-125)Online publication date: 5-Jul-2016
  • (2015)DeNovoSyncACM SIGARCH Computer Architecture News10.1145/2786763.269435643:1(545-559)Online publication date: 14-Mar-2015
  • (2015)DeNovoSyncACM SIGPLAN Notices10.1145/2775054.269435650:4(545-559)Online publication date: 14-Mar-2015
  • (2015)DeNovoSyncProceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/2694344.2694356(545-559)Online publication date: 14-Mar-2015
  • (2015)Integrating Lock-Free and Combining Techniques for a Practical and Scalable FIFO QueueIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2014.233300726:7(1910-1922)Online publication date: 1-Jul-2015
  • (2015)Wartefreie Synchronisation von Echtzeitprozessen mittels abgeschirmter AbschnitteBetriebssysteme und Echtzeit10.1007/978-3-662-48611-5_7(59-68)Online publication date: 6-Nov-2015
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media