Article

Free access

Adaptive backoff synchronization techniques

Authors:

M. CherianAuthors Info & Claims

ISCA '89: Proceedings of the 16th annual international symposium on Computer architecture

Pages 396 - 406

https://doi.org/10.1145/74925.74970

Published: 01 April 1989 Publication History

Abstract

Shared-memory multiprocessors commonly use shared variables for synchronization. Our simulations of real parallel applications show that large-scale cache-coherent multiprocessors suffer significant amounts of invalidation traffic due to synchronization. Large multiprocessors that do not cache synchronization variables are often more severely impacted. If this synchronization traffic is not reduced or managed adequately, synchronization references can cause severe congestion in the network. We propose a class of adaptive back-off methods that do not use any extra hardware and can significantly reduce the memory traffic to synchronization variables. These methods use synchronization state to reduce polling of synchronization variables. Our simulations show that when the number of processors participating in a barrier synchronization is small compared to the time of arrival of the processors, reductions of 20 percent to over 95 percent in synchronization traffic can be achieved at no extra cost. In other situations adaptive backoff techniques result in a tradeoff between reduced network accesses and increased processor idle time.

References

[1]

Norman Abramson. The ALOHA System - Another alternative for computer communications. In Proc. Fall Joint Computer Conf., pages 261-285, 1977.

[2]

Anant Agarwal and Mathews Cherian. Adaptive Backofi Synchronization Techniques. MIT VLSI Memo, April 1989.

[3]

Anant Agarwal, Richard Simoni, John Hennessy, and Mark Horowitz. An Evaluation of Directory Schemes for Cache Coherence. In Proc. 15th Intl. Symp. on Computer Architecture, IEEE, New York, June 1988.

Digital Library

[4]

Lucien M. Censier and Paul Feautrier. A New Solution to Coherence Problems in Multicache Systems. IEEE Trans. on Computers, C-27(12):1112-1118, December 1978.

Digital Library

[5]

J. W. Cooley and J. W. "Tukey. An Algorithm for the Machine Calculation of Complex Fourier Series. Math. Comput., 19:297-301, April 1965.

[6]

W. P. Crowley and C. P. Hendrickson and T. E. Rudy. The Simple Code. Lawrence Livermore Laboratory TR, February 1978.

[7]

F. Darema-Rogers, D. A. George, V. A. Norton, and G. F. Pfister. Single-Program-Multiple-Data Computational Model for EPEX/FORTRAN. TR RC 11552 (55212), IBM T. J. Watson Research Center, Yorktown Heights, November 1986.

[8]

Daniel Gajski, David Kuck, Duncan Lawrie, and Ahmed Saleh. Cedar - A Large Scale Multiprocessor. In Proc. ICPP, pages 524-529, August 1983.

[9]

A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, and M. Snir. The NYU Ultracomputer - Designing a MIMD Shared-Memory Parallel Machine. IEEE Trans. on Computers, C-32(2):175- 189, February 1983.

Digital Library

[10]

Tsutomu Hoshino. PAX Computer. High-Speed ParaL lel Processing and Scientific Computing. Addison Wesley, Reading Mass., 1989. Harold S. Stone, Editor.

Digital Library

[11]

Eugenia Kalnay-Rivas and David Hoitsma. Documentation of the Fourth Order Band Model. Technical Report, NASA Modeling and Simulation Facility Laboratory for Atmospheric Science, NASA/Goddard Space Flight Center, Greenbelt, MD, 1979.

[12]

L. Kleinrock and Y. Yemini. An Optimal Adaptive Scheme for Multiple Acess Broadcast Communication. Proc. ICC, pages 7.2.1-7.2.5, June 1978.

[13]

S. S. Lam. A' Carrier Sense Multiple Access Protocol for Local Networks. Computer Networks, 4(1):21-32, Jan. 1980.

[14]

S. S. Lam and L. Kleinrock. Packet Switching in a Multiaccess Broadcast Channel: Dynamic Control Procedures. IEEE Trans. on Computers, C-23, Sept. 1975.

[15]

E. L. Lusk and R. A. Overbeek. Implementation of Monitors with Macros: A Programming Aid for the HEP and other Parallel Processors. TR ANL-83-97, Argonne National Laboratory, Argonne, Illinois, December 1983.

[16]

R. Metcalfe and D. Boggs. Ethernet: Distributed Packet Switching for Local Computer Networks. Communications of the ACM, 19(7), July 1976.

Digital Library

[17]

Janak H. Patel. Analysis of Multiprocessors with Private Cache Memories. IEEE Trans. on Computers, C- 31(4):296-304, April 1982.

Digital Library

[18]

G. F. Pfister, W. C. Brantley, D. A. George, S. L. Harvey, W. J. Kleinfelder, K. P. McAuliffe, E. A. Melton, A. Norton, and J. Weiss. The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture. In Proc. ICPP, pages 764-771, August 1985.

[19]

G. F. Pfister and V. A. Norton. 'Hotspot' Contention and Combining in Multistage Interconnection Networks. IEEE Trans. on Computers, C-34(10), October 1985.

[20]

Steven Scott and Gurindar Sohi. Using Feedback to Control Tree Saturation In Multistage Interconnection Networks. In Proc. 16th Annual Int. Symp. on Computer Architecture, June 1989.

Digital Library

[21]

K. So, F. Darema-Rogers, D. A. George, V. A. Norton, and G. F. Pfister. PSIMUL - A System for Parallel Simulation of Parallel Systems. Technical Report RC 11674 (58502), IBM T. J. Watson Research Center, Yorktown Heights, November 1987.

[22]

Peiyi Tang and Pen-Chung Yew. Processor Selfscheduling for Multiple-Nested Parallel Loops. In Proc. ICPP, pages 528-535, August 1986.

[23]

Wolf-Dietrich Weber and Anoop Gupta. Analysis of Cache Invalidation Patterns in Multiprocessors. In Proc. ASPLOS III, April 1989.

Digital Library

[24]

P.-C. Yew, N.-F. Tzeng, and D. H. Lawrie. Distributed Hot-Spot Addressing in Large-Scale Multiprocessors. IEEE Tmns. on Computers, C-36(14):388-395, April 1987.

Digital Library

Cited By

Guerraoui RGuiroux HLachaize RQuéma VTrigonakis V(2019)Lock–UnlockACM Transactions on Computer Systems10.1145/330150136:1(1-149)Online publication date: 14-Mar-2019
https://dl.acm.org/doi/10.1145/3301501
Wang JLi TFu X(2018)Accurate counting algorithm for high‐speed parallel applicationsConcurrency and Computation: Practice and Experience10.1002/cpe.509031:13Online publication date: 23-Nov-2018
https://doi.org/10.1002/cpe.5090
Acar UBen-David NRainey M(2017)Contention in Structured ConcurrencyACM SIGPLAN Notices10.1145/3155284.301876252:8(75-88)Online publication date: 26-Jan-2017
https://dl.acm.org/doi/10.1145/3155284.3018762
Show More Cited By

Index Terms

Adaptive backoff synchronization techniques

Recommendations

Adaptive backoff synchronization techniques
Special Issue: Proceedings of the 16th annual international symposium on Computer Architecture

Shared-memory multiprocessors commonly use shared variables for synchronization. Our simulations of real parallel applications show that large-scale cache-coherent multiprocessors suffer significant amounts of invalidation traffic due to ...
A STUDY OF BACKOFF BARRIER SYNCHRONIZATION
Synchronization coherence: A transparent hardware mechanism for cache coherence and fine-grained synchronization

The quest to improve performance forces designers to explore finer-grained multiprocessor machines. Ever increasing chip densities based on CMOS improvements fuel research in highly parallel chip multiprocessors with 100s of processing elements. With ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISCA '89: Proceedings of the 16th annual international symposium on Computer architecture

April 1989

426 pages

ISBN:0897913191

DOI:10.1145/74925

Chairman:
Jean-Claude Syre

ACM SIGARCH Computer Architecture News Volume 17, Issue 3
Special Issue: Proceedings of the 16th annual international symposium on Computer Architecture
June 1989
400 pages
ISSN:0163-5964
DOI:10.1145/74926
Editor:
Jean-Claude Syre
Issue’s Table of Contents

Copyright © 1989 Authors.

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture
IEEE-CS: Computer Society

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 1989

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Sponsor:
sigarch

The 52nd Annual International Symposium on Computer Architecture

June 21 - 25, 2025

Tokyo , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

74
Total Citations
View Citations
743
Total Downloads

Downloads (Last 12 months)50
Downloads (Last 6 weeks)13

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Guerraoui RGuiroux HLachaize RQuéma VTrigonakis V(2019)Lock–UnlockACM Transactions on Computer Systems10.1145/330150136:1(1-149)Online publication date: 14-Mar-2019
https://dl.acm.org/doi/10.1145/3301501
Wang JLi TFu X(2018)Accurate counting algorithm for high‐speed parallel applicationsConcurrency and Computation: Practice and Experience10.1002/cpe.509031:13Online publication date: 23-Nov-2018
https://doi.org/10.1002/cpe.5090
Acar UBen-David NRainey M(2017)Contention in Structured ConcurrencyACM SIGPLAN Notices10.1145/3155284.301876252:8(75-88)Online publication date: 26-Jan-2017
https://dl.acm.org/doi/10.1145/3155284.3018762
Acar UBen-David NRainey MSarkar VRauchwerger L(2017)Contention in Structured ConcurrencyProceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3018743.3018762(75-88)Online publication date: 26-Jan-2017
https://dl.acm.org/doi/10.1145/3018743.3018762
Dongol BHierons RKoskinen EGrohe MShankar N(2016)Decidability and Complexity for Quiescent ConsistencyProceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science10.1145/2933575.2933576(116-125)Online publication date: 5-Jul-2016
https://dl.acm.org/doi/10.1145/2933575.2933576
Sung HAdve S(2015)DeNovoSyncACM SIGARCH Computer Architecture News10.1145/2786763.269435643:1(545-559)Online publication date: 14-Mar-2015
https://dl.acm.org/doi/10.1145/2786763.2694356
Sung HAdve S(2015)DeNovoSyncACM SIGPLAN Notices10.1145/2775054.269435650:4(545-559)Online publication date: 14-Mar-2015
https://dl.acm.org/doi/10.1145/2775054.2694356
Sung HAdve SOzturk OEbcioglu KDwarkadas S(2015)DeNovoSyncProceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/2694344.2694356(545-559)Online publication date: 14-Mar-2015
https://dl.acm.org/doi/10.1145/2694344.2694356
Changwoo Min Young Ik Eom (2015)Integrating Lock-Free and Combining Techniques for a Practical and Scalable FIFO QueueIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2014.233300726:7(1910-1922)Online publication date: 1-Jul-2015
https://dl.acm.org/doi/10.1109/TPDS.2014.2333007
Drescher GSchröder-Preikschat W(2015)Wartefreie Synchronisation von Echtzeitprozessen mittels abgeschirmter AbschnitteBetriebssysteme und Echtzeit10.1007/978-3-662-48611-5_7(59-68)Online publication date: 6-Nov-2015
https://doi.org/10.1007/978-3-662-48611-5_7
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents