article

Free access

Analysis of cache invalidation patterns in multiprocessors

Authors:

A. GuptaAuthors Info & Claims

ACM SIGARCH Computer Architecture News, Volume 17, Issue 2

Pages 243 - 256

https://doi.org/10.1145/68182.68205

Published: 01 April 1989 Publication History

Abstract

To make shared-memory multiprocessors scalable, researchers are now exploring cache coherence protocols that do not rely on broadcast, but instead send invalidation messages to individual caches that contain stale data. The feasibility of such directory-based protocols is highly sensitive to the cache invalidation patterns that parallel programs exhibit. In this paper, we analyze the cache invalidation patterns caused by several parallel applications and investigate the effect of these patterns on a directory-based protocol. Our results are based on multiprocessor traces with 4, 8 and 16 processors. To gain insight into what the invalidation patterns would look like beyond 16 processors, we propose a classification scheme for data objects found in parallel applications and link the invalidation traffic patterns observed in the traces back to these high-level objects. Our results show that synchronization objects have very different invalidation patterns from those of other data objects. A write reference to a synchronization object usually causes invalidations in many more caches. We point out situations where restructuring the application seems appropriate to reduce the invalidation traffic, and others where hardware support is more appropriate. Our results also show that it should be possible to scale “well-written” parallel programs to a large number of processors without an explosion in invalidation traffic.

References

[1]

Anant Agarwal and Anoop Gupta. Memory Reference Characteristics of Multiprocessor Applications under MACH. In ACM SIGMETRICS, 1988.

Digital Library

[2]

Anant Agarwal, Richard Simoni, John Hennessy, and Mark Horowitz. An Evaluation of Directory Schemes for Cache Coherence. In }5th International Symposium on Computer Architecture, 1988.

Digital Library

[3]

Francisco Javier Carrasco. A Parallel Maxfiow Implementation. March 1988. CS411 - Final Project Report, Stanford University.

[4]

M. Censier and P. Feautier. A New Solution 1o Coherence Problems in Multicache Systems. IEEE Transactions on Computers, C-27(12):1112-1118, December 1978.

Digital Library

[5]

K. M. Chandy and J. Misra. Asynchronous Distributed Simulation via a Sequence of Pamllcl Computations. In Communications of the ACM, April 1981.

Digital Library

[6]

David Cheriton. Workfonn Processing: A Model and Language for Parallel Computation. Stanford University, Computer Science Technical Report, 1986.

[7]

Multimax Technical Summary. Encore Corporation.

[8]

Stephen R. Goldschrnidt. Simulating Mulfiprocessor Memory Traces. December 1987. EE390 Report, Stanford University.

[9]

~.R. Goodman. Using Cache Memory to Reduce Processor- Memory Traffic. In Proc. Tenth International Symposium on Computer Architecture, pages 124-131, June 1983.

Digital Library

[10]

Anoop Gupta, Charles Forgy, and Robert Wedig. Parallel Algorithms and Architectures for Rule-Based Syterns. In Proc. I3th Int. Syrup. of Computer Architecture, June 1986.

Digital Library

[11]

S. Kificpatrick, C.D. Gelatt, and M. P. Vecchi. Optimization by Simulated Annealing. Science, 220(4580):671-680, May 1983.

[12]

Tom Lover and Shreekant Thak. kar. The Symmetry Multiprocessor System. In Proceedings of the 1988 International Conference on Parallel Processing, pages 303-310, August 1988.

[13]

Lurk, Overbeek, et al. Portable Programs for Parallel Processors. Holt, Rinehart, and Winston Inc., 1987.

Digital Library

[14]

Lusk, Stevens, and Ov~fi>ee.k. A Tutorial on the Use of Monitors in C: Writing Portable Code for Multiprocessors. Argonne National Laboratory, Argonne, Illinois 60439, 1986.

[15]

Margaret Martono~ and Anoop Gupta. Shared Memory vs. Message Passing Architectures: An Application Based Study. 1988. Submitted for publication.

[16]

Jeffrey D. McDonald. A Direct Particle Simulation Method for Hypersonic Ratified Flow on a Shared Memory Multiprocessor. March 1988. CS411 - Final Project Report, Stanford University.

[17]

Jeffrey D. McDonald and Donald Baganoff. Vectorization of a Particle Simulation Method for Hypersonic Raritied Flow. In AIAA Thermodynamics, Plasmadynamics and Lasers Conference, June 1988.

[18]

Louis Monicr and Pradeeg Sindhu. The Axc~tu~ of the Dragon. In Proc. Thirtieth IEEE Int. Conference, pages 118- 121, IF., Febmrary 1985.

[19]

R. Katz, S. Eggers, D. Wood, C. Perkins, and R. Sheldon. Implementing a Cache Consistency Protocol. in 12th International Symposium on Computer Architecture, 1985.

Digital Library

[20]

Jonathan Rose. LocusRoute: A Parallel Global Router for Standard Cells. In Design Automation Conference, pages 189-195, June 1988.

Digital Library

[21]

Larry Rudolph and Zary Segall. Dynamic Decentralized Cache Consistency Schemes for MIMD Parallel Processors. In Proc. 12th Int. Syrup. on Computer Architecture, pages 355--362, ACM SIGARCH, June 1985. also SIGARCI-I Newsletter, Volume 13, Issue 3, 1985.

Digital Library

[22]

Richard L. Sites and Anant Agarwal. Multiprocessor Cache Analysis using ATUM. In Proc. 15th Annual International Symposium on Computer Architecture, May 1988.

Digital Library

[23]

Michael Smith and Wolf-Dietrich Weber. Parallel Simulated Annealing. Maxeh 1988. CS411- Final Project Report, Stanford University.

[24]

Larry Soule and Tom Blank. Parallel Logic Simulation on General Purpose Machines. In Design Automation Conference, pages 166-171, June 1988.

Digital Library

[25]

C. Thacker and L. Stewart. Firefly: A Multiprocessor Wostation. In 2nd Int. Conference on Architectural Support for Programming Languages and Operating Systems, pages 164- 172, ACM, October 1987.

Cited By

Chen CHsia AZhan YLiu T(2018)Energy-efficient hybrid coherence protocol for multicore processorsCluster Computing10.5555/3287988.328800321:3(1521-1541)Online publication date: 1-Sep-2018
https://dl.acm.org/doi/10.5555/3287988.3288003
Kaxiras SKlaftenegger DNorgren MRos ASagonas KKielmann THildebrand DTaufer M(2015)Turning Centralized Coherence and Distributed Critical-Section Execution on their HeadProceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing10.1145/2749246.2749250(3-14)Online publication date: 15-Jun-2015
https://dl.acm.org/doi/10.1145/2749246.2749250
Chang YBhuyan L(1999)Cache Memory ProtocolsWiley Encyclopedia of Electrical and Electronics Engineering10.1002/047134608X.W1661Online publication date: 27-Dec-1999
https://doi.org/10.1002/047134608X.W1661
Show More Cited By

Index Terms

Analysis of cache invalidation patterns in multiprocessors

Recommendations

Analysis of cache invalidation patterns in multiprocessors
ASPLOS III: Proceedings of the third international conference on Architectural support for programming languages and operating systems

To make shared-memory multiprocessors scalable, researchers are now exploring cache coherence protocols that do not rely on broadcast, but instead send invalidation messages to individual caches that contain stale data. The feasibility of such directory-...
Modeling LRU cache with invalidation

Least Recently Used (LRU) is a very popular caching replacement policy. It is very easy to implement and offers good performance, especially when data requests are temporally correlated, as in the case of web traffic.When the data content can change ...
Cache Invalidation Patterns in Shared-Memory Multiprocessors

The cache invalidation patterns of several parallel applications are analyzed. The results are based on multiprocessor simulations with 8, 16, and 32 processors. To provide deeper insight into the observed invalidation behavior the invalidations ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News

ACM SIGARCH Computer Architecture News Volume 17, Issue 2

Special issue: Proceedings of ASPLOS-III: the third international conference on architecture support for programming languages and operating systems

April 1989

291 pages

ISSN:0163-5964

DOI:10.1145/68182

Editor:
Joel Emer

Issue’s Table of Contents

ASPLOS III: Proceedings of the third international conference on Architectural support for programming languages and operating systems
April 1989
303 pages
ISBN:0897913000
DOI:10.1145/70082
Chairman:
Joel Emer,
General Chair:
John Hennessy
Stanford University

Copyright © 1989 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 1989

Published in SIGARCH Volume 17, Issue 2

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

146
Total Citations
View Citations
791
Total Downloads

Downloads (Last 12 months)91
Downloads (Last 6 weeks)21

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Chen CHsia AZhan YLiu T(2018)Energy-efficient hybrid coherence protocol for multicore processorsCluster Computing10.5555/3287988.328800321:3(1521-1541)Online publication date: 1-Sep-2018
https://dl.acm.org/doi/10.5555/3287988.3288003
Kaxiras SKlaftenegger DNorgren MRos ASagonas KKielmann THildebrand DTaufer M(2015)Turning Centralized Coherence and Distributed Critical-Section Execution on their HeadProceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing10.1145/2749246.2749250(3-14)Online publication date: 15-Jun-2015
https://dl.acm.org/doi/10.1145/2749246.2749250
Chang YBhuyan L(1999)Cache Memory ProtocolsWiley Encyclopedia of Electrical and Electronics Engineering10.1002/047134608X.W1661Online publication date: 27-Dec-1999
https://doi.org/10.1002/047134608X.W1661
Meslin APacheco AAude J(1992)A comparative analysis of cache memory architectures for the MULTIPLUS multiprocessorMicroprocessing and Microprogramming10.1016/0165-6074(92)90368-H35:1-5(555-562)Online publication date: Sep-1992
https://doi.org/10.1016/0165-6074(92)90368-H
Lopriore L(1989)Software-controlled cache coherence protocol for multicache systemsInformation Processing Letters10.1016/0020-0190(89)90190-733:3(125-130)Online publication date: Nov-1989
https://doi.org/10.1016/0020-0190(89)90190-7
Won YKim SYun JTuan DSeo J(2019)DASHProceedings of the VLDB Endowment10.14778/3317315.331732112:7(793-806)Online publication date: 1-Mar-2019
https://dl.acm.org/doi/10.14778/3317315.3317321
Chen CHsia AZhan YLiu T(2018)Energy-efficient hybrid coherence protocol for multicore processorsCluster Computing10.1007/s10586-018-1947-z21:3(1521-1541)Online publication date: 16-Feb-2018
https://doi.org/10.1007/s10586-018-1947-z
Shukla SChaudhuri M(2017)Tiny Directory: Efficient Shared Memory in Many-Core Systems with Ultra-Low-Overhead Coherence Tracking2017 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2017.24(205-216)Online publication date: Feb-2017
https://doi.org/10.1109/HPCA.2017.24
Liu PFang LHuang MHu QJiang G(2016)Building Expressive and Area-Efficient Directories with Hybrid Representation and Adaptive Multi-Granular TrackingIEEE Transactions on Computers10.1109/TC.2015.243579065:3(847-859)Online publication date: 1-Mar-2016
https://dl.acm.org/doi/10.1109/TC.2015.2435790
Kaxiras SKlaftenegger DNorgren MRos ASagonas KKielmann THildebrand DTaufer M(2015)Turning Centralized Coherence and Distributed Critical-Section Execution on their HeadProceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing10.1145/2749246.2749250(3-14)Online publication date: 15-Jun-2015
https://dl.acm.org/doi/10.1145/2749246.2749250
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents