Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Analysis of cache invalidation patterns in multiprocessors

Published: 01 April 1989 Publication History

Abstract

To make shared-memory multiprocessors scalable, researchers are now exploring cache coherence protocols that do not rely on broadcast, but instead send invalidation messages to individual caches that contain stale data. The feasibility of such directory-based protocols is highly sensitive to the cache invalidation patterns that parallel programs exhibit. In this paper, we analyze the cache invalidation patterns caused by several parallel applications and investigate the effect of these patterns on a directory-based protocol. Our results are based on multiprocessor traces with 4, 8 and 16 processors. To gain insight into what the invalidation patterns would look like beyond 16 processors, we propose a classification scheme for data objects found in parallel applications and link the invalidation traffic patterns observed in the traces back to these high-level objects. Our results show that synchronization objects have very different invalidation patterns from those of other data objects. A write reference to a synchronization object usually causes invalidations in many more caches. We point out situations where restructuring the application seems appropriate to reduce the invalidation traffic, and others where hardware support is more appropriate. Our results also show that it should be possible to scale “well-written” parallel programs to a large number of processors without an explosion in invalidation traffic.

References

[1]
Anant Agarwal and Anoop Gupta. Memory Reference Characteristics of Multiprocessor Applications under MACH. In ACM SIGMETRICS, 1988.
[2]
Anant Agarwal, Richard Simoni, John Hennessy, and Mark Horowitz. An Evaluation of Directory Schemes for Cache Coherence. In }5th International Symposium on Computer Architecture, 1988.
[3]
Francisco Javier Carrasco. A Parallel Maxfiow Implementation. March 1988. CS411 - Final Project Report, Stanford University.
[4]
M. Censier and P. Feautier. A New Solution 1o Coherence Problems in Multicache Systems. IEEE Transactions on Computers, C-27(12):1112-1118, December 1978.
[5]
K. M. Chandy and J. Misra. Asynchronous Distributed Simulation via a Sequence of Pamllcl Computations. In Communications of the ACM, April 1981.
[6]
David Cheriton. Workfonn Processing: A Model and Language for Parallel Computation. Stanford University, Computer Science Technical Report, 1986.
[7]
Multimax Technical Summary. Encore Corporation.
[8]
Stephen R. Goldschrnidt. Simulating Mulfiprocessor Memory Traces. December 1987. EE390 Report, Stanford University.
[9]
~.R. Goodman. Using Cache Memory to Reduce Processor- Memory Traffic. In Proc. Tenth International Symposium on Computer Architecture, pages 124-131, June 1983.
[10]
Anoop Gupta, Charles Forgy, and Robert Wedig. Parallel Algorithms and Architectures for Rule-Based Syterns. In Proc. I3th Int. Syrup. of Computer Architecture, June 1986.
[11]
S. Kificpatrick, C.D. Gelatt, and M. P. Vecchi. Optimization by Simulated Annealing. Science, 220(4580):671-680, May 1983.
[12]
Tom Lover and Shreekant Thak. kar. The Symmetry Multiprocessor System. In Proceedings of the 1988 International Conference on Parallel Processing, pages 303-310, August 1988.
[13]
Lurk, Overbeek, et al. Portable Programs for Parallel Processors. Holt, Rinehart, and Winston Inc., 1987.
[14]
Lusk, Stevens, and Ov~fi>ee.k. A Tutorial on the Use of Monitors in C: Writing Portable Code for Multiprocessors. Argonne National Laboratory, Argonne, Illinois 60439, 1986.
[15]
Margaret Martono~ and Anoop Gupta. Shared Memory vs. Message Passing Architectures: An Application Based Study. 1988. Submitted for publication.
[16]
Jeffrey D. McDonald. A Direct Particle Simulation Method for Hypersonic Ratified Flow on a Shared Memory Multiprocessor. March 1988. CS411 - Final Project Report, Stanford University.
[17]
Jeffrey D. McDonald and Donald Baganoff. Vectorization of a Particle Simulation Method for Hypersonic Raritied Flow. In AIAA Thermodynamics, Plasmadynamics and Lasers Conference, June 1988.
[18]
Louis Monicr and Pradeeg Sindhu. The Axc~tu~ of the Dragon. In Proc. Thirtieth IEEE Int. Conference, pages 118- 121, IF., Febmrary 1985.
[19]
R. Katz, S. Eggers, D. Wood, C. Perkins, and R. Sheldon. Implementing a Cache Consistency Protocol. in 12th International Symposium on Computer Architecture, 1985.
[20]
Jonathan Rose. LocusRoute: A Parallel Global Router for Standard Cells. In Design Automation Conference, pages 189-195, June 1988.
[21]
Larry Rudolph and Zary Segall. Dynamic Decentralized Cache Consistency Schemes for MIMD Parallel Processors. In Proc. 12th Int. Syrup. on Computer Architecture, pages 355--362, ACM SIGARCH, June 1985. also SIGARCI-I Newsletter, Volume 13, Issue 3, 1985.
[22]
Richard L. Sites and Anant Agarwal. Multiprocessor Cache Analysis using ATUM. In Proc. 15th Annual International Symposium on Computer Architecture, May 1988.
[23]
Michael Smith and Wolf-Dietrich Weber. Parallel Simulated Annealing. Maxeh 1988. CS411- Final Project Report, Stanford University.
[24]
Larry Soule and Tom Blank. Parallel Logic Simulation on General Purpose Machines. In Design Automation Conference, pages 166-171, June 1988.
[25]
C. Thacker and L. Stewart. Firefly: A Multiprocessor Wostation. In 2nd Int. Conference on Architectural Support for Programming Languages and Operating Systems, pages 164- 172, ACM, October 1987.

Cited By

View all
  • (2018)Energy-efficient hybrid coherence protocol for multicore processorsCluster Computing10.5555/3287988.328800321:3(1521-1541)Online publication date: 1-Sep-2018
  • (2015)Turning Centralized Coherence and Distributed Critical-Section Execution on their HeadProceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing10.1145/2749246.2749250(3-14)Online publication date: 15-Jun-2015
  • (1999)Cache Memory ProtocolsWiley Encyclopedia of Electrical and Electronics Engineering10.1002/047134608X.W1661Online publication date: 27-Dec-1999
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 17, Issue 2
Special issue: Proceedings of ASPLOS-III: the third international conference on architecture support for programming languages and operating systems
April 1989
291 pages
ISSN:0163-5964
DOI:10.1145/68182
Issue’s Table of Contents
  • cover image ACM Conferences
    ASPLOS III: Proceedings of the third international conference on Architectural support for programming languages and operating systems
    April 1989
    303 pages
    ISBN:0897913000
    DOI:10.1145/70082
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 1989
Published in SIGARCH Volume 17, Issue 2

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)91
  • Downloads (Last 6 weeks)21
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Energy-efficient hybrid coherence protocol for multicore processorsCluster Computing10.5555/3287988.328800321:3(1521-1541)Online publication date: 1-Sep-2018
  • (2015)Turning Centralized Coherence and Distributed Critical-Section Execution on their HeadProceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing10.1145/2749246.2749250(3-14)Online publication date: 15-Jun-2015
  • (1999)Cache Memory ProtocolsWiley Encyclopedia of Electrical and Electronics Engineering10.1002/047134608X.W1661Online publication date: 27-Dec-1999
  • (1992)A comparative analysis of cache memory architectures for the MULTIPLUS multiprocessorMicroprocessing and Microprogramming10.1016/0165-6074(92)90368-H35:1-5(555-562)Online publication date: Sep-1992
  • (1989)Software-controlled cache coherence protocol for multicache systemsInformation Processing Letters10.1016/0020-0190(89)90190-733:3(125-130)Online publication date: Nov-1989
  • (2019)DASHProceedings of the VLDB Endowment10.14778/3317315.331732112:7(793-806)Online publication date: 1-Mar-2019
  • (2018)Energy-efficient hybrid coherence protocol for multicore processorsCluster Computing10.1007/s10586-018-1947-z21:3(1521-1541)Online publication date: 16-Feb-2018
  • (2017)Tiny Directory: Efficient Shared Memory in Many-Core Systems with Ultra-Low-Overhead Coherence Tracking2017 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2017.24(205-216)Online publication date: Feb-2017
  • (2016)Building Expressive and Area-Efficient Directories with Hybrid Representation and Adaptive Multi-Granular TrackingIEEE Transactions on Computers10.1109/TC.2015.243579065:3(847-859)Online publication date: 1-Mar-2016
  • (2015)Turning Centralized Coherence and Distributed Critical-Section Execution on their HeadProceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing10.1145/2749246.2749250(3-14)Online publication date: 15-Jun-2015
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media