Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/139669.139674acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article
Free access

A performance study of memory consistency models

Published: 01 April 1992 Publication History

Abstract

Recent advances in technology are such that the speed of processors is increasing faster than memory latency is decreasing. Therefore the relative cost of a cache miss is becoming more important. However, the full cost of a cache miss need not be paid every time in a multiprocessor. The frequency with which the processor must stall on a cache miss can be reduced by using a relaxed model of memory consistency.
In this paper, we present the results of instruction-level simulation studies on the relative performance benefits of using different models of memory consistency. Our vehicle of study is a shared-memory multiprocessor with processors and associated write-back caches connected to global memory modules via an Omega network. The benefits of the relaxed models, and their increasing hardware complexity, are assessed with varying cache size, line size, and number of processors. We find that substantial benefits can be accrued by using relaxed models but the magnitudes of the benefits depend on the architecture being modeled, the benchmarks, and how the code is scheduled. We did not find any major difference in levels of improvement among the various relaxed models.

References

[1]
Sarita Adve and Mark Hill. Weak Ordering- A New Definition. in 17th ISCA, pages 2-14, 1990.
[2]
Jean-Loup Baer and Richard N. Zucker. On Synchronization Patterns of Parallel Programs. In 1991 ICPP, pages II-60-67, 1991.
[3]
E. D. Brooks III. PCP: A Parallel Extension of C that is 99% Fat Free. Tech. Rep. UCRL-99673, Lawrence Livermore National Laboratory, 1988.
[4]
E. D. Brooks III, T. S. Axelrod, and G. A. Darmohray. The Cerberus Multiprocessor Simulator. In G, P~odrigu~, editor, Parallel Proccaaiug for Scientific CompuZing, pages 384-390. SIAM, 1989.
[5]
Lucien M. Censier and Paul Feautrier. A New Solution to Coherence Problems in Multicache Systems. IEEE Transactions on Computers, C-27(12):1112- 1118, December 1978.
[6]
Gregory A. Darmohray. Gaussian Techniques on Shared-Memory Multiprocessors. Master's thesis, University of California, Davis, April 1988.
[7]
Edgar W. Dijkstra. Cooperating Sequential Processes. In F. Genuys, editor, Programming Languages. Academic Press, 1968.
[8]
Michel Dubois, Christoph Scheurich, and Faye Briggs. Memory access buffering in multiprocessors. In 13~h ISCA, pages 434-442, 1986.
[9]
Susan J. Eggers and Randy H. Katz. The Effect of Sharing on the Cache and Bus Performance of Parallel Programs. In Third ASPLOS, pages 257- 270, 1989.
[10]
Kourosh Gharachorloo. personal communication.
[11]
Kourosh Gharachorloo, Sarita Adve, Anoop Gupta, John Hennessy, and Mark Hill. Programming for Different Memory Consistency Models. Journal of Parallel and Distributed Computing, to appear.
[12]
Kourosh Gharachorloo, Anoop Gupta, and John Hennessy. Performance Evaluation of Memory Consistency Models for Shared-Memory Multiprocessors. In Fourth ASPLOS, pages 245-257, 1991.
[13]
Kourosh Gharachorloo, Anoop Gupta, and John Hennessy. Two Techniques to Enhance the Performance of Memory Consistency Models. In 1991 ICPP, pages 1-355-364, 1991.
[14]
Kourosh Gharachorloo, Daniel Lenoski, James Laudon, Phillip Gibbons, Anoop Gupta, and John Hennessy. Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors. In 17lh ISCA, pages 15-26, 1990.
[15]
Simon Kahan and Larry Ruzzo. Parallel Quicksand: Sorting on the Sequent. Technical Report 91-01-01, Department of Computer Science, University of Washington, January 1991.
[16]
David Kroft. Lockup-free Instruction Fetch/- Prefetch Cache Organization. In 8lh ISCA, pages 81-87, June 1981.
[17]
Leslie Lamport. How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs. IEEE Transactions on Computers, C- 28(9):690-691, September 1979.
[18]
Dan Lenoski, James Laudon, Kourosh Gharachorloo, Anoop Gupta, and John Hennessy. The Directory-Based Cache Coherence Protocol for the DASH Multiprocessor. In 17th ISCA, pages 148- 159, 1990.
[19]
Todd Mowry and Anoop Gupta. Tolerating Latency Through Software-Controlled Prefetch in ~hared-Memory Multiprocessors. Journal of Parallel and Distributed Computing, 12(2):87-106, June 1991.
[20]
Ridge Computers. Ridge 32 User's Guide.
[21]
Richard N. Zucker and Jean-Loup Baer. A Performance Study of Memory Consistency Models. Technical Report 92-01-02, Department of Computer Science, University of Washington, March 1992.

Cited By

View all
  • (2016)The virtues of conflictACM SIGPLAN Notices10.1145/3016078.285116551:8(1-12)Online publication date: 27-Feb-2016
  • (2016)The virtues of conflictProceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/2851141.2851165(1-12)Online publication date: 27-Feb-2016
  • (2005)Modeling relaxed memory consistency protocolsQuantitative Evaluation of Computing and Communication Systems10.1007/BFb0024329(385-400)Online publication date: 9-Jun-2005
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISCA '92: Proceedings of the 19th annual international symposium on Computer architecture
May 1992
439 pages
ISBN:0897915097
DOI:10.1145/139669
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 20, Issue 2
    Special Issue: Proceedings of the 19th annual international symposium on Computer architecture (ISCA '92)
    May 1992
    429 pages
    ISSN:0163-5964
    DOI:10.1145/146628
    Issue’s Table of Contents

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 1992

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

ISCA92
Sponsor:
ISCA92: International Conference on Computer Architecture
May 19 - 21, 1992
Queensland, Australia

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)88
  • Downloads (Last 6 weeks)16
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2016)The virtues of conflictACM SIGPLAN Notices10.1145/3016078.285116551:8(1-12)Online publication date: 27-Feb-2016
  • (2016)The virtues of conflictProceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/2851141.2851165(1-12)Online publication date: 27-Feb-2016
  • (2005)Modeling relaxed memory consistency protocolsQuantitative Evaluation of Computing and Communication Systems10.1007/BFb0024329(385-400)Online publication date: 9-Jun-2005
  • (2001)Implementing a Caching Service for Distributed CORBA ObjectsMiddleware 200010.1007/3-540-45559-0_1(1-23)Online publication date: 24-Aug-2001
  • (2000)Implementing a caching service a distributed COBRA objectsIFIP/ACM International Conference on Distributed systems platforms10.5555/338283.338284(1-23)Online publication date: 1-Apr-2000
  • (1999)Recent advances in memory consistency models for hardware shared memory systemsProceedings of the IEEE10.1109/5.74786587:3(445-455)Online publication date: Mar-1999
  • (1999)MILLIPEDE: Easy Parallel Programming in Available Distributed EnvironmentsSoftware: Practice and Experience10.1002/(SICI)1097-024X(199708)27:8<929::AID-SPE113>3.0.CO;2-#27:8(929-965)Online publication date: 8-Jan-1999
  • (1998)Schemes for reducing communication latency in regular computations on DSM multiprocessorsProceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205)10.1109/ICPP.1998.708478(164-171)Online publication date: 1998
  • (1997)Using speculative retirement and larger instruction windows to narrow the performance gap between memory consistency modelsProceedings of the ninth annual ACM symposium on Parallel algorithms and architectures10.1145/258492.258512(199-210)Online publication date: 1-Jun-1997
  • (1997)Load balancing in distributed shared memory systems1997 IEEE International Performance, Computing and Communications Conference10.1109/PCCC.1997.581502(152-158)Online publication date: 1997
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media