Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/605397.605401acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
Article

Temporally silent stores

Published: 01 October 2002 Publication History

Abstract

Recent work has shown that silent stores--stores which write a value matching the one already stored at the memory location--occur quite frequently and can be exploited to reduce memory traffic and improve performance. This paper extends the definition of silent stores to encompass sets of stores that change the value stored at a memory location, but only temporarily, and subsequently return a previous value of interest to the memory location. The stores that cause the value to revert are called temporally silent stores. We redefine multiprocessor sharing to account for temporal silence and show that in the limit, up to 45% of communication misses in scientific and commercial applications can be eliminated by exploiting values that change only temporarily. We describe a practical mechanism that detects temporally silent stores and removes the coherence traffic they cause in conventional multiprocessors. We find that up to 42% of communication misses can be eliminated with a simple extension to the MESI protocol. Further, we examine application and operating system code to provide insight into the temporal silence phenomenon and characterize temporal silence by examining value frequencies and dynamic instruction distances between temporally silent pairs. These studies indicate that the operating system is involved heavily in temporal silence, in both commercial and scientific workloads, and that while detectable synchronization primitives provide substantial contributions, significant opportunity exists outside these references.

References

[1]
H. Akkary and M. A. Driscoll. A dynamic multithreading processor. In Proceedings of the 31st Annual International Symposium on Microarchitecture, pages 226-236, Dallas, TX, USA, 30 November-2 December 1998. ACM Press.
[2]
A. Alameldeen, C. Mauer, M. Xu, P. Harper, M. Martin, D. Sorin, M. Hill, and D. Wood. Evaluating non-deterministic multi-threaded commercial workloads. In Proceedings of Computer Architecture Evaluation using Commercial Workloads (CAECW-02), February 2002.
[3]
L. Barroso, K. Gharachorloo, and F. Bugnion. Memory system characterization of commercial workloads. In Proceedings of the 25th Annual International Symposium on Computer Architecture, pages 3-14, June 1998.
[4]
G. B. Bell, K. M. Lepak, and M. H. Lipasti. A characterization of silent stores. In Proceedings of PACT-2000, Philadelphia, PA, October 2000.
[5]
J. Borkenhagen and S. Storino. 5th Generation 64-bit Power-PC-Compatible Commercial Processor Design. IBM White-paper available from http://www.rs6000.ibm.com, 1999.
[6]
H. W. Cain, R. Rajwar, M. Marden, and M. H. Lipasti. An architectural characterization of java tpc-w. In Proc. of HPCA-7, January 2001.
[7]
M. Cintra and J. Torrellas. Eliminating squashes through learning cross-thread violations in speculative parallelization for multiprocessors. In HPCA, 2002.
[8]
IBM Corporation. AIX v4.3 online documentation. http://nc-sp.upenn.edu/aix4.3html/, 2002.
[9]
D. Culler and J. P. Singh. Parallel Computer Architecture: A Hardware/Software Approach. Morgan Kaufmann Publishers, Inc., San Mateo, CA, 1999.
[10]
M. Dubois, J. Skeppstedt, L. Ricciulli, K. Ramamurthy, and P. Stenström. The Detection and Elimination of Useless Misses in Multiprocessors. In 20th Annual International Symposium on Computer Architecture, May 1993.
[11]
J. R. Goodman and P. J. Woest. The wisconsin multicube: A new large-scale cache coherent multiprocessor. In Proceedings of the 15th Annual International Symposium on Computer Architecture, June 1988.
[12]
S. Kaxiras and J. R. Goodman. Improving CC-NUMA performance using instruction-based prediction. In Proceedings of HPCA-5, Orlando, January 1999.
[13]
T. Keller, A. M. Maynard, R. Simpson, and P. Bohrer. Simos-ppc full system simulator. http://www.cs.utexas.edu/users/cart/simOS.
[14]
G. Lauterbach and T. Horel. UltraSPARC-III: designing third generation 64-bit performance. IEEE Micro, 19(3):56-66, 1999.
[15]
K. M. Lepak, G. B. Bell, and M. H. Lipasti. Silent stores and store value locality. IEEE Transactions on Computers, 50(11), November 2001.
[16]
K. M. Lepak and M. H. Lipasti. On the value locality of store instructions. In Proceedings of ISCA-2000, Vancouver, B.C., Canada, June 2000.
[17]
K. M. Lepak and M. H. Lipasti. Silent stores for free. In Proceedings of MICRO-2000, Monterrey, CA, November 2000.
[18]
M. M. K. Martin, D. J. Sorin, A. Ailamaki, A. R. Alameldeen, R. M. Dickson, C. J. Mauer, K. E. Moore, M. Plakal, M. D. Hill, and D. A. Wood. Timestamp snooping: An approach for extending SMPs. ACM SIG-PLAN Notices, 35(11):25-36, November 2000.
[19]
C. Moore. POWER4 system microarchitecture. In Proceedings of the Microprocessor Forum, October 2000.
[20]
R. Rajwar and J. R. Goodman. Speculative lock elision: Enabling highly concurrent multithreaded execution. In MICRO-34, December 2001.
[21]
J. G. Steffan, C. B. Colohan, A. Zhai, and T. C. Mowry. Improving value communication for thread-level speculation. In HPCA, 2002.
[22]
S. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The SPLASH-2 programs: Characterization and methodological considerations. In Proceedings of the 22th International Symposium on Computer Architecture, June 1995.

Cited By

View all
  • (2022)Tech Worker Perspectives on Considering the Interpersonal Implications of Communication TechnologiesProceedings of the ACM on Human-Computer Interaction10.1145/35675667:GROUP(1-22)Online publication date: 29-Dec-2022
  • (2022)"It's Just Like doing Meditation"Proceedings of the ACM on Human-Computer Interaction10.1145/35675647:GROUP(1-28)Online publication date: 29-Dec-2022
  • (2022)Integrating Real-Time and Non-Real-Time Collaborative ProgrammingProceedings of the ACM on Human-Computer Interaction10.1145/35675637:GROUP(1-19)Online publication date: 29-Dec-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS X: Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
October 2002
318 pages
ISBN:1581135742
DOI:10.1145/605397
  • cover image ACM SIGOPS Operating Systems Review
    ACM SIGOPS Operating Systems Review  Volume 36, Issue 5
    December 2002
    296 pages
    ISSN:0163-5980
    DOI:10.1145/635508
    Issue’s Table of Contents
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 30, Issue 5
    Special Issue: Proceedings of the 10th annual conference on Architectural Support for Programming Languages and Operating Systems
    December 2002
    296 pages
    ISSN:0163-5964
    DOI:10.1145/635506
    Issue’s Table of Contents
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 37, Issue 10
    October 2002
    296 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/605432
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2002

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

ASPLOS02

Acceptance Rates

ASPLOS X Paper Acceptance Rate 24 of 175 submissions, 14%;
Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Tech Worker Perspectives on Considering the Interpersonal Implications of Communication TechnologiesProceedings of the ACM on Human-Computer Interaction10.1145/35675667:GROUP(1-22)Online publication date: 29-Dec-2022
  • (2022)"It's Just Like doing Meditation"Proceedings of the ACM on Human-Computer Interaction10.1145/35675647:GROUP(1-28)Online publication date: 29-Dec-2022
  • (2022)Integrating Real-Time and Non-Real-Time Collaborative ProgrammingProceedings of the ACM on Human-Computer Interaction10.1145/35675637:GROUP(1-19)Online publication date: 29-Dec-2022
  • (2022)Agency and AmplificationProceedings of the ACM on Human-Computer Interaction10.1145/35675527:GROUP(1-22)Online publication date: 29-Dec-2022
  • (2018)Static Prediction of Silent StoresACM Transactions on Architecture and Code Optimization10.1145/328084815:4(1-26)Online publication date: 16-Nov-2018
  • (2017)Detecting and mitigating data-dependent DRAM failures by exploiting current memory contentProceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3123939.3123945(27-40)Online publication date: 14-Oct-2017
  • (2014)Trash in cacheProceedings of the workshop on Memory Systems Performance and Correctness10.1145/2618128.2618133(1-9)Online publication date: 13-Jun-2014
  • (2012)Edge chasing delayed consistencyProceedings of the 2012 ACM workshop on Relaxing synchronization for multicore and manycore scalability10.1145/2414729.2414733(15-24)Online publication date: 21-Oct-2012
  • (2012)XPoint cacheProceedings of the 21st international conference on Parallel architectures and compilation techniques10.1145/2370816.2370829(75-86)Online publication date: 19-Sep-2012
  • (2012)Supporting Overcommitted Virtual Machines through Hardware Spin DetectionIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2011.14323:2(353-366)Online publication date: 1-Feb-2012
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media