Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Low-power snoop architecture for synchronized producer-consumer embedded multiprocessing

Published: 01 September 2009 Publication History

Abstract

We introduce a cross-layer customization methodology where application knowledge regarding data sharing in producer-consumer relationships is used in order to aggressively eliminate unnecessary and predictable snoop-induced cache lookups even for references to shared data, thus, achieving significant power reductions with minimal hardware cost. The technique exploits application-specific information regarding the exact producer-consumer relationships between tasks as well as information regarding the precise timing of synchronized accesses to shared memory buffers by their corresponding producers and/or consumers. Snoop-induced cache lookups for accesses to the shared data are eliminated when it is ensured that such lookups will not result in extra knowledge regarding the cache state in respect to the other caches and the memory. Our experiments show average power reductions of more than 80% compared to a general-purpose snoop protocol.

References

[1]
M. K. Martin, M. D. Hill, and D. A. Wood, "Token coherence: Decoupling performance and correctness," in Proc. ISCA, 2003, pp. 182-193.
[2]
J. Nilsson, A. Landin, and P. Stenstrom, "The coherence predictor cache: A resource-efficient and accurate coherence prediction infrastructure," in Proc. ISPDP, 2003, pp. 10-17.
[3]
M. Ekman, F. Dahlgren, and P. Stenstrom, "Tlb and snoop energy-reduction using virtual caches in low-power chip-microprocessors," in Proc. ISLPED, Aug. 2002, pp. 243-246.
[4]
M. Loghi, M. Letis, L. Benini, and M. Poncino, "Exploring the energy efficiency of cache coherence protocols in single-chip multi-processors," in Proc. GLSVLSI, 2005, pp. 276-281.
[5]
A. Moshovos, "Regionscout: Exploiting coarse grain sharing in snoop-based coherence," in Proc. ISCA, 2005, pp. 234-245.
[6]
T. F. Wenisch, S. Somogyi, N. Hardavellas, J. Kim, A. Ailamaki, and B. Falsafi, "Temporal streaming of shared memory," in Proc. ISCA, 2005, pp. 222-233.
[7]
J. F. Cantin, M. H. Lipasti, and J. E. Smith, "Improving multiprocessor performance with coarse-grain coherence tracking," SIGARCH Comput. Arch. News, vol. 33, no. 2, pp. 246-257, 2005.
[8]
S. Mukherjee and M. Hill, "Using prediction to accelerate coherence protocols," in Proc. ISCA, 1998, pp. 179-190.
[9]
C. Ballapuram, A. Sharif, and H.-H. Lee, "Exploiting access semantics and program behavior to reduce snoop power in chip multiprocessors," in Proc. ASPLOS, 2008, pp. 60-69.
[10]
C. Lee, M. Potkonjak, and W. H. Mangione-Smith, "Mediabench: A tool for evaluating and synthesizing multimedia and communications systems," in Proc. 30th MICRO, Dec. 1997, pp. 330-335.
[11]
M. Guthaus, J. S. Ringenberg, D. Ernst, T. Austin, T. Mudge, and R. Brown, "Mibench: A free, commercially representative embedded benchmark suite," in Proc. WWC, Dec. 2001, pp. 3-14.
[12]
N. Binkert, R. Dreslinski, L. Hsu, K. Lim, A. Saidi, and S. Reinhardt, "The m5 simulator: Modeling networked systems," IEEE Micro, vol. 26, no. 4, pp. 52-60, Apr. 2006.
[13]
D. Tarjan, S. Thoziyoor, and N. Jouppi, "Cacti 4.0: An integrated cache timing, power and area model," HP Laboratories, Palo Alto, CA, Jun. 2006.
[14]
R. Bashirullah, W. Liu, and R. K. Cavin, "Low-power design methodology for an on-chip bus with adaptive bandwidth capability," in Proc. Des. Autom. Conf. (DAC), 2003, pp. 628-633.

Index Terms

  1. Low-power snoop architecture for synchronized producer-consumer embedded multiprocessing
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image IEEE Transactions on Very Large Scale Integration (VLSI) Systems
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems  Volume 17, Issue 9
    September 2009
    198 pages

    Publisher

    IEEE Educational Activities Department

    United States

    Publication History

    Published: 01 September 2009
    Revised: 19 March 2008
    Received: 22 October 2007

    Author Tags

    1. Low-power cache coherence
    2. low-power cache coherence
    3. low-power multiprocessor systems-on-a-chip (MPSoC)
    4. producer-consumer communication in MPSoC

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 04 Sep 2024

    Other Metrics

    Citations

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media