Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1898953.1898969acmotherconferencesArticle/Chapter ViewAbstractPublication PagesidpdsConference Proceedingsconference-collections
Article

Exploiting locality: a flexible DSM approach

Published: 25 April 2006 Publication History
  • Get Citation Alerts
  • Abstract

    No single coherence strategy suits all applications well. Many promising adaptive protocols and coherence predictors, capable of dynamically modifying the coherence strategy, have been suggested over the years.
    While most dynamic detection schemes rely on plentiful of dedicated hardware, the customization technique suggested in this paper requires no extra hardware support for its per-application coherence strategy. Instead, each application is profiled using a low-overhead profiling tool. The appropriate coherence flag setting, suggested by the profiling, is specified when the application is launched.
    We have compared the performance of a hardware DSM (Sun WildFire) to a software DSM built with identical interconnect hardware and coherence strategy. With no support for flexibility, the software DSM runs on average 45 percent slower than the hardware. DSM on the 12 studied applications, while the flexibility can get the software DSM within 11 percent. Our all-software system outperforms the hardware DSM on four applications.

    References

    [1]
    A. Bilas et al. Using Network Interface Support to Avoid Asynchronous Protocol Processing in Shared Virtual Memory Systems. In ISCA '99, pages 282- 293, May 1999.
    [2]
    W. W. Carlson et al. Introduction to UPC and Language Specification. Technical Report CCS-TR-99- 157, The George Washington University, May 1999.
    [3]
    J. B. Carter et al. Implementation and Performance of Munin. In SOSP '91, pages 152-164, Oct. 1991.
    [4]
    D. Chaiken and A. Agarwal. Software-Extended Coherent Shared Memory: Performance and Cost. In ISCA '94, pages 314-324, Apr. 1994.
    [5]
    M. Chaudhuri and M. Heinrich. SMTp: An Architecture for Next-generation Scalable Multi-threading. In ISCA '04, pages 124-135, June 2004.
    [6]
    D. Chiou et al. StarT-NG: Delivering Seamless Parallel Computing. In Euro-Par '95, pages 101-116, Aug. 1995.
    [7]
    L. Dagum and R. Menon. OpenMP: An Industry-Standard API for Shared Memory Programming. IEEE Computational Science and Engineering, 5(1):46-55, Jan.-Mar. 1998.
    [8]
    M. Dubois et al. Memory Access Buffering in Multiprocessors. In ISCA '86, pages 434-442, June 1986.
    [9]
    S. Dwarkadas et al. Comparative Evaluation of Fine- and Coarse-Grain Approaches for Software Distributed Shared Memory. In HPCA-5, pages 260-269, Jan. 1999.
    [10]
    A. Erlichson et al. SoftFLASH: Analyzing the Performance of Clustered Distributed Virtual Shared Memory. In ASPLOS-VII, pages 210-220, Oct. 1996.
    [11]
    B. Falsafi et al. Application-Specific Protocols for User-Level Shared Memory. In SC '94, pages 380-389, Nov. 1994.
    [12]
    K. Gharachorloo et al. Memory Consistency and Event Ordering in Scalable Shared-memory Multiprocessors. In ISCA '90, pages 15-26, May 1990.
    [13]
    J. R. Goodman. Cache Consistency and Sequential Consistency. Technical Report 61, SCI Committee, Mar. 1989.
    [14]
    H. Grahn and P. Stenström. Efficient Strategies for Software-Only Protocols in Shared-Memory Multiprocessors. In ISCA '95, pages 38-47, June 1995.
    [15]
    E. Hagersten and M. Koster. WildFire: A Scalable Path for SMPs. In HPCA-5, pages 172-181, Jan. 1999.
    [16]
    J. L. Hennessy and D. A. Patterson. Computer Architecture: A Quantitative Approach. Morgan Kaufmann, 3rd edition, 2003.
    [17]
    M. D. Hill et al. Cooperative Shared Memory: Software and Hardware for Scalable Multiprocessors. ACM Transactions on Computer Systems, 11(4):300- 318, Nov. 1993.
    [18]
    InfiniBand Trade Association, InfiniBand Architecture Specification, Release 1.2, Oct. 2004. Available from http://www.infinibandta.org.
    [19]
    P. Keleher et al. Lazy Release Consistency for Software Distributed Shared Memory. In ISCA '92, pages 13- 21, May 1992.
    [20]
    K. Krewell. Best Servers of 2004: Where Multicore Is the Norm. In Microprocessor Report, Jan. 2005.
    [21]
    J. Kuskin et al. The Stanford FLASH Multiprocessor. In ISCA '94, pages 302-313, Apr. 1994.
    [22]
    L. Lamport. How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs. IEEE Transactions on Computers, C-28(9):690-691, Sept. 1979.
    [23]
    J. R. Larus and E. Schnarr. EEL: Machine-Independent Executable Editing. In PLDI '95, pages 291-300, June 1995.
    [24]
    K. Li. Shared Virtual Memory on Loosely Coupled Multiprocessors. PhD thesis, Department of Computer Science, Yale University, Sept. 1986.
    [25]
    L. W. McVoy and C. Staelin. Imbench: Portable Tools for Performance Analysis. In USENIX Technical Conference '96, pages 279-294, Jan. 1996.
    [26]
    Z. Radović and E. Hagersten. Removing the Overhead from Software-Based Shared Memory. In SC '01, Nov. 2001.
    [27]
    D. J. Scales et al. Shasta: A Low-Overhead Software-Only Approach to Fine-Grain Shared Memory. In ASPLOS-VII, pages 174-185, Oct. 1996.
    [28]
    D. J. Scales et al. Fine-Grain Software Distributed Shared Memory on SMP Clusters. In HPCA-4, pages 125-136, Feb. 1998.
    [29]
    C. Scheurich. Access Ordering and Coherence in Shared Memory Multiprocessors. PhD thesis, University of Southern California, May 1989.
    [30]
    I. Schoinas et al. Fine-grain Access Control for Distributed Shared Memory. In ASPLOS-VI, pages 297- 306, Oct. 1994.
    [31]
    I. Schoinas et al. Implementing Fine-Grain Distributed Shared Memory On Commodity SMP Workstations. Technical Report #1307, Computer Sciences Department, University of Wisconsin-Madison, Mar. 1996.
    [32]
    I. Schoinas et al. Sirocco: Cost-Effective Fine-Grain Distributed Shared Memory. In PACT '98, pages 40- 49, Oct. 1998.
    [33]
    A. Singhal et al. Gigaplane: A High Performance Bus for Large SMPs. In Proceedings of IEEE Hot Interconnects IV, pages 41-52, Aug. 1996.
    [34]
    S. J. Sistare and C. J. Jackson. Ultra-High Performance Communication with MPI and the Sun Fire Link Interconnect. In SC '02, Nov. 2002.
    [35]
    R. Stets et al. Cashmere-2L: Software Coherent Shared Memory on a Clustered Remote-Write Network. In SOSP '97, pages 170-183, Oct. 1997.
    [36]
    D. L. Weaver and T. Germond, editors. The SPARC Architecture Manual, Version 9. PTR Prentice Hall, 2000.
    [37]
    S. C. Woo et al. The SPLASH-2 Programs: Characterization and Methodological Considerations. In ISCA '95 pages 24-36, June 1995.
    [38]
    D. Yeung et al. Multigrain Shared Memory. ACM Transactions on Computer Systems, 18(2):154-196, May 2000.
    [39]
    H. Zeffer et al. Exploiting Spatial Store Locality through Permission Caching in Software DSMs. In Euro-Par '04, pages 551-560, Aug. 2004.
    [40]
    H. Zeffer et al. Flexibility Implies Performance. Technical Report 2005-013, Department of Information Technology, Uppsala University, Apr. 2005.
    [41]
    Y. Zhou et al. Performance Evaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Virtual Memory. In OSDI '96, pages 75-88, Oct. 1996.
    [42]
    Y. Zhou et al. Relaxed Consistency and Coherence Granularity in DSM Systems: A Performance Evaluation. In PPOPP '97, pages 193-205, June 1997.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    IPDPS'06: Proceedings of the 20th international conference on Parallel and distributed processing
    April 2006
    399 pages
    ISBN:1424400546

    Sponsors

    • IEEE CS TCPP: IEEE Computer Society Technical Committee on Parallel Processing

    In-Cooperation

    Publisher

    IEEE Computer Society

    United States

    Publication History

    Published: 25 April 2006

    Check for updates

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media