Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Hiding communication latency and coherence overhead in software DSMs

Published: 01 September 1996 Publication History

Abstract

In this paper we propose the use of a PCI-based programmable protocol controller for hiding communication and coherence overheads in software DSMs. Our protocol controller provides three different types of overhead tolerance: a) moving basic communication and coherence tasks away from computation processors; b) prefetching of diffs; and c) generating and applying diffs with hardware assistance. We evaluate the isolated and combined impact of these features on the performance of TreadMarks. We also compare performance against two versions of the Shrimp-based AURC protocol. Using detailed execution-driven simulations of a 16-node network of workstations, we show that the greatest performance benefits provided by our protocol controller come from our hardware-supported diffs. Reducing the burden of communication and coherence transactions on the computation processor is also beneficial but to a smaller extent. Prefetching is not always profitable. Our results show that our protocol controller can improve running time performance by up to 50% for TreadMarks, which means that it can double the TreadMarks speedups. The overlapping implementation of TreadMarks performs as well or better than AURC for 5 of our 6 applications. We conclude that the simple hardware support we propose allows for the implementation of high-performance software DSMs at low cost. Based on this conclusion, we are building the NCP2 parallel system at COPPE/UFRJ.

References

[1]
A. Agarwal, R. Bianchini, D. Chaiken, K.L. Johnson, D. Kranz, J. Kubiatowicz, B.-H. Lim, K. Mackenzie, and D. Yeung. The MIT Alewife Machine: Architecture and Performance. In Proceedings of the ~2nd Annual Internat~onaI Symposium on Computer Architecture, June 1995.]]
[2]
R. Bianchini and B.-H. Lira. Evaluating the Performance of Multithreading and Prefetching in Shared-Memory Multiprocessors. To appear in Journal of Parallel and Distributed Computing, spectal issue on Multithreading for MuItiprocessors, October 1996.]]
[3]
R. Bianchini, R. Pinto, and C. L. Amorim. Page Fault Behavior and Prefetching in Software DSMs. Technical Report ES-401/96, COPPE Systems Engineering, Federal University of Rio de Janeiro, July 1996.]]
[4]
M. Blumrich, K. Li, R. Alpert, C. Dubnicki, E. Felten, and J. Sandberg. Virtual Memory Mapped Network Interface for the SHRIMP Multicomputer. In Proceedings of the 21st An~zual International $ympoai~m on Computer Architecture, pages 142-153, April 1994.]]
[5]
D. Callahan, K. Kennedy, and A. Porterfield. Software Prefetching. Proceedzngs of the dth International Confer- ~nce on ArchztecturaI Support for Programming Languages and Operating Systems, pages 40-52, April 1991.]]
[6]
J. B. Carter, J. K. Bennett, and W. Zwaenepoel. Implementation and Performance of Munin. In Proceedings of the 13th Symposium on Operating Systems Principles, October 1991.]]
[7]
D. Culler etaI. Parallel Programming in Split-C. in Proceedings of Supercomputing '93, pages 262-273, November 1993.]]
[8]
F. Dahlgren and P. Stenstrom. Effectiveness of Hardware- Based Stride and Sequential Prefetching in Shared-Memory Multiprocessors. In Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture, January 1995.]]
[9]
F. Dahlgren and P. Stenstrom. Reducing the Write Traffic for a Hybrid Cache Protocol. In Proceedings of the 199,1 International Conference on Parallel Processing, August 1994.]]
[10]
S. Dwarkadas, A. Cox, and W. Zwaenepoel. An Integrated Compile-Time/Run-Time Software Distributed Shared Memory System. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems, Oct 1996.]]
[11]
S. Dwarkadas, P. Keleher, A. Cox, and W. Zwaenepoel. Evaluation of Release Consistent Software Distributed Shared Memory on Emerging Network Technology. In Proceedings of the 20~d Annual International Symposium on Computer Architecture, May 1993.]]
[12]
L. Iftode, C. Dubnicki, E. Felten, and K. Li. Improving Release-Consistent Shared Virtual Memory using Automatic Update. In Proceedings of the ~nd IEEE Symposium on High- Performance Computer Architecture, February 1996.]]
[13]
P. Keleher. Coherence as an Abstract Type. Technical Report CS-TR-3544, Department of Computer Science, University of Maryland, Oct 1995.]]
[14]
P. Keleher, A. L. Cox, and W. Zwaenepoel. Lazy Release Consistency for Software Distributed Shared Memory. in Proceedings of the 19th Annual International Symposium on Computer Architecture, pages 13-21, May 1992.]]
[15]
P. Keleher, S. Dwarkadas, A. Cox, and W. Zwaenepoel. TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems. in Proceedings of the USENIX Winter '9~ Technical Conference, pages 17-21, Jan 1994.]]
[16]
Kendall Square Research. KSRi Principles of Operation, 1992.]]
[17]
L. I. Kontothanassis and M. L. Scott. Distributed Shared Memory for New Generation Networks. In Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture, February 1996.]]
[18]
J. Kuskin et aI. The Stanford FLASH Multiprocessor. In Proceedings of the 21st Annual International Symposium on Computer Architecture, Chicago, iL, April 1994.]]
[19]
D. Lenoski, J. Laudon, T. Joe, D. Nakakira, L. Stevens, A. Gupta, and J. Hennessy. The DASH Prototype: Logic Overhead and Performance. IEEE Transactions on Parallel and Distributed Systems, 4(1):41-61, Jan 1993.]]
[20]
T. Mowry and A. Gupta. Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors. Journal of Parallel and Distributed Computing, 12(2):87-106, June 1991.]]
[21]
S. K. Reinhardt, J. R. Larus, and D. A. Wood. Tempest and Typhoon: User-Level Shared Memory. In Proceedings of the 21st Annual International Symposium on Computer Archztecture, Chicago, IL, April 1994.]]
[22]
S. K. Reinhardt, R. W. Pfile, and D. A. Wood. Decoupled Hardware Support for Distributed Shared Memory. In Proceedings of the ~3rd Annual International Symposium on Computer Architecture, Philadelphia, PA, May 1996.]]
[23]
I. Schoinas, B. Falsafi, A. Lebeck, S. Reinhardt, J. Larus, and D. Wood. Fine-grain Access Control for Distributed Shared Memory. Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 297-307, October 1994.]]
[24]
J. E. Veenstra and R. J. Fowler. MINT: A Front End for Efficient Simulation of Shared-Memory Multiprocessors. In Proceedings of the 2nd International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, 1994.]]
[25]
S. C. Woo, M. Ohara, g. Torrie, J. P. Singh, and A. Gupta. The SPLASH-2 Programs: Characterization and Methodological Considerations. In Proceedings of the 22rid Annual International Symposium on Computer Architecture, pages 24-36, May 1995.]]

Recommendations

Comments

Information & Contributors

Information

Published In

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 1996
Published in SIGPLAN Volume 31, Issue 9

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)99
  • Downloads (Last 6 weeks)26
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media