Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Dynamic data prefetching in home-based software DSMs

  • Correspondence
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

A major overhead in software DSM (Distributed Shared Memory) is the cost of remote memory accesses necessitated by the protocol as well as induced by false sharing. This paper introduces a dynamic prefetching method implemented in the JIAJIA software DSM to reduce system overhead caused by remote accesses. The prefetching method records the interleaving string of INV (invalidation) and GETP (getting a remote page) operations for each cached page and analyzes the periodicity of the string when a page is invalidated on a lock or barrier. A prefetching request is issued after the lock or barrier if the periodicity analysis indicates that GETP will be the next operation in the string. Multiple prefetching requests are merged into the same message if they are to the same host. Performance evaluation with eight well-accepted benchmarks in a cluster of sixteen Power PC workstations shows that the prefetching scheme can significantly reduce the page fault overhead and as a result achieves a performance increase of 15%–20% in three benchmarks and around 8%–10% in another three. The average extra traffic caused by useless prefetches is only 7%–13% in the evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Carter J, Bennet J, Zwaenepoel W. Implementation and performance of Munin. InProc. the 13th Symp. Operating Systems Principles, Oct., 1991, pp.152–164.

  2. Keleher P, Dwarkadas S, Cox A, Zwaenepoel W. TreadMarks distributed shared memory on standard workstations and operating systems. InProc. the 1994 Winter Usenix Conf., Jan., 1994, pp.115–131.

  3. Hu Weiwu, Shi Weisong, Tang Zhimin. Optimizing home-based software DSM protocols.Cluster Computing, to appear in 2001.

  4. Hu Weiwu, Shi Weisong, Tang Zhimin, Li Ming. A lock-based cache coherence protocol for scope consistency.Journal of Computer Science and Technology, Mar., 1998, 13(2): 97–109.

    Article  Google Scholar 

  5. Woo S, Ohara M, Torrie Eet al. The SPLASH-2 programs: Characterization and methodological considerations. InProc. ISCA’95, 1995, pp.24–36.

  6. Bailey D, Barton J, Lasinski T, Simon H. The NAS parallel benchmarks. Technical Report No. 103863, NASA, Jul., 1993.

  7. Lu H, Dwarkadas S, Cox A, Zwaenepoel W. Quantifying the performance differences between PVM and TreadMarks.Journal of Parallel and Distributed Computing, Jun., 1997, 43(2): 65–78.

    Article  Google Scholar 

  8. Iftode L. Home-based shared virtual memory [dissertation]. Princeton University, Aug., 1998.

  9. Hu Weiwu, Shi Weisong, Tang Zhimin. Reducing system overhead in home-based software DSMs. InProc. the 13th Int. Parallel Processing Symp., Apr, 1999, pp.167–173.

  10. Hu Weiwu, Zhang Fuxin, Liu Haiming. A new home-based software DSM protocol for SMP clusters. InProc. the 6th Euro-Par Conference, Aug., 2000, pp.1132–1142.

  11. Karlsson M, Stenstrom P. Effectiveness of dynamic prefetching in multiple-writer distributed virtual shared memory system.Journal of Parallel and Distributed Computing, Jun., 1997, 43(2): 79–93.

    Article  Google Scholar 

  12. Bianchini R, Kontothanasis L, Pinto Ret al. Hiding communication latency and coherence overhead in software DSMs. InProc. 7th Int. Conf. Architectural Support for Programming Languages and Operating Systems, 1996, pp.198–209.

  13. Mowry T, Gupta A. Tolerating latency through software-controlled prefetching in shared-memory multiprocessors.Journal of Parallel and Distributed Computing, Jun., 1991, 12(2): 87–106.

    Article  Google Scholar 

  14. Dwarkadas S, Lu H, Cox Aet al. Combining compile-time and runtime support for efficient software distributed shared memory. InProc. IEEE, Special Issue on Distributed Shared Memory, Mar., 1999, pp.476–486.

  15. Keleher P, Tseng C. Enhancing software DSM for compiler-parallelized applications. InProc. the 11th Int. Parallel Processing Symposium, Apr., 1997.

  16. Chandra S, Larus J. Optimizimg communication in HPF programs for fine-grained distributed shared memory. InProc. the 6th Symp. Principles and Practice of Parallel Programming, Jun., 1997.

  17. Amza C, Cox A, Dwarkadas Set al. Adaptive protocols for software distributed shared memory. InProc. IEEE, Special Issue on Distributed Shared Memory, Mar., 1999, pp.467–475.

  18. Bershad B, Zekauskas M, Sawdon W. The Midway Distributed Shared Memory System. InProc. the 38th IEEE Int., CompCon Conf., Feb., 1993, pp.528–537.

  19. Dwarkadas S, Schaffer A, Cottingham Ret al. Parallelization of general linkage analysis problemsHuman Heredity, 1994, 44: 127–141.

    Article  Google Scholar 

  20. Lathtop G, Lalouel J, Jurier C, Ott J. Strategies for multilocus analysis in humans.PNAS, 1994, 81: 3443–3446.

    Article  Google Scholar 

  21. Li K. IVY: A shared virtual memory system for parallel computing. InProc. the 1988 Int. Conf. Parallel Processing, Aug., 1988, 2: 94–101.

  22. Schaffer A, Gupta S, Shriram K, Cottingham R. Avoiding recompoudation in genetic linkage analysis.Human Heredity, 1994, 44: 225–237.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hu Weiwu.

Additional information

This work is supported by the National Natural Science Foundation of China (No. 60073018).

HU Weiwu received his B.S. degree from the University of Science and Technology of China in 1991 and his Ph.D. degree from the Institute of Computing Technology, The Chinese Academy of Sciences in 1996, both in computer science. He is currently a professor in the Institute of Computing Technology. His research interests include high performance computer architecture, parallel processing, and SOC design.

ZHANG Fuxin received his B.S. degree in computing technology from the University of Science and Technology of China in 1999. He is currently an M.S. candidate in the Institute of Computing Technology, The Chinese Academy of Sciences. His research interests include high performance computer architecture, cluster computing, and LINUX.

LIU Haiming received his B.S. degree in computing technology from the University of Science and Technology of China in 1999. He is currently an M.S. candidate in the Institute of Computing Technology, The Chinese Academy of Sciences. His research interests include high performance computer architecture and cluster computing.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hu, W., Zhang, F. & Liu, H. Dynamic data prefetching in home-based software DSMs. J. Comput. Sci. & Technol. 16, 231–241 (2001). https://doi.org/10.1007/BF02943201

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02943201

Keywords