Abstract
A major overhead in software DSM (Distributed Shared Memory) is the cost of remote memory accesses necessitated by the protocol as well as induced by false sharing. This paper introduces a dynamic prefetching method implemented in the JIAJIA software DSM to reduce system overhead caused by remote accesses. The prefetching method records the interleaving string of INV (invalidation) and GETP (getting a remote page) operations for each cached page and analyzes the periodicity of the string when a page is invalidated on a lock or barrier. A prefetching request is issued after the lock or barrier if the periodicity analysis indicates that GETP will be the next operation in the string. Multiple prefetching requests are merged into the same message if they are to the same host. Performance evaluation with eight well-accepted benchmarks in a cluster of sixteen Power PC workstations shows that the prefetching scheme can significantly reduce the page fault overhead and as a result achieves a performance increase of 15%–20% in three benchmarks and around 8%–10% in another three. The average extra traffic caused by useless prefetches is only 7%–13% in the evaluation.
Similar content being viewed by others
References
Carter J, Bennet J, Zwaenepoel W. Implementation and performance of Munin. InProc. the 13th Symp. Operating Systems Principles, Oct., 1991, pp.152–164.
Keleher P, Dwarkadas S, Cox A, Zwaenepoel W. TreadMarks distributed shared memory on standard workstations and operating systems. InProc. the 1994 Winter Usenix Conf., Jan., 1994, pp.115–131.
Hu Weiwu, Shi Weisong, Tang Zhimin. Optimizing home-based software DSM protocols.Cluster Computing, to appear in 2001.
Hu Weiwu, Shi Weisong, Tang Zhimin, Li Ming. A lock-based cache coherence protocol for scope consistency.Journal of Computer Science and Technology, Mar., 1998, 13(2): 97–109.
Woo S, Ohara M, Torrie Eet al. The SPLASH-2 programs: Characterization and methodological considerations. InProc. ISCA’95, 1995, pp.24–36.
Bailey D, Barton J, Lasinski T, Simon H. The NAS parallel benchmarks. Technical Report No. 103863, NASA, Jul., 1993.
Lu H, Dwarkadas S, Cox A, Zwaenepoel W. Quantifying the performance differences between PVM and TreadMarks.Journal of Parallel and Distributed Computing, Jun., 1997, 43(2): 65–78.
Iftode L. Home-based shared virtual memory [dissertation]. Princeton University, Aug., 1998.
Hu Weiwu, Shi Weisong, Tang Zhimin. Reducing system overhead in home-based software DSMs. InProc. the 13th Int. Parallel Processing Symp., Apr, 1999, pp.167–173.
Hu Weiwu, Zhang Fuxin, Liu Haiming. A new home-based software DSM protocol for SMP clusters. InProc. the 6th Euro-Par Conference, Aug., 2000, pp.1132–1142.
Karlsson M, Stenstrom P. Effectiveness of dynamic prefetching in multiple-writer distributed virtual shared memory system.Journal of Parallel and Distributed Computing, Jun., 1997, 43(2): 79–93.
Bianchini R, Kontothanasis L, Pinto Ret al. Hiding communication latency and coherence overhead in software DSMs. InProc. 7th Int. Conf. Architectural Support for Programming Languages and Operating Systems, 1996, pp.198–209.
Mowry T, Gupta A. Tolerating latency through software-controlled prefetching in shared-memory multiprocessors.Journal of Parallel and Distributed Computing, Jun., 1991, 12(2): 87–106.
Dwarkadas S, Lu H, Cox Aet al. Combining compile-time and runtime support for efficient software distributed shared memory. InProc. IEEE, Special Issue on Distributed Shared Memory, Mar., 1999, pp.476–486.
Keleher P, Tseng C. Enhancing software DSM for compiler-parallelized applications. InProc. the 11th Int. Parallel Processing Symposium, Apr., 1997.
Chandra S, Larus J. Optimizimg communication in HPF programs for fine-grained distributed shared memory. InProc. the 6th Symp. Principles and Practice of Parallel Programming, Jun., 1997.
Amza C, Cox A, Dwarkadas Set al. Adaptive protocols for software distributed shared memory. InProc. IEEE, Special Issue on Distributed Shared Memory, Mar., 1999, pp.467–475.
Bershad B, Zekauskas M, Sawdon W. The Midway Distributed Shared Memory System. InProc. the 38th IEEE Int., CompCon Conf., Feb., 1993, pp.528–537.
Dwarkadas S, Schaffer A, Cottingham Ret al. Parallelization of general linkage analysis problemsHuman Heredity, 1994, 44: 127–141.
Lathtop G, Lalouel J, Jurier C, Ott J. Strategies for multilocus analysis in humans.PNAS, 1994, 81: 3443–3446.
Li K. IVY: A shared virtual memory system for parallel computing. InProc. the 1988 Int. Conf. Parallel Processing, Aug., 1988, 2: 94–101.
Schaffer A, Gupta S, Shriram K, Cottingham R. Avoiding recompoudation in genetic linkage analysis.Human Heredity, 1994, 44: 225–237.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported by the National Natural Science Foundation of China (No. 60073018).
HU Weiwu received his B.S. degree from the University of Science and Technology of China in 1991 and his Ph.D. degree from the Institute of Computing Technology, The Chinese Academy of Sciences in 1996, both in computer science. He is currently a professor in the Institute of Computing Technology. His research interests include high performance computer architecture, parallel processing, and SOC design.
ZHANG Fuxin received his B.S. degree in computing technology from the University of Science and Technology of China in 1999. He is currently an M.S. candidate in the Institute of Computing Technology, The Chinese Academy of Sciences. His research interests include high performance computer architecture, cluster computing, and LINUX.
LIU Haiming received his B.S. degree in computing technology from the University of Science and Technology of China in 1999. He is currently an M.S. candidate in the Institute of Computing Technology, The Chinese Academy of Sciences. His research interests include high performance computer architecture and cluster computing.
Rights and permissions
About this article
Cite this article
Hu, W., Zhang, F. & Liu, H. Dynamic data prefetching in home-based software DSMs. J. Comput. Sci. & Technol. 16, 231–241 (2001). https://doi.org/10.1007/BF02943201
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02943201