Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Towards implementation of a novel scheme for data prefetching on distributed shared memory systems

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

High speed networks and rapidly improving microprocessor performance make the network of workstations an extremely important tool for parallel computing in order to speedup the execution of scientific applications. Shared memory is an attractive programming model for designing parallel and distributed applications, where the programmer can focus on algorithmic development rather than data partition and communication. Based on this important characteristic, the design of systems to provide the shared memory abstraction on physically distributed memory machines has been developed, known as Distributed Shared Memory (DSM). DSM is built using specific software to combine a number of computer hardware resources into one computing environment. Such an environment not only provides an easy way to execute parallel applications, but also combines available computational resources with the purpose of speeding up execution of these applications. DSM systems need to maintain data consistency in memory, which usually leads to communication overhead. Therefore, there exists a number of strategies that can be used to overcome this overhead issue and improve overall performance. Strategies as prefetching have been proven to show great performance in DSM systems, since they can reduce data access communication latencies from remote nodes. On the other hand, these strategies also transfer unnecessary prefetching pages to remote nodes. In this research paper, we focus on the access pattern during execution of a parallel application, and then analyze the data type and behavior of parallel applications. We propose an adaptive data classification scheme to improve prefetching strategy with the goal to improve overall performance. Adaptive data classification scheme classifies data according to the accessing sequence of pages, so that the home node uses past history access patterns of remote nodes to decide whether it needs to transfer related pages to remote nodes. From experimental results, we can observe that our proposed method can increase the accuracy of data access in effective prefetch strategy by reducing the number of page faults and misprefetching. Experimental results using our proposed classification scheme show a performance improvement of about 9–25% over the same benchmark applications running on top of an original JIAJIA DSM system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abe T, Okamoto S (2003) A moving home-based software DSM system. In: The proceedings of communication, computers and signal processing, vol 1, pp 17–20

  2. Cox AL, Dwarkadas S, Keleher P, Zwaenepoel W (1994) An integrated approach to distributed shared memory. In: First international workshop on parallel processing, December 1994

  3. Dwarkadas S, Keleher P, Cox AL, Zwaenepoel W (1993) An evaluation of software distributed shared memory for next-generation processors and networks. In: Proceedings of the twentieth symposium on computer architecture, May, pp 144–155

  4. Eskicioglu MR, Marsland TA, Hu W, Shi W (1999) Evaluation of the JIAJIA software DSM system on high performance computer architectures. In: The proceedings of the 32nd annual Hawaii international conference on system sciences (HICSS-32), volume track 8

  5. Hsu C-H, Chen T-L, Li K-C (2007) Performance effective pre-scheduling strategy for heterogeneous grid systems in the master slave paradigm. Future Gener Comput Syst 23(4):569–579

    Article  Google Scholar 

  6. Hu W, Shi W, Tang Z (1999) JIAJIA: an SVM system based on a new cache coherence protocol. In: The proceedings of the high performance computing and networking (HPCN’99), pp 463–472

  7. Hu W, Shi W, Tang Z (1999) Write detection in home-based software DSMs. In: The proceedings of the EuroPar’99

  8. Hu W, Shi W, Tang Z (1999) Reducing system overheads in home-based software DSMs. In: The proceedings of 13th international and 10th symposium on parallel and distributed processing, pp 167–173

  9. Hu W, Shi W, Tang Z (1999) Home migration in home based software DSMs. In: The proceedings of ACM 1st workshop on software DSM system

  10. Hu W, Shi W, Tang Z (2001) Optimizing home-based software DSM protocols. J Netw Softw Tools Appl 4(3):235–242

    Google Scholar 

  11. Hu W, Zhang F, Liu H (2001) Dynamic data prefetching in home-based software DSM. J Comput Sci Technology

  12. Liu H, Hu W (2001) A comparison of two strategies of dynamic data prefetching in software DSM. In: Proceedings of the 15th IEEE international parallel and distributed processing symposium

  13. Lu S-H, Yang C-C, Wang H-H, Li K-C (2005) On design of agent home scheme for prefetching strategy in DSM systems. In: The proceedings of the 19th IEEE international conference on advanced information networking and applications (AINA’2005), vol 1, pp 693–698

  14. Mirchandaney R, Hiranandani S, Sethi A (1994) Improving the performance of DSM systems via compiler involvement. In: Proceedings of supercomputing ’94, pp 763–772

  15. Park D, Saavedra RH (1996) Adaptive granularity: transparent integration of fine- and coarse-grain communication. In: The proceedings of parallel architectures and compilation techniques, pp 260–268

  16. Roh Y, Seong BH, Park D (2000) Hiding latency through bulk transfer and prefetching in distributed shared memory multiprocessors. In: The proceedings of high performance computing in the Asia-Pacific region, vol 1, pp 164–166

  17. Shi W (1999) Improving the performance of software DSM systems. Ph.D. thesis, Chinese Academy of Sciences, Beijing, China

  18. Tu J-F, Wang Y-H, Wang L-H (2000) A dynamic data prefetching method of improving the memory latency. In: International conference on high performance computing in the Asia-Pacific region, vol 1, pp 13–18

  19. Wang K-J (2004) On the design and implementation of an effective prefetch strategy on DSM systems. Master thesis, Providence University, Dept. of Computer Science and Information Management, Taiwan

  20. Wang H-H, Li K-C, Wang K-J, Lu S-H (2006) On the design and implementation of an effective prefetch strategy for DSM systems. J Supercomput 37(1):91–112

    Article  Google Scholar 

  21. Wang K-J, Wang H-H, Li K-C (2004) On design of a prefetching strategy for DSM system. In: PDPTA’2004 international conference on parallel and distributed processing techniques and applications, Las Vegas, USA

  22. Yang C-C, Lu S-H, Wang H-H, Li K-C (2005) An efficient data transmission scheme for software distributed shared memory. In: Proceedings of the 11th workshop on compiler techniques for high performance computing, Taiwan, pp 136–140

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kuan-Ching Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, HH., Li, KC., Lu, SH. et al. Towards implementation of a novel scheme for data prefetching on distributed shared memory systems. J Supercomput 47, 111–126 (2009). https://doi.org/10.1007/s11227-008-0180-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-008-0180-6

Keywords