Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

How to Evaluate Various Commonly Used Program Classification Methods?

  • Conference paper
  • First Online:
Advanced Computer Architecture (ACA 2020)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1256))

Included in the following conference series:

  • 972 Accesses

Abstract

Understanding the characteristics of scientific computing programs has been of great importance due to its close relationship with the design and implementation of program optimization methods. Generally, scientific computing programs can be divided into three categories according to their computing, memory access and communication characteristics, namely compute-intensive, memory-intensive and communication-intensive, respectively. There are more than one commonly used program classification methods, particularly for compute-intensive and memory-intensive programs. In most cases, all kinds of classification methods have consistent results but occasionally different classification results also occur. Why are there occasionally inconsistent classification results and where? How to understand such inconsistencies and what is the reason behind that? We answer these questions by analyzing four representative program classification methods (IPC, MPKI, MEM/Uop and Roofline) on two platforms. Firstly, we discover some occasional inconsistency cases, the inconsistency from various indicators, the inconsistency from multi-phase characteristics and the inconsistency from various platforms, followed by some possible reasons. Secondly, we explore the impact of threshold settings on classification inconsistencies. All the experiment and analysis results and the data collected from other references prove that different classification methods have the same classification results in most cases but occasionally bring about inconsistencies especially for in-between programs that are between memory-intensive and compute-intensive programs, which have a bad impact on some optimization algorithms.

This work is supported in part by the Advanced Research Project of China under grant number 31511010203 and the Research Program of NUDT grant number ZK18-03-10.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Intel\(\textregistered \) vtune\(^{\rm TM}\) amplifier (2019). https://software.intel.com/en-us/vtune

  2. Perf (2019). https://perf.wiki.kernel.org/index.php/Main_Page

  3. Alcaraz, J., Sikora, A., Cesar, E.: Dynamic tuning of openmp memory bound applications in multisocket systems using mate. In: Proceedings of the 47th International Conference on Parallel Processing Companion, ICPP 2018. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3229710.3229748. https://doi-org-s.nudtproxy.yitlink.com/10.1145/3229710.3229748

  4. Bailey, D., et al.: The NAS parallel benchmarks. Int. J. High Perform. Comput. Appl. 5, 63–73 (1991). https://doi.org/10.1177/109434209100500306

    Article  Google Scholar 

  5. Begum, R., Werner, D., Hempstead, M., Prasad, G., Challen, G.: Energy-performance trade-offs on energy-constrained devices with multi-component DVFs. In: 2015 IEEE International Symposium on Workload Characterization (IISWC) (2015)

    Google Scholar 

  6. Crovella, M., Bianchini, R., LeBlanc, T., Markatos, E., Wisniewski, R.: Using communication-to-computation ratio in parallel program design and performance prediction. In: Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing, pp. 238–245 (1992)

    Google Scholar 

  7. Denning, P.J.: The working set model for program behavior. Commun. ACM 11(5), 323–333 (1968)

    Article  MathSciNet  Google Scholar 

  8. Ge, R., Zou, P., Feng, X.: [IEEE 2017 46th International Conference on Parallel Processing (ICPP) - Bristol, United Kingdom (14 August 2017–17 August 2017)] 2017 46th International Conference on Parallel Processing (ICPP) - Application-Aware Power Coordination on Power Bounded Numa Multicore, pp. 591–600 (2017)

    Google Scholar 

  9. Hashemi, M., Mutlu, O., Patt, Y.: Continuous runahead: transparent hardware acceleration for memory intensive workloads, pp. 1–12 (2016). https://doi.org/10.1109/MICRO.2016.7783764

  10. Hashemi, M., Mutlu, O., Patt, Y.N.: Continuous runahead: transparent hardware acceleration for memory intensive workloads (2016)

    Google Scholar 

  11. Hashemi, M., Patt, Y.N.: Filtered runahead execution with a runahead buffer. In: Proceedings of the 48th International Symposium on Microarchitecture, pp. 358–369 (2015)

    Google Scholar 

  12. Huang, S., Feng, W.: Energy-efficient cluster computing via accurate workload characterization. In: 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, CCGrid 2009, Shanghai, China, 18–21 May 2009 (2009)

    Google Scholar 

  13. Isci, C., Contreras, G., Martonosi, M.: Live, runtime phase monitoring and prediction on real systems with application to dynamic power management, pp. 359–370 (2006). https://doi.org/10.1109/MICRO.2006.30

  14. Jang, H., Lee, J., Kong, J., Suh, T., Chung, S.: Leveraging process variation for performance and energy: in the perspective of overclocking. IEEE Trans. Comput. 63, 1 (2014). https://doi.org/10.1109/TC.2012.286

    Article  MathSciNet  MATH  Google Scholar 

  15. Chen, J., et al.: Analyzing time-dimension communication characterizations for representative scientific applications on supercomputer systems. Front. Comput. Sci. 13(6), 1228–1242 (2019)

    Article  Google Scholar 

  16. Konstantinidis, E., Cotronis, Y.: A practical performance model for compute and memory bound GPU kernels (2015). https://doi.org/10.1109/PDP.2015.51

  17. Loew, J., Ponomarev, D.: Two-level reorder buffers: accelerating memory-bound applications on SMT architectures. In: 2008 37th International Conference on Parallel Processing, pp. 182–189 (2008)

    Google Scholar 

  18. Luszczek, P., et al.: The HPC challenge (HPCC) benchmark suite, p. 213 (2006). https://doi.org/10.1145/1188455.1188677

  19. Maron, B., Chen, T., Vianney, D., Olszewski, B., Kunkel, S., Mericas, A.: Workload characterization for the design of future servers. In: IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, pp. 129–136 (2005). https://doi.org/10.1109/IISWC.2005.1526009

  20. Tikir, M.M., Carrington, L., Strohmaier, E., Snavely, A.: A genetic algorithms approach to modeling the performance of memory-bound computations. In: SC 2007: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, pp. 1–12 (2007)

    Google Scholar 

  21. Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52, 65–76 (2009). https://doi.org/10.1145/1498765.1498785

    Article  Google Scholar 

  22. Wu, F., et al.: A holistic energy-efficient approach for a processor-memory system. Tsinghua Sci. Technol. 24(4), 468–483 (2019)

    Article  Google Scholar 

  23. Wu, Q., et al.: A dynamic compilation framework for controlling microprocessor energy and performance. In: Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (2005)

    Google Scholar 

  24. Dong, Y., Chen, J., Tang, Y., Wu, J., Wang, H., Zhou, E.: Lazy scheduling based disk energy optimization method. Tsinghua Sci. Technol. 25(2), 203–216 (2020)

    Article  Google Scholar 

  25. Zhou, H., Conte, T.M.: Enhancing memory-level parallelism via recovery-free value prediction. IEEE Trans. Comput. 54(7), 897–912 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Qi, X., Yuan, Y., Chen, J., Dong, Y. (2020). How to Evaluate Various Commonly Used Program Classification Methods?. In: Dong, D., Gong, X., Li, C., Li, D., Wu, J. (eds) Advanced Computer Architecture. ACA 2020. Communications in Computer and Information Science, vol 1256. Springer, Singapore. https://doi.org/10.1007/978-981-15-8135-9_17

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-8135-9_17

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-8134-2

  • Online ISBN: 978-981-15-8135-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics