Abstract
The move to multicore offers a steep increase in compute power, while little is done to improve the performance of the memory system. Typically, current applications make poor use of the memory system and few developers have the insight to fix such problems. Furthermore, the introduction of shared memory system resources makes the picture even more complicated.
Acumem Virtual Performance Expert (VPE) automatically identifies wasteful memory access behavior in applications and suggests improvements. About 20 different types of performance issues related to multi-threaded execution and cache usage are identified and fixes are suggested at a level of detail allowing even novice programmers to perform performance optimization requiring performance experts today.
Among other things, Acumem’s technology suggests changes to make cache usage more efficient and to lower memory bandwidth requirements. Most of today’s applications use less than half the data brought into the cache. If the applications could be optimized to use memory efficiently, that would lower the cache miss frequency substantially. Other parts of the application would then also benefit from reduced cache pressure. Based on a small application fingerprint file collected from native execution on a system the application’s performance on any memory system can be analyzed and application improvements be suggested.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hennessy, J.L., Patterson, D.A.: Computer Architecture – A Quantitative Approach. Morgan Kaufmann Publishers, San Francisco, USA (2007)
Berg, E., et al.: Fast Data-Locality Profiling of Native Execution. Proceedings of the International Conference on Measurement and Modeling of Computer Systems, Banff, Alberta, Canada (2005)
Hammond, L., et al.: A Single-Chip Multiprocessor. IEEE Computer 30(9): 79-85 (1997)
Fernandes, E.S.T., et al.: Instruction usage and the memory gap problem. In Proceedings of 14th Symposium on Computer Architecture and High Performance Computing 2002
Karlsson, M., et al.: Conserving Memory Bandwidth in Chip Multiprocessors with Runahead Execution. In Proceedings of IPDPS 2007
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hagersten, E., Nilsson, M., Vesterlund, M. (2008). Improving Cache Utilization Using Acumem VPE. In: Resch, M., Keller, R., Himmler, V., Krammer, B., Schulz, A. (eds) Tools for High Performance Computing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68564-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-68564-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68561-6
Online ISBN: 978-3-540-68564-7
eBook Packages: Computer ScienceComputer Science (R0)