Cheetah: Detecting false sharing efficiently and effectively

T Liu, X Liu - Proceedings of the 2016 International Symposium on …, 2016 - dl.acm.org
Proceedings of the 2016 International Symposium on Code Generation and …, 2016dl.acm.org
False sharing is a notorious performance problem that may occur in multithreaded programs
when they are running on ubiquitous multicore hardware. It can dramatically degrade the
performance by up to an order of magnitude, significantly hurting the scalability. Identifying
false sharing in complex programs is challenging. Existing tools either incur significant
performance overhead or do not provide adequate information to guide code optimization.
To address these problems, we develop Cheetah, a profiler that detects false sharing both …
False sharing is a notorious performance problem that may occur in multithreaded programs when they are running on ubiquitous multicore hardware. It can dramatically degrade the performance by up to an order of magnitude, significantly hurting the scalability. Identifying false sharing in complex programs is challenging. Existing tools either incur significant performance overhead or do not provide adequate information to guide code optimization. To address these problems, we develop Cheetah, a profiler that detects false sharing both efficiently and effectively. Cheetah leverages the lightweight hardware performance monitoring units (PMUs) that are available in most modern CPU architectures to sample memory accesses. Cheetah develops the first approach to quantify the optimization potential of false sharing instances without actual fixes, based on the latency information collected by PMUs. Cheetah precisely reports false sharing and provides insightful optimization guidance for programmers, while adding less than 7% runtime overhead on average. Cheetah is ready for real deployment.
ACM Digital Library