A comprehensive methodology to determine optimal coherence interfaces for many-accelerator SoCs

K Bhardwaj, M Havasi, Y Yao, DM Brooks… - Proceedings of the …, 2020 - dl.acm.org
Proceedings of the ACM/IEEE International Symposium on Low Power Electronics …, 2020dl.acm.org
Modern systems-on-chip (SoCs) include not only general-purpose CPUs but also
specialized hardware accelerators. Typically, there are three coherence model choices to
integrate an accelerator with the memory hierarchy: no coherence, coherent with the last-
level cache (LLC), and private cache based full coherence. However, there has been very
limited research on finding which coherence models are optimal for the accelerators of a
complex many-accelerator SoC. This paper focuses on determining a cost-aware coherence …
Modern systems-on-chip (SoCs) include not only general-purpose CPUs but also specialized hardware accelerators. Typically, there are three coherence model choices to integrate an accelerator with the memory hierarchy: no coherence, coherent with the last-level cache (LLC), and private cache based full coherence. However, there has been very limited research on finding which coherence models are optimal for the accelerators of a complex many-accelerator SoC. This paper focuses on determining a cost-aware coherence interface for an SoC and its target application: find the best coherence models for the accelerators that optimize their power and performance, considering both workload characteristics and system-level contention. A novel comprehensive methodology is proposed that uses Bayesian optimization to efficiently find the cost-aware coherence interfaces for SoCs that are modeled using the gem5-Aladdin architectural simulator. For a complete analysis, gem5-Aladdin is extended to support LLC coherence in addition to already-supported no coherence and full coherence. For a heterogeneous SoC targeting applications with varying amount of accelerator-level parallelism, the proposed framework rapidly finds cost-aware coherence interfaces that show significant performance and power benefits over the other commonly-used coherence interfaces.
ACM Digital Library