research-article

Multi-objective Exploration for Practical Optimization Decisions in Binary Translation

Authors:

Scott MahlkeAuthors Info & Claims

ACM Transactions on Embedded Computing Systems (TECS), Volume 18, Issue 5s

Article No.: 57, Pages 1 - 19

https://doi.org/10.1145/3358185

Published: 07 October 2019 Publication History

Get Access

Abstract

In the design of mobile systems, hardware/software (HW/SW) co-design has important advantages by creating specialized hardware for the performance or power optimizations. Dynamic binary translation (DBT) is a key component in co-design. During the translation, a dynamic optimizer in the DBT system applies various software optimizations to improve the quality of the translated code. With dynamic optimization, optimization time is an exposed run-time overhead and useful analyses are often restricted due to their high costs. Thus, a dynamic optimizer needs to make smart decisions with limited analysis information, which complicates the design of optimization decision models and often causes failures in human-made heuristics. In mobile systems, this problem is even more challenging because of strict constraints on computing capabilities and memory size.

To overcome the challenge, we investigate an opportunity to build practical optimization decision models for DBT by using machine learning techniques. As the first step, loop unrolling is chosen as the representative optimization. We base our approach on the industrial strength DBT infrastructure and conduct evaluation with 17,116 unrollable loops collected from 200 benchmarks and real-life programs across various domains. By utilizing all available features that are potentially important for loop unrolling decision, we identify the best classification algorithm for our infrastructure with consideration for both prediction accuracy and cost. The greedy feature selection algorithm is then applied to the classification algorithm to distinguish its significant features and cut down the feature space. By maintaining significant features only, the best affordable classifier, which satisfies the budgets allocated to the decision process, shows 74.5% of prediction accuracy for the optimal unroll factor and realizes an average 20.9% reduction in dynamic instruction count during the steady-state translated code execution. For comparison, the best baseline heuristic achieves 46.0% prediction accuracy with an average 13.6% instruction count reduction. Given that the infrastructure is already highly optimized and the ideal upper bound for instruction reduction is observed at 23.8%, we believe this result is noteworthy.

References

[1]

2019-02-08. Intel Core i7 Embedded Processor. https://ark.intel.com/products/series/122593/8th-Generation-Intel-Core-i7-Processors#@embedded.

Abstract

References

Cited By

Index Terms

Recommendations

The continuous artificial bee colony algorithm for binary optimization

Movement Strategies for Multi-Objective Particle Swarm Optimization

Multi-objective optimization using BFO algorithm

Comments

Information

Published In

Publisher

Journal Family

Publication History

Permissions

Check for updates

Author Tag

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

HTML Format

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations