Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Complete and Practical Universal Instruction Selection

Published: 27 September 2017 Publication History

Abstract

In code generation, instruction selection chooses processor instructions to implement a program under compilation where code quality crucially depends on the choice of instructions. Using methods from combinatorial optimization, this paper proposes an expressive model that integrates global instruction selection with global code motion. The model introduces (1) handling of memory computations and function calls, (2) a method for inserting additional jump instructions where necessary, (3) a dependency-based technique to ensure correct combinations of instructions, (4) value reuse to improve code quality, and (5) an objective function that reduces compilation time and increases scalability by exploiting bounding techniques. The approach is demonstrated to be complete and practical, competitive with LLVM, and potentially optimal (w.r.t. the model) for medium-sized functions. The results show that combinatorial optimization for instruction selection is well-suited to exploit the potential of modern processors in embedded systems.

References

[1]
CPU Benchmarks -- Single Thread Performance. PassMark Software. URL: http://www.cpubenchmark.net/singleThread.html, updated June 2, 2017.
[2]
A. V. Aho, M. Ganapathi, and S. W. K. Tjiang. 1989. Code Generation Using Tree Matching and Dynamic Programming. ACM Transactions on Programming Languages and Systems 11(4):491--516, 1989.
[3]
J. R. Allen, K. Kennedy, C. Porterfield, and J. Warren. 1983. Conversion of Control Dependence to Data Dependence. In POPL’83. 177--189, 1983.
[4]
M. Ali Arslan and K. Kuchcinski. 2013. Instruction Selection and Scheduling for DSP Kernels on Custom Architectures. In DSD’13. IEEE. 2013.
[5]
S. Bansal and A. Aiken. 2006. Automatic Generation of Peephole Superoptimizers. In ASPLOS’06. 394--403. ACM, 2006.
[6]
G. Barany and A. Krall. 2013. Optimal and Heuristic Global Code Motion for Minimal Spilling. In CC’13. 21--40. Springer, 2013.
[7]
S. Bashford and R. Leupers. 1999. Constraint Driven Code Selection for Fixed-Point DSPs. In DAC’99. 817--822. ACM/IEEE, 1999.
[8]
A. Bednarski and C. W. Kessler. 2006. Optimal Integrated VLIW Code Generation with Integer Linear Programming. 2006.
[9]
J. Boender and C. S. Coen. 2014. On the Correctness of a Branch Displacement Algorithm. In TACAS’14. 605--619. Springer, 2014.
[10]
S. Buchwald. 2015. Optgen: A Generator for Local Optimizations. In CC’15. 171--189. Springer, 2015.
[11]
S. Buchwald and A. Zwinkau. 2010. Instruction Selection by Graph Transformation. In CASES’10. 31--40, 2010.
[12]
R. C. Lozano, M. Carlsson, F. Drejhammar, and C. Schulte. Constraint-based Register Allocation and Instruction Scheduling. In CP’12. 750--766. Springer.
[13]
R. C. Lozano, M. Carlsson, G. H. Blindell, and C. Schulte. 2014. Combinatorial Spill Code Optimization and Ultimate Coalescing. In LCTES’14. 23--32. ACM, 2014.
[14]
G. G. Chu. 2011. Improving Combinatorial Optimization. Ph.D. Dissertation. The University of Melbourne, Australia. 2011.
[15]
C. Click. 1995. Global Code Motion/Global Value Numbering. In PLDI’95. 246--257. ACM, 1995.
[16]
L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. 2004. A (Sub)Graph Isomorphism Algorithm for Matching Large Graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(10):1367--1372, 2004.
[17]
R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. 1991. Efficiently Computing Static Single Assignment Form and the Control Dependence Graph. ACM Transactions on Programming Languages and Systems 13(4):451--490, 1991.
[18]
D. Ebner, F. Brandner, B. Scholz, A. Krall, P. Wiedermann, and A. Kadlec. 2008. Generalized Instruction Selection Using SSA-Graphs. In LCTES’08. 31--40. ACM, 2008.
[19]
E. Eckstein, O. König, and B. Scholz. 2003. Code Instruction Selection Based on SSA-Graphs. In SCOPES’03. 49--65. ACM, 2003.
[20]
M. A. Ertl. 1999. Optimal Code Selection in DAGs. In POPL’99. 242--249. ACM, 1999.
[21]
M. A. Ertl, K. Casey, and D. Gregg. 2006. Fast and Flexible Instruction Selection with On-Demand Tree-Parsing Automata. In PLDI’06. 52--60. ACM, 2006.
[22]
J. A. Fisher. 1981. Trace Scheduling: A Technique for Global Microcode Compaction. IEEE Trans. Comput. 30(7):478--490, 1981.
[23]
A. Floch, C. Wolinski, and K. Kuchcinski. 2010. Combined Scheduling and Instruction Selection for Processors with Reconfigurable Cell Fabric. In ASAP’10. 167--174. IEEE, 2010.
[24]
C. W. Fraser, R. R. Henry, and T. A. Proebsting. 1992. BURG: Fast Optimal Instruction Selection and Tree Parsing. SIGPLAN Notices 27(4):68--76, 1992.
[25]
C. H. Gebotys. 1997. An Efficient Model for DSP Code Generation: Performance, Code Size, Estimated Energy. In ISSS’97. 41--47. IEEE, 1997.
[26]
M. P. Gerlek, E. Stoltz, and M. Wolfe. 1995. Beyond Induction Variables: Detecting and Classifying Sequences Using a Demand-driven SSA Form. ACM Transactions on Programming Languages and Systems 17(1):85--122, 1995.
[27]
T. Granlund and R. Kenner. 1992. Eliminating Branches Using a Superoptimizer and the GNU C Compiler. In PLDI’92. 341--352. ACM, 1992.
[28]
G. H. Blindell. 2016. Instruction Selection: Principles, Methods, and Applications. Springer. 2016. ISBN 978-3-319-34017-3.
[29]
G. H. Blindell, R. C. Lozano, M. Carlsson, and C. Schulte. 2015. Modeling Universal Instruction Selection. In CP’15. 609--626. Springer, 2015.
[30]
D. B. Johnson. 1975. Finding All the Elementary Circuits of a Directed Graph. SIAM J. Comput. 4(1):77--84, 1975.
[31]
N. Johnson and A. Mycroft. 2003. Combined Code Motion and Register Allocation Using the Value State Dependence Graph. In CC’03. 1--16. Springer, 2003.
[32]
A. Jordan, N. Kim, and A. Krall. 2013. IR-level Versus Machine-level If-conversion for Predicated Architectures. In ODES’13. 3--10. ACM, 2013.
[33]
D. R. Koes and S. C. Goldstein. 2008. Near-Optimal Instruction Selection on DAGs. In CGO’08. 45--54. IEEE/ACM, 2008.
[34]
A. H. Land and A. G. Doig. 1960. An automatic method of solving discrete programming problems. Econometrica: Journal of the Econometric Society 497--520, 1960.
[35]
S. Larsen and S. Amarasinghe. 2000. Exploiting Superword Level Parallelism with Multimedia Instruction Sets. In PLDI’00. 145--156. ACM, 2000.
[36]
C. Lattner and V. Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis 8 Transformation. In IEEE/ACM International Symposium on Code Generation and Optimization. 75--86. IEEE, 2004.
[37]
J.-L. Laurière. 1978. A Language and a Program for Stating and Solving Combinatorial Problems. Artificial Intelligence 10, (1):29--127, 1978.
[38]
Y. Law and J. Lee. 2006. Symmetry Breaking Constraints for Value Symmetries in Constraint Satisfaction. Constraints 11, (2--3):221--267, 2006.
[39]
C. Lecoutre and R. Szymanek. 2006. Generalized Arc Consistency for Positive Table Constraints. In CP’06. 284--298. Springer, 2006.
[40]
C. Lee, M. Potkonjak, and W. H. Mangione-Smith. 1997. MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems. In MICRO’97. 330--335. IEEE, 1997.
[41]
R. Leupers. 2000. Code Selection for Media Processors with SIMD Instructions. In DATE’00. 4--8. IEEE, 2000.
[42]
J. Liu, Y. Zhang, O. Jang, W. Ding, and M. Kandemir. 2012. A Compiler Framework for Extracting Superword Level Parallelism. In PLDI’12. 347--358. ACM, 2012.
[43]
N. P. Lopes, D. Menendez, S. Nagarakatte, and J. Regehr. 2015. Provably Correct Peephole Optimizations with Alive. In PLDI’15. 22--32. ACM, 2015.
[44]
K. Martin, C. Wolinski, K. Kuchcinski, A. Floch, and F. Charot. 2009. Constraint-Driven Instructions Selection and Application Scheduling in the DURASE System. In ASAP’09. 145--152. IEEE, 2009.
[45]
N. Nethercote, P. J. Stuckey, R. Becket, S. Brand, G. J. Duck, and G. Tack. 2007. MiniZinc: Towards a Standard CP Modelling Language. In CP’07. 529--543. Springer, 2007.
[46]
E. Pelegrí-Llopart and S. L. Graham. 1988. Optimal Code Generation for Expression Trees: An Application of BURS Theory. In POPL’88. 294--308. ACM, 1988.
[47]
A. Phansalkar, A. Joshi, L. Eeckhout, and L. K. John. 2005. Measuring Program Similarity: Experiments with SPEC CPU Benchmark Suites. In ISPASS’05. 10--20. IEEE, 2005.
[48]
Hexagon V5/V55 Programmer’s Reference Manual. Qualcomm Technologies Inc. 80-N2040-8 Rev. A.
[49]
F. Rossi, P. van Beek, and T. Walsh. 2006. Handbook of Constraint Programming. Elsevier Science Inc. 2006. ISBN 0-444-52726-5.
[50]
V. Sarkar, M. J. Serrano, and B. B. Simons. 2001. Register-sensitive Selection, Duplication, and Sequencing of Instructions. In ICS’01. 277--288. ACM, 2001.
[51]
H. Tanaka, S. Kobayashi, Y. Takeuchi, K. Sakanushi, and M. Imai. 2013. A Code Selection Method for SIMD Processors with PACK Instructions. In SCOPES’03. 66--80. Springer, 2013.
[52]
V. Živojnović, J. M. Velarde, C. Schläger, and H. Meyr. 1994. DSPstone: A DSP-Oriented Benchmarking Methodology. In ICSPAT’94. 715--720. Miller Freeman, 1994.
[53]
N. J. Warter, S. A. Mahlke, W.-M. W. Hwu, and B. R. Rau. 1993. Reverse If-Conversion. In PLDI’93. 290--299. ACM, 1993.
[54]
T. Wilson, G. Grewal, B. Halley, and D. Banerji. 1994. An Integrated Approach to Retargetable Code Generation. In ISSS’94. 70--75. IEEE, 1994.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 16, Issue 5s
Special Issue ESWEEK 2017, CASES 2017, CODES + ISSS 2017 and EMSOFT 2017
October 2017
1448 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/3145508
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 27 September 2017
Accepted: 01 June 2017
Revised: 01 June 2017
Received: 01 April 2017
Published in TECS Volume 16, Issue 5s

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. code generation
  2. combinatorial optimization
  3. constraint programming
  4. instruction selection

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Development of DSL Compilers for Specialized ProcessorsProgramming and Computing Software10.1134/S036176882107008247:7(541-554)Online publication date: 1-Dec-2021
  • (2020)Certified and efficient instruction scheduling: application to interlocked VLIW processorsProceedings of the ACM on Programming Languages10.1145/34281974:OOPSLA(1-29)Online publication date: 13-Nov-2020
  • (2019)Constraint Programming in Embedded Systems Design: Considered HelpfulMicroprocessors and Microsystems10.1016/j.micpro.2019.05.012Online publication date: May-2019
  • (2018)Compiling for VLIW DSPsHandbook of Signal Processing Systems10.1007/978-3-319-91734-4_27(979-1020)Online publication date: 14-Oct-2018

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media