Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

A Compile-Time Optimization Method for WCET Reduction in Real-Time Embedded Systems through Block Formation

Published: 04 January 2016 Publication History

Abstract

Compile-time optimizations play an important role in the efficient design of real-time embedded systems. Usually, compile-time optimizations are designed to reduce average-case execution time (ACET). While ACET is a main concern in high-performance computing systems, in real-time embedded systems, concerns are different and worst-case execution time (WCET) is much more important than ACET. Therefore, WCET reduction is more desirable than ACET reduction in many real-time embedded systems. In this article, we propose a compile-time optimization method aimed at reducing WCET in real-time embedded systems. In the proposed method, based on the predicated execution capability of embedded processors, program code blocks that are in the worst-case paths of the program are merged to increase instruction-level parallelism and opportunity for WCET reduction. The use of predicated execution enables merging code blocks from different worst-case paths that can be very effective in WCET reduction. The experimental results show that the proposed method can reduce WCET by up to 45% as compared to previous compile-time block formation methods. It is noteworthy that compared to previous works, while the proposed method usually achieves more WCET reduction, it has considerably less negative impact on ACET and code size.

References

[1]
AbsInt. 2015. aiT Worst-Case Execution Time Analyzers. Retrieved from http://www.absint.com/ait/.
[2]
Hakan Aydin, Rami Melhem, Daniel Mosse, and Pedro Mejia-Alvarez. 2004. Power-aware scheduling for periodic real-time tasks. IEEE Transactions on Computers 53, 5 (May 2004), 584--600.
[3]
Lakshmi N. Chakrapani, John Gyllenhaal, Wen-mei W. Hwu, Scott A. Mahlke, Krishna V. Palem, and Rodric M. Rabbah. 2005. Trimaran: An infrastructure for research in instruction-level parallelism. In Languages and Compilers for High Performance Computing, Rudolf Eigenmann, Zhiyuan Li, and Samuel P. Midkiff (Eds.). Lecture Notes in Computer Science, Vol. 3602. Springer, Berlin, 32--41.
[4]
Pohua P. Chang, Scott A. Mahlke, and Wen-Mei W. Hwu. 1991. Using profile information to assist classic code optimizations. Software: Practice and Experience 21, 12 (1991), 1301--1321.
[5]
Keith Cooper and Linda Torczon. 2011. Engineering a Compiler. Elsevier.
[6]
ILOG Cplex. 2007. 11.0 Users Manual. Report.
[7]
Jakob Engblom and Andreas Ermedahl. 1999. Pipeline timing analysis using a trace-driven simulator. In Proceedings of the 6th International Conference on Real-Time Computing Systems and Applications, 1999 (RTCSA’99). IEEE, 88--95.
[8]
Heiok Falk. 2009. WCET-aware register allocation based on graph coloring. In Proceedings of the 46th ACM/IEEE Design Automation Conference, 2009 (DAC’09). IEEE, 726--731.
[9]
Heiko Falk and Jan C. Kleinsorge. 2009. Optimal static WCET-aware scratchpad allocation of program code. In Proceedings of the 46th Annual Design Automation Conference (DAC’09). ACM, New York, NY, 732--737.
[10]
Heiok Falk, Paul Lokuciejewski, and Henrik Theiling. 2006. Design of a WCET-aware C compiler. In Proceedings of the 2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia (ESTMED’06). IEEE Computer Society, 121--126.
[11]
Heiok Falk, Norman Schmitz, and Florian Schmoll. 2011. WCET-aware register allocation based on integer-linear programming. In 2011 23rd Euromicro Conference on Real-Time Systems (ECRTS’11). IEEE, 13--22.
[12]
Heiko Falk and Martin Schwarzer. 2006. Loop nest splitting for WCET-optimization and predictability improvement. In Proceedings of the 2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia. IEEE, 115--120.
[13]
Joseph A. Fisher. 1981. Trace scheduling: A technique for global microcode compaction. IEEE Transactions on Computing C-30, 7 (1981), 478--490.
[14]
Joseph A. Fisher, Paolo Faraboschi, and Clifford Young. 2005. Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools. Elsevier.
[15]
Jan Gustafsson, Adam Betts, Andreas Ermedahl, and Bjrn Lisper. 2010. The mlardalen WCET benchmarks: Past, present and future. In Proceedings of the 10th International Workshop on Worst-Case Execution Time Analysis (WCET’10), Björn Lisper (Ed.), Vol. 15. Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik, 136--146. The printed version of the WCET’10 proceedings are published by OCG (www.ocg.at) - ISBN 978-3-85403-268-7.
[16]
Matthew R. Guthaus, Jeffrey S. Ringenberg, Dan Ernst, Todd M. Austin, Trevor Mudge, and Richard B. Brown. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the 2001 IEEE International Workshop on Workload Characterization, 2001 (WWC-4). IEEE, Austin, TX, USA, 3--14.
[17]
Jörg Henkel and Sri Parameswaran. 2007. Designing Embedded Processors: A Low Power Perspective. Springer Publishing Company, New York, NY.
[18]
Yazhi Huang, Liang Shi, Jianhua Li, Qingan Li, and C. J. Xue. 2014. WCET-aware re-scheduling register allocation for real-time embedded systems with clustered VLIW architecture. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 22, 1 (Jan. 2014), 168--180.
[19]
Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis transformation. In Proceedings of the International Symposium on Code Generation and Optimization, 2004 (CGO’04). 75--86.
[20]
Rainer Leupers. 1999. Exploiting conditional instructions in code generation for embedded VLIW processors. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’99). ACM, New York, NY, Article 23.
[21]
Xianfeng Li, Yun Liang, Tulika Mitra, and Abhik Roychoudury. 2007. Chronos: A timing analyzer for embedded software. Science of Computer Programming 69, 1--3 (2007), 56--67. http://www.comp.nus.edu.sg/ rpembed/chronos.
[22]
Xianfeng Li, Abhik Roychoudhury, and Tulika Mitra. 2006. Modeling out-of-order processors for WCET analysis. Real-Time Systems 34, 3 (2006), 195--227.
[23]
Yau-Tsun Steven Li and Sharad Malik. 1995. Performance analysis of embedded software using implicit path enumeration. SIGPLAN Notice 30, 11 (Nov. 1995), 88--98.
[24]
Paul Lokuciejewski, Heiko Falk, and Peter Marwedel. 2008a. WCET-driven cache-based procedure positioning optimizations. In Proceedings of the Euromicro Conference on Real-Time Systems, 2008 (ECRTS’08). IEEE, 321--330.
[25]
Paul Lokuciejewski, Heiko Falk, Peter Marwedel, and Henrik Theiling. 2008b. WCET-driven, code-size critical procedure cloning. In Proceedings of the 11th International Workshop on Software & Compilers for Embedded Systems (SCOPES’’08). ACM, New York, NY, 21--30. http://dl.acm.org/citation.cfm?id=1361096.1361100
[26]
Paul Lokuciejewski, Fatih Gedikli, Peter Marwedel, and Katharina Morik. 2009. Automatic WCET reduction by machine learning based heuristics for function inlining. In Proceedings of the 3rd Workshop on Statistical and Machine Learning Approaches to Architectures and Compilation (SMART’09). 1--15.
[27]
Paul Lokuciejewski, Timon Kelter, and Peter Marwedel. 2010. Superblock-based source code optimizations for WCET reduction. In Proceedings of the 2010 IEEE 10th International Conference on Computer and Information Technology (CIT’10). IEEE, 1918--1925.
[28]
Paul Lokuciejewski and Peter Marwedel. 2009. Combining worst-case timing models, loop unrolling, and static loop analysis for WCET minimization. In Proceedings of the 21st Euromicro Conference on Real-Time Systems, 2009 (ECRTS’09). 35--44.
[29]
Paul Lokuciejewski and Peter Marwedel. 2010. Worst-Case Execution Time Aware Compilation Techniques for Real-Time Systems. Springer Science & Business Media.
[30]
Scott A. Mahlke, David C. Lin, William Y. Chen, Richard E. Hank, and Roger A. Bringmann. 1992. Effective compiler support for predicated execution using the hyperblock. SIGMICRO Newsletter 23 (1992), 45--54.
[31]
Dhiraj K. Pradhan. 1996. Fault-Tolerant Computer System Design. Prentice-Hall.
[32]
Peter Puschner. 2002. Is worst-case execution-time analysis a non-problem?-Towards new software and hardware architecture. In Proceedings of the 2nd Euromicro International Workshop on WCET Analysis.
[33]
Martin Schoberl, Sahar Abbaspour, Benny Akesson, Neil Audsley, Raffaele Capasso, Jamie Garside, Kees Goossens, Sven Goossens, Scott Hansen, Reinhold Heckmann, Stefan Hepp, Benedikt Huber, Alexander Jordan, Evangelia Kasapaki, Jens Knoop, Yonghui Li, Daniel Prokesch, Wolfgang Puffitsch, Peter Puschner, Andr Rocha, Cludio Silva, Jens Spars, and Alessandro Tocchi. 2015. T-CREST: Time-predictable multi-core architecture for embedded systems. Journal of Systems Architecture 61, 9 (October 2015), 449--471.
[34]
Martin Schoberl, Peter Puschner, and Raimund Kirner. 2009. Single-path programming on a chip-multiprocessor system. In Proceedings of the Workshop on Reconciling Performance with Predictability (RePP’09).
[35]
Martin Schoberl, Pascal Schleuniger, Wolfgang Puffitsch, Florian Brandner, Christian W. Probst, Sven Karlsson, Tommy Thorn, and others. 2011. Towards a time-predictable dual-issue microprocessor: The Patmos approach. In Bringing Theory to Practice: Predictability and Performance in Embedded Systems, Vol. 18. 11--21.
[36]
David Seal. 2000. ARM Architecture Reference Manual. Addison-Wesley Longman Publishing Co.
[37]
Richard M. Stallman. 2015. Using the GNU Compiler Collection. Free Software Foundation. Retrieved from https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc.pdf.
[38]
Vivy Suhendra, Tulika Mitra, Abhik Roychoudhury, and Ting Chen. 2005. WCET centric data allocation to scratchpad memory. In Proceedings of the 26th IEEE International Real-Time Systems Symposium, 2005 (RTSS’05). 10 pp. 223--232.
[39]
Jack Whitham and Neil C. Audsley. 2010. Time-Predictable out-of-order execution for hard real-time systems. IEEE Transactions on Computers 59, 9 (Sept. 2010), 1210--1223.
[40]
Reinhard Wilhelm, Jakob Engblom, Andreas Ermedahl, Niklas Holsti, Stephan Thesing, David Whalley, Guillem Bernat, Christian Ferdinand, Reinhold Heckmann, Tulika Mitra, Frank Mueller, Isabelle Puaut, Peter Puschner, Jan Staschulat, and Per Stenström. 2008. The worst-case execution-time problem&Mdash;overview of methods and survey of tools. ACM Transactions on Embedded Computing Systems 7, 3 (2008), 36:1--36:53.
[41]
Kent Wilken, Jack Liu, and Mark Heffernan. 2000. Optimal instruction scheduling using integer programming. SIGPLAN Notices 35, 5 (2000), 121--133.
[42]
Hui Wu, Jingling Xue, and Sri Parameswaran. 2010. Optimal WCET-aware code selection for scratchpad memory. In Proceedings of the 10th ACM International Conference on Embedded Software. ACM, 59--68.
[43]
Jun Yan and Wei Zhang. 2008. A time-predictable VLIW processor and its compiler support. Real-Time Systems 38, 1 (2008), 67--84.
[44]
Wankang Zhao, William Kreahling, David Whalley, Christopher Healy, and Frank Mueller. 2005a. Improving WCET by optimizing worst-case paths. In Proceedings of the 11th IEEE Real Time and Embedded Technology and Applications Symposium, 2005 (RTAS’05). 138--147.
[45]
Wankang Zhao, David Whalley, Christopher Healy, and Frank Mueller. 2005b. Improving WCET by applying a WC code-positioning optimization. ACM Transactions on Architecture Code Optimization 2, 4 (2005), 335--365.

Cited By

View all
  • (2024)Assessing the Cloud-RAN in the Linux Kernel: Sharing Computing and Network ResourcesSensors10.3390/s2407236524:7(2365)Online publication date: 8-Apr-2024
  • (2024)Predictable and optimized single-path code for predicated processorsJournal of Systems Architecture10.1016/j.sysarc.2024.103214154(103214)Online publication date: Sep-2024
  • (2023)When quantum annealing meets multitasking: Potentials, challenges and opportunitiesArray10.1016/j.array.2023.10028217(100282)Online publication date: Mar-2023
  • Show More Cited By

Index Terms

  1. A Compile-Time Optimization Method for WCET Reduction in Real-Time Embedded Systems through Block Formation

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Architecture and Code Optimization
      ACM Transactions on Architecture and Code Optimization  Volume 12, Issue 4
      January 2016
      848 pages
      ISSN:1544-3566
      EISSN:1544-3973
      DOI:10.1145/2836331
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 04 January 2016
      Accepted: 01 November 2015
      Revised: 01 October 2015
      Received: 01 April 2014
      Published in TACO Volume 12, Issue 4

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Compile-time optimization
      2. WCET
      3. hyperblock

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • deputation of research and technology of Sharif University of Technology

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)102
      • Downloads (Last 6 weeks)17
      Reflects downloads up to 12 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Assessing the Cloud-RAN in the Linux Kernel: Sharing Computing and Network ResourcesSensors10.3390/s2407236524:7(2365)Online publication date: 8-Apr-2024
      • (2024)Predictable and optimized single-path code for predicated processorsJournal of Systems Architecture10.1016/j.sysarc.2024.103214154(103214)Online publication date: Sep-2024
      • (2023)When quantum annealing meets multitasking: Potentials, challenges and opportunitiesArray10.1016/j.array.2023.10028217(100282)Online publication date: Mar-2023
      • (2021)Scenario-Aware Program Specialization for Timing PredictabilityACM Transactions on Architecture and Code Optimization10.1145/347333318:4(1-26)Online publication date: 3-Sep-2021
      • (2020)Towards Dual-Issue Single-Path Code2020 IEEE 23rd International Symposium on Real-Time Distributed Computing (ISORC)10.1109/ISORC49007.2020.00039(176-183)Online publication date: May-2020
      • (2019)WCET-aware hyper-block construction for clustered VLIW processorsProceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3316482.3326349(110-122)Online publication date: 23-Jun-2019
      • (2019)Context-Sensitive FencingProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3297858.3304060(395-410)Online publication date: 4-Apr-2019
      • (2019)Software UART: A Use Case for VSCPU Worst-Case Execution Time Analyzer2019 4th International Conference on Computer Science and Engineering (UBMK)10.1109/UBMK.2019.8907220(504-509)Online publication date: Sep-2019
      • (2017)An Efficient WCET-Aware Instruction Scheduling and Register Allocation Approach for Clustered VLIW ProcessorsACM Transactions on Embedded Computing Systems10.1145/312652416:5s(1-21)Online publication date: 27-Sep-2017

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Full Access

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media