Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Compilation of Dataflow Applications for Multi-Cores using Adaptive Multi-Objective Optimization

Published: 11 March 2019 Publication History

Abstract

State-of-the-art system synthesis techniques employ meta-heuristic optimization techniques for Design Space Exploration (DSE) to tailor application execution, e.g., defined by a dataflow graph, for a given target platform. Unfortunately, the performance evaluation of each implementation candidate is computationally very expensive, in particular on recent multi-core platforms, as this involves compilation to and extensive evaluation on the target hardware. Applying heuristics for performance evaluation on the one hand allows for a reduction of the exploration time but on the other hand may deteriorate the convergence of the optimization technique toward performance-optimal solutions with respect to the target platform. To address this problem, we propose DSE strategies that are able to dynamically trade off between (i) approximating heuristics to guide the exploration and (ii) accurate performance evaluation, i.e., compilation of the application and subsequent performance measurement on the target platform. Technically, this is achieved by introducing a set of additional, but easily computable guiding objective functions, and varying the set of objective functions that are evaluated during the DSE adaptively. One major advantage of these guiding objectives is that they are generically applicable for dataflow models without having to apply any configuration techniques to tailor their parameters to the specific use case. We show this for synthetic benchmarks as well as a real-world control application. Moreover, the experimental results demonstrate that our proposed adaptive DSE strategies clearly outperform a state-of-the-art DSE approach known from literature in terms of the quality of the gained implementations as well as exploration times. Amongst others, we show a case for a two-core implementation where after about 3 hours of exploration time one of our proposed adaptive DSE strategies already obtains a 60% higher performance value than obtained by the state-of-the-art approach. Even when the state-of-the-art approach is given a total exploration time of more than 2 weeks to optimize this value, the proposed adaptive DSE strategy features a 20% higher performance value after a total exploration time of about 4 days.

References

[1]
Tobias Blickle, Jürgen Teich, and Lothar Thiele. 1998. System-level synthesis using evolutionary algorithms. Design Automation for Embedded Systems 3, 1 (1998), 23--58.
[2]
Gustav Cedersjö and Jörn W. Janneck. 2014. Software code generation for dynamic dataflow programs. In Proceedings of the International Workshop on Software and Compilers for Embedded Systems (SCOPES’14). ACM, 31--39.
[3]
Gustav Cedersjö, Jörn W. Janneck, and Jonas Skeppstedt. 2014. Finding fast action selectors for dataflow actors. In Proceedings of the 48th Asilomar Conference on Signals, Systems and Computers (ACSSC’14), Michael B. Matthews (Ed.). IEEE, 1435--1439.
[4]
M. Damavandpeyma, S. Stuijk, M. Geilen, T. Basten, and H. Corporaal. 2012. Parametric throughput analysis of scenario-aware dataflow graphs. In Proceedings of the 2012 IEEE 30th International Conference on Computer Design (ICCD’12). 219--226.
[5]
Ali Dasdan. 2004. Experimental analysis of the fastest optimum cycle ratio and mean algorithms. ACM Transactions on Design Automation of Electronic Systems 9, 4 (Oct. 2004), 385--418.
[6]
K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6, 2 (April 2002), 182--197.
[7]
Joachim Falk, Tobias Schwarzer, Michael Glaß, Jürgen Teich, and Christian Haubelt. 2015. Quasi-static scheduling of data flow graphs in the presence of limited channel capacities. In Proceedings of Embedded Systems for Real-Time Multimedia (ESTIMedia’15). 10.
[8]
Joachim Falk, Christian Zebelein, Joachim Keinert, Christian Haubelt, Jürgen Teich, and Shuvra S. Bhattacharyya. 2011. Analysis of SystemC actor networks for efficient synthesis. ACM Transactions on Embedded Computing Systems 10, 2 (Jan. 2011), Article 18, 18:1--18:34 pages.
[9]
Björn Franke. 2008. Fast cycle-approximate instruction set simulation. In Proceedings of SCOPES. ACM, New York, 69--78.
[10]
Marc Geilen and Sander Stuijk. 2010. Worst-case performance analysis of synchronous dataflow scenarios. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’10). ACM, New York, 125--134.
[11]
A. H. Ghamarian, M. C. W. Geilen, T. Basten, and S. Stuijk. 2008. Parametric throughput analysis of synchronous data flow graphs. In Proceedings of the Design, Automation and Test in Europe (DATE’08). 116--121.
[12]
A. H. Ghamarian, M. C. W. Geilen, S. Stuijk, T. Basten, B. D. Theelen, M. R. Mousavi, A. J. M. Moonen, and M. J. G. Bekooij. 2006. Throughput analysis of synchronous data flow graphs. In Proceedings of the 6th International Conference on Application of Concurrency to System Design (ACSD’06). 25--36.
[13]
Yaochu Jin, M. Olhofer, and B. Sendhoff. 2001. Managing approximate models in evolutionary aerodynamic design optimization. In Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No. 01TH8546), Vol. 1, 592--599.
[14]
Yaochu Jin, M. Olhofer, and B. Sendhoff. 2002. A framework for evolutionary optimization with approximate fitness functions. IEEE Transactions on Evolutionary Computation 6, 5 (Oct 2002), 481--494.
[15]
Gilles Kahn. 1974. The semantics of simple language for parallel programming. In IFIP Congress.
[16]
T. Kempf, M. Doerper, R. Leupers, G. Ascheid, H. Meyr, T. Kogel, and B. Vanthournout. 2005. A modular simulation framework for spatial and temporal task mapping onto multi-processor SoC platforms. In Proceedings of the Design, Automation and Test in Europe (DATE’05). Vol. 2, 876--881.
[17]
Yann LeCun, Yoshua Bengio, and Geoffrey E. Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436--444.
[18]
E. A. Lee and D. G. Messerschmitt. 1987. Synchronous data flow. Proceedings of the IEEE 75, 9 (Sept. 1987), 1235--1245.
[19]
M. Lukasiewycz, M. Glass, C. Haubelt, and J. Teich. 2007. SAT-decoding in evolutionary algorithms for discrete constrained optimization problems. In 2007 IEEE Congress on Evolutionary Computation. 935--942.
[20]
Martin Lukasiewycz, Michael Glaß, Christian Haubelt, Jürgen Teich, Richard Regler, and Bardo Lang. 2008. Concurrent topology and routing optimization in automotive network integration. In Proceedings of the Design Automation Conference (DAC’08). 626--629.
[21]
M. Lukasiewycz, M. Glaß, F. Reimann, and J. Teich. 2011. Opt4J: A modular framework for meta-heuristic optimization. In Proceedings of the Genetic and Evolutionary Computing Conference (GECCO’11). ACM, New York, 1723--1730.
[22]
Martin Lukasiewycz, Martin Streubühr, Michael Glaß, Christian Haubelt, and Jürgen Teich. 2009. Combined system synthesis and communication architecture exploration for MPSoCs. In Proceedings of the Design, Automation and Test in Europe (DATE’09). IEEE Computer Society, 472--477.
[23]
Giovanni Mariani, Gianluca Palermo, Vittorio Zaccaria, and Cristina Silvano. 2013. Design-space exploration and runtime resource management for multicores. ACM Transactions on Embedded Computing Systems 13, 2 (Sept. 2013), Article 20, 27 pages.
[24]
M. Michalska, S. Casale-Brunet, E. Bezati, and M. Mattavelli. 2018. High-precision performance estimation for the design space exploration of dynamic dataflow programs. IEEE Transactions on Multi-Scale Computing Systems 4, 2 (April-June 2018), 127--140.
[25]
K. Neubauer, C. Haubelt, P. Wanko, and T. Schaub. 2018. Work-in-progress: On leveraging approximations for exact system-level design space exploration. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’18). 1--2.
[26]
K. Neubauer, P. Wanko, T. Schaub, and C. Haubelt. 2017. Enhancing symbolic system synthesis through ASPmT with partial assignment evaluation. In Proceedings of the Design, Automation and Test in Europe (DATE’17). 306--309.
[27]
OpenDSE. 2018. Open Design Space Exploration Framework. Retrieved on July 2018 from http://opendse.sf.net/.
[28]
Edoardo Paone, N. Vahabi, Vittorio Zaccaria, Cristina Silvano, Diego Melpignano, Germain Haugou, and Thierry Lepley. 2013. Improving simulation speed and accuracy for many-core embedded platforms with ensemble models. In Proceedings of the Design, Automation and Test in Europe (DATE’13). 671--676.
[29]
Jose Luis Pino, Shuvra S. Bhattacharyya, and Edward A. Lee. 1995. A hierarchical multiprocessor scheduling system for DSP applications. In Proceedings of the Conference Record of the 29th Asilomar Conference on Signals, Systems and Computers. IEEE, Vol. 1, 122--126.
[30]
Felix Reimann, Michael Glaß, Christian Haubelt, Michael Eberl, and Jürgen Teich. 2010. Improving platform-based system synthesis by satisfiability modulo theories solving. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’10). 135--144.
[31]
Valentina Richthammer, Tobias Schwarzer, Stefan Wildermann, Jürgen Teich, and Michael Glaß. 2018. Architecture decomposition in system synthesis of heterogeneous many-core systems. In Proceedings of Design Automation Conference (DAC’18) (2018-06-24/2018-06-28).
[32]
J. Schnerr, O. Bringmann, A. Viehl, and W. Rosenstiel. 2008. High-performance timing simulation of embedded software. In Proceedings of the 2008 45th ACM/IEEE Design Automation Conference. 290--295.
[33]
Tobias Schwarzer, Joachim Falk, Michael Glaß, Jürgen Teich, Christian Zebelein, and Christian Haubelt. 2015. Throughput-optimizing compilation of dataflow applications for multi-cores using quasi-static scheduling. In Proceedings of the International Workshop on Software and Compilers for Embedded Systems (SCOPES’15). ACM, 68--75.
[34]
Tobias Schwarzer, Andreas Weichslgartner, Michael Glaß, Stefan Wildermann, Peter Brand, and Jürgen Teich. 2018. Symmetry-eliminating design space exploration for hybrid application mapping on many-core architectures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37, 2 (Feb. 2018), 297--310.
[35]
M. Streubühr, J. Gladigau, C. Haubelt, and J. Teich. 2009. Efficient approximately-timed performance modeling for architectural exploration of MPSoCs. In Proceedings of the 2009 Forum on Specification Design Languages (FDL’09). 1--6.
[36]
Sander Stuijk, Marc Geilen, and Twan Basten. 2006. SDF<sup>3</sup>: SDF for free. In Application of Concurrency to System Design. IEEE, 276--278.
[37]
B. D. Theelen, M. C. W. Geilen, T. Basten, J. P. M. Voeten, S. V. Gheorghita, and S. Stuijk. 2006. A scenario-aware data flow model for combined long-run average and worst-case performance analysis. In Proceedings of the 4th ACM and IEEE International Conference on Formal Methods and Models for Co-Design (MEMOCODE’06). 185--194.
[38]
Stavros Tripakis, Dai Bui, Marc Geilen, Bert Rodiers, and Edward A. Lee. 2013. Compositionality in synchronous data flow: Modular code generation from hierarchical SDF graphs. ACM Transactions on Embedded Computing Systems 12, 3 (2013), 83.
[39]
S. Xydis, C. Skouroumounis, K. Pekmestzi, D. Soudris, and G. Economakos. 2010. Designing efficient DSP datapaths through compiler-in-the-loop exploration methodology. In Proceedings of the 2010 IEEE International Symposium on Circuits and Systems. 2598--2601.
[40]
Herve Yviquel, Antoine Lorence, Khaled Jerbi, Gildas Cocherel, Alexandre Sanchez, and Mickael Raulet. 2013. Orcc: Multimedia development made easy. In Proceedings of the 21st ACM International Conference on Multimedia (MM’13). ACM, 863--866.

Cited By

View all
  • (2022)On Transferring Application Mapping Knowledge Between Differing MPSoC ArchitecturesIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.319752741:11(4289-4300)Online publication date: 1-Nov-2022
  • (2022)Latency-driven Optimization of Switching Pipeline Design in Network Chips2022 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS55097.2022.00037(344-355)Online publication date: Dec-2022
  • (2022)Influence of Dataflow Graph Moldable Parameters on Optimization CriteriaDesign and Architecture for Signal and Image Processing10.1007/978-3-031-12748-9_7(83-95)Online publication date: 20-Jun-2022
  • Show More Cited By

Index Terms

  1. Compilation of Dataflow Applications for Multi-Cores using Adaptive Multi-Objective Optimization

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Design Automation of Electronic Systems
    ACM Transactions on Design Automation of Electronic Systems  Volume 24, Issue 3
    May 2019
    266 pages
    ISSN:1084-4309
    EISSN:1557-7309
    DOI:10.1145/3319359
    • Editor:
    • Naehyuck Chang
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Journal Family

    Publication History

    Published: 11 March 2019
    Accepted: 01 January 2019
    Revised: 01 January 2019
    Received: 01 August 2018
    Published in TODAES Volume 24, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Dataflow
    2. clustering
    3. design space exploration

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 24 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)On Transferring Application Mapping Knowledge Between Differing MPSoC ArchitecturesIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.319752741:11(4289-4300)Online publication date: 1-Nov-2022
    • (2022)Latency-driven Optimization of Switching Pipeline Design in Network Chips2022 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS55097.2022.00037(344-355)Online publication date: Dec-2022
    • (2022)Influence of Dataflow Graph Moldable Parameters on Optimization CriteriaDesign and Architecture for Signal and Image Processing10.1007/978-3-031-12748-9_7(83-95)Online publication date: 20-Jun-2022
    • (2020)Exact Design Space Exploration Based on Consistent ApproximationsElectronics10.3390/electronics90710579:7(1057)Online publication date: 27-Jun-2020

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media