Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Symmetry in Software Synthesis

Published: 21 July 2017 Publication History

Abstract

With the surge of multi- and many-core systems, much research has focused on algorithms for mapping and scheduling on these complex platforms. Large classes of these algorithms face scalability problems. This is why diverse methods are commonly used for reducing the search space. While most such approaches leverage the inherent symmetry of architectures and applications, they do it in a problem-specific and intuitive way. However, intuitive approaches become impractical with growing hardware complexity, like Network-on-Chip interconnect or heterogeneous cores. In this article, we present a formal framework that can determine the inherent local and global symmetry of architectures and applications algorithmically and leverage these for problems in software synthesis. Our approach is based on the mathematical theory of groups and a generalization called inverse semigroups. We evaluate our approach in two state-of-the-art mapping frameworks. Even for the platforms with a handful of cores of today and moderate-sized benchmarks, our approach consistently yields reductions of the overall execution time of algorithms. We obtain a speedup of more than 10 × for one use-case and saved 10% of time in another.

Supplementary Material

TACO1402-20 (taco1402-20.pdf)
Slide deck associated with this paper

References

[1]
László Babai. 2015. Graph isomorphism in quasipolynomial time. arXiv preprint arXiv:1512.03547 (2015).
[2]
J. Balkind, M. McKeown, Y. Fu, T. Nguyen, Y. Zhou, A. Lavrov, M. Shahrad, A. Fuchs, S. Payne, X. Liang, M. Matl, and D. Wentzlaff. 2016. OpenPiton: An open source manycore research framework. In Proceedings of the 21st International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’16). ACM, New York, NY, 217--232.
[3]
Reimer Behrends, Kevin Hammond, Vladimir Janjic, Alexander Konovalov, Steve Linton, Hans-Wolfgang Loidl, Patrick Maier, and Phil Trinder. 2016. HPC-GAP: Engineering a 21st-century high-performance computer algebra system. Concurrency and Computation: Practice and Experience 28 (2016), 3606--3636.
[4]
Eric Biscondi, Tom Flanagan, Frank Fruth, Zhihong Lin, and Filip Moerman. 2012. Maximizing Multicore Efficiency with Navigator Runtime. White Paper. (Feb. 2012). Retrieved from www.ti.com/lit/wp/spry190/spry190.pdf.
[5]
Wieb Bosma, John Cannon, and Catherine Playoust. 1997. The Magma algebra system: The user language. Journal of Symbolic Computing 24, 3--4 (1997), 235--265.
[6]
Simone Casale Brunet, Marco Mattavelli, Claudio Alberti, and Jorn W Janneck. 2013. Design space exploration of high level stream programs on parallel architectures: A focus on the buffer size minimization and optimization problem. In Proceedings of the 8th International Symposium on Image and Signal Processing and Analysis.
[7]
Jeronimo Castrillon, Rainer Leupers, and Gerd Ascheid. 2013. MAPS: Mapping concurrent dataflow applications to heterogeneous MPSoCs. IEEE Transactions on Industrial Informatics 9, 1 (Feb. 2013), 527--545.
[8]
Jeronimo Castrillon, Weihua Sheng, and Rainer Leupers. 2011. Trends in embedded software synthesis. In Proceedings of the International Conference on Embedded Computer Systems (SAMOS’11). IEEE, 347--354.
[9]
Jeronimo Castrillon, Andreas Tretter, Rainer Leupers, and Gerd Ascheid. 2012. Communication-aware mapping of KPN applications onto heterogeneous MPSoCs. In Proceedings of the 49th Annual Conference on Design Automation (DAC’12).
[10]
Kuan-Hsun Chen, Jian-Jia Chen, Florian Kriebel, Semeen Rehman, Muhammad Shafique, and Jörg Henkel. 2016. Task mapping for redundant multithreading in multi-cores with reliability and performance heterogeneity. IEEE Transactions on Computers 65, 11 (2016), 3441--3455.
[11]
Eric Cheung, Harry Hsieh, and Felice Balarin. 2007. Automatic buffer sizing for rate-constrained KPN applications on multiprocessor system-on-chip. In Proceedings of the 2007 IEEE International High Level Design Validation and Test Workshop. IEEE Computer Society, Washington, D.C., 37--44.
[12]
Paolo Codenotti, Hadi Katebi, Karem A. Sakallah, and Igor L. Markov. 2013. Conflict analysis and branching heuristics in the search for graph automorphisms. In Proceedings of the 2013 IEEE 25th International Conference on Tools with Artificial Intelligence (ICTAI’13). IEEE, 907--914.
[13]
Harvey A. Cohen. 1988. Symmetry considerations applied to hardware convolvers for image filtering. In Proceedings of the 1988 IEEE International Conference on Systems, Man, and Cybernetics, Vol. 2. IEEE, 1128--1131.
[14]
Benoît Dupont de Dinechin, Renaud Ayrignac, Pierre-Edouard Beaucamps, Patrice Couvert, Benoit Ganne, Pierre Guironnet de Massas, François Jacquet, Samuel Jones, Nicolas Morey Chaisemartin, Frédéric Riss, and others. 2013. A clustered manycore processor architecture for embedded and accelerated applications. In HPEC. 1--6.
[15]
K. Deb. 2001. Multi-objective Optimization Using Evolutionary Algorithms. Vol. 16. John Wiley 8 Sons.
[16]
Marco Dorigo, Mauro Birattari, Christian Blum, Maurice Clerc, Thomas Stützle, and Alan Winfield. 2008. Proceedings of the 5th International Conference on Ant Colony Optimization and Swarm Intelligence (ANTS’08). Vol. 5217. Springer.
[17]
J. East, A. Egri-Nagy, J. D. Mitchell, and Y. Péresse. 2015. Computing finite semigroups. arXiv preprint arXiv:1510.01868 (2015).
[18]
C. Erbas, S. Cerav-Erbas, and A. D. Pimentel. 2006. Multiobjective optimization and evolutionary algorithms for the application mapping problem in multiprocessor system-on-chip design. IEEE Transactions on Evolutionary Computation 10, 3 (June 2006), 358--374.
[19]
Andrés Goens and Jeronimo Castrillon. 2015. Analysis of process traces for mapping dynamic KPN applications to MPSoCs. In Proceedings of the IFIP International Embedded Systems Symposium (IESS). Foz do Iguaçu, Brazil.
[20]
A. Goens, R. Khasanov, J. Castrillon, S. Polstra, and A. Pimentel. 2016. Why comparing system-level MPSoC mapping approaches is difficult: A case study. In Proceedings of the IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC-16).
[21]
Masaki Gondo, Fumio Arakawa, and Masato Edahiro. 2014. Establishing a standard interface between multi-manycore and software tools-SHIM. In COOL Chips XVII, 2014 IEEE. IEEE, 1--3.
[22]
P. Greenhalgh. 2011. Big.LITTLE processing with arm cortex-a15 8 cortex-a7. ARM White Paper (2011), 1--8.
[23]
Linley Gwennap. 2011. Adapteva: More flops, less watts. Microprocessor Report 6, 13 (2011), 11--02.
[24]
Frank Hannig and Jürgen Teich. 2001. Design space exploration for massively parallel processor arrays. In Parallel Computing Technologies. Springer, 51--65.
[25]
G. Hempel, A. Goens, J. Asmus, J. Castrillon, and I. Sbalzarini. 2017. Robust mapping of process networks to many-core systems using bio-inspired design centering. In Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES’17).
[26]
Tommi Junttila and Petteri Kaski. 2011. Conflict propagation and component recursion for canonical labeling. In Theory and Practice of Algorithms in (Computer) Systems. Springer, 151--162.
[27]
Gilles Kahn. 1974. The semantics of a simple language for parallel programming. In Information Processing’74: Proceedings of the IFIP Congress, Vol. 74. 471--475.
[28]
J. Keinert, T. Schlichter, J. Falk, J. Gladigau, C. Haubelt, J. Teich, M. Meredith, and others. 2009. SystemCoDesigner- An automatic ESL synthesis approach by design space exploration and behavioral synthesis for streaming applications. ACM TODAES 14, 1 (2009), 1.
[29]
Heba Khdr, Santiago Pagani, Muhammad Shafique, and Jörg Henkel. 2015. Thermal constrained resource management for mixed ILP-TLP workloads in dark silicon chips. In Proceedings of the 52nd Annual Design Automation Conference. ACM, 179.
[30]
Márcio Kreutz, César A. Marcon, Luigi Carro, Flávio Wagner, and Altamiro A. Susin. 2005. Design space exploration comparing homogeneous and heterogeneous network-on-chip architectures. In Proceedings of the 18th Annual Symposium on Integrated Circuits and System Design. ACM, 190--195.
[31]
Mark V. Lawson. 1998. Inverse Semigroups: The Theory of Partial Symmetries. World Scientific.
[32]
E. Lee and D. Messerschmitt. 1987. Synchronous data flow. Proceedings of the IEEE 75, 9 (1987), 1235--1245.
[33]
Hung-Yi Liu, Michele Petracca, and Luca P. Carloni. 2012. Compositional system-level design exploration with planning of high-level synthesis. In Proceedings of the Conference on Design, Automation and Test in Europe. EDA Consortium, 641--646.
[34]
José Luis López-Presa, Antonio Fernández Anta, and Luis Núñez Chiroque. 2011. Conauto-2.0: Fast isomorphism testing and automorphism group computation. arXiv preprint arXiv:1108.1060 (2011).
[35]
Frank Lübeck and Max Neunhöffer. 2001. Enumerating large orbits and direct condensation. Experimental Mathematics 10, 2 (2001), 197--205.
[36]
Brendan D. McKay and Adolfo Piperno. 2014. Practical graph isomorphism, II. Journal of Symbolic Computation 60, 0 (2014), 94--112.
[37]
J. D. Mitchell, M. Delgado, J. East, A. Egri-Nagy, N. Ham, J. Jonusas, M. Pfeiffer, B. Steinberg, J. Smith, M. Torpey, and W. Wilson. 2016. Semigroups, Version 2.8.0. Retrieved from https://gap-packages.github.io/Semigroups.
[38]
Maximilian Odendahl, Jeronimo Castrillon, Vitaliy Volevach, Rainer Leupers, and Gerd Ascheid. 2013. Split-cost communication model for improved MPSoC application mapping. In Proceedings of the 2013 International Symposium on System on Chip (SoC’13). IEEE, 1--8.
[39]
A. Olofsson, T. Nordström, and Z. Ul-Abdin. 2014. Kickstarting high-performance energy-efficient manycore architectures with Epiphany. In 2014 48th Asilomar Conference on Signals, Systems and Computers. IEEE, 1719--1726.
[40]
Gianluca Palermo, Cristina Silvano, and Vittorio Zaccaria. 2005. Multi-objective design space exploration of embedded systems. Journal of Embedded Computing 1, 3 (2005), 305--316.
[41]
M. Pelcat, K. Desnos, L. Maggiani, Y. Liu, J. Heulot, J. F. Nezan, and S. Bhattacharyya. 2015. Models of Architecture. Research Report PREESM/2015-12TR01, 2015. IETR/INSA Rennes ; Scuola Superiore Sant’Anna, Pisa ; Institut Pascal, Clermont Ferrand ; University of Maryland, College Park ; Tampere University of Technology, Tampere. https://hal.archives-ouvertes.fr/hal-01244470
[42]
A. Pimentel, C. Erbas, and S. Polstra. 2006. A systematic approach to exploring embedded system architectures at multiple abstraction levels. IEEE Transactions on Computers 55, 2 (2006), 99--112.
[43]
W. Quan and A. Pimentel. 2014. Towards exploring vast MPSoC mapping design spaces using a bias-elitist evolutionary approach. In Proceedings of the 2014 17th Euromicro Conference on DSD. IEEE, 655--658.
[44]
Wei Quan and Andy D. Pimentel. 2015. A hybrid task mapping algorithm for heterogeneous MPSoCs. ACM Transactions on Embedded Computing Systems (TECS) 14, 1 (2015), 14.
[45]
Carl Ramey. 2011. TILE-Gx100 ManyCore Processor: Acceleration Interfaces and Architecture. Presented at HotChips 23. (Aug. 2011).
[46]
Sascha Roloff, David Schafhauser, Frank Hannig, and Jürgen Teich. 2015. Execution-driven parallel simulation of PGAS applications on heterogeneous tiled architectures. In Proceedings of the 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC). IEEE, 1--6.
[47]
T. Schwarzer, A. Weichslgartner, M. Gla, S. Wildermann, P. Brand, and J. Teich. 2017. Symmetry-eliminating Design Space Exploration for Hybrid Application Mapping on Many-Core Architectures. Retrieved from https://cris.fau.de/converis/publicweb/Publication/1061555.
[48]
Ákos Seress. 2003. Permutation Group Algorithms. Vol. 152. Cambridge University Press.
[49]
Weihua Sheng, Artur Wiebe, Anastasia Stulova, Rainer Leupers, Bart Kienhuis, Johan Walters, and Gerd Ascheid. 2012. FIFO exploration in mapping streaming applications onto the TI OMAP3530 platform: Case study and optimizations. In Proceedings of the 2012 IEEE 6th International Symposium on Embedded Multicore SoCs. IEEE, 51--58.
[50]
Software Solutions GmbH Silexica. 2016. SLXMapper. Retrieved from http://www.silexica.com.
[51]
Charles C. Sims. 1970. Computational methods in the study of permutation groups. In Computational Problems in Abstract Algebra. 169--183.
[52]
A. Singh, M. Shafique, A. Kumar, and J. Henkel. 2013b. Mapping on multi/many-core systems: Survey of current and emerging trends. In Proceedings of the 50th Annual Design Automation Conference. ACM, 1.
[53]
Amit Kumar Singh, Akash Kumar, and Thambipillai Srikanthan. 2013a. Accelerating throughput-aware runtime mapping for heterogeneous MPSoCs. ACM TODAES 18, 1 (2013), 9.
[54]
Ian Stewart, Martin Golubitsky, and Marcus Pivato. 2003. Symmetry groupoids and patterns of synchrony in coupled cell networks. SIAM Journal on Applied Dynamical Systems 2, 4 (2003), 609--646.
[55]
The GAP Group 2016. GAP -- Groups, Algorithms, and Programming, Version 4.8.5. The GAP Group. Retrieved from http://www.gap-system.org.
[56]
Lothar Thiele, Iuliana Bacivarov, Wolfgang Haid, and Kai Huang. 2007. Mapping applications to tiled multiprocessor embedded systems. In Proceedings of the 7th International Conference on Application of Concurrency to System Design (ACSD’07). IEEE Computer Society, Washington, D.C., 29--40.
[57]
Mark Thompson and Andy D. Pimentel. 2013. Exploiting domain knowledge in system-level MPSoC design space exploration. Journal of Systems Architecture 59, 7 (2013), 351--360.
[58]
Anish Varghese, Bob Edwards, Gaurav Mitra, and Alistair P Rendell. 2015. Programming the Adapteva Epiphany 64-core network-on-chip coprocessor. International Journal of High Performance Computing Applications (2015), 1094342015599238.
[59]
G. Gary Wang and Songqing Shan. 2004. Design space reduction for multi-objective optimization and robust design optimization problems. SAE SP 113, 5 (2004), 37--46.
[60]
A. Weichslgartner, S. Wildermann, J. Götzfried, Felix Freiling, M. Glaß, and J. Teich. 2016. Design-time/run-time mapping of security-critical applications in heterogeneous MPSoCs. In Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems. ACM, 153--162.
[61]
A. Weinstein. 1996. Groupoids: Unifying internal and external symmetry. Notices of the AMS 43, 7 (1996), 744--752.

Cited By

View all
  • (2024)Generative Design of the Architecture Platform in Multiprocessor System DesignElectronics10.3390/electronics1307140413:7(1404)Online publication date: 8-Apr-2024
  • (2023)Dataflow Models of Computation for Programming Heterogeneous MulticoresHandbook of Computer Architecture10.1007/978-981-15-6401-7_45-2(1-40)Online publication date: 28-Sep-2023
  • (2022)mpsym: Improving Design-Space Exploration of Clustered Manycores With Arbitrary TopologiesIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.310251241:6(1592-1605)Online publication date: Jun-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization  Volume 14, Issue 2
June 2017
259 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/3086564
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 July 2017
Accepted: 01 May 2017
Revised: 01 May 2017
Received: 01 November 2016
Published in TACO Volume 14, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Scalability
  2. automation
  3. clusters
  4. design-space exploration
  5. group theory
  6. heterogeneous
  7. inverse-semigroups
  8. mapping
  9. metaheuristics
  10. network-on-chip
  11. symmetry

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Center for Advancing Electronics Dresden (cfaed)
  • Graduiertenkolleg Experimentelle und konstruktive Algebra (GK EukA)

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)81
  • Downloads (Last 6 weeks)13
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Generative Design of the Architecture Platform in Multiprocessor System DesignElectronics10.3390/electronics1307140413:7(1404)Online publication date: 8-Apr-2024
  • (2023)Dataflow Models of Computation for Programming Heterogeneous MulticoresHandbook of Computer Architecture10.1007/978-981-15-6401-7_45-2(1-40)Online publication date: 28-Sep-2023
  • (2022)mpsym: Improving Design-Space Exploration of Clustered Manycores With Arbitrary TopologiesIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.310251241:6(1592-1605)Online publication date: Jun-2022
  • (2022)Dataflow Models of Computation for Programming Heterogeneous MulticoresHandbook of Computer Architecture10.1007/978-981-15-6401-7_45-1(1-40)Online publication date: 28-Jan-2022
  • (2022)Methodologies for Design Space ExplorationHandbook of Computer Architecture10.1007/978-981-15-6401-7_23-1(1-31)Online publication date: 27-Jan-2022
  • (2021)Exploiting Similarity in Evolutionary Product Design for Improved Design Space ExplorationEmbedded Computer Systems: Architectures, Modeling, and Simulation10.1007/978-3-031-04580-6_3(33-49)Online publication date: 4-Jul-2021
  • (2021)Software Compilation and Optimization Techniques for Heterogeneous Multi‐core PlatformsMulti‐Processor System‐on‐Chip 210.1002/9781119818410.ch10(203-235)Online publication date: 28-Apr-2021
  • (2020)Energy-efficient runtime resource management for adaptable multi-application mappingProceedings of the 23rd Conference on Design, Automation and Test in Europe10.5555/3408352.3408558(909-914)Online publication date: 9-Mar-2020
  • (2020)Energy-efficient Runtime Resource Management for Adaptable Multi-application Mapping2020 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE48585.2020.9116381(909-914)Online publication date: Mar-2020
  • (2019)Comparing Dataflow and OpenMP Programming for Speaker Recognition ApplicationsProceedings of the 10th and 8th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms10.1145/3310411.3310417(1-6)Online publication date: 21-Jan-2019
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media