Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Throughput-driven synthesis of embedded software for pipelined execution on multicore architectures

Published: 09 February 2009 Publication History

Abstract

We present a methodology for pipelined software synthesis of streaming applications. First, we develop a versatile task assignment algorithm capable of optimizing realistically-arbitrary cost functions for two cores. The algorithm is exact (i.e., theoretically optimal) contrary to existing heuristics. Second, our approximation technique provides an adjustable knob to trade solution quality with algorithm runtime and memory. Third, we develop a recursive heuristic for more cores. FPGA-based emulated experiments validate our theoretical results. The exact algorithm yields 1.7 × throughput improvement. The approximation method offers a range of tradeoff points (e.g., 3 × faster with 20 × less memory) while degrading the throughput only 1% to 5%.

References

[1]
Aleksandrov, L., Djidjev, H., Guo, H., and Maheshwari, A. 2007. Partitioning planar graphs with costs and weights. J. Exper. Algor. 11.
[2]
Alur, R. 2003. Formal analysis of hierarchical state machines. In Verifcation Theory and Practice. Springer, Berlin, Germany, 42--66.
[3]
Alur, R., Courcoubetis, C., Henzinger, T. A., and Ho, P. H. 1992. Hybrid automata: An algorithmic approach to the specification and verification of hybrid systems. In Proceedings of the 4th Annual Conference on Hybrid Systems. Springer, Berlin, Germany, 209--229.
[4]
Angelini, P., Di Battista, G., and Patrignani, M. 2007. Computing a minimum-depth planar graph embedding in O(n4) time. Lecture Notes in Computer Science, vol. 4619, 287.
[5]
Atasu, K., Pozzi, L., and Ienne, P. 2003. Automatic application-specific instruction-set extensions under microarchitectural constraints. In Proceedings of the Design Automation Conference (DAC). IEEE, Los Alamitos, CA, 256--261.
[6]
Balarin, F., Watanabe, Y., Hsieh, H., Lavagno, L., Passerone, C., Sangiovanni-Vincentelli, A. 2003. Metropolis: An integrated electronic system design environment. IEEE Comput. 36, 4, 45--52.
[7]
Benveniste, A., Carloni, L. P., Caspi, P., and Sangiovanni-Vincentelli, A. L. 2003. Heterogeneous reactive systems modeling and correct-by-construction deployment. In Proceedings of the International Conference on Embedded Software (EMSOFT). Springer, Berlin, Germany, 35--50.
[8]
Bonivento, A., Carloni, L. P., and Sangiovanni-Vincentelli, A. L. 2005. Rialto: A bridge between description and implementation of control algorithms for wireless sensor networks. In Proceedings of the 2nd International Conference on Embedded Software (EMSOFT). Springer, Berlin, Germany, 183--186.
[9]
Bui, T. N. and Peck, A. 1992. Partitioning planar graphs. SIAM J. Comput. 21, 2, 203--215.
[10]
Cong, J., Han, G., and Jiang, W. 2007. Synthesis of an application-specific soft multiprocessor system. In Proceedings of the 15th ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA). ACM, New York, 99--107.
[11]
Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. 2001. Introduction to Algorithms. MIT Press, Cambridge, MA.
[12]
Erbas, C., Erbas, S. C., and Pimentel, A. D. 2006. Multiobjective optimization and evolutionary algorithms for the application mapping problem in multiprocessor system-on-chip design. IEEE Trans. Evolut. Comput. 10, 3, 358--374.
[13]
Feige, U. and Krauthgamer, R. 2002. A polylogarithmic approximation of the minimum bisection. SIAM J. Comput. 31, 4, 1090--1118.
[14]
Garey, M. R. and Johnson, D. S. 1990. Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman, New York.
[15]
Garg, N., Saran, H., and Vazirani, V. V. 2000. Finding separator cuts in planar graphs within twice the optimal. SIAM J. Comput. 29, 1, 159--179.
[16]
Gordon, M. I., Thies, W., and Amarasinghe, S. 2006. Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XII). ACM, New York, 151--162.
[17]
Henzinger, T. A., Manna, Z., and Pnueli, A. 1992. Towards refining temporal specifications into hybrid systems. In Proceedings of the 5th International Conference on Hybrid Systems. Springer, Berlin, Germany, 60--76.
[18]
Henzinger, T. A., Nicollin, X., Sifakis, J., and Yovine, S. 1994. Symbolic model checking for real-time systems. Inform. Comput. 111, 2, 193--244.
[19]
Henzinger, T. A., Qadeer, S., and Rajamani, S. K. 1998. You assume, we guarantee: Methodology and case studies. In Proceedings of the 10th International Conference on Computer Aided Verification. Springer, Berlin, Germany, 440--451.
[20]
Henzinger, T. A. and Sifakis, J. 2006. The embedded systems design challenge. In Proceedings of the 14th International Symposium on Formal Methods. Springer, Berlin, Germany, 1--15.
[21]
Hu, J. and Marculescu, R. 2005. Energy- and performance-aware mapping for regular noc architectures. IEEE Trans. Comput. Aid. Des. Integr. Circ. Syst. 24, 4.
[22]
Kahn, G. 1974. The semantics of simple language for parallel programming. In Proceedings of the International Federation for Information Processing (IFIP) Congress. 471--475.
[23]
Karpinski, M. 2002. Approximability of the minimum bisection problem: An algorithmic challenge. In Proceedings of the 27th International Symposium on Mathematical Foundations of Computer Science (MFCS'02). Springer, Berlin, Germany, 59--67.
[24]
Lee, E. A. 2005. Building unreliable systems out of reliable components: The real time story. Tech. rep. UCB/EECS-2005-5, EECS Department, University of California, Berkeley.
[25]
Lee, E. A. 2006. The problem with threads. IEEE Comput. 39, 5, 33--42.
[26]
Lee, E. A. and Messerschmitt, D. G. 1987a. Static scheduling of synchronous data ow programs for digital signal processing. IEEE Trans. Comput. 36, 1, 24--35.
[27]
Lee, E. A. and Messerschmitt, D. G. 1987b. Synchronous data ow. Proc. IEEE 75, 9, 1235--1245.
[28]
Lipton, R. J. and Tarjan, R. E. 1979. A separator theorem for planar graphs. SIAM J. Applied Mathematics 36, 177--189.
[29]
Ma, Z., Catthoor, F., and Vounckx, J. 2005. Hierarchical task scheduler for interleaving subtasks on heterogeneous multiprocessor platforms. In Proceedings of the Conference on Asia South Pacific Design Automation (ASP-DAC). IEEE, Los Alamitos, CA, 952--955.
[30]
Meeting. 2006. Joint United States-European Union-TEKES workshop: Long term challenges in high con_dence composable embedded systems. http://www.truststc.org/euus/wiki/Euus/HelsinkiMeeting.
[31]
Michael I. Gordon, William Thies, Michal Karczmarek, Jasper Lin, Ali S. Meli, Andrew A. Lamb, Chris Leger, Jeremy Wong, Henry Hoffmann, David Maze, and Saman Amarasinghe. 2002. A stream compiler for communication-exposed architectures. In Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-X). ACM, New York, 291--303.
[32]
Owens, J. D. et al. 2000. Polygon rendering on a stream architecture. In Proceedings of the Workshop on Graphics Hardware. ACM, New York, 23--32.
[33]
Owens, J. D. et al. 2002. Media processing applications on the Imagine stream processor. In Proceedings of the IEEE/ACM International Conference on Computer Design (ICCD). IEEE, Los Alamitos, CA, 295--302.
[34]
Park, J. K. and Phillips, C. A. 1993. Finding minimum-quotient cuts in planar graphs. In Proceedings of the 25th Annual ACM Symposium on Theory of Computing (STOC). ACM, New York, 766--775.
[35]
Pimentel, A. D. et al. 2001. Exploring embedded-systems architectures with artemis. IEEE Comput. 34, 11, 57--63.
[36]
Pino, J. L., Ha, S., Lee, E. A., and Buck, J. T. 1995. Software synthesis for DSP using ptolemy. J. VLSI Signal Process. Syst. 9, 1-2, 7--21.
[37]
Pinto, A., Bonivento, A., Sangiovanni-Vincentelli, A. L., Passerone, R., and Sgroi, M. 2006. System-level design paradigms: Platform-based design and communication synthesis. ACM Trans. Des. Autom. Electron. Syst. 11, 3, 537--563.
[38]
Rangan, R., Vachharajani, N., Stoler, A., Ottoni, G., August, D. I., and Cai, G. Z. N. 2006. Support for high-frequency streaming in CMPs. In Proceedings of the 39th Annual International Symposium on Microarchitecture. IEEE, Los Alamitos, CA, 259--272.
[39]
Rao, S., Amir, E., and Krauthgamer, R. 2003. Constant factor approximation of vertex-cuts in planar graphs. In Proceedings of the ACM Symposium on Theory of Computing (STOC). ACM, New York, 90--99.
[40]
Rao, S. B. 1992. Faster algorithms for finding small edge cuts in planar graphs. In Proceedings of the ACM Symposium on Theory of Computing (STOC). ACM, New York, 229--240.
[41]
Stankovic, J. A. 2007. Keynote speech: Control challenges in wireless sensor networks. In Proceedings of the 10th International Conference on Hybrid Systems: Computation and Control. Springer, Berlin, Germany, 2.
[42]
Stuijk, S. and Basten, T. 2008. Analyzing concurrency in streaming applications. Kluwver J. Syst. Architec. (available online).
[43]
Stuijk, S., Basten, T., Geilen, M., and Corporaal, H. 2007. Multiprocessor resource allocation for throughput-constrained synchronous dataow graphs. In Proceedings of the 44th Design Automation Conference (DAC). IEEE, Los Alamitos, CA, 777--782.
[44]
Stuijk, S., Geilen, M., and Basten, T. 2006. Exploring trade-offs in buffer requirements and throughput constraints for synchronous dataow graphs. In Proceedings of the 43rd Design Automation Conference (DAC). IEEE, Los Alamitos, CA, 899--904.
[45]
Sztipanovits, J., Glossner, C. J., Mudge, T. N., Rowen, C., Sangiovanni-Vincentelli, A. L., Wolf, W., and Zhao, F. 2005. Panel session: Grand challenges in embedded systems. InProceedings of the 2nd International Conference on Embedded Software (EMSOFT). IEEE, Los Alamitos, CA, 333.
[46]
Taylor, M. B., Psota, J., Saraf, A., Shnidman, N., Strumpen, V., Frank, M., et al. 2004. Evaluation of the RAW microprocessor: An exposed-wire-delay architecture for ILP and streams. In Proceedings of the 31st Annual International Symposium on Computer Architecture (ISCA). IEEE, Los Alamitos, CA, 2.
[47]
Thies, W., Karczmarek, M., and Amarasinghe, S. 2002. StreamIt: A language for streaming applications. In Proceedings of the 11th International Conference on Compiler Construction. Springer, Berlin, Germany, 179--196.
[48]
Thies, W., Lin, J., and Amarasinghe, S. 2003. Partitioning a structured stream graph using dynamic programming. Tech. rep., CS Department, Massachusetts Institute of Technology.
[49]
Thoen, F. and Catthoor, F. 2000. Modeling, Verification, and Exploration of Task-Level Concurrency of Real-Time Embedded Systems. Kluwer Academic Publishers.
[50]
Yu, J., Yao, J., Bhuyan, L., and Yang, J. 2007. Program mapping onto network processors by recursive bipartitioning and refining. In Proceedings of the 44th Annual IEEE/ACM Design Automation Conference (DAC'04). IEEE, Los Alamitos, CA, 805--810.
[51]
Yu, Z., Meeuwsen, M., Apperson, R., Sattari, O., Lai, M., Webb, J., Work, E., Mohsenin, T., Singh, M., and Baas, B. M. 2006. An asynchronous array of simple processors for DSP applications. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC). IEEE, Los Alamitos, CA.
[52]
Zhou, G., Leung, M.-K., and Lee, E. A. 2007. A code generation framework for actor-oriented models with partial evaluation. In Proceedings of the International Conference on Embedded Software and Systems. ACM, New York, 786--799.

Cited By

View all
  • (2016)Throughput-Driven Parallel Embedded Software Synthesis from Synchronous Dataflow Models: Caveats and RemediesModel-Implementation Fidelity in Cyber Physical System Design10.1007/978-3-319-47307-9_4(91-127)Online publication date: 10-Dec-2016
  • (2013)Software Architecture Optimization MethodsIEEE Transactions on Software Engineering10.1109/TSE.2012.6439:5(658-683)Online publication date: 1-May-2013
  • (2013)Multi-mode pipelined MPSoCs for streaming applications2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)10.1109/ASPDAC.2013.6509601(231-236)Online publication date: Jan-2013
  • Show More Cited By

Index Terms

  1. Throughput-driven synthesis of embedded software for pipelined execution on multicore architectures

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Embedded Computing Systems
      ACM Transactions on Embedded Computing Systems  Volume 8, Issue 2
      January 2009
      243 pages
      ISSN:1539-9087
      EISSN:1558-3465
      DOI:10.1145/1457255
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Journal Family

      Publication History

      Published: 09 February 2009
      Accepted: 01 August 2008
      Revised: 01 March 2008
      Received: 01 August 2007
      Published in TECS Volume 8, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Embedded software
      2. graph partitioning
      3. multi-core hardware
      4. streaming applications
      5. task assignment

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 12 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2016)Throughput-Driven Parallel Embedded Software Synthesis from Synchronous Dataflow Models: Caveats and RemediesModel-Implementation Fidelity in Cyber Physical System Design10.1007/978-3-319-47307-9_4(91-127)Online publication date: 10-Dec-2016
      • (2013)Software Architecture Optimization MethodsIEEE Transactions on Software Engineering10.1109/TSE.2012.6439:5(658-683)Online publication date: 1-May-2013
      • (2013)Multi-mode pipelined MPSoCs for streaming applications2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)10.1109/ASPDAC.2013.6509601(231-236)Online publication date: Jan-2013
      • (2013)Multi-mode Pipelined MPSoCsPipelined Multiprocessor System-on-Chip for Multimedia10.1007/978-3-319-01113-4_8(147-162)Online publication date: 26-Nov-2013
      • (2013)Optimisation FrameworkPipelined Multiprocessor System-on-Chip for Multimedia10.1007/978-3-319-01113-4_3(53-64)Online publication date: 26-Nov-2013
      • (2013)Literature SurveyPipelined Multiprocessor System-on-Chip for Multimedia10.1007/978-3-319-01113-4_2(21-52)Online publication date: 26-Nov-2013
      • (2013)IntroductionPipelined Multiprocessor System-on-Chip for Multimedia10.1007/978-3-319-01113-4_1(1-20)Online publication date: 26-Nov-2013
      • (2010)Use of Accurate GPS Timing Based on Radial Basis Probabilistic Neural Network in Electric SystemsProceedings of the 2010 International Conference on Electrical and Control Engineering10.1109/iCECE.2010.1483(2572-2575)Online publication date: 25-Jun-2010
      • (2010)Versatile task assignment for heterogeneous soft dual-processor platformsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2010.204185629:3(414-425)Online publication date: 1-Mar-2010
      • (2010)Application-level pipelining on Hierarchical NoCProceedings of 2010 IEEE International Symposium on Circuits and Systems10.1109/ISCAS.2010.5537706(3873-3876)Online publication date: May-2010
      • Show More Cited By

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media