Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1629911.1629979acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

A design flow for application specific heterogeneous pipelined multiprocessor systems

Published: 26 July 2009 Publication History

Abstract

This paper describes a rapid design methodology to create a pipeline of processers to execute streaming applications. The methodology is in two separate phases: the first phase, uses a heuristic to rapidly search through a large number of processor configurations (configurations differ by the base processor, the additional instructions and cache sizes) to find the near Pareto front; the second phase, utilizes either the above heuristic or an ILP (Integer Linear Programming) formulation to search a smaller design space to find an appropriate final implementation. By the utilization of the fast heuristic with differing runtime constraints in the first phase, we rapidly find the near Pareto front. The second phase provides either an optimal or a near optimal solution. Both the ILP formulation and the heuristic find a system with the smallest area, within a designer specified runtime constraint. The system has efficiently explored design spaces with over 1012 design points.
We integrated this design methodology into a commercial design flow and evaluated our approach with different benchmarks (JPEG Encoder, JPEG Decoder and MP3 Encoder). For each benchmark, the near Pareto front was found in a few hours using the heuristic (took several days for the ILP). The results show that the average area error of the heuristic is within 2.5% of the optimal design points (obtained using ILP) for all benchmarks.

References

[1]
Altera Nios Processor. Altera Corp. (http://www.altera.com).
[2]
ARC the leader in configurable processor technology. ARC International (http://www.arc.com).
[3]
Xtensa Processor. Tensilica Inc. (http://www.tensilica.com).
[4]
S. L. Shee, A. Erdos, and S. Parameswaran. Heterogeneous multiprocessor implementations for jpeg:: a case study. In CODES+ISSS '06: Proceedings of the 4th international conference on Hardware/software codesign and system synthesis, pages 217--222, New York, NY, USA, 2006. ACM.
[5]
M. Strik, A. Timmer, J. van Meerbergen, and G.-J. van Rootselaar. Heterogeneous multiprocessor for the management of real-time video and graphics streams. Solid-State Circuits, IEEE Journal of, 35(11):1722--1731, Nov 2000.
[6]
A. Beric, R. Sethuraman, C. Pinto, H. Peters, G. Veldman, P. van de Haar, and M. Duranton. Heterogeneous multiprocessor for high definition video. Consumer Electronics, 2006. ICCE '06. 2006 Digest of Technical Papers. International Conference on, pages 401--402, 7--11 Jan. 2006.
[7]
T. Kodaka, K. Kimura, and H. Kasahara. Multigrain parallel processing for jpeg encoding on a single chip multiprocessor. In IWIA '02: Proceedings of the International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'02), page 57, Washington, DC, USA, 2002. IEEE Computer Society.
[8]
S. Banerjee, T. Hamada, P. Chau, and R. Fellman. Macro pipelining based scheduling on high performance heterogeneous multiprocessor systems. Signal Processing, IEEE Transactions on, 43(6):1468--1484, 1995.
[9]
J. Jeon and K. Choi. Loop pipelining in hardware-software partitioning. In Asia and South Pacific Design Automation Conference, pages 361--366, 1998.
[10]
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stien. Introduction to Algorithms. MIT Press and MCGraw-Hill, Second edition, 2001.
[11]
J. DeSouza-Batista and A. Parker. Optimal synthesis of application specific heterogeneous pipelined multiprocessors. Application Specific Array Processors, 1994. Proceedings., International Conference on, pages 99--110, 22--24 Aug 1994.
[12]
S.-R. Kuang, C.-Y. Chen, and R.-Z. Liao. Partitioning and pipelined scheduling of embedded system using integer linear programming. In ICPADS '05: Proceedings of the 11th International Conference on Parallel and Distributed Systems - Workshops (ICPADS'05), pages 37--41, Washington, DC, USA, 2005. IEEE Computer Society.
[13]
M. Schwiegershausen and P. Pirsch. A formal approach for the optimization of heterogeneous multiprocessors for complex image processing schemes. In EURO-DAC '95/EURO-VHDL '95: Proceedings of the conference on European design automation, pages 8--13, Los Alamitos, CA, USA, 1995. IEEE Computer Society Press.
[14]
F. Sun, S. Ravi, A. Raghunathan, and N. K. Jha. Synthesis of application-specific heterogeneous multiprocessor architectures using extensible processors. In VLSID '05: Proceedings of the 18th International Conference on VLSI Design held jointly with 4th International Conference on Embedded Systems Design, pages 551--556, Washington, DC, USA, 2005. IEEE Computer Society.
[15]
J. Cong, G. Han, and W. Jiang. Synthesis of an application-specific soft multiprocessor system. In FPGA '07: Proceedings of the 2007 ACM/SIGDA 15th international symposium on Field programmable gate arrays, pages 99--107, New York, NY, USA, 2007. ACM.
[16]
S. L. Shee and S. Parameswaran. Design methodology for pipelined heterogeneous multiprocessor system. In DAC '07: Proceedings of the 44th annual conference on Design automation, pages 811--816, New York, NY, USA, 2007. ACM.
[17]
H. Javaid and S. Parameswaran. Synthesis of heterogeneous pipelined multiprocessor systems using ilp: jpeg case study. In CODES/ISSS '08: Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis, pages 1--6, New York, NY, USA, 2008. ACM.
[18]
H. Javaid and S. Parameswaran. Synthesis of application specific heterogeneous multiprocessor systems. Technical Report UNSW-CSE-TR-0911, School of Computer Science and Engineering, The University of New South Wales.
[19]
Flix: Fast relief for performance-hungry embedded applications, 2005. Available at: http://www.tensilica.com/pdf/FLIX_White_Paper_v2.pdf.
[20]
XPRES Generated Specialized Operations, 2005. Available at: http://tensilica.com/pdf/XPRES%201205.pdf.
[21]
lp_solve. Available at: http://lpsolve.sourceforge.net/5.5/.

Cited By

View all
  • (2022)An Extensive Survey on Assessment of Multicore Processors for Embedded SystemsAdvances in Signal Processing and Communication Engineering10.1007/978-981-19-5550-1_16(161-170)Online publication date: 2-Dec-2022
  • (2019)Self-Optimizing and Self-Programming Computing Systems: A Combined Compiler, Complex Networks, and Machine Learning ApproachIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2019.2897650(1-12)Online publication date: 2019
  • (2019)Mapping and scheduling techniques in NoC: A survey of the state of the art2019 International Conference on Networking and Advanced Systems (ICNAS)10.1109/ICNAS.2019.8807815(1-6)Online publication date: Jun-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '09: Proceedings of the 46th Annual Design Automation Conference
July 2009
994 pages
ISBN:9781605584973
DOI:10.1145/1629911
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 July 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. MPSoCs
  2. design space exploration
  3. integer linear programming

Qualifiers

  • Research-article

Funding Sources

Conference

DAC '09
Sponsor:
DAC '09: The 46th Annual Design Automation Conference 2009
July 26 - 31, 2009
California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2022)An Extensive Survey on Assessment of Multicore Processors for Embedded SystemsAdvances in Signal Processing and Communication Engineering10.1007/978-981-19-5550-1_16(161-170)Online publication date: 2-Dec-2022
  • (2019)Self-Optimizing and Self-Programming Computing Systems: A Combined Compiler, Complex Networks, and Machine Learning ApproachIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2019.2897650(1-12)Online publication date: 2019
  • (2019)Mapping and scheduling techniques in NoC: A survey of the state of the art2019 International Conference on Networking and Advanced Systems (ICNAS)10.1109/ICNAS.2019.8807815(1-6)Online publication date: Jun-2019
  • (2018)Run-time Mapping Algorithm for Dynamic Workloads on Heterogeneous MPSoCs Platforms2018 21st Euromicro Conference on Digital System Design (DSD)10.1109/DSD.2018.00071(373-380)Online publication date: Aug-2018
  • (2017)A Majority-Based Reliability-Aware Task Mapping in High-Performance Homogenous NoC ArchitecturesACM Transactions on Embedded Computing Systems10.1145/313127317:1(1-31)Online publication date: 6-Dec-2017
  • (2017)A Survey and Comparative Study of Hard and Soft Real-Time Dynamic Resource Allocation Strategies for Multi-/Many-Core SystemsACM Computing Surveys10.1145/305726750:2(1-40)Online publication date: 11-Apr-2017
  • (2017)Run-time resource allocation for embedded Multiprocessor System-on-Chip using tree-based design space exploration2017 12th International Conference on Design & Technology of Integrated Systems In Nanoscale Era (DTIS)10.1109/DTIS.2017.7929873(1-6)Online publication date: Apr-2017
  • (2017)Design Space Exploration and Run-Time Adaptation for Multi-core Resource Management Under Performance and Power ConstraintsHandbook of Hardware/Software Codesign10.1007/978-94-017-7358-4_11-1(1-32)Online publication date: 8-Apr-2017
  • (2017)Design Space Exploration and Run-Time Adaptation for Multicore Resource Management Under Performance and Power ConstraintsHandbook of Hardware/Software Codesign10.1007/978-94-017-7267-9_11(301-332)Online publication date: 27-Sep-2017
  • (2016)Resource and Throughput Aware Execution Trace Analysis for Efficient Run-Time Mapping on MPSoCsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2015.244693835:1(72-85)Online publication date: Jan-2016
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media