Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Optimization Models for Three On-Chip Network Problems

Published: 17 September 2016 Publication History

Abstract

We model three on-chip network design problems—memory controller placement, resource allocation in heterogeneous on-chip networks, and their combination—as mathematical optimization problems. We model the first two problems as mixed integer linear programs. We model the third problem as a mixed integer nonlinear program, which we then linearize exactly. Sophisticated optimization algorithms enable solutions to be obtained much more efficiently. Detailed simulations using synthetic traffic and benchmark applications validate that our designs provide better performance than solutions proposed previously. Our work provides further evidence toward suitability of optimization models in searching/pruning architectural design space.

References

[1]
Ahmed H. Abdel-Gawad and Mithuna Thottethodi. 2011. TransCom: Transforming stream communication for load balance and efficiency in networks-on-chip. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-44). ACM, New York, NY, 237--247.
[2]
Dennis Abts, Natalie D. Enright Jerger, John Kim, Dan Gibson, and Mikko H. Lipasti. 2009. Achieving predictable performance through better memory controller placement in many-core CMPs. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA’09). ACM, New York, NY, 451--461.
[3]
Niket Agarwal, Tushar Krishna, Li-Shiuan Peh, and Niraj K. Jha. 2009. GARNET: A detailed on-chip network model inside a full-system simulator. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’09). IEEE, Los Alamitos, CA, 33--42.
[4]
Manu Awasthi, David W. Nellans, Kshitij Sudan, Rajeev Balasubramonian, and Al Davis. 2010. Handling the problems and opportunities posed by multiple on-chip memory controllers. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT’10). ACM, New York, NY, 319--330.
[5]
Omid Azizi, Aqeel Mahesri, Benjamin C. Lee, Sanjay J. Patel, and Mark Horowitz. 2010. Energy-performance tradeoffs in processor architecture and circuit design: A marginal cost analysis. In Proceedings of the 37th Annual International Symposium on Computer Architecture (ISCA’10). ACM, New York, NY, 26--36.
[6]
D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. 1991. The NAS Parallel Benchmarks—summary and preliminary results. In Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing’91). ACM, New York, NY, 158--165.
[7]
Yaniv Ben-Itzhak, Israel Cidon, and Avinoam Kolodny. 2012. Optimizing heterogeneous NoC design. In Proceedings of the International Workshop on System Level Interconnect Prediction (SLIP’12). ACM, New York, NY, 32--39.
[8]
Dimitri Bertsekas and Robert Gallager. 1992. Data Networks. Prentice Hall.
[9]
Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The gem5 simulator. ACM SIGARCH Computer Architecture News 39, 2, 1--7.
[10]
Paul Bogdan. 2015. Mathematical modeling and control of multifractal workloads for data-center-on-a-chip optimization. In Proceedings of the 9th International Symposium on Networks-on-Chip (NOCS’15). ACM, New York, NY, Article No. 21.
[11]
Paul Bogdan, Miray Kas, Radu Marculescu, and Onur Mutlu. 2010. QuaLe: A quantum-leap inspired model for non-stationary analysis of NoC traffic in chip multi-processors. In Proceedings of the 2010 4th ACM/IEEE International Symposium on Networks-on-Chip (NOCS’10). IEEE, Los Alamitos, CA, 241--248.
[12]
Stephen Boyd and Lieven Vandenberghe. 2004. Convex Optimization. Cambridge University Press, New York, NY.
[13]
William Dally and Brian Towles. 2003. Principles and Practices of Interconnection Networks. Morgan Kaufmann, San Francisco, CA.
[14]
J. Feehrer, S. Jairath, P. Loewenstein, R. Sivaramakrishnan, D. Smentek, S. Turullols, and A. Vahidsafa. 2013. The Oracle Sparc T5 16-core processor scales to eight sockets. IEEE Micro 33, 2, 48--57.
[15]
Changqing Fu and Kent Wilken. 2002. A faster optimal register allocator. In Proceedings of the 35th Annual ACM/IEEE International Symposium on Microarchitecture (MICRO-35). IEEE, Los Alamitos, CA, 245--256. http://dl.acm.org/citation.cfm?id=774861.774888
[16]
GAMS Development Corporation. 2015. General Algebraic Modeling System (GAMS) Release 24.4.3. Available at http://www.gams.com.
[17]
Dan Gibson. 2012. Private communication.
[18]
M. X. Goemans, A. V. Goldberg, S. Plotkin, D. B. Shmoys, É. Tardos, and D. P. Williamson. 1994. Improved approximation algorithms for network design problems. In Proceedings of the 5th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’94). 223--232. http://dl.acm.org/citation.cfm?id=314464.314497
[19]
Boris Grot, Joel Hestness, Stephen W. Keckler, and Onur Mutlu. 2011. Kilo-NOC: A heterogeneous network-on-chip architecture for scalability and service guarantees. In Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA’11). ACM, New York, NY, 401--412.
[20]
Akshay Gupte, Shabbir Ahmed, Myun Seok Cheon, and Santanu S. Dey. 2012. Solving Mixed Integer Bilinear Problems Using MIP Formulations. Available at http://www.optimization-online.org/DB_HTML/2011/07/3087.html.
[21]
Gurobi Optimization, Inc. 2015. Gurobi Optimizer Reference Manual. Available at http://www.gurobi.com.
[22]
Mitchell Hayenga, Natalie Enright Jerger, and Mikko Lipasti. 2009. SCARAB: A single cycle adaptive routing and bufferless network. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42). ACM, New York, NY, 244--254.
[23]
John L. Henning. 2006. SPEC CPU2006 benchmark descriptions. ACM SIGARCH Computer Architecture News 34, 4, 1--17.
[24]
W.-L. Hung, Y. Xie, N. Vijaykrishnan, C. Addo-Quaye, T. Theocharides, and M. J. Irwin. 2005. Thermal-aware floorplanning using genetic algorithms. In Proceedings of the 2005 International Symposium on Quality of Electronic Design. 634--639.
[25]
IBM Decision Optimization. 2015. IBM ILOG CPLEX Optimizer. Retrieved July 26, 2016, from http://www.cplex.com.
[26]
Hyunjun Jang, Jinchun Kim, Paul Gratz, Ki Hwan Yum, and Eun Jung Kim. 2015. Bandwidth-efficient on-chip interconnect designs for GPGPUs. In Proceedings of the 52nd Annual Design Automation Conference (DAC’15). ACM, New York, NY, Article No. 9.
[27]
Natalie D. Enright Jerger and Li-Shiuan Peh. 2009. On-Chip Networks. Morgan & Claypool.
[28]
Michel A. Kinsy, Myong Hyon Cho, Tina Wen, Edward Suh, Marten van Dijk, and Srinivas Devadas. 2009. Application-aware deadlock-free oblivious routing. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA’09). ACM, New York, NY, 208--219.
[29]
Sheng Ma, Natalie Enright Jerger, and Zhiying Wang. 2011. DBAR: An efficient routing algorithm to support multiple concurrent applications in networks-on-chip. In Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA’11). ACM, New York, NY, 413--424.
[30]
Sheng Ma, Natalie Enright Jerger, and Zhiying Wang. 2012. Whole packet forwarding: Efficient design of fully adaptive routing algorithms for networks-on-chip. In Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture (HPCA’12). IEEE, Los Alamitos, CA, 1--12.
[31]
T. L. Magnanti and R. T. Wong. 1984. Network design and transportation planning: Models and algorithms. Transportation Science 18, 1--56.
[32]
Radu Marculescu and Paul Bogdan. 2009. The chip is the network: Toward a science of network-on-chip design. Foundations and Trends in Electronic Design Automation 2, 4, 371--461. http://dx.doi.org/10.1561/1000000011
[33]
Asit K. Mishra. 2012. Private communication.
[34]
Asit K. Mishra, Narayanan Vijaykrishnan, and Chita R. Das. 2011. A case for heterogeneous on-chip interconnects for CMPs. In Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA’11). ACM, New York, NY, 389--400.
[35]
Thomas Moscibroda and Onur Mutlu. 2009. A case for bufferless routing in on-chip networks. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA’09). ACM, New York, NY, 196--207.
[36]
Tony Nowatzki, Michael Ferris, Karthikeyan Sankaralingam, Cristian Estan, Nilay Vaish, and David Wood. 2013. Optimization and mathematical modeling in computer architecture. Synthesis Lectures on Computer Architecture 8, 4, 1--144.
[37]
Christos H. Papadimitriou. 1981. On the complexity of integer programming. Journal of the ACM 28, 4, 765--768.
[38]
Alexander Shapiro, D. Dentcheva, and A. Ruszczynski. 2009. Lectures on Stochastic Programming. SIAM.
[39]
Allan Snavely and Dean M. Tullsen. 2000. Symbiotic jobscheduling for a simultaneous multithreaded processor. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS IX). ACM, New York, NY, 234--244.
[40]
K. Srinivasan, K. S. Chatha, and G. Konjevod. 2004. Linear programming based techniques for synthesis of network-on-chip architectures. In Proceedings of the 2004 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD’04). 422--429. http://dx.doi.org/10.1109/ICCD.2004.1347957
[41]
Mohit Tawarmalani and Nikolaos V. Sahinidis. 2005. A polyhedral branch-and-cut approach to global optimization. Mathematical Programming: Series A and B 103, 2, 225--249.
[42]
H.-M. Tong, Y.-S. Lai, and C. P. Wong. 2013. Advanced Flip Chip Packaging. Springer.
[43]
Zheng Wang and Michael F. P. O’Boyle. 2009. Mapping parallelism to multi-cores: A machine learning based approach. In Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’09). ACM, New York, NY, 75--84.
[44]
Thomas Canhao Xu, Pasi Liljeberg, and Hannu Tenhunen. 2011. Optimal memory controller placement for chip multiprocessor. In Proceedings of the 7th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’11). ACM, New York, NY, 217--226.
[45]
Wenbiao Zhou, Yan Zhang, and Zhigang Mao. 2006. Pareto based multi-objective mapping IP cores onto NoC architectures. In Proceedings of the 2006 IEEE Asia Pacific Conference on Circuits and Systems. 331--334.

Cited By

View all
  • (2024)NetSmith: An Optimization Framework for Machine-Discovered Network TopologiesProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673060(421-432)Online publication date: 12-Aug-2024
  • (2017)Marginal Performance: Formalizing and Quantifying Power Over/Under Provisioning in NoC DVFSIEEE Transactions on Computers10.1109/TC.2017.271501866:11(1903-1917)Online publication date: 6-Oct-2017
  • (2017)Optimizing the heterogeneous network on-chip design in manycore architectures2017 30th IEEE International System-on-Chip Conference (SOCC)10.1109/SOCC.2017.8226033(184-189)Online publication date: Sep-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization  Volume 13, Issue 3
September 2016
207 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/2988523
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 September 2016
Accepted: 01 May 2016
Revised: 01 March 2016
Received: 01 November 2015
Published in TACO Volume 13, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. On-chip network
  2. optimization models

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)66
  • Downloads (Last 6 weeks)16
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)NetSmith: An Optimization Framework for Machine-Discovered Network TopologiesProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673060(421-432)Online publication date: 12-Aug-2024
  • (2017)Marginal Performance: Formalizing and Quantifying Power Over/Under Provisioning in NoC DVFSIEEE Transactions on Computers10.1109/TC.2017.271501866:11(1903-1917)Online publication date: 6-Oct-2017
  • (2017)Optimizing the heterogeneous network on-chip design in manycore architectures2017 30th IEEE International System-on-Chip Conference (SOCC)10.1109/SOCC.2017.8226033(184-189)Online publication date: Sep-2017
  • (2017)Efficient Reconfigurable Global Network-on-Chip Designs towards Heterogeneous CPU-GPU Systems: An Application-Aware Approach2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI.2017.83(439-444)Online publication date: Jul-2017

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media