research-article

Public Access

APPROX-NoC: A Data Approximation Framework for Network-On-Chip Architectures

Authors:

Rahul Boyapati,

Pritam Majumder,

Eun Jung KimAuthors Info & Claims

ISCA '17: Proceedings of the 44th Annual International Symposium on Computer Architecture

Pages 666 - 677

https://doi.org/10.1145/3079856.3080241

Published: 24 June 2017 Publication History

Abstract

The trend of unsustainable power consumption and large memory bandwidth demands in massively parallel multicore systems, with the advent of the big data era, has brought upon the onset of alternate computation paradigms utilizing heterogeneity, specialization, processor-in-memory and approximation. Approximate Computing is being touted as a viable solution for high performance computation by relaxing the accuracy constraints of applications. This trend has been accentuated by emerging data intensive applications in domains like image/video processing, machine learning and big data analytics that allow inaccurate outputs within an acceptable variance. Leveraging relaxed accuracy for high throughput in Networks-on-Chip (NoCs), which have rapidly become the accepted method for connecting a large number of on-chip components, has not yet been explored. We propose APPROX-NoC, a hardware data approximation framework with an online data error control mechanism for high performance NoCs. APPROX-NoC facilitates approximate matching of data patterns, within a controllable value range, to compress them thereby reducing the volume of data movement across the chip.

Our evaluation shows that APPROX-NoC achieves on average up to 9% latency reduction and 60% throughput improvement compared with state-of-the-art NoC data compression mechanisms, while maintaining low application error. Additionally, with a data intensive graph processing application we achieve a 36.7% latency reduction compared to state-of-the-art compression mechanisms.

References

[1]

Banit Agrawal and Timothy Sherwood. 2008. Ternary CAM Power and Delay Model: Extensions and Uses. IEEE Trans. Very Large Scale Integr. Syst. (2008), 554--564.

Digital Library

[2]

Omar Alejandro Aguilar and Joel Carlos Huegel. 2011. Inverse Kinematics Solution for Robotic Manipulators Using a CUDA-Based Parallel Genetic Algorithm. In Proceedings of the 10th Mexican International Conference on Advances in Artificial Intelligence - Volume Part I (MICAI 2011). 490--503.

Digital Library

[3]

Junwhan Ahn, Sungpack Hong, Sungjoo Yoo, Onur Mutlu, and Kiyoung Choi. 2015. A Scalable Processing-in-memory Accelerator for Parallel Graph Processing. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA-42). 105--117.

Digital Library

[4]

Junwhan Ahn, Sungjoo Yoo, Onur Mutlu, and Kiyoung Choi. 2015. PIM-enabled instructions: a low-overhead, locality-aware processing-in-memory architecture. In Proceedings of the 42th Annual International Symposium on Computer Architecture (ISCA-42). 336--348.

Digital Library

[5]

Alaa R Alameldeen and David A Wood. 2004. Frequent Pattern Compression: A Significance-Based Compression Scheme for L2 Caches. Dept. Comp. Scie., Univ. Wisconsin-Madison, Tech. Rep 1500 (2004).

[6]

Carlos Alvarez, Jesus Corbal, and Mateo Valero. 2005. Fuzzy Memoization for Floating-Point Multimedia Applications. IEEE Trans. Comput. 54, 7 (2005), 922--927.

Digital Library

[7]

Carlos Álvarez, Jesús Corbal, and Mateo Valero. 2012. Dynamic Tolerance Region Computing for Multimedia. IEEE Trans. Computers 61 (2012), 650--665.

Digital Library

[8]

David A. Bader and Kamesh Madduri. 2005. Design and Implementation of the HPCS Graph Analysis Benchmark on Symmetric Multiprocessors. In Proceedings of the 12th International Conference on High Performance Computing (HiPC 2005). 465--476.

Digital Library

[9]

Christian Bienia. 2011. Benchmarking Modern Multiprocessors. Ph.D. Dissertation. Princeton University.

Digital Library

[10]

Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The Gem5 Simulator. SIGARCH Comput. Archit. News 39 (2011), 1--7.

Digital Library

[11]

M. Creel and M. Zubair. 2012. High Performance Implementation of an Econometrics and Financial Application on GPUs. In Proceedings of International Conference on High Performance Computing, Networking, Storage and Analysis (SCC 2012). 1147--1153.

Digital Library

[12]

Reetuparna Das, Asit K. Mishra, Chrysostomos Nicopoulos, Dongkook Park, Vijaykrishnan Narayanan, Ravishankar R. Iyer, Mazin S. Yousif, and Chita R. Das. 2008. Performance and Power Optimization Through Data Compression in Network-on-Chip Architectures. In Proceedings of the 14th International Conference on High-Performance Computer Architecture (HPCA-14). 215--225.

[13]

Zidong Du, Avinash Lingamneni, Yunji Chen, Krishna V. Palem, Olivier Temam, and Chengyong Wu. 2015. Leveraging the Error Resilience of Neural Networks for Designing Highly Energy Efficient Accelerators. IEEE Trans. on CAD of Integrated Circuits and Systems 34 (2015), 1223--1235.

[14]

Hadi Esmaeilzadeh, Adrian Sampson, Luis Ceze, and Doug Burger. 2012. Architecture Support for Disciplined Approximate Programming. SIGPLAN Not. 47, 4 (2012), 301--312.

Digital Library

[15]

Hadi Esmaeilzadeh, Adrian Sampson, Luis Ceze, and Doug Burger. 2012. Neural Acceleration for General-Purpose Approximate Programs. In Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-45). 449--460.

Digital Library

[16]

Alexander Guzhva, Sergey Dolenko, and Igor Persiantsev. 2009. Multifold Acceleration of Neural Network Computations Using GPU. In Proceedings of the 19th International Conference on Artificial Neural Networks: Part I (ICANN 2009). 373--380.

Digital Library

[17]

Yuho Jin, Ki Hwan Yum, and Eun Jung Kim. 2008. Adaptive Data Compression for High-performance Low-power On-chip Networks. In Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-41). 354--363.

Digital Library

[18]

Daya S. Khudia, Babak Zamirai, Mehrzad Samadi, and Scott Mahlke. 2015. Rumba: An Online Quality Management System for Approximate Computing. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA-42). 554--566.

Digital Library

[19]

Snehasish Kumar, Naveen Vedula, Arrvindh Shriraman, and Vijayalakshmi Srinivasan. 2015. DASX: Hardware Accelerator for Software Data Structures. In Proceedings of the 29th ACM on International Conference on Supercomputing (ICS 2015). 361--372.

Digital Library

[20]

Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data. (June 2014).

[21]

Song Liu, Karthik Pattabiraman, Thomas Moscibroda, and Benjamin G. Zorn. 2011. Flikker: Saving DRAM Refresh-power Through Critical Data Partitioning. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XVI). 213--224.

Digital Library

[22]

Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In Proceedings of the 26th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2005) (PLDI '05). 190--200.

Digital Library

[23]

Joshua San Miguel, Jorge Albericio, Andreas Moshovos, and Natalie Enright Jerger. 2015. Doppelganger: A Cache for Approximate Computing. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO-48). 50--61.

Digital Library

[24]

Joshua San Miguel, Mario Badr, and Natalie Enright Jerger. 2014. Load Value Approximation. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-47). 127--139.

Digital Library

[25]

Thierry Moreau, Mark Wyse, Jacob Nelson, Adrian Sampson, Hadi Esmaeilzadeh, Luis Ceze, and Mark Oskin. 2015. SNNAP: Approximate Computing on Programmable SoCs via Neural Acceleration. In Proceeedings of the 21st IEEE International Symposium on High Performance Computer Architecture (HPCA-21). 603--614.

[26]

Naveen Muralimanohar, Rajeev Balasubramonian, and Norm Jouppi. 2007. Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0. In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-40). 3--14.

Digital Library

[27]

Mehrzad Samadi, Davoud Anoushe Jamshidi, Janghaeng Lee, and Scott A. Mahlke. 2014. Paraprox: Pattern-Based Approximation for Data Parallel Applications. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XIX). 35--50.

Digital Library

[28]

Mehrzad Samadi, Janghaeng Lee, D. Anoushe Jamshidi, Amir Hormati, and Scott Mahlke. 2013. SAGE: Self-tuning Approximation for Graphics Engines. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46). 13--24.

Digital Library

[29]

Adrian Sampson, André Baixo, Benjamin Ransford, Thierry Moreau, Joshua Yip, Luis Ceze, and Mark Oskin. 2015. Accept: A Programmer-Guided Compiler Framework for Practical Approximate Computing. University of Washington Technical Report UW-CSE-15-01 1 (2015).

[30]

Adrian Sampson, Werner Dietl, Emily Fortuna, Danushen Gnanapragasam, Luis Ceze, and Dan Grossman. 2011. EnerJ: Approximate Data Types for Safe and General Low-Power Computation. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2011). IEEE, 164--174.

Digital Library

[31]

Adrian Sampson, Jacob Nelson, Karin Strauss, and Luis Ceze. 2013. Approximate Storage in Solid-State Memories. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46). 25--36.

Digital Library

[32]

Stelios Sidiroglou-Douskos, Sasa Misailovic, Henry Hoffmann, and Martin Rinard. 2011. Managing Performance vs. Accuracy Trade-offs with Loop Perforation. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE 2011). 124--134.

Digital Library

[33]

Renée St. Amant, Amir Yazdanbakhsh, Jongse Park, Bradley Thwaites, Hadi Esmaeilzadeh, Arjang Hassibi, Luis Ceze, and Doug Burger. 2014. General-purpose Code Acceleration with Limited-precision Analog Computation. In Proceeding of the 41st Annual International Symposium on Computer Architecuture (ISCA-41). 505--516.

Digital Library

[34]

Swagath Venkataramani, Vinay K. Chippa, Srimat T. Chakradhar, Kaushik Roy, and Anand Raghunathan. 2013. Quality Programmable Vector Processors for Approximate Computing. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46). 1--12.

Digital Library

[35]

Amir Yazdanbakhsh, Jongse Park, Hardik Sharma, Pejman Lotfi-Kamran, and Hadi Esmaeilzadeh. 2015. Neural Acceleration for GPU Throughput Processors. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO-48). 482--493.

Digital Library

[36]

J. Zhan, M. Poremba, Y. Xu, and Y. Xie. 2014. Leveraging Delta Compression for End-to-End Memory Access in NoC Based Multicores. In 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC). 586--591.

[37]

Ping Zhou, Bo Zhao, Yu Du, Yi Xu, Youtao Zhang, Jun Yang, and Li Zhao. 2009. Frequent Value Compression in Packet-based NoC Architectures. In Proceedings of the 2009 Asia and South Pacific Design Automation Conference (ASP-DAC 2009). 13--18.

Digital Library

Cited By

Li SZhou SXue YFan WCheng TJi JDai CSong WChen QGao CLi LFu Y(2024)HAS-RL: A Hierarchical Approximate Scheme Optimized With Reinforcement Learning for NoC-Based NN AcceleratorsIEEE Transactions on Circuits and Systems I: Regular Papers10.1109/TCSI.2024.335991271:4(1863-1875)Online publication date: Apr-2024
https://doi.org/10.1109/TCSI.2024.3359912
Chen YLouri ALiu SLombardi F(2024)Approximate Communication in Network-on-Chips for Training and Inference of Image Classification ModelsDesign and Applications of Emerging Computer Systems10.1007/978-3-031-42478-6_27(709-740)Online publication date: 14-Jan-2024
https://doi.org/10.1007/978-3-031-42478-6_27
Reza M(2023)Machine Learning Enabled Solutions for Design and Optimization Challenges in Networks-on-Chip based Multi/Many-Core ArchitecturesACM Journal on Emerging Technologies in Computing Systems10.1145/359147019:3(1-26)Online publication date: 30-Jun-2023
https://dl.acm.org/doi/10.1145/3591470
Show More Cited By

Index Terms

APPROX-NoC: A Data Approximation Framework for Network-On-Chip Architectures
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
      1. Interconnection architectures
      2. Multicore architectures
2. Networks
  1. Network performance evaluation
    1. Network performance analysis

Recommendations

APPROX-NoC: A Data Approximation Framework for Network-On-Chip Architectures
ISCA'17

The trend of unsustainable power consumption and large memory bandwidth demands in massively parallel multicore systems, with the advent of the big data era, has brought upon the onset of alternate computation paradigms utilizing heterogeneity, ...
NI + Router Microarchitecture for NoC-based Communication Systems
ANCS '16: Proceedings of the 2016 Symposium on Architectures for Networking and Communications Systems

Modern communication systems are characterized by intensive computation signal processing algorithms. System-on-Chip implementations of these systems are generally based on Networks-on-Chip (NoC). The router and Network Interface (NI) are the main ...
Electro-Photonic NoC Designs for Kilocore Systems
Special Issue on Nanoelectronic Circuit and System Design Methods for the Mobile Computing Era and Regular Papers

The increasing core count in manycore systems requires a corresponding large Network-on-chip (NoC) bandwidth to support the overlying applications. However, it is not possible to provide this large bandwidth in an energy-efficient manner using ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISCA '17: Proceedings of the 44th Annual International Symposium on Computer Architecture

June 2017

736 pages

ISBN:9781450348928

DOI:10.1145/3079856

ACM SIGARCH Computer Architecture News Volume 45, Issue 2
ISCA'17
May 2017
715 pages
ISSN:0163-5964
DOI:10.1145/3140659
Editor:
Babak Falsafi
Interim
Issue’s Table of Contents

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

IEEE: IEEE Computer Society Technical Committee on Design Automation
SIGARCH: ACM Special Interest Group on Computer Architecture

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 June 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Science Foundation

Conference

ISCA '17

Sponsor:

IEEE
SIGARCH

ISCA '17: The 44th Annual International Symposium on Computer Architecture

June 24 - 28, 2017

ON, Toronto, Canada

Acceptance Rates

ISCA '17 Paper Acceptance Rate 54 of 322 submissions, 17%;

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Sponsor:
sigarch

The 52nd Annual International Symposium on Computer Architecture

June 21 - 25, 2025

Tokyo , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

60
Total Citations
View Citations
1,792
Total Downloads

Downloads (Last 12 months)177
Downloads (Last 6 weeks)29

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li SZhou SXue YFan WCheng TJi JDai CSong WChen QGao CLi LFu Y(2024)HAS-RL: A Hierarchical Approximate Scheme Optimized With Reinforcement Learning for NoC-Based NN AcceleratorsIEEE Transactions on Circuits and Systems I: Regular Papers10.1109/TCSI.2024.335991271:4(1863-1875)Online publication date: Apr-2024
https://doi.org/10.1109/TCSI.2024.3359912
Chen YLouri ALiu SLombardi F(2024)Approximate Communication in Network-on-Chips for Training and Inference of Image Classification ModelsDesign and Applications of Emerging Computer Systems10.1007/978-3-031-42478-6_27(709-740)Online publication date: 14-Jan-2024
https://doi.org/10.1007/978-3-031-42478-6_27
Reza M(2023)Machine Learning Enabled Solutions for Design and Optimization Challenges in Networks-on-Chip based Multi/Many-Core ArchitecturesACM Journal on Emerging Technologies in Computing Systems10.1145/359147019:3(1-26)Online publication date: 30-Jun-2023
https://dl.acm.org/doi/10.1145/3591470
Jha CNandi AMekie J(2023)Single Exact Single Approximate Adders and Single Exact Dual Approximate AddersIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2023.326827531:7(907-916)Online publication date: Jul-2023
https://doi.org/10.1109/TVLSI.2023.3268275
Chen YLouri ALiu SLombardi F(2023)Slack-Aware Packet Approximation for Energy-Efficient Network-on-ChipsIEEE Transactions on Sustainable Computing10.1109/TSUSC.2022.32134698:1(120-132)Online publication date: 1-Jan-2023
https://doi.org/10.1109/TSUSC.2022.3213469
Chen YLiu SLombardi FLouri A(2023)A Technique for Approximate Communication in Network-on-Chips for Image ClassificationIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2022.316216511:1(30-42)Online publication date: 1-Jan-2023
https://doi.org/10.1109/TETC.2022.3162165
González YNelissen GTovar E(2023)Traffic Injection Regulation Protocol Based on Free Time-Slots Requests2023 IEEE 29th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA)10.1109/RTCSA58653.2023.00027(157-166)Online publication date: 30-Aug-2023
https://doi.org/10.1109/RTCSA58653.2023.00027
Najafi ARotermund DNajafi APawelzik KGarcia-Ortiz A(2023)Empirical Analysis of Full-System Approximation on Non-Spiking and Spiking Neural Networks2023 12th International Conference on Modern Circuits and Systems Technologies (MOCAST)10.1109/MOCAST57943.2023.10176919(1-5)Online publication date: 28-Jun-2023
https://doi.org/10.1109/MOCAST57943.2023.10176919
Deb DM.K RJose J(2022)FlitZip: Effective Packet Compression for NoC in MultiProcessor System-on-ChipIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.309031533:1(117-128)Online publication date: 1-Jan-2022
https://doi.org/10.1109/TPDS.2021.3090315
Chen YLouri ALiu SLombardi F(2022)Approximate Network-on-Chips with Application to Image Classification2022 IEEE International Conference on Networking, Architecture and Storage (NAS)10.1109/NAS55553.2022.9925540(1-8)Online publication date: Oct-2022
https://doi.org/10.1109/NAS55553.2022.9925540
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents