research-article

Public Access

Ultra-Efficient Processing In-Memory for Data Intensive Applications

Authors:

Tajana RosingAuthors Info & Claims

DAC '17: Proceedings of the 54th Annual Design Automation Conference 2017

Article No.: 6, Pages 1 - 6

https://doi.org/10.1145/3061639.3062337

Published: 18 June 2017 Publication History

Abstract

Recent years have witnessed a rapid growth in the domain of Internet of Things (IoT). This network of billions of devices generates and exchanges huge amount of data. The limited cache capacity and memory bandwidth make transferring and processing such data on traditional CPUs and GPUs highly inefficient, both in terms of energy consumption and delay. However, many IoT applications are statistical at heart and can accept a part of inaccuracy in their computation. This enables the designers to reduce complexity of processing by approximating the results for a desired accuracy. In this paper, we propose an ultra-efficient approximate processing in-memory architecture, called APIM, which exploits the analog characteristics of non-volatile memories to support addition and multiplication inside the crossbar memory, while storing the data. The proposed design eliminates the overhead involved in transferring data to processor by virtually bringing the processor inside memory. APIM dynamically configures the precision of computation for each application in order to tune the level of accuracy during runtime. Our experimental evaluation running six general OpenCL applications shows that the proposed design achieves up to 20x performance improvement and provides 480x improvement in energy-delay product, ensuring acceptable quality of service. In exact mode, it achieves 28x energy savings and 4.8x speed up compared to the state-of-the-art GPU cores.

References

[1]

J. Gubbi et al., "Internet of things (IoT): A vision, architectural elements, and future directions," Future Generation Computer Systems, vol. 29, no. 7, pp. 1645--1660, 2013.

Digital Library

[2]

M. Samragh et al., "Looknn: Neural network with no multiplication," in IEEE/ACM DATE, 2017.

[3]

K. Hwang et al., Distributed and cloud computing: from parallel processing to the internet of things. Morgan Kaufmann, 2013.

Digital Library

[4]

R. Balasubramonian et al., "Near-data processing: Insights from a micro-46 workshop," Microarchitecture, vol. 34, no. 4, pp. 36--42, 2014.

[5]

G. Loh et al., "A processing-in-memory taxonomy and a case for studying fixed-function pim," in WoNDP, 2013.

[6]

M. Imani et al., "Mpim: Multi-purpose in-memory processing using configurable resistive memory," in IEEE ASP-DAC, pp. 757--763, IEEE, 2017.

[7]

S. Pugsley et al., "Comparing implementations of near-data computing with in-memory mapreduce workloads," Microarchitecture, vol. 34, no. 4, pp. 44--52, 2014.

[8]

A. M. Aly et al., "M3: Stream processing on main-memory mapreduce," in ICDE, pp. 1253--1256, IEEE, 2012.

Digital Library

[9]

J. Han et al., "Approximate computing: An emerging paradigm for energy-efficient design," in ETS, pp. 1--6, IEEE, 2013.

[10]

M. Imani et al., "Efficient neural network acceleration on gpgpu using content addressable memory," in IEEE/ACM DATE, 2017.

[11]

M. Imani et al., "Resistive configurable associative memory for approximate computing," in DATE, pp. 1327--1332, IEEE, 2016.

Digital Library

[12]

V. Gupta et al., "Impact: imprecise adders for low-power approximate computing," in ISLPED, pp. 409--414, IEEE, 2011.

Digital Library

[13]

M. Imani et al., "Masc: Ultra-low energy multiple-access single-charge tcam for approximate computing," in IEEE/ACM DATE, pp. 373--378, IEEE, 2016.

Digital Library

[14]

Q. Guo et al., "Ac-dimm: associative computing with stt-mram," in ISCA, vol. 41, pp. 189--200, ACM, 2013.

Digital Library

[15]

Q. Guo et al., "A resistive tcam accelerator for data-intensive computing," in Microarchitecture, pp. 339--350, ACM, 2011.

Digital Library

[16]

M. Imani et al., "Exploring hyperdimensional associative memory," in IEEE HPCA, IEEE, 2017.

[17]

X. Yin et al., "Design and benchmarking of ferroelectric fet based tcam," in IEEE/ACM DATE, IEEE, 2017.

[18]

J. Borghetti et al., "A hybrid nanomemristor/transistor logic circuit capable of self-programming," PNAS, vol. 106, no. 6, pp. 1699--1703, 2009.

[19]

M. Imani et al., "Acam: Approximate computing based on adaptive associative memory with online learning," in IEEE/ACM ISLPED, pp. 162--167, 2016.

Digital Library

[20]

L. Yavits et al., "Resistive associative processor," IEEE Computer Architecture Letters, vol. 14, no. 2, pp. 148--151, 2015.

Digital Library

[21]

J. Borghetti et al., "Memristive switches enable stateful logic operations via material implication," Nature, vol. 464, no. 7290, pp. 873--876, 2010.

[22]

S. Kvatinsky, G. Satat, N. Wald, E. G. Friedman, A. Kolodny, and U. C. Weiser, "Memristor-based material implication (IMPLY) logic: design principles and methodologies," TVLSI, vol. 22, no. 10, pp. 2054--2066, 2014.

[23]

S. Kvatinsky et al., "MAGIC -- memristor-aided logic," TCAS II, vol. 61, no. 11, pp. 895--899, 2014.

[24]

N. Talati et al., "Logic design within memristive memories using memristor-aided loGIC (MAGIC)," IEEE TNano, vol. 15, pp. 635--650, jul 2016.

Digital Library

[25]

A. Siemon et al., "A complementary resistive switch-based crossbar array adder," JETCAS, vol. 5, no. 1, pp. 64--74, 2015.

[26]

V. Gupta et al., "Low-power digital signal processing using approximate adders," TCAD, vol. 32, no. 1, pp. 124--137, 2013.

Digital Library

[27]

R. Ubal et al., "Multi2sim: a simulation framework for cpu-gpu computing," in PACT, pp. 335--344, ACM, 2012.

Digital Library

[28]

S. Kvatinsky et al., "Vteam: a general model for voltage-controlled memristors," TCAS II, vol. 62, no. 8, pp. 786--790, 2015.

[29]

"Caltech Library." http://www.vision.caltech.edu/Image_Datasets/Caltech101/.

Cited By

Heddes MNunes IGivargis TNicolau AVeidenbaum A(2024)Hyperdimensional computing: a framework for stochastic computation and symbolic AIJournal of Big Data10.1186/s40537-024-01010-811:1Online publication date: 24-Oct-2024
https://doi.org/10.1186/s40537-024-01010-8
Hui YLi QWang LLiu CZhang DMiao X(2024)In-Memory Wallace Tree Multipliers Based on Majority Gates Within Voltage-Gated SOT-MRAM Crossbar ArraysIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2024.335015132:3(497-504)Online publication date: Mar-2024
https://doi.org/10.1109/TVLSI.2024.3350151
Lu ZWang XArafin MYang HLiu ZZhang JQu G(2024)An RRAM-Based Computing-in-Memory Architecture and Its Application in Accelerating Transformer InferenceIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2023.334565132:3(485-496)Online publication date: Mar-2024
https://doi.org/10.1109/TVLSI.2023.3345651
Show More Cited By

Index Terms

Ultra-Efficient Processing In-Memory for Data Intensive Applications
1. Hardware
  1. Emerging technologies
    1. Analysis and design of emerging devices and systems
      1. Emerging architectures
  2. Integrated circuits
    1. Semiconductor memory
      1. Non-volatile memory

Recommendations

FloatPIM: in-memory acceleration of deep neural network training with high precision
ISCA '19: Proceedings of the 46th International Symposium on Computer Architecture

Processing In-Memory (PIM) has shown a great potential to accelerate inference tasks of Convolutional Neural Network (CNN). However, existing PIM architectures do not support high precision computation, e.g., in floating point precision, which is ...
Digital-based processing in-memory: a highly-parallel accelerator for data intensive applications
MEMSYS '19: Proceedings of the International Symposium on Memory Systems

Recently, Processing In-Memory (PIM) has been shown as a promising solution to address data movement issue in the current processors. However, today's PIM technologies are mostly analog-based, which involve both scalability and efficiency issues. In ...
Towards memory-efficient processing-in-memory architecture for convolutional neural networks
LCTES 2017: Proceedings of the 18th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems

Convolutional neural networks (CNNs) are widely adopted in artificial intelligent systems. In contrast to conventional computing centric applications, the computational and memory resources of CNN applications are mixed together in the network weights. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

DAC '17: Proceedings of the 54th Annual Design Automation Conference 2017

June 2017

533 pages

ISBN:9781450349277

DOI:10.1145/3061639

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

EDAC: Electronic Design Automation Consortium
SIGDA: ACM Special Interest Group on Design Automation
IEEE-CEDA

In-Cooperation

SIGBED: ACM Special Interest Group on Embedded Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Science Foundation

Conference

DAC '17

Sponsor:

EDAC
SIGDA

DAC '17: The 54th Annual Design Automation Conference 2017

June 18 - 22, 2017

TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25

Sponsor:
sigda

62nd ACM/IEEE Design Automation Conference

June 22 - 26, 2025

San Francisco , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

74
Total Citations
View Citations
1,724
Total Downloads

Downloads (Last 12 months)234
Downloads (Last 6 weeks)22

Reflects downloads up to 12 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Heddes MNunes IGivargis TNicolau AVeidenbaum A(2024)Hyperdimensional computing: a framework for stochastic computation and symbolic AIJournal of Big Data10.1186/s40537-024-01010-811:1Online publication date: 24-Oct-2024
https://doi.org/10.1186/s40537-024-01010-8
Hui YLi QWang LLiu CZhang DMiao X(2024)In-Memory Wallace Tree Multipliers Based on Majority Gates Within Voltage-Gated SOT-MRAM Crossbar ArraysIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2024.335015132:3(497-504)Online publication date: Mar-2024
https://doi.org/10.1109/TVLSI.2024.3350151
Lu ZWang XArafin MYang HLiu ZZhang JQu G(2024)An RRAM-Based Computing-in-Memory Architecture and Its Application in Accelerating Transformer InferenceIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2023.334565132:3(485-496)Online publication date: Mar-2024
https://doi.org/10.1109/TVLSI.2023.3345651
Marchesin ANaclerio ARiente FGraziano M(2024)Beyond Von Neumann Architectures: Exploring Algorithmic Opportunities via OctantisIEEE Access10.1109/ACCESS.2024.345010512(120005-120022)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3450105
Yue JYue J(2024)Digital Circuits and CIM Integrated NN ProcessorHigh Energy Efficiency Neural Network Processor with Combined Digital and Computing-in-Memory Architecture10.1007/978-981-97-3477-1_5(71-93)Online publication date: 1-Aug-2024
https://doi.org/10.1007/978-981-97-3477-1_5
Riahi Alam MNajafi MTaheriNejad NImani MPeng LThapliyal HDeMara RPartin-Vaisband IKatkoori S(2023)Stochastic Computing for Reliable Memristive In-Memory ComputationProceedings of the Great Lakes Symposium on VLSI 202310.1145/3583781.3590307(397-401)Online publication date: 5-Jun-2023
https://dl.acm.org/doi/10.1145/3583781.3590307
Leitersdorf OLeitersdorf DGal JDahan MRonen RKvatinsky S(2023)AritPIM: High-Throughput In-Memory ArithmeticIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2023.326813711:3(720-735)Online publication date: 1-Jul-2023
https://doi.org/10.1109/TETC.2023.3268137
Haq Rashed MThijssen SJha SEwetz R(2023)Automated Synthesis for In-Memory Computing2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)10.1109/ICCAD57390.2023.10323667(1-9)Online publication date: 28-Oct-2023
https://doi.org/10.1109/ICCAD57390.2023.10323667
Sun ZKvatinsky SSi XMehonic ACai YHuang R(2023)A full spectrum of computing-in-memory technologiesNature Electronics10.1038/s41928-023-01053-46:11(823-835)Online publication date: 13-Nov-2023
https://doi.org/10.1038/s41928-023-01053-4
Tasnim MRaje CYu SSadredini ETan S(2023)MAGIC-DHTIntegration, the VLSI Journal10.1016/j.vlsi.2023.10206093:COnline publication date: 1-Nov-2023
https://dl.acm.org/doi/10.1016/j.vlsi.2023.102060
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten