Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

IMFlexCom: Energy Efficient In-Memory Flexible Computing Using Dual-Mode SOT-MRAM

Published: 23 October 2018 Publication History

Abstract

In this article, we propose an <u>I</u>n-<u>M</u>emory <u>Flex</u>ible <u>Com</u>puting platform (IMFlexCom) using a novel Spin Orbit Torque Magnetic Random Access Memory (SOT-MRAM) array architecture, which could work in dual mode: memory mode and computing mode. Such intrinsic in-memory logic (AND/OR/XOR) could be used to process data within memory to greatly reduce power-hungry and long distance massive data communication in conventional Von Neumann computing systems. A comprehensive reliability analysis is performed, which confirms ∼90mV and ∼10mV (worst-case) sense margin for memory and in-memory logic operation in variations on resistance-area product and tunnel magnetoresistance. We further show that sense margin for in-memory logic computation can be significantly increased by increasing the oxide thickness. Furthermore, we employ bulk bitwise vector operation and data encryption engine as case studies to investigate the performance of our proposed design. IMFlexCom shows ∼35× energy saving and ∼18× speedup for bulk bitwise in-memory vector AND/OR operation compared to DRAM-based in-memory logic. Again, IMFlexCom can achieve 77.27% and 85.4% lower energy consumption compared to CMOS-ASIC- and CMOL-based Advanced Encryption Standard (AES) implementations, respectively. It offers almost similar energy consumption as recent DW-AES implementation with 66.7% less area overhead.

References

[1]
Z. Abid and others. 2009. Efficient CMOL gate designs for cryptography applications. IEEE Transactions on Nanotechnology 8 (2009), 315--321.
[2]
Junwhan Ahn, Sungjoo Yoo, Onur Mutlu, and Kiyoung Choi. 2015. PIM-enabled instructions: A low-overhead, locality-aware processing-in-memory architecture. In Proceedings of the 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA’15). IEEE, 336--348.
[3]
Shaahin Angizi, Zhezhi He, and Deliang Fan. 2017a. Energy efficient in-memory computing platform based on 4-terminal spin Hall effect-driven domain wall motion devices. In Proceedings of the on Great Lakes Symposium on VLSI 2017. ACM, 77--82.
[4]
Shaahin Angizi, Zhezhi He, Nader Bagherzadeh, and Deliang Fan. 2017b. Design and evaluation of a spintronic in-memory processing platform for non-volatile data encryption. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2017), 428--430.
[5]
G. Autès, J. Mathon, and A. Umerski. 2010. Strong enhancement of the tunneling magnetoresistance by electron filtering in an Fe/MgO/Fe/GaAs (001) junction. Physical Review Letters 104, 21 (2010), 217202.
[6]
Ray Beaulieu, Stefan Treatman-Clark, Douglas Shors, Bryan Weeks, Jason Smith, and Louis Wingers. 2015. The SIMON and SPECK lightweight block ciphers. In Proceedings of the 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC’15). IEEE, 1--6.
[7]
Xiuyuan Bi, Mohamed Anis Weldon, and Hai Li. 2013. STT-RAM designs supporting dual-port accesses. In Proceedings of the Conference on Design, Automation and Test in Europe. EDA Consortium, 853--858.
[8]
Nathan Binkert and others. 2011. The gem5 simulator. SIGARCH 39 (2011), 1--7.
[9]
Rajendra Bishnoi, Mojtaba Ebrahimi, Fabian Oboril, and Mehdi B. Tahoori. 2014. Architectural aspects in design and analysis of SOT-based memories. In Design Automation Conference (ASP-DAC), Proceedings of the 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC’14). IEEE, 700--707.
[10]
A. Bogdanov, L. R. Knudsen, G. Leander, C. Paar, A. Poschmann, M. J. Robshaw, Y. Seurin, and C. Vikkelsoe. 2007. PRESENT: An ultra-lightweight block cipher. In Proceedings of the 9th International Workshop on Cryptographic Hardware and Embedded Systems (CHES’07). Springer-Verlag, Berlin, 450--466.
[11]
De Canniere and others. 2009. Katan and Ktantan—A family of small and efficient hardware-oriented block ciphers. In Proceedings of the International Workshop on CHES. Springer, 272--288.
[12]
Xunchao Chen, Navid Khoshavi, Ronald F. DeMara, Jun Wang, Dan Huang, Wujie Wen, and Yiran Chen. 2017. Energy-aware adaptive restore schemes for MLC STT-RAM cache. IEEE Transactions on Computers 66, 5 (2017), 786--798.
[13]
Ping Chi, Shuangchen Li, Z. Qi, P. Gu, C. Xu, T. Zhang, J. Zhao, Y. Liu, Y. Wang, and Y. Xie. 2016. PRIME: A novel processing-in-memory architecture for neural network computation in ReRAM-based main memory. In Proceedings of ISCA, Vol. 43.
[14]
Joan Daemen and Vincent Rijmen. 2002. The Design of Rijndael: AES—The Advanced Encryption Standard. Springer-Verlag. 238 pages.
[15]
J. Darper and others. 2002. The architecture of DIVA processing in memory chips. In Proceedings of the 16th International Conference on Supercomputing (ICS’02).
[16]
Brandon Del Bel, Jongyeon Kim, Chris H. Kim, and Sachin S. Sapatnekar. 2014. Improving STT-MRAM density through multibit error correction. In Proceedings of the Conference on Design, Automation 8 Test in Europe. European Design and Automation Association, 182.
[17]
Xiangyu Dong, Cong Xu, Norm Jouppi, and Yuan Xie. 2014. NVSim: A circuit-level performance, energy, and area model for emerging non-volatile memory. In Emerging Memory Technologies. Springer, 15--50.
[18]
Jeff Draper, Jacqueline Chame, Mary Hall, Craig Steele, Tim Barrett, Jeff LaCoss, John Granacki, Jaewook Shin, Chun Chen, Chang Woo Kang, and others. 2002. The architecture of the DIVA processing-in-memory chip. In Proceedings of the 16th International Conference on Supercomputing. ACM, 14--25.
[19]
Deliang Fan. 2016. Low power in-memory computing platform with four terminal magnetic domain wall motion devices. In Proceedings of the 2016 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH’16). IEEE, 153--158.
[20]
Amin Farmahini-Farahani, Jung Ho Ahn, Katherine Morrow, and Nam Sung Kim. 2015. NDA: Near-DRAM acceleration architecture leveraging commodity DRAM devices and standard memory modules. In Proceedings of the 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA’15). IEEE, 283--295.
[21]
Xuanyao Fong, Sumeet K. Gupta, Niladri N. Mojumder, Sri Harsha Choday, Charles Augustine, and Kaushik Roy. 2011. KNACK: A hybrid spin-charge mixed-mode simulator for evaluating different genres of spin-transfer torque MRAM bit-cells. In Proceedings of the 2011 International Conference on Simulation of Semiconductor Processes and Devices. IEEE, 51--54.
[22]
Xuanyao Fong, Yusung Kim, Sri Harsha Choday, and Kaushik Roy. 2014. Failure mitigation techniques for 1T-1MTJ spin-transfer torque MRAM bit-cells. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 22, 2 (2014), 384--395.
[23]
Xuanyao Fong, Yusung Kim, Karthik Yogendra, Deliang Fan, Abhronil Sengupta, Anand Raghunathan, and Kaushik Roy. 2016. Spin-transfer torque devices for logic and memory: Prospects and perspectives. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 35, 1 (2016), 1--22.
[24]
Qing Guo, Xiaochen Guo, Ravi Patel, Engin Ipek, and Eby G. Friedman. 2013. AC-DIMM: Associative computing with STT-MRAM. In ACM SIGARCH Computer Architecture News, Vol. 41. ACM, 189--200.
[25]
Xiaochen Guo, Engin Ipek, and Tolga Soyata. 2010. Resistive computation: Avoiding the power wall with low-leakage, STT-MRAM based computing. In ACM SIGARCH Computer Architecture News, Vol. 38. ACM, 371--382.
[26]
Zhezhi He, Shaahin Angizi, Farhana Parveen, and Deliang Fan. 2017. Leveraging dual-mode magnetic crossbar for ultra-low energy in-memory data encryption. In Proceedings of the on Great Lakes Symposium on VLSI 2017. ACM, 83--88.
[27]
Zhezhi He and Deliang Fan. 2016. A low power current-mode flash ADC with spin Hall effect based multi-threshold comparator. In Proceedings of the 2016 International Symposium on Low Power Electronics and Design. ACM, 314--319.
[28]
A. T. Hindmarch, A. W. Rushforth, R. P. Campion, C. H. Marrows, and B. L. Gallagher. 2011. Origin of in-plane uniaxial magnetic anisotropy in CoFeB amorphous ferromagnetic thin films. Physical Review B 83, 21 (2011), 212404.
[29]
H. Honjo, H. Sato, S. Ikeda, S. Sato, T. Watanebe, S. Miura, T. Nasuno, Y. Noguchi, M. Yasuhira, T. Tanigawa, and others. 2015. 10 nmf perpendicular-anisotropy CoFeB-MgO magnetic tunnel junction with over 400°C high thermal tolerance by boron diffusion control. In Proceedings of the 2015 Symposium on VLSI Technology (VLSI Technology’15). IEEE, T160--T161.
[30]
Kotb Jabeur, Gregory Di Pendina, Guillaume Prenat, Liliana Daniela Buda-Prejbeanu, and Bernard Dieny. 2014. Compact modeling of a magnetic tunnel junction based on spin orbit torque. IEEE Transactions on Magnetics 50, 7 (2014), 1--8.
[31]
Shubham Jain, Ashish Ranjan, Kaushik Roy, and Anand Raghunathan. 2017. Computing in memory with spin-transfer torque magnetic RAM. arXiv:1703.02118.
[32]
Wang Kang, Haotian Wang, Zhaohao Wang, Youguang Zhang, and Weisheng Zhao. 2017. In-memory processing paradigm for bitwise logic operations in STT-MRAM. IEEE Transactions on Magnetics 53, 11 (2017).
[33]
Wang Kang, Zhaohao Wang, Youguang Zhang, Jacques-Olivier Klein, Weifeng Lv, and Weisheng Zhao. 2016. Spintronic logic design methodology based on spin Hall effect--driven magnetic tunnel junctions. Journal of Physics D: Applied Physics 49, 6 (2016), 065008.
[34]
Wang Kang, Liuyang Zhang, Weisheng Zhao, Jacques-Olivier Klein, Youguang Zhang, Dafiné Ravelosona, and Claude Chappert. 2015. Yield and reliability improvement techniques for emerging nonvolatile STT-MRAM. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 5, 1 (2015), 28--39.
[35]
Yusung Kim, Sumeet Kumar Gupta, Sang Phill Park, Georgios Panagopoulos, and Kaushik Roy. 2012. Write-optimized reliable design of STT MRAM. In Proceedings of the 2012 ACM/IEEE International Symposium on Low Power Electronics and Design. ACM, 3--8.
[36]
Peter M. Kogge. 1994. EXECUBE—A new architecture for scaleable MPPs. In Proceedings of the International Conference on Parallel Processing (ICPP’94), Vol. 1. IEEE, 77--84.
[37]
Kon-Woo Kwon, Xuanyao Fong, Parami Wijesinghe, Priyadarshini Panda, and Kaushik Roy. 2015. High-density and robust STT-MRAM array through device/circuit/architecture interactions. IEEE Transactions on Nanotechnology 14, 6 (2015), 1024--1034.
[38]
Sheng Li and others. 2009. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’09). ACM, 469--480.
[39]
Shuangchen Li, Cong Xu, Qiaosha Zou, Jishen Zhao, Yu Lu, and Yuan Xie. 2016. Pinatubo: A processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories. In Proceedings of the 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC’16). IEEE, 1--6.
[40]
K. Malbrain. 2009. Byte-oriented-aes: A public domain byte-oriented implementation of AES in C. (2009).
[41]
Sanu Mathew and others. 2015. 340 mV--1.1 V, 289 Gbps/W, 2090-gate nanoAES hardware accelerator with area-optimized encrypt/decrypt GF (2 4) 2 polynomials in 22 nm tri-gate CMOS. IEEE Journal of Solid-State Circuits 50, 4 (2015), 1048--1058.
[42]
Ravi Nair, Samuel F. Antao, Carlo Bertolli, Pradip Bose, Jose R. Brunheroto, Tong Chen, C.-Y. Cher, Carlos H. A. Costa, Jun Doi, Constantinos Evangelinos, and others. 2015. Active memory cube: A processing-in-memory architecture for exascale systems. IBM Journal of Research and Development 59, 2/3 (2015), 17--1.
[43]
NCSU EDA FreePDK45. (2011). Retrieved May 2011 from http://www.eda.ncsu.edu/wiki/FreePDK45:Contents.
[44]
I. Off. 1992. Computational RAM: A memory-SIMD hybrid and its application to DSP.
[45]
Mark Oskin, Frederic T. Chong, and Timothy Sherwood. 1998. Active Pages: A Computation Model for Intelligent Memory. Vol. 26. IEEE Computer Society.
[46]
Chi-Feng Pai, Luqiao Liu, Y. Li, H. W. Tseng, D. C. Ralph, and R. A. Buhrman. 2012. Spin transfer torque devices utilizing the giant spin Hall effect of tungsten. Applied Physics Letters 101, 12 (2012), 122404.
[47]
Farhana Parveen, Shaahin Angizi, Zhezhi He, and Deliang Fan. 2017. Low power in-memory computing based on dual-mode SOT-MRAM. In Proceedings of the 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED’17). IEEE, 1--6.
[48]
David Patterson, Thomas Anderson, Neal Cardwell, Richard Fromm, Kimberley Keeton, Christoforos Kozyrakis, Randi Thomas, and Katherine Yelick. 1997a. Intelligent RAM (IRAM): Chips that remember and compute. In Proceedings of the 1997 43rd IEEE International Solid-State Circuits Conference (ISSCC’97), Digest of Technical Papers. IEEE, 224--225.
[49]
David Patterson, Thomas Anderson, Neal Cardwell, Richard Fromm, Kimberly Keeton, Christoforos Kozyrakis, Randi Thomas, and Katherine Yelick. 1997b. A case for intelligent RAM. IEEE Micro 17, 2 (1997), 34--44.
[50]
J. Thomas Pawlowski. 2011. Hybrid memory cube (HMC). In Proceedings of the 2011 IEEE Hot Chips 23 Symposium (HCS’11). IEEE, 1--24.
[51]
Guillaume Prenat, Kotb Jabeur, Gregory Di Pendina, Olivier Boulle, and Gilles Gaudin. 2015. Beyond STT-MRAM, spin orbit torque RAM SOT-MRAM for high speed and high reliability applications. In Spintronics-Based Computing. Springer, 145--157.
[52]
Guillaume Prenat, Kotb Jabeur, Pierre Vanhauwaert, Gregory Di Pendina, Fabian Oboril, Rajendra Bishnoi, Mojtaba Ebrahimi, Nathalie Lamard, Olivier Boulle, Kevin Garello, and others. 2016. Ultra-fast and high-reliability SOT-MRAM: From cache replacement to normally-off computing. IEEE Transactions on Multi-Scale Computing Systems 2, 1 (2016), 49--60.
[53]
Erik Riedel, Christos Faloutsos, Garth A. Gibson, and David Nagle. 2001. Active disks for large-scale data processing. Computer 34, 6 (2001), 68--74.
[54]
J.-W. Ryu and K.-W. Kwon. 2017. Self-adjusting sensing circuit without speed penalty for reliable STT-MRAM. Electronics Letters 53, 4 (2017), 224--226.
[55]
Yeongkyo Seo, Kon-Woo Kwon, Xuanyao Fong, and Kaushik Roy. 2016. High performance and energy-efficient on-chip cache using dual port (1R/1W) spin-orbit torque MRAM. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 6, 3 (2016), 293--304.
[56]
Vivek Seshadri, Kevin Hsieh, Amirali Boroum, Donghyuk Lee, Michael A. Kozuch, Onur Mutlu, Phillip B. Gibbons, and Todd C. Mowry. 2015. Fast bulk bitwise AND and OR in DRAM. IEEE Computer Architecture Letters 14, 2 (2015), 127--131.
[57]
Vivek Seshadri, Donghyuk Lee, Thomas Mullins, Hasan Hassan, Amirali Boroumand, Jeremie Kim, Michael A. Kozuch, Onur Mutlu, Phillip B. Gibbons, and Todd C. Mowry. 2017. Ambit: In-memory accelerator for bulk bitwise operations using commodity DRAM technology. In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 273--287.
[58]
NIST-FIPS Standard. 2001. Announcing the advanced encryption standard (AES). FIPSP 197 (2001).
[59]
Aaron Stillmaker, Zhibin Xiao, and Bevan Baas. 2011. Toward More Accurate Scaling Eestimates of CMOS Circuits From 180 nm to 22 nm. VLSI Computation Lab Technical Report ECE-VCL-2011-4. ECE Department, University of California, Davis.
[60]
David Tawei Wang. 2005. Modern Dram Memory Systems: Performance Analysis and Scheduling Algorithm. Ph.D. Dissertation. Unversity of Maryland.
[61]
Jian-Ping Wang and Jonathan D. Harms. 2015. General structure for computational random access memory (CRAM). (Dec. 29 2015). US Patent No. 9,224,447, Filed April 23, 2014, Issued Dec. 29 2015.
[62]
Yuhao Wang, Leibin Ni, Chip-Hong Chang, and Hao Yu. 2016. DW-AES: A domain-wall nanowire-based AES for high throughput and energy-efficient data encryption in non-volatile memory. IEEE Transactions on Information Forensics and Security 11, 11 (2016), 2426--2440.
[63]
Yuhao Wang, Hao Yu, Leibin Ni, Guang-Bin Huang, Mei Yan, Chuliang Weng, Wei Yang, and Junfeng Zhao. 2015. An energy-efficient nonvolatile in-memory computing architecture for extreme learning machine by domain-wall nanowire devices. IEEE Transactions on Nanotechnology 14, 6 (2015), 998--1012.
[64]
Wm. A. Wulf and Sally A. McKee. 1995. Hitting the memory wall: Implications of the obvious. ACM SIGARCH Computer Architecture News 23, 1 (1995), 20--24.
[65]
Y. Yanagawa and others. 2011. In-substrate-bitline sense amplifier with array-noise-gating scheme for low-noise 4F 2 DRAM array operable at 10-fF cell capacitance. In Proceedings of VLSIC. IEEE, 230--231.
[66]
Dongping Zhang, Nuwan Jayasena, Alexander Lyashevsky, Joseph L. Greathouse, Lifan Xu, and Michael Ignatowski. 2014. TOP-PIM: Throughput-oriented programmable processing in memory. In Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing. ACM, 85--98.
[67]
He Zhang, Wang Kang, Lezhi Wang, Kang L. Wang, and Weisheng Zhao. 2017. Stateful reconfigurable logic via a single-voltage-gated spin Hall-effect driven magnetic tunnel junction in a spintronic memory. IEEE Transactions on Electron Devices 64, 10 (2017), 4295--4301.
[68]
Yaojun Zhang, Bonan Yan, Wenqing Wu, Hai Li, and Yiran Chen. 2015. Giant spin Hall effect (GSHE) logic design for low power application. In Proceedings of the 2015 Design, Automation 8 Test in Europe Conference 8 Exhibition. EDA Consortium, 1000--1005.
[69]
Qiuling Zhu, Kaushik Vaidyanathan, Ofer Shacham, Mark Horowitz, Larry Pileggi, and Franz Franchetti. 2012. Design automation framework for application-specific logic-in-memory blocks. In Proceedings of the 2012 IEEE 23rd International Conference on Application-Specific Systems, Architectures and Processors (ASAP’12). IEEE, 125--132.

Cited By

View all
  • (2023)Analysis of resistive defects on a foundry 8T SRAM-based IMC architectureMicroelectronics Reliability10.1016/j.microrel.2023.115029147(115029)Online publication date: Aug-2023
  • (2022)MagCiM: A Flexible and Non-Volatile Computing-in-Memory Processor for Energy-Efficient Logic ComputationIEEE Access10.1109/ACCESS.2022.315996710(35445-35459)Online publication date: 2022
  • (2021)Preliminary Defect Analysis of 8T SRAM Cells for In-Memory Computing Architectures2021 16th International Conference on Design & Technology of Integrated Systems in Nanoscale Era (DTIS)10.1109/DTIS53253.2021.9505101(1-4)Online publication date: 28-Jun-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Journal on Emerging Technologies in Computing Systems
ACM Journal on Emerging Technologies in Computing Systems  Volume 14, Issue 3
July 2018
150 pages
ISSN:1550-4832
EISSN:1550-4840
DOI:10.1145/3287773
  • Editor:
  • Yuan Xie
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 23 October 2018
Accepted: 01 May 2018
Revised: 01 May 2018
Received: 01 July 2017
Published in JETC Volume 14, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. In-memory computing
  2. SOT-MRAM
  3. giant spin Hall effect
  4. magnetic tunnel junction
  5. memory architecture

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)170
  • Downloads (Last 6 weeks)28
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Analysis of resistive defects on a foundry 8T SRAM-based IMC architectureMicroelectronics Reliability10.1016/j.microrel.2023.115029147(115029)Online publication date: Aug-2023
  • (2022)MagCiM: A Flexible and Non-Volatile Computing-in-Memory Processor for Energy-Efficient Logic ComputationIEEE Access10.1109/ACCESS.2022.315996710(35445-35459)Online publication date: 2022
  • (2021)Preliminary Defect Analysis of 8T SRAM Cells for In-Memory Computing Architectures2021 16th International Conference on Design & Technology of Integrated Systems in Nanoscale Era (DTIS)10.1109/DTIS53253.2021.9505101(1-4)Online publication date: 28-Jun-2021
  • (2020)MRIMA: An MRAM-Based In-Memory AcceleratorIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2019.290788639:5(1123-1136)Online publication date: May-2020
  • (2019)Multi-Port 1R1W Transpose Magnetic Random Access Memory by Hierarchical Bit-Line SwitchingIEEE Access10.1109/ACCESS.2019.29339027(110463-110471)Online publication date: 2019

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media