COPPER: a combinatorial optimization problem solver with processing-in-memory architecture

Wang, Qiankun; Li, Xingchen; Wu, Bingzhe; Yang, Ke; Hu, Wei; Sun, Guangyu; Yang, Yuchao

doi:10.1631/FITEE.2200463

COPPER: a combinatorial optimization problem solver with processing-in-memory architecture

COPPER: 具有存内计算架构的组合优化问题求解器

Research Article
Published: 02 June 2023

Volume 24, pages 731–741, (2023)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Qiankun Wang (汪乾坤) ORCID: orcid.org/0000-0002-3723-3444¹,
Xingchen Li (李星辰)^2,3,
Bingzhe Wu (吴秉哲)⁴,
Ke Yang (杨可)³,
Wei Hu (胡炜)⁵,
Guangyu Sun (孙广宇) ORCID: orcid.org/0000-0002-7315-6589^3,6,7 &
…
Yuchao Yang (杨玉超)³

148 Accesses
3 Citations
Explore all metrics

Abstract

The combinatorial optimization problem (COP), which aims to find the optimal solution in discrete space, is fundamental in various fields. Unfortunately, many COPs are NP-complete, and require much more time to solve as the problem scale increases. Troubled by this, researchers may prefer fast methods even if they are not exact, so approximation algorithms, heuristic algorithms, and machine learning have been proposed. Some works proposed chaotic simulated annealing (CSA) based on the Hopfield neural network and did a good job. However, CSA is not something that current general-purpose processors can handle easily, and there is no special hardware for it. To efficiently perform CSA, we propose a software and hardware co-design. In software, we quantize the weight and output using appropriate bit widths, and then modify the calculations that are not suitable for hardware implementation. In hardware, we design a specialized processing-in-memory hardware architecture named COPPER based on the memristor. COPPER is capable of efficiently running the modified quantized CSA algorithm and supporting the pipeline further acceleration. The results show that COPPER can perform CSA remarkably well in both speed and energy.

摘要

组合优化问题(combinatorial optimization problem, COP)是一类在离散空间中寻找最优解的数学问题, 具有广泛的应用。然而, 许多组合优化问题是NP完全的, 随着问题规模的增加, 解决问题所需的时间急剧增加, 这促使研究人员寻求更快速的解决方法, 即使解不一定是最优的, 如近似算法、启发式算法和机器学习算法等。一些先前的工作基于 Hopfield神经网络提出了混沌模拟退火(chaotic simulated annealing, CSA), 并取得了良好的表现。然而, CSA的计算模式对当前的通用处理器并不友好, 且没有专用的计算硬件。为了高效地执行CSA, 我们提出一种软硬件联合的设计方案。在软件方面, 我们使用适当的位宽对权重和输出进行量化, 并修改那些不适合硬件实现的计算模式。在硬件方面, 我们设计了一种基于忆阻器的专用存内计算硬件架构COPPER。COPPER能够高效地运行修改后的量化CSA算法, 并支持流水线以获得进一步加速。结果表明, COPPER在执行CSA算法时, 速度和能耗方面都十分出色。

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

Chen JR, Wu HQ, Gao B, et al., 2020. A parallel multi-bit programing scheme with high precision for RRAM-based neuromorphic systems. IEEE Trans Electron Dev, 67(5):2213–2217. https://doi.org/10.1109/TED.2020.2979606
Article Google Scholar
Chen LN, Aihara K, 1995. Chaotic simulated annealing by a neural network model with transient chaos. Neur Netw, 8(6):915–930. https://doi.org/10.1016/0893-6080(95)00033-V
Article Google Scholar
Chi P, Li SC, Xu C, et al., 2016. PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory. ACM SIGARCH Comput Archit News, 44(3):27–39. https://doi.org/10.1145/3007787.3001140
Article Google Scholar
Cipra BA, 1987. An introduction to the Ising model. Am Math Mon, 94(10):937–959. https://doi.org/10.1080/00029890.1987.12000742
Article MathSciNet Google Scholar
Hopfield JJ, 1982. Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci USA, 79(8):2554–2558. https://doi.org/10.1073/pnas.79.8.2554
Article MathSciNet MATH Google Scholar
Hopfield JJ, Tank DW, 1985. “Neural” computation of decisions in optimization problems. Biol Cybern, 52(3):141–152. https://doi.org/10.1007/BF00339943
Article MATH Google Scholar
Hung JM, Huang YH, Huang SP, et al., 2022. An 8-Mb DC-current-free binary-to-8b precision ReRAM nonvolatile computing-in-memory macro using time-space-readout with 1286.4-21.6TOPS/W for edge-AI devices. IEEE Int Solid-State Circuits Conf, p.1–3. https://doi.org/10.1109/ISSCC42614.2022.9731715
Johnson MW, Amin MHS, Gildert S, et al., 2011. Quantum annealing with manufactured spins. Nature, 473(7346):194–198. https://doi.org/10.1038/nature10012
Article Google Scholar
Karp RM, 1972. Reducibility among combinatorial problems. In: Miller RE, Thatcher JW, Bohlinger JD (Eds.), Complexity of Computer Computations. Springer, New York, USA, p.85–103. https://doi.org/10.1007/978-1-4684-2001-2_9
Chapter Google Scholar
Li XC, Yuan ZH, Sun GY, et al., 2022. Tailor: removing redundant operations in memristive analog neural network accelerators. Proc 59^th ACM/IEEE Design Automation Conf, p.1009–1014. https://doi.org/10.1145/3489517.3530500
Lucas A, 2014. Ising formulations of many NP problems. Front Phys, 2:5. https://doi.org/10.3389/fphy.2014.00005
Article Google Scholar
Mirhoseini A, Goldie A, Yazgan M, et al., 2020. Chip placement with deep reinforcement learning. https://arxiv.org/abs/2004.10746
Shafiee A, Nag A, Muralimanohar N, et al., 2016. ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars. ACM SIGARCH Comput Archit News, 44(3):14–26. https://doi.org/10.1145/3007787.3001139
Article Google Scholar
Shin SW, Smith G, Smolin JA, et al., 2014. How “quantum” is the D-wave machine? https://arxiv.org/abs/1401.7087
Song LH, Qian XH, Li H, et al., 2017. PipeLayer: a pipelined ReRAM-based accelerator for deep learning. IEEE Int Symp on High Performance Computer Architecture, p.541–552. https://doi.org/10.1109/HPCA.2017.55
Takemoto T, Hayashi M, Yoshimura C, et al., 2019. 2.6 A 2×30k-spin multichip scalable annealing processor based on a processing-in-memory approach for solving large-scale combinatorial optimization problems. IEEE Int Solid-State Circuits Conf, p.52–54. https://doi.org/10.1109/ISSCC.2019.8662517
Takemoto T, Yamamoto K, Yoshimura C, et al., 2021. 4.6 A 144Kb annealing system composed of 9×16Kb annealing processor chips with scalable chip-to-chip connections for large-scale combinatorial optimization problems. IEEE Int Solid-State Circuits Conf, p.64–66. https://doi.org/10.1109/ISSCC42613.2021.9365748
Vinyals O, Fortunato M, Jaitly N, 2015. Pointer networks. Proc 28^th Int Conf on Neural Information Processing Systems, p.2692–2700.
Yamamoto K, Ando K, Mertig N, et al., 2020. 7.3 STATICA: a 512-spin 0.25M-weight full-digital annealing processor with a near-memory all-spin-updates-at-once architecture for combinatorial optimization with complete spinspin interactions. IEEE Int Solid-State Circuits Conf, p.138–140. https://doi.org/10.1109/ISSCC19947.2020.9062965
Yamaoka M, Yoshimura C, Hayashi M, et al., 2016. A 20k-spin Ising chip to solve combinatorial optimization problems with CMOS annealing. IEEE J Sol-State Circ, 51(1):303–309. https://doi.org/10.1109/JSSC.2015.2498601
Article Google Scholar
Yang K, Duan QX, Wang YH, et al., 2020. Transiently chaotic simulated annealing based on intrinsic nonlinearity of memristors for efficient solution of optimization problems. Sci Adv, 6(33):eaba9901. https://doi.org/10.1126/sciadv.aba9901
Article Google Scholar
Zhu ZH, Sun HB, Lin YJ, et al., 2019. A configurable multi-precision CNN computing framework based on single bit RRAM. Proc 56^th Annual Design Automation Conf, Article 56. https://doi.org/10.1145/3316781.3317739

Download references

Author information

Authors and Affiliations

School of Software & Microelectronics, Peking University, Beijing, 100871, China
Qiankun Wang (汪乾坤)
School of Computer Science, Peking University, Beijing, 100871, China
Xingchen Li (李星辰)
School of Integrated Circuits, Peking University, Beijing, 100871, China
Xingchen Li (李星辰), Ke Yang (杨可), Guangyu Sun (孙广宇) & Yuchao Yang (杨玉超)
Tencent AI Lab, Shenzhen, 518057, China
Bingzhe Wu (吴秉哲)
College of Physics and Information Engineering, Fuzhou University, Fuzhou, 350116, China
Wei Hu (胡炜)
Beijing Advanced Innovation Center for Integrated Circuits, Beijing, 100871, China
Guangyu Sun (孙广宇)
Beijing Academy of Artificial Intelligence, Beijing, 100080, China
Guangyu Sun (孙广宇)

Authors

Qiankun Wang (汪乾坤)
View author publications
You can also search for this author in PubMed Google Scholar
Xingchen Li (李星辰)
View author publications
You can also search for this author in PubMed Google Scholar
Bingzhe Wu (吴秉哲)
View author publications
You can also search for this author in PubMed Google Scholar
Ke Yang (杨可)
View author publications
You can also search for this author in PubMed Google Scholar
Wei Hu (胡炜)
View author publications
You can also search for this author in PubMed Google Scholar
Guangyu Sun (孙广宇)
View author publications
You can also search for this author in PubMed Google Scholar
Yuchao Yang (杨玉超)
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Qiankun WANG led the research and was mainly responsible for implementing the algorithm, designing the hardware, and drafting the paper. Xingchen LI provided the design ideas and some data for the hardware part. Bingzhe WU sorted out the algorithm and pointed out the possibility of combining software and hardware. Ke YANG and Yuchao YANG laid the foundation for this research and provided some parameters for the algorithm. Wei HU provided the stability analysis of ReRAM and the latest research progress of ReRAM PIM macro. Guangyu SUN made many suggestions on the research, and revised and finalized the paper.

Corresponding author

Correspondence to Guangyu Sun (孙广宇).

Additional information

Compliance with ethics guidelines

Qiankun WANG, Xingchen LI, Bingzhe WU, Ke YANG, Wei HU, Guangyu SUN, and Yuchao YANG declare that they have no conflict of interest.

Project supported by the National Natural Science Foundation of China (Nos. 61832020, 62032001, 92064006, and 62274036), the Beijing Academy of Artificial Intelligence (BAAI) of China, and the 111 Project of China (No. B18001)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Q., Li, X., Wu, B. et al. COPPER: a combinatorial optimization problem solver with processing-in-memory architecture. Front Inform Technol Electron Eng 24, 731–741 (2023). https://doi.org/10.1631/FITEE.2200463

Download citation

Received: 14 October 2022
Accepted: 02 March 2023
Published: 02 June 2023
Issue Date: May 2023
DOI: https://doi.org/10.1631/FITEE.2200463

Key words

关键词

CLC number

TP389.1

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

COPPER: a combinatorial optimization problem solver with processing-in-memory architecture

Abstract

摘要

Access this article

Subscribe and save

Buy Now

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Compliance with ethics guidelines

Rights and permissions

About this article

Cite this article

Share this article

Key words

关键词

CLC number

Subscribe and save

Buy Now

Search

Navigation