research-article

Co-mining: a processing-in-memory assisted framework for memory-intensive PoW acceleration

Authors:

Zili ShaoAuthors Info & Claims

LCTES 2022: Proceedings of the 23rd ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems

Pages 1 - 12

https://doi.org/10.1145/3519941.3535064

Published: 14 June 2022 Publication History

Abstract

Recently, HBM (High Bandwidth Memory) and PIM (Processing in Memory) integrated technology such as Samsung function-in-memory DRAM opens a new door for memory-intensive PoW acceleration by jointly exploiting GPU, PIM and HBM. In this paper, we for the first time propose a GPU/PIM Co-Mining framework to accelerate memory intensive PoW by fully exploiting HBM-PIM's bandwidth and coordinately scheduling mining tasks in both GPU and PIM. Specifically, we first design a linear programming model to intelligently guide the GPU/PIM task scheduling. An extended finite-state-machine model is designed for the GPU memory controller to switch PIM working mode (compute/memory mode) accordingly. Finally, considering the speed difference between intra-/inter-channel memory accesses, a hybrid memory access method is proposed to minimize inter-channel data movements. We evaluate Co-Mining based on Samsung's HBM2-based function-in-memory architecture. The experimental results show that it can achieve up to 38.5% hashrate improvement compared with the method by directly integrating PIM into PoW acceleration with GPU.

References

[1]

2022. BITMAIN Technologies. https://www.bitmain.com/

[2]

2022. Co-Mining implementation. https://doi.org/10.5281/zenodo.6605099

[3]

2022. Ethereum DAG size. https://investoon.com/tools/dag_size

[4]

2022. nsfminer: an Ethash GPU mining implementation. https://github.com/no-fee-ethereum-mining/nsfminer

[5]

2022. RTX2060 Specification. https://www.techpowerup.com/gpu-specs/geforce-rtx-2060.c3310

[6]

2022. RTX3060 Specification. https://www.techpowerup.com/gpu-specs/geforce-rtx-3060.c3682

[7]

2022. RTX3090 Specification. https://www.techpowerup.com/gpu-specs/geforce-rtx-3090.c3622

[8]

Fan Chen, Linghao Song, Hai Helen Li, and Yiran Chen. 2019. Zara: A novel zero-free dataflow accelerator for generative adversarial networks in 3d reram. In Proceedings of the 56th Annual Design Automation Conference 2019. 1–6. https://doi.org/10.1145/3316781.3317936

Digital Library

[9]

Ping Chi, Shuangchen Li, Cong Xu, Tao Zhang, Jishen Zhao, Yongpan Liu, Yu Wang, and Yuan Xie. 2016. Prime: A novel processing-in-memory architecture for neural network computation in reram-based main memory. ACM SIGARCH Computer Architecture News, 44, 3 (2016), 27–39. https://doi.org/10.1109/ISCA.2016.13

Digital Library

[10]

Jiwon Choe, Tali Moreshet, R Iris Bahar, and Maurice Herlihy. 2019. Attacking memory-hard scrypt with near-data-processing. In Proceedings of the International Symposium on Memory Systems. 33–37. https://doi.org/10.1145/3357526.3357570

Digital Library

[11]

Zonghao Feng and Qiong Luo. 2020. Evaluating memory-hard proof-of-work algorithms on three processors. Proceedings of the VLDB Endowment, 13, 6 (2020), 898–911. https://doi.org/10.14778/3380750.3380759

Digital Library

[12]

G Fowler. 1991. Fowler/noll/vo (fnv) hash. http://isthe.com/chongo/tech/comp/fnv

[13]

Runchao Han, Nikos Foutris, and Christos Kotselidis. 2019. Demystifying crypto-mining: Analysis and optimizations of memory-hard pow algorithms. In 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 22–33. https://doi.org/10.1109/ISPASS.2019.00011

[14]

Shihab Shahriar Hazari and Qusay H Mahmoud. 2019. A parallel proof of work to improve transaction speed and scalability in blockchain systems. In 2019 IEEE 9th annual computing and communication workshop and conference (CCWC). 916–921. https://doi.org/10.1109/CCWC.2019.8666535

[15]

Kevin Hsieh, Eiman Ebrahimi, Gwangsun Kim, Niladrish Chatterjee, Mike O’Connor, Nandita Vijaykumar, Onur Mutlu, and Stephen W Keckler. 2016. Transparent offloading and mapping (TOM) enabling programmer-transparent near-data processing in GPU systems. ACM SIGARCH Computer Architecture News, 44, 3 (2016), 204–216. https://doi.org/10.1109/ISCA.2016.27

Digital Library

[16]

Gwangsun Kim, Niladrish Chatterjee, Mike O’Connor, and Kevin Hsieh. 2017. Toward standardized near-data processing with unrestricted data placement for GPUs. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1–12. https://doi.org/10.1145/3126908.3126965

Digital Library

[17]

Young-Cheon Kwon, Suk Han Lee, Jaehoon Lee, Sang-Hyuk Kwon, Je Min Ryu, Jong-Pil Son, O Seongil, Hak-Soo Yu, Haesuk Lee, and Soo Young Kim. 2021. 25.4 A 20nm 6GB Function-In-Memory DRAM, Based on HBM2 with a 1.2 TFLOPS Programmable Computing Unit Using Bank-Level Parallelism, for Machine Learning Applications. In 2021 IEEE International Solid-State Circuits Conference (ISSCC). 64, 350–352. https://doi.org/10.1109/ISSCC42613.2021.9365862

[18]

Donghyuk Lee, Saugata Ghose, Gennady Pekhimenko, Samira Khan, and Onur Mutlu. 2016. Simultaneous multi-layer access: Improving 3D-stacked memory bandwidth at low cost. ACM Transactions on Architecture and Code Optimization (TACO), 12, 4 (2016), 1–29. https://doi.org/10.1145/2832911

Digital Library

[19]

Sukhan Lee, Shin-haeng Kang, Jaehoon Lee, Hyeonsu Kim, Eojin Lee, Seungwoo Seo, Hosang Yoon, Seungwon Lee, Kyounghwan Lim, and Hyunsung Shin. 2021. Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology: Industrial Product. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA). 43–56. https://doi.org/10.1109/ISCA52012.2021.00013

Digital Library

[20]

Satoshi Nakamoto. 2008. Bitcoin: A peer-to-peer electronic cash system. Decentralized Business Review, 21260.

[21]

Rafael Pass, Lior Seeman, and Abhi Shelat. 2017. Analysis of the blockchain protocol in asynchronous networks. In Annual International Conference on the Theory and Applications of Cryptographic Techniques. 643–673. https://doi.org/10.1007/978-3-319-56614-6_22

[22]

Ashutosh Pattnaik, Xulong Tang, Adwait Jog, Onur Kayiran, Asit K Mishra, Mahmut T Kandemir, Onur Mutlu, and Chita R Das. 2016. Scheduling techniques for GPU architectures with processing-in-memory capabilities. In Proceedings of the 2016 International Conference on Parallel Architectures and Compilation. 31–44. https://doi.org/10.1145/2967938.2967940

Digital Library

[23]

Colin Percival. 2009. Stronger key derivation via sequential memory-hard functions.

[24]

Marc Pilkington. 2016. Blockchain technology: principles and applications. In Research handbook on digital transformations. Edward Elgar Publishing. https://doi.org/10.4337/9781784717766.00019

[25]

Vivek Seshadri, Yoongu Kim, Chris Fallin, Donghyuk Lee, Rachata Ausavarungnirun, Gennady Pekhimenko, Yixin Luo, Onur Mutlu, Phillip B Gibbons, and Michael A Kozuch. 2013. RowClone: Fast and energy-efficient in-DRAM bulk data copy and initialization. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture. 185–197. https://doi.org/10.1145/2540708.2540725

Digital Library

[26]

Meng Shen, Junxian Duan, Liehuang Zhu, Jie Zhang, Xiaojiang Du, and Mohsen Guizani. 2020. Blockchain-based incentives for secure and collaborative data sharing in multiple clouds. IEEE Journal on Selected Areas in Communications, 38, 6 (2020), 1229–1241. https://doi.org/10.1109/JSAC.2020.2986619

[27]

Linghao Song, Xuehai Qian, Hai Li, and Yiran Chen. 2017. Pipelayer: A pipelined reram-based accelerator for deep learning. In 2017 IEEE international symposium on high performance computer architecture (HPCA). 541–552. https://doi.org/10.1109/HPCA.2017.55

[28]

Linghao Song, Youwei Zhuo, Xuehai Qian, Hai Li, and Yiran Chen. 2018. GraphR: Accelerating graph processing using ReRAM. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). 531–543. https://doi.org/10.1109/HPCA.2018.00052

[29]

Jiya Su, Linfeng He, Peng Jiang, and Rujia Wang. 2021. Exploring PIM Architecture for High-Performance Graph Pattern Mining. IEEE Computer Architecture Letters, 20, 2 (2021), 114–117. https://doi.org/10.1109/LCA.2021.3103665

Digital Library

[30]

Fang Wang, Zhaoyan Shen, Lei Han, and Zili Shao. 2019. ReRAM-based processing-in-memory architecture for blockchain platforms. In Proceedings of the 24th Asia and South Pacific Design Automation Conference. 615–620. https://doi.org/10.1145/3287624.3287656

Digital Library

[31]

Qian Wang, Tianyu Wang, Zhaoyan Shen, Zhiping Jia, Mengying Zhao, and Zili Shao. 2019. Re-tangle: A reram-based processing-in-memory architecture for transaction-based blockchain. In 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). 1–8. https://doi.org/10.1109/ICCAD45719.2019.8942056

[32]

Gavin Wood. 2014. Ethereum: A secure decentralised generalised transaction ledger. Ethereum yellow paper, 151, 2014 (2014), 1–32.

[33]

Kun Wu, Guohao Dai, Xing Hu, Shuangchen Li, Xinfeng Xie, Yu Wang, and Yuan Xie. 2019. Memory-bound proof-of-work acceleration for blockchain applications. In Proceedings of the 56th Annual Design Automation Conference 2019. 1–6. https://doi.org/10.1145/3316781.3317862

Digital Library

[34]

Liang Yan, Mingzhe Zhang, Rujia Wang, Xiaoming Chen, Xingqi Zou, Xiaoyang Lu, Yinhe Han, and Xian-He Sun. 2021. CoPIM: a concurrency-aware PIM workload offloading architecture for graph applications. In 2021 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED). 1–6. https://doi.org/10.1109/ISLPED52811.2021.9502483

Digital Library

[35]

Guy Zyskind and Oz Nathan. 2015. Decentralizing privacy: Using blockchain to protect personal data. In 2015 IEEE Security and Privacy Workshops. 180–184. https://doi.org/10.1109/SPW.2015.27

Index Terms

Co-mining: a processing-in-memory assisted framework for memory-intensive PoW acceleration
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Heterogeneous (hybrid) systems

Recommendations

Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities
PACT '16: Proceedings of the 2016 International Conference on Parallel Architectures and Compilation

Processing data in or near memory (PIM), as opposed to in conventional computational units in a processor, can greatly alleviate the performance and energy penalties of data transfers from/to main memory. Graphics Processing Unit (GPU) architectures and ...
Towards memory-efficient processing-in-memory architecture for convolutional neural networks
LCTES '17

Convolutional neural networks (CNNs) are widely adopted in artificial intelligent systems. In contrast to conventional computing centric applications, the computational and memory resources of CNN applications are mixed together in the network weights. ...
A Prototype Processing-In-Memory (PIM) Chip for the Data-Intensive Architecture (DIVA) System

The Data-Intensive Architecture (DIVA) system employs Processing-In-Memory (PIM) chips as smart-memory coprocessors. This architecture exploits inherent memory bandwidth both on chip and across the system to target several classes of bandwidth-limited ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

LCTES 2022: Proceedings of the 23rd ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems

June 2022

161 pages

ISBN:9781450392662

DOI:10.1145/3519941

General Chair:
Tobias Grosser
University of Edinburgh, UK
,
Program Chair:
Kyoungwoo Lee
Yonsei University, South Korea

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Conference

LCTES '22

Sponsor:

LCTES '22: 23rd ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems

June 14, 2022

CA, San Diego, USA

Acceptance Rates

Overall Acceptance Rate 116 of 438 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
207
Total Downloads

Downloads (Last 12 months)46
Downloads (Last 6 weeks)1

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents