research-article

Toward Optimal Softcore Carry-aware Approximate Multipliers on Xilinx FPGAs

Authors:

Muhammad Awais,

Syed Ayaz Ali Shah,

Pedro Reviriego,

Hazrat AliAuthors Info & Claims

ACM Transactions on Embedded Computing Systems, Volume 22, Issue 4

Article No.: 76, Pages 1 - 19

https://doi.org/10.1145/3564243

Published: 03 August 2023 Publication History

Abstract

Domain-specific accelerators for signal processing, image processing, and machine learning are increasingly being implemented on SRAM-based field-programmable gate arrays (FPGAs). Owing to the inherent error tolerance of such applications, approximate arithmetic operations, in particular, the design of approximate multipliers, have become an important research problem. Truncation of lower bits is a widely used approximation approach; however, analyzing and limiting the effects of carry-propagation due to this approximation has not been explored in detail yet. In this article, an optimized carry-aware approximate radix-4 Booth multiplier design is presented that leverages the built-in slice look-up tables (LUTs) and carry-chain resources in a novel configuration. The proposed multiplier simplifies the computation of the upper and lower bits and provides significant benefits in terms of FPGA resource usage (LUTs saving 38.5%–42.9%), Power Delay Product (PDP saving 49.4%–53%), performance metric (LUTs × critical path delay (CPD) × PDP saving 68.9%–73.1%) and errors (70% improvement in mean relative error distance) compared to the latest state-of-the-art designs. Therefore, the proposed designs are an attractive choice to implement multiplication on FPGA-based accelerators.

References

[1]

W. Liu, F. Lombardi, and M. Shulte. 2020. A retrospective and prospective view of approximate computing point of view. Proc. IEEE 108, 3 (2020), 394–399. DOI:

[2]

Honglan Jiang, Cong Liu, Fabrizio Lombardi, and Jie Han. 2019. Low-power approximate unsigned multipliers with configurable error recovery. IEEE Trans. Circ. Syst. I: Reg. Papers 66, 1 (2019), 189–202. DOI:

[3]

Sparsh Mittal. 2016. A survey of techniques for approximate computing. ACM Comput. Surveys 48, 4 (2016), 1–33.

Digital Library

[4]

K. Chen, L. Chen, P. Reviriego, and F. Lombardi. 2019. Efficient implementations of reduced precision redundancy (RPR) multiply and accumulate (MAC). IEEE Trans. Comput. 68, 5 (2019), 784–790. DOI:

[5]

Salim Ullah, Hendrik Schmidl, Siva Satyendra Sahoo, Semeen Rehman, and Akash Kumar. 2021. Area-optimized accurate and approximate softcore signed multiplier architectures. IEEE Trans. Comput. 70, 3 (2021), 384–392. DOI:

Digital Library

[6]

Cong Liu, Jie Han, and Fabrizio Lombardi. 2014. A low-power, high-performance approximate multiplier with configurable partial error recovery. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE’14). IEEE, 1–4.

[7]

Issa Qiqieh, Rishad Shafik, Ghaith Tarawneh, Danil Sokolov, and Alex Yakovlev. 2017. Energy-efficient approximate multiplier design using bit significance-driven logic compression. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE’17). IEEE, 7–12.

[8]

Tongxin Yang, Tomoaki Ukezono, and Toshinori Sato. 2017. Low-power and high-speed approximate multiplier design with a tree compressor. In Proceedings of the IEEE International Conference on Computer Design (ICCD’17). IEEE, 89–96.

[9]

Weiqiang Liu, Liangyu Qian, Chenghua Wang, Honglan Jiang, Jie Han, and Fabrizio Lombardi. 2017. Design of approximate radix-4 booth multipliers for error-tolerant computing. IEEE Trans. Comput. 66, 8 (2017), 1435–1441.

Digital Library

[10]

Chia-Hao Lin and Chao Lin. 2013. High accuracy approximate multiplier with error correction. In Proceedings of the 31st IEEE International Conference on Computer Design (ICCD’13). IEEE, 33–38.

[11]

Semeen Rehman, Walaa El-Harouni, Muhammad Shafique, Akash Kumar, and Jorg Henkel. 2016. Architectural-space exploration of approximate multipliers. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’16). IEEE, 1–8.

Digital Library

[12]

Parag Kulkarni, Puneet Gupta, and Milos Ercegovac. 2011. Trading accuracy for power with an underdesigned multiplier architecture. In Proceedings of the 24th Internatioal Conference on VLSI Design. IEEE, 346–351.

Digital Library

[13]

Salim Ullah, Semeen Rehman, Bharath Srinivas Prabakaran, Florian Kriebel, Muhammad Abdullah Hanif, Muhammad Shafique, and Akash Kumar. 2018. Area-optimized low-latency approximate multipliers for FPGA-based hardware accelerators. In Proceedings of the 55th Annual Design Automation Conference. 1–6.

Digital Library

[14]

Vojtech Mrazek, Radek Hrbacek, Zdenek Vasicek, and Lukas Sekanina. 2017. EvoApproxSb: Library of approximate adders and multipliers for circuit design and benchmarking of approximation methods. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE’17). IEEE, 258–261.

[15]

Salim Ullah, Sanjeev Sripadraj Murthy, and Akash Kumar. 2018. SMApproxlib: Library of FPGA-based approximate multipliers. In Proceedings of the 55th ACM/ESDA/IEEE Design Automation Conference (DAC’18). IEEE, 1–6.

Digital Library

[16]

Xilinx. 2020. Vivado design suite user guide. Retrieved from https://www.xilinx.com/content/dam/xilinx/support/documentation/sw_manuals/xilinx2020_1/ug973-vivado-release-notes-install-license.pdf.

[17]

Israel Koren. 2018. Computer Arithmetic Algorithms. CRC Press, Boca Raton, FL.

[18]

Behrooz Parhami. 2010. Computer Arithmetic. Vol. 20. Oxford University Press, Oxford, UK.

[19]

Luigi Dadda. 1965. Some schemes for parallel multipliers. Alta Frequenza 34 (1965), 349–356.

[20]

E. George Walters. 2016. Array multipliers for high throughput in xilinx FPGAs with 6-input LUTs. Computers 5, 4 (2016), 20.

[21]

Xilinx. 2018. 7 Series FPGAs configuration user guide; UG470. Retrieved from https://www.xilinx.com/support/documentation/user_guides/ug470_7Series_Config.pdf.

[22]

Haroon Waris, Chenghua Wang, Weiqiang Liu, and Fabrizio Lombardi. 2021. AxBMs: Approximate radix-8 booth multipliers for high-performance FPGA-based accelerators. IEEE Trans. Circ. Syst. II: Express Briefs 68, 5 (2021), 1566–1570.

[23]

Salim Ullah, Semeen Rehman, Muhammad Shafique, and Akash Kumar. 2022. High-performance accurate and approximate multipliers for FPGA-based hardware accelerators. IEEE Trans. Comput.-Aided Design Integr. Circ. Syst. 41, 2 (2022), 211–224. DOI:

[24]

Antonio Giuseppe Maria Strollo, Ettore Napoli, Davide De Caro, Nicola Petra, and Gennaro Di Meo. 2020. Comparison and extension of approximate 4-2 compressors for low-power approximate multipliers. IEEE Trans. Circ. Syst. I: Reg. Papers 67, 9 (2020), 3021–3034. DOI:

[25]

Farnaz Sabetzadeh, Mohammad Hossein Moaiyeri, and Mohammad Ahmadinejad. 2019. A majority-based imprecise multiplier for ultra-efficient approximate image multiplication. IEEE Trans. Circ. Syst. I: Reg. Papers 66, 11 (2019), 4200–4208. DOI:

[26]

Mohammad Ahmadinejad, Mohammad Hossein Moaiyeri, and Farnaz Sabetzadeh. 2019. Energy and area efficient imprecise compressors for approximate multiplication at nanoscale. AEU—Int. J. Electr. Commun. 110 (2019), 152859. DOI:

[27]

Haroon Waris, Chenghua Wang, and Weiqiang Liu. 2020. Hybrid low radix encoding-based approximate booth multipliers. IEEE Trans. Circ. Syst. II: Express Briefs 67, 12 (2020), 3367–3371. DOI:

[28]

Suganthi Venkatachalam, Elizabeth Adams, Hyuk Jae Lee, and Seok-Bum Ko. 2019. Design and analysis of area and power efficient approximate booth multipliers. IEEE Trans. Comput. 68, 11 (2019), 1697–1703. DOI:

Digital Library

[29]

Honglan Jiang, Jie Han, Fei Qiao, and Fabrizio Lombardi. 2016. Approximate radix-8 booth multipliers for low-power and high-performance operation. IEEE Trans. Comput. 65, 8 (2016), 2638–2644. DOI:

Digital Library

[30]

Mohammad Saeed Ansari, Bruce F. Cockburn, and Jie Han. 2021. An improved logarithmic multiplier for energy-efficient neural computing. IEEE Trans. Comput. 70, 4 (2021), 614–625. DOI:

[31]

Ratko Pilipović, Patricio Bulić, and Uroš Lotrič. 2021. A two-stage operand trimming approximate logarithmic multiplier. IEEE Trans. Circ. Syst. I: Reg. Papers 68, 6 (2021), 2535–2545. DOI:

[32]

Uroš Lotrič, Ratko Pilipović, and Patricio Bulić. 2021. A hybrid radix-4 and approximate logarithmic multiplier for energy efficient image processing. Electronics 10, 10 (2021). Retrieved from https://www.mdpi.com/2079-9292/10/10/1175.

Cited By

Shah YRafferty CKhalid AKhan SJaveed KO’Neill M(2024)Efficient Soft Core Multiplier for Post Quantum Digital Signatures2024 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS58744.2024.10558234(1-5)Online publication date: 19-May-2024
https://doi.org/10.1109/ISCAS58744.2024.10558234
Spagnolo FCorsonello PFrustaci FPerri S(2024)Efficient implementation of signed multipliers on FPGAsComputers and Electrical Engineering10.1016/j.compeleceng.2024.109217116(109217)Online publication date: May-2024
https://doi.org/10.1016/j.compeleceng.2024.109217

Index Terms

Toward Optimal Softcore Carry-aware Approximate Multipliers on Xilinx FPGAs
1. Computer systems organization
  1. Architectures
2. Hardware
  1. Integrated circuits
    1. Logic circuits
      1. Arithmetic and datapath circuits
    2. Reconfigurable logic and FPGAs
      1. Hardware accelerators

Recommendations

A multiplier generator for Xilinx FPGAs
VLSID '96: Proceedings of the 9th International Conference on VLSI Design: VLSI in Mobile Communication

In this paper, we present a module generator which can produce variety of multiplier designs for LUT based FPGAs. It incorporates algorithms for generating sequential, combinational and pipelined designs. The multiplier generator forms a part of the ...
Sequential Large Multipliers on FPGAs

Large operands are used in modern cryptography, signal processing and multimedia applications. Multiplication is one of the most used operations in these applications. The multiplication of the large operands is performed by a large multiplier hardware ...
Systematic synthesis of approximate adders and multipliers with accurate error calculations
Abstract
In this study, we perform logic synthesis and area optimization of approximate ripple-carry adders and Wallace-tree multipliers with a given error constraint. We first implement approximate 1-bit adders having different error rates as ...
Highlights
- Logic synthesis and area optimization of approximate ripple-carry adders and Wallace-tree multipliers is presented.

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems

ACM Transactions on Embedded Computing Systems Volume 22, Issue 4

July 2023

551 pages

ISSN:1539-9087

EISSN:1558-3465

DOI:10.1145/3610418

Editor:
Tulika Mitra
National University of Singapore, Singapore

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

ACM Journals for the Design of Smart and Connected Systems

Publication History

Published: 03 August 2023

Online AM: 21 September 2022

Accepted: 09 September 2022

Revised: 12 July 2022

Received: 11 April 2022

Published in TECS Volume 22, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tag

Neural Network

Qualifiers

Research-article

Funding Sources

ACHILLES
Go2Edge network
Spanish Agencia Estatal de Investigación (AEI)

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
335
Total Downloads

Downloads (Last 12 months)196
Downloads (Last 6 weeks)8

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Shah YRafferty CKhalid AKhan SJaveed KO’Neill M(2024)Efficient Soft Core Multiplier for Post Quantum Digital Signatures2024 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS58744.2024.10558234(1-5)Online publication date: 19-May-2024
https://doi.org/10.1109/ISCAS58744.2024.10558234
Spagnolo FCorsonello PFrustaci FPerri S(2024)Efficient implementation of signed multipliers on FPGAsComputers and Electrical Engineering10.1016/j.compeleceng.2024.109217116(109217)Online publication date: May-2024
https://doi.org/10.1016/j.compeleceng.2024.109217

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents