research-article

Designing an IEEE-Compliant FPU that Supports Configurable Precision for Soft Processors

Authors:

Chris Keilbart,

Steven J.E. Wilton,

Lesley ShannonAuthors Info & Claims

ACM Transactions on Reconfigurable Technology and Systems, Volume 17, Issue 2

Article No.: 33, Pages 1 - 32

https://doi.org/10.1145/3650036

Published: 30 April 2024 Publication History

Abstract

Field Programmable Gate Arrays (FPGAs) are commonly used to accelerate floating-point (FP) applications. Although researchers have extensively studied FPGA FP implementations, existing work has largely focused on standalone operators and frequency-optimized designs. These works are not suitable for FPGA soft processors which are more sensitive to latency, impose a lower frequency ceiling, and require IEEE FP standard compliance. We present an open-source floating-point unit (FPU) for FPGA RISC-V soft processors that is fully IEEE compliant with configurable levels of FP precision. Our design emphasizes runtime performance with 25% lower latency in the most common instructions compared to previous works while maintaining efficient resource utilization.

Our FPU also allows users to explore various mantissa widths without having to rewrite or recompile their algorithms. We use this to investigate the scalability of our reduced-precision FPU across numerous microbenchmark functions as well as more complex case studies. Our experiments show that applications like the discrete cosine transformation and the Black-Scholes model can realize a speedup of more than 1.35x in conjunction with a 43% and 35% reduction in lookup table and flip-flop resources while experiencing less than a 0.025% average loss in numerical accuracy with a 16-bit mantissa width.

References

[1]

Intel. 2023. Floating-Point IP Cores User Guide. Intel. Retrieved June 7, 2023 from https://www.intel.com/content/www/us/en/docs/programmable/683750/23-1/about-floating-point-ip-cores.html

[2]

Frontgrade Gaisler. 2004. GRFPU High-Performance Floating-Point Unit. Frontgrade Gaisler. Retrieved January 11, 2023 from https://www.gaisler.com/index.php/products/ipcores/ieee754fpu

[3]

Arvind, Krste Asanović, Rimas Avižienis, Jacob Bachmeyer, Christopher F. Batten, Allen J. Baum, Alex Bradbury, Scott Beamer, Preston Briggs, Christopher Celio, Chuanhua Chang, et al. 2019. The RISC-V Instruction Set Manual, Volume I: User-Level ISA, Document Version 20191213, Andrew Waterman and Krste Asanović (Eds.). RISC-V Foundation.

[4]

Syed Asad Alam, James Garland, and David Gregg. 2021. Low-precision logarithmic number systems: Beyond base-2. ACM Trans. Archit. Code Optim. 18, 4, Article 47 (July 2021), 25 pages. DOI:

Digital Library

[5]

Jaenicke Allan and Wayne Luk. 2001. Parameterised floating-point arithmetic on FPGAs. In Proceedings of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Vol. 2, IEEE, 897–900. DOI:

Digital Library

[6]

AMD Xilinx. 2020. Floating-Point Operator. AMD Xilinx. Retrieved January 11, 2023 from https://www.xilinx.com/products/intellectual-property/floating_pt.html

[7]

AMD Xilinx. 2022. MicroBlaze Soft Processor Core. AMD Xilinx. Retrieved January 11, 2023 from https://www.xilinx.com/products/design-tools/microblaze.html

[8]

Javier Bruguera and Tomás Lang. 1999. Leading-one prediction with concurrent position correction. IEEE Trans. Comput. 48, 10 (1999), 1083–1097. DOI:

Digital Library

[9]

Javier Bruguera and Tomás Lang. 2000. Rounding in Floating-Point Addition Using a Compound Adder. Technical Report. University of Santiago de Compostela.

[10]

James W. Cooley and John W. Tukey. 1965. An algorithm for the machine calculation of complex Fourier series. Math. Comp. 19, 90 (1965), 297–301. Retrieved from http://www.jstor.org/stable/2003354

[11]

Florent de Dinechin and Bogdan Pasca. 2011. Designing custom arithmetic data paths with FloPoCo. IEEE Des. Test 28, 4 (2011), 18–27. DOI:

Digital Library

[12]

Yong Dou, Stamatis Vassiliadis, Georgi K. Kuzmanov, and Georgi N. Gaydadjiev. 2005. 64-bit floating-point FPGA matrix multiplication. In Proceedings of the 2005 ACM/SIGDA 13th International Symposium on Field-Programmable Gate Arrays (FPGA ’05). ACM, New York, NY, USA, 86–95. DOI:

Digital Library

[13]

Pedro Echeverría and Marisa López-Vallejo. 2011. Customizing floating-point units for FPGAs: Area-performance-standard trade-offs. Microprocess. Microsyst. 35, 6 (2011), 535–546. DOI:

[14]

EEMBC 2012. About the EEMBC FPMark™ Floating-Point Benchmark Suite. EEMBC. Retrieved May 25, 2023 from https://www.eembc.org/fpmark/

[15]

Miloš Dragutin Ercegovac and Tomás Lang. 2003. Digital Arithmetic (1st ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.

Digital Library

[16]

Xin Fang and Miriam Leeser. 2016. Open-source variable-precision floating-point library for major commercial FPGAs. ACM Trans. Reconfigurable Technol. Syst. 9, 3, Article 20 (June 2016), 17 pages. DOI:

Digital Library

[17]

Paul Michael Farmwald. 1981. On the Design of High Performance Digital Arithmetic Units. Ph. D. Dissertation. Stanford University, Stanford, CA, USA. AAI8201985.

Digital Library

[18]

John Leroy Gustafson. 2015. The End of Error. CRC Press, Boca Raton, FL, USA.

[19]

Espen Gaardner Haug. 2007. The Complete Guide to Option Pricing Formulas. McGraw-Hill, New York, NY, USA.

[20]

John R. Hauser. 2018. Berkeley SoftFloat. University of California, Berkeley. Retrieved January 11, 2023 from http://www.jhauser.us/arithmetic/SoftFloat.html

[21]

John R. Hauser. 2018. Berkeley TestFloat. University of California, Berkeley. Retrieved November 29, 2023 from http://www.jhauser.us/arithmetic/TestFloat.html

[22]

Carsten Heinz, Yannick Lavan, Jaco Hofmann, and Andreas Koch. 2019. A catalog and in-hardware evaluation of open-source drop-in compatible RISC-V softcore processors. In Proceedings of the 2019 International Conference on ReConFigurable Computing and FPGAs (ReConFig) . IEEE, 1–8. DOI:

[23]

Karl S. Hemmert and Keith D. Underwood. 2010. Fast, efficient floating-point adders and multipliers for FPGAs. ACM Trans. Reconfigurable Technol. Syst. 3, 3, Article 11 (Sept. 2010), 30 pages. DOI:

Digital Library

[24]

Nicholas John Higham. 2002. Accuracy and Stability of Numerical Algorithms (2nd ed.). Society for Industrial and Applied Mathematics, Philadelphia, PA, USA. DOI:

[25]

Andrew Howard, Mark Sandler, Bo Chen, Weijun Wang, Liang-Chieh Chen, Mingxing Tan, Grace Chu, Vijay Vasudevan, Yukun Zhu, Ruoming Pang, Hartwig Adam, and Quoc Le. 2019. Searching for MobileNetV3. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV ’19). 1314–1324. DOI:

[26]

2019. IEEE standard for floating-point arithmetic. IEEE Std 754-2019 (Revision of IEEE 754-2008) (June 2019), 1–84. DOI:

[27]

Imperas 2021. Imperas RISC-V Tests. Imperas. Retrieved January 11, 2023 from https://github.com/riscv-ovpsim/imperas-riscv-tests

[28]

Intel. 2023. Nios® Soft Processor Series. Intel. Retrieved November 19, 2023 from https://www.intel.com/content/www/us/en/products/details/fpga/nios-processor.html

[29]

Manish Kumar Jaiswal and Hayden K.-H. So. 2017. DSP48E efficient floating point multiplier architectures on FPGA. In Proceedings of the 2017 30th International Conference on VLSI Design and 2017 16th International Conference on Embedded Systems (VLSID ’17). IEEE, 1–6. DOI:

[30]

Byeong Lee. 1984. A new algorithm to compute the discrete cosine transform. IEEE Trans. Acoust. Speech Signal. 32, 6 (1984), 1243–1245. DOI:

[31]

Stefan Mach, Fabian Schuiki, Florian Zaruba, and Luca Benini. 2021. FPnew: An open-source multiformat floating-point unit architecture for energy-proportional transprecision computing. IEEE Trans. Very Large Scale Integr. Syst. 29, 4 (2021), 774–787. DOI:

[32]

Ali Malik and Seok-bum Ko. 2006. A study on the floating-point adder in FPGAS. In Proceedings of the 2006 Canadian Conference on Electrical and Computer Engineering. IEEE, 86–89. DOI:

[33]

Eric Matthews, Yuhui Gao, and Lesley Shannon. 2020. Exploring writeback designs for efficiently leveraging parallel-execution units in FPGA-based soft-processors. In Proceedings of the 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM ’20). IEEE, 120–128. DOI:

[34]

Eric Matthews and Lesley Shannon. 2017. TAIGA: A new RISC-V soft-processor framework enabling high performance CPU architectural features. In Proceedings of the 2017 27th International Conference on Field Programmable Logic and Applications (FPL ’17). IEEE, 1–4. DOI:

[35]

Jean-Michel Muller, Nicolas Brunie, Florent de Dinechin, Claude-Pierre Jeannerod, Mioara Joldes, Vincent Lefevre, Guillaume Melquiond, Nathalie Revol, and Serge Torres. 2018. Handbook of Floating-Point Arithmetic (2nd ed.). Birkhauser, Basel, Switzerland.

[36]

Anouar Nechi, Lukas Groth, Saleh Mulhem, Farhad Merchant, Rainer Buchty, and Mladen Berekovic. 2023. FPGA-based deep learning inference accelerators: Where are we standing? ACM Trans. Reconfigurable Technol. Syst. 16, 4 (Sept. 2023), 1–32. DOI:

Digital Library

[37]

Stuart F. Oberman, Hesham Al-Twaijry, and Michael J. Flynn. 1997. The SNAP project: Design of floating point arithmetic units. In Proceedings of the 13th IEEE Symposium on Computer Arithmetic. IEEE, 156–165. DOI:

[38]

OpenHW Group 2017. CVA5. OpenHW Group. Retrieved May 17, 2023 from https://github.com/openhwgroup/cva5

[39]

Keith Packard. 2019. Picolibc: C Libraries for Smaller Embedded Systems. Retrieved May 17, 2023 from https://keithp.com/picolibc/

[40]

Stefania Perri, Fanny Spagnolo, Fabio Frustaci, and Pasquale Corsonello. 2023. Design of leading zero counters on FPGAs. IEEE Embed. Syst. Lett. 15, 3 (2023), 149-152. DOI:

Digital Library

[41]

ISO Central Secretary. 2009. ISO/IEC/IEEE 9945:2009 Information technology – Portable Operating System Interface (POSIX®) Base Specifications, Issue 7. Standard ISO/IEC/IEEE 9945:2009. International Organization for Standardization.

[42]

ISO Central Secretary. 2018. ISO/IEC 9899:2018 Information technology - Programming languages - C. Standard ISO/IEC 9899:2018. International Organization for Standardization.

[43]

ISO Central Secretary. 2020. ISO/IEC 23008-2:2020 Information Technology – High Efficiency Coding and Media Delivery in Heterogeneous Environments – Part 2: High Efficiency Video Coding. Standard ISO/IEC 23008-2:2020. International Organization for Standardization.

[44]

Wilson Snyder. 2006. Verilator. Veripool. Retrieved May 17, 2023 from https://veripool.org/verilator/

[45]

SpinalHDL 2021. NaxRiscv. SpinalHDL. Retrieved May 1, 2023 from https://github.com/SpinalHDL/NaxRiscv

[46]

SpinalHDL 2017. VexRiscV - A FPGA Friendly 32 Bit RISC-V CPU Implementation. SpinalHDL. Retrieved January 11, 2023 from https://github.com/SpinalHDL/VexRiscv

[47]

Johannes Stallkamp, Marc Schlipsing, Jan Salmen, and Christian Igel. 2012. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Networks 32 (2012), 323–332. DOI:Selected Papers from IJCNN 2011.

Digital Library

[48]

Giuseppe Tagliavini, Stefan Mach, Davide Rossi, Andrea Marongiu, and Luca Benini. 2018. A transprecision floating-point platform for ultra-low power computing. In Proceedings of the 2018 Design, Automation and Test in Europe Conference and Exhibition (DATE ’18). 1051–1056. DOI:

[49]

David B. Thomas. 2019. Templatised soft floating-point for high-level synthesis. In Proceedings of the 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM ’19). 227–235. DOI:

[50]

VectorBlox Computing 2015. Vectorblox ORCA. VectorBlox Computing. Retrieved January 11, 2023 from https://github.com/UBC-ORCA/orca-public

[51]

Julio Villalba and Javier Hormigo. 2022. High-radix formats for enhancing floating-point FPGA implementations. Circuits, Systems, and Signal Processing 41, 3 (2022), 1683–1703. DOI:

Digital Library

[52]

Shibo Wang and Pankaj Kanwar. 2019. BFloat16: The Secret to High Performance on Cloud TPUs. Google Cloud. Retrieved October 5, 2023 from https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus

[53]

Xiaojun Wang and Miriam Leeser. 2010. VFloat: A variable precision fixed- and floating-point library for reconfigurable hardware. ACM Trans. Reconfigurable Technol. Syst. 3, 3, Article 16 (Sept. 2010), 34 pages. DOI:

Digital Library

[54]

YosysHQ 2015. PicoRV32 - A Size-Optimized RISC-V CPU. YosysHQ. Retrieved January 11, 2023 from https://github.com/YosysHQ/picorv32

Index Terms

Designing an IEEE-Compliant FPU that Supports Configurable Precision for Soft Processors
1. Computer systems organization
  1. Architectures
    1. Serial architectures
      1. Reduced instruction set computing
2. Hardware
  1. Integrated circuits
    1. Logic circuits
      1. Arithmetic and datapath circuits
    2. Reconfigurable logic and FPGAs
      1. Reconfigurable logic applications

Recommendations

The microarchitecture of FPGA-based soft processors
CASES '05: Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems

As more embedded systems are built using FPGA platforms, there is an increasing need to support processors in FPGAs. One option is the soft processor, a programmable instruction processor implemented in the reconfigurable logic of the FPGA. Commercial ...
Fine-grain performance scaling of soft vector processors
CASES '09: Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems

Embedded systems are often implemented on FPGA devices and 25% of the time include a soft processor--a processor built using the FPGA reprogrammable fabric. Because of their prevalence and flexibility, soft processors are compelling targets for ...
An Efficient Instruction Fetch Architecture for a RISC-V Soft Processor on an FPGA
HEART '19: Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies

To reduce the code size of application programs for RISC-V soft processors on an FPGA, it is desirable for the processor to support the RISC-V compressed instruction extension. In this paper, we implement an efficient instruction fetch unit. We clarify ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Reconfigurable Technology and Systems

ACM Transactions on Reconfigurable Technology and Systems Volume 17, Issue 2

June 2024

464 pages

EISSN:1936-7414

DOI:10.1145/3613550

Editor:
Deming Chen
University of Illinois, Urbana-Champaign, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 April 2024

Online AM: 15 March 2024

Accepted: 15 February 2024

Revised: 14 December 2023

Received: 17 June 2023

Published in TRETS Volume 17, Issue 2

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Natural Sciences and Engineering Research Council of Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
169
Total Downloads

Downloads (Last 12 months)169
Downloads (Last 6 weeks)16

Reflects downloads up to 22 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents