Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Designing an IEEE-Compliant FPU that Supports Configurable Precision for Soft Processors

Published: 30 April 2024 Publication History

Abstract

Field Programmable Gate Arrays (FPGAs) are commonly used to accelerate floating-point (FP) applications. Although researchers have extensively studied FPGA FP implementations, existing work has largely focused on standalone operators and frequency-optimized designs. These works are not suitable for FPGA soft processors which are more sensitive to latency, impose a lower frequency ceiling, and require IEEE FP standard compliance. We present an open-source floating-point unit (FPU) for FPGA RISC-V soft processors that is fully IEEE compliant with configurable levels of FP precision. Our design emphasizes runtime performance with 25% lower latency in the most common instructions compared to previous works while maintaining efficient resource utilization.
Our FPU also allows users to explore various mantissa widths without having to rewrite or recompile their algorithms. We use this to investigate the scalability of our reduced-precision FPU across numerous microbenchmark functions as well as more complex case studies. Our experiments show that applications like the discrete cosine transformation and the Black-Scholes model can realize a speedup of more than 1.35x in conjunction with a 43% and 35% reduction in lookup table and flip-flop resources while experiencing less than a 0.025% average loss in numerical accuracy with a 16-bit mantissa width.

References

[1]
Intel. 2023. Floating-Point IP Cores User Guide. Intel. Retrieved June 7, 2023 from https://www.intel.com/content/www/us/en/docs/programmable/683750/23-1/about-floating-point-ip-cores.html
[2]
Frontgrade Gaisler. 2004. GRFPU High-Performance Floating-Point Unit. Frontgrade Gaisler. Retrieved January 11, 2023 from https://www.gaisler.com/index.php/products/ipcores/ieee754fpu
[3]
Arvind, Krste Asanović, Rimas Avižienis, Jacob Bachmeyer, Christopher F. Batten, Allen J. Baum, Alex Bradbury, Scott Beamer, Preston Briggs, Christopher Celio, Chuanhua Chang, et al. 2019. The RISC-V Instruction Set Manual, Volume I: User-Level ISA, Document Version 20191213, Andrew Waterman and Krste Asanović (Eds.). RISC-V Foundation.
[4]
Syed Asad Alam, James Garland, and David Gregg. 2021. Low-precision logarithmic number systems: Beyond base-2. ACM Trans. Archit. Code Optim. 18, 4, Article 47 (July 2021), 25 pages. DOI:
[5]
Jaenicke Allan and Wayne Luk. 2001. Parameterised floating-point arithmetic on FPGAs. In Proceedings of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Vol. 2, IEEE, 897–900. DOI:
[6]
AMD Xilinx. 2020. Floating-Point Operator. AMD Xilinx. Retrieved January 11, 2023 from https://www.xilinx.com/products/intellectual-property/floating_pt.html
[7]
AMD Xilinx. 2022. MicroBlaze Soft Processor Core. AMD Xilinx. Retrieved January 11, 2023 from https://www.xilinx.com/products/design-tools/microblaze.html
[8]
Javier Bruguera and Tomás Lang. 1999. Leading-one prediction with concurrent position correction. IEEE Trans. Comput. 48, 10 (1999), 1083–1097. DOI:
[9]
Javier Bruguera and Tomás Lang. 2000. Rounding in Floating-Point Addition Using a Compound Adder. Technical Report. University of Santiago de Compostela.
[10]
James W. Cooley and John W. Tukey. 1965. An algorithm for the machine calculation of complex Fourier series. Math. Comp. 19, 90 (1965), 297–301. Retrieved from http://www.jstor.org/stable/2003354
[11]
Florent de Dinechin and Bogdan Pasca. 2011. Designing custom arithmetic data paths with FloPoCo. IEEE Des. Test 28, 4 (2011), 18–27. DOI:
[12]
Yong Dou, Stamatis Vassiliadis, Georgi K. Kuzmanov, and Georgi N. Gaydadjiev. 2005. 64-bit floating-point FPGA matrix multiplication. In Proceedings of the 2005 ACM/SIGDA 13th International Symposium on Field-Programmable Gate Arrays (FPGA ’05). ACM, New York, NY, USA, 86–95. DOI:
[13]
Pedro Echeverría and Marisa López-Vallejo. 2011. Customizing floating-point units for FPGAs: Area-performance-standard trade-offs. Microprocess. Microsyst. 35, 6 (2011), 535–546. DOI:
[14]
EEMBC 2012. About the EEMBC FPMark™ Floating-Point Benchmark Suite. EEMBC. Retrieved May 25, 2023 from https://www.eembc.org/fpmark/
[15]
Miloš Dragutin Ercegovac and Tomás Lang. 2003. Digital Arithmetic (1st ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.
[16]
Xin Fang and Miriam Leeser. 2016. Open-source variable-precision floating-point library for major commercial FPGAs. ACM Trans. Reconfigurable Technol. Syst. 9, 3, Article 20 (June 2016), 17 pages. DOI:
[17]
Paul Michael Farmwald. 1981. On the Design of High Performance Digital Arithmetic Units. Ph. D. Dissertation. Stanford University, Stanford, CA, USA. AAI8201985.
[18]
John Leroy Gustafson. 2015. The End of Error. CRC Press, Boca Raton, FL, USA.
[19]
Espen Gaardner Haug. 2007. The Complete Guide to Option Pricing Formulas. McGraw-Hill, New York, NY, USA.
[20]
John R. Hauser. 2018. Berkeley SoftFloat. University of California, Berkeley. Retrieved January 11, 2023 from http://www.jhauser.us/arithmetic/SoftFloat.html
[21]
John R. Hauser. 2018. Berkeley TestFloat. University of California, Berkeley. Retrieved November 29, 2023 from http://www.jhauser.us/arithmetic/TestFloat.html
[22]
Carsten Heinz, Yannick Lavan, Jaco Hofmann, and Andreas Koch. 2019. A catalog and in-hardware evaluation of open-source drop-in compatible RISC-V softcore processors. In Proceedings of the 2019 International Conference on ReConFigurable Computing and FPGAs (ReConFig) . IEEE, 1–8. DOI:
[23]
Karl S. Hemmert and Keith D. Underwood. 2010. Fast, efficient floating-point adders and multipliers for FPGAs. ACM Trans. Reconfigurable Technol. Syst. 3, 3, Article 11 (Sept. 2010), 30 pages. DOI:
[24]
Nicholas John Higham. 2002. Accuracy and Stability of Numerical Algorithms (2nd ed.). Society for Industrial and Applied Mathematics, Philadelphia, PA, USA. DOI:
[25]
Andrew Howard, Mark Sandler, Bo Chen, Weijun Wang, Liang-Chieh Chen, Mingxing Tan, Grace Chu, Vijay Vasudevan, Yukun Zhu, Ruoming Pang, Hartwig Adam, and Quoc Le. 2019. Searching for MobileNetV3. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV ’19). 1314–1324. DOI:
[26]
2019. IEEE standard for floating-point arithmetic. IEEE Std 754-2019 (Revision of IEEE 754-2008) (June 2019), 1–84. DOI:
[27]
Imperas 2021. Imperas RISC-V Tests. Imperas. Retrieved January 11, 2023 from https://github.com/riscv-ovpsim/imperas-riscv-tests
[28]
Intel. 2023. Nios® Soft Processor Series. Intel. Retrieved November 19, 2023 from https://www.intel.com/content/www/us/en/products/details/fpga/nios-processor.html
[29]
Manish Kumar Jaiswal and Hayden K.-H. So. 2017. DSP48E efficient floating point multiplier architectures on FPGA. In Proceedings of the 2017 30th International Conference on VLSI Design and 2017 16th International Conference on Embedded Systems (VLSID ’17). IEEE, 1–6. DOI:
[30]
Byeong Lee. 1984. A new algorithm to compute the discrete cosine transform. IEEE Trans. Acoust. Speech Signal. 32, 6 (1984), 1243–1245. DOI:
[31]
Stefan Mach, Fabian Schuiki, Florian Zaruba, and Luca Benini. 2021. FPnew: An open-source multiformat floating-point unit architecture for energy-proportional transprecision computing. IEEE Trans. Very Large Scale Integr. Syst. 29, 4 (2021), 774–787. DOI:
[32]
Ali Malik and Seok-bum Ko. 2006. A study on the floating-point adder in FPGAS. In Proceedings of the 2006 Canadian Conference on Electrical and Computer Engineering. IEEE, 86–89. DOI:
[33]
Eric Matthews, Yuhui Gao, and Lesley Shannon. 2020. Exploring writeback designs for efficiently leveraging parallel-execution units in FPGA-based soft-processors. In Proceedings of the 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM ’20). IEEE, 120–128. DOI:
[34]
Eric Matthews and Lesley Shannon. 2017. TAIGA: A new RISC-V soft-processor framework enabling high performance CPU architectural features. In Proceedings of the 2017 27th International Conference on Field Programmable Logic and Applications (FPL ’17). IEEE, 1–4. DOI:
[35]
Jean-Michel Muller, Nicolas Brunie, Florent de Dinechin, Claude-Pierre Jeannerod, Mioara Joldes, Vincent Lefevre, Guillaume Melquiond, Nathalie Revol, and Serge Torres. 2018. Handbook of Floating-Point Arithmetic (2nd ed.). Birkhauser, Basel, Switzerland.
[36]
Anouar Nechi, Lukas Groth, Saleh Mulhem, Farhad Merchant, Rainer Buchty, and Mladen Berekovic. 2023. FPGA-based deep learning inference accelerators: Where are we standing? ACM Trans. Reconfigurable Technol. Syst. 16, 4 (Sept. 2023), 1–32. DOI:
[37]
Stuart F. Oberman, Hesham Al-Twaijry, and Michael J. Flynn. 1997. The SNAP project: Design of floating point arithmetic units. In Proceedings of the 13th IEEE Symposium on Computer Arithmetic. IEEE, 156–165. DOI:
[38]
OpenHW Group 2017. CVA5. OpenHW Group. Retrieved May 17, 2023 from https://github.com/openhwgroup/cva5
[39]
Keith Packard. 2019. Picolibc: C Libraries for Smaller Embedded Systems. Retrieved May 17, 2023 from https://keithp.com/picolibc/
[40]
Stefania Perri, Fanny Spagnolo, Fabio Frustaci, and Pasquale Corsonello. 2023. Design of leading zero counters on FPGAs. IEEE Embed. Syst. Lett. 15, 3 (2023), 149-152. DOI:
[41]
ISO Central Secretary. 2009. ISO/IEC/IEEE 9945:2009 Information technology – Portable Operating System Interface (POSIX®) Base Specifications, Issue 7. Standard ISO/IEC/IEEE 9945:2009. International Organization for Standardization.
[42]
ISO Central Secretary. 2018. ISO/IEC 9899:2018 Information technology - Programming languages - C. Standard ISO/IEC 9899:2018. International Organization for Standardization.
[43]
ISO Central Secretary. 2020. ISO/IEC 23008-2:2020 Information Technology – High Efficiency Coding and Media Delivery in Heterogeneous Environments – Part 2: High Efficiency Video Coding. Standard ISO/IEC 23008-2:2020. International Organization for Standardization.
[44]
Wilson Snyder. 2006. Verilator. Veripool. Retrieved May 17, 2023 from https://veripool.org/verilator/
[45]
SpinalHDL 2021. NaxRiscv. SpinalHDL. Retrieved May 1, 2023 from https://github.com/SpinalHDL/NaxRiscv
[46]
SpinalHDL 2017. VexRiscV - A FPGA Friendly 32 Bit RISC-V CPU Implementation. SpinalHDL. Retrieved January 11, 2023 from https://github.com/SpinalHDL/VexRiscv
[47]
Johannes Stallkamp, Marc Schlipsing, Jan Salmen, and Christian Igel. 2012. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Networks 32 (2012), 323–332. DOI:Selected Papers from IJCNN 2011.
[48]
Giuseppe Tagliavini, Stefan Mach, Davide Rossi, Andrea Marongiu, and Luca Benini. 2018. A transprecision floating-point platform for ultra-low power computing. In Proceedings of the 2018 Design, Automation and Test in Europe Conference and Exhibition (DATE ’18). 1051–1056. DOI:
[49]
David B. Thomas. 2019. Templatised soft floating-point for high-level synthesis. In Proceedings of the 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM ’19). 227–235. DOI:
[50]
VectorBlox Computing 2015. Vectorblox ORCA. VectorBlox Computing. Retrieved January 11, 2023 from https://github.com/UBC-ORCA/orca-public
[51]
Julio Villalba and Javier Hormigo. 2022. High-radix formats for enhancing floating-point FPGA implementations. Circuits, Systems, and Signal Processing 41, 3 (2022), 1683–1703. DOI:
[52]
Shibo Wang and Pankaj Kanwar. 2019. BFloat16: The Secret to High Performance on Cloud TPUs. Google Cloud. Retrieved October 5, 2023 from https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus
[53]
Xiaojun Wang and Miriam Leeser. 2010. VFloat: A variable precision fixed- and floating-point library for reconfigurable hardware. ACM Trans. Reconfigurable Technol. Syst. 3, 3, Article 16 (Sept. 2010), 34 pages. DOI:
[54]
YosysHQ 2015. PicoRV32 - A Size-Optimized RISC-V CPU. YosysHQ. Retrieved January 11, 2023 from https://github.com/YosysHQ/picorv32

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Reconfigurable Technology and Systems
ACM Transactions on Reconfigurable Technology and Systems  Volume 17, Issue 2
June 2024
464 pages
EISSN:1936-7414
DOI:10.1145/3613550
  • Editor:
  • Deming Chen
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 April 2024
Online AM: 15 March 2024
Accepted: 15 February 2024
Revised: 14 December 2023
Received: 17 June 2023
Published in TRETS Volume 17, Issue 2

Check for updates

Author Tags

  1. FPGA
  2. soft processor
  3. floating point
  4. reduced precision
  5. RISC-V

Qualifiers

  • Research-article

Funding Sources

  • Natural Sciences and Engineering Research Council of Canada

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 169
    Total Downloads
  • Downloads (Last 12 months)169
  • Downloads (Last 6 weeks)16
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media