poster

Reconfigurable custom floating-point instructions (abstract only)

Authors:

Richard Neil Pittman,

Alessandro ForinAuthors Info & Claims

FPGA '10: Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays

Page 287

https://doi.org/10.1145/1723112.1723173

Published: 21 February 2010 Publication History

Abstract

Multimedia and communication algorithms from the embedded system domain often make extensive use of floating-point arithmetic. Due to the complexity and expense of the floating-point hardware, these algorithms are usually converted to fixed point operations, or implemented using floating-point emulation in software. This study presents the design and implementation of custom floating-point units, leveraging the partial reconfiguration feature of state-of-the-art FPGAs. The custom floating-point units can be dynamically configured, loaded, and executed when needed by software applications. The system is binary compliant with the conventional MIPS architecture and the IEEE-754 standard, and supports most of the floating-point operations and relevant functionalities. Furthermore, we investigate various customization strategies and construct a set of optimized functional modules to meet different application demands or requirements. Using LINPACK as a floating-point intensive example, we replace a sequence of 25 instructions with a custom unit, and demonstrate an overall 80x application speedup.

References

[1]

Beauchamp, M. J., Hauck, S., Underwood, K. D., and Hemmert, S. Architectural Modifications to Enhance the Floating-Point Performance of FPGAs. IEEE Transactions on VLSI Systems, 16, 2 (2008), 177--187.

Digital Library

[2]

Behm, M., Ludden, J., Lichtenstein, Y., Rimon, M., and Vinov, M. Industrial experience with test generation languages for processor verification. In Proceedings of ACM/IEEE Design Automation Conference ( 2004), 36--40.

Digital Library

[3]

Booth, A. D. A signed binary multiplication technique. Quarterly Journal of Mechanics and Applied Mathematics, 4 (June 1951), 236--240.

[4]

Chandra, A., Geist, D, Wolfsthal, Y. et al. AVPGEN - A test generator for architecture verification. IEEE Transactions on VLSI Systems, 3 (1995), 188--200.

Digital Library

[5]

Chong, Y. J. and Parameswaran, S. Flexible Multi-Mode Embedded Floating-Point Unit for Field Programmable Gate Arrays. In Proceedings of the ACM/SIGDA Symposium on Field Programmable Gate Arrays ( 2009), 171--180.

Digital Library

[6]

Davis, J. D., Thacker, C. P., and Cheng, C. BEE3: Revitalizing Computer Architecture Research. MSR-TR-2009-45, Microsoft Research, Redmond, WA, 2009.

[7]

Dongarra, J. The LINPACK Benchmark: An Explanation. Lecture Notes in Computer Science, 297 (1988), 456--474.

[8]

Ercegovac, M. D. and Lang, T. Digital Arithmetic. Morgan Kaufmann, San Francisco, CA, 2004.

[9]

Flynn, M. J. and Oberman, S. F. Advanced Computer Arithmetic Design. Wiley-Interscience, Malden, MA, 2001.

Digital Library

[10]

Forin, A., Neekzad, B., and Lynch, N. L. Giano: The Two-Headed System Simulator. MSR-TR-2006-130, Microsoft Research, Redmond, WA, 2006.

[11]

Hauser, J. SoftFloat. Available: http://www.jhauser.us/arithmetic/SoftFloat.html.

[12]

Hennessy, J. L. and Patterson, D. A. Computer Architecture: A Quantitative Approach. Morgan Kaufmann, San Francisco, CA, 2006.

Digital Library

[13]

Herbordt, M. C., Model, J., Gu, Y., Sukhwani, B., and VanCourt, T. Single Pass, BLAST-Like, Approximate String Matching on FPGAs. In Proc. of IEEE Symp. Field-Prog-rammable Custom Computing Machines ( 2006), 217--226.

Digital Library

[14]

Ho, C. H., Yu, C. W., Leong, P., Luk, W., and Wilton, S. J. E. Floating-Point FPGA: Architecture and Modeling. to appear in IEEE Transactions on VLSI Systems (2009).

Digital Library

[15]

IEEE STD 754-1985. IEEE Standard for Binary Floating-Point Arithmetic. IEEE Computer Society. 1985.

[16]

IEEE STD 754-2008. IEEE Standard for Floating-Point Arithmetic. IEEE Computer Society. 2008.

[17]

Kane, G. and Heinrich, J. MIPS RISC Architecture. Prentice Hall PTR, Upper Saddle River, NJ, 1992.

Digital Library

[18]

Karlstrom, P., Ehliar, A., and Liu, D. High Performance Low Latency FPGA based Floating Point Adder and Multiplier Units in a Virtex 4. In Proceedings of the 24th Norchip Conference ( 2006), 31--34.

[19]

Karuri, K., Leupers, R., and Kedia, M. Design and Implementation of a Modular and Portable IEEE 754 Compliant Floating-Point Unit. In Proceedings of Design, Automation and Test in Europe (DATE) ( 2006), 1--6.

Digital Library

[20]

Krueger, S. D. and Seidel, P. M. Design of an On-Line IEEE Floating-Point Addition Unit for FPGAs. In Proceedings of IEEE Symposium on Field-Programmable Custom Computing Machines ( 2004), 239--246.

Digital Library

[21]

Langhammer, M. and Vancourt, T. FPGA Floating Point Datapath Compiler. In Proceedings of IEEE Symposium of Field-Programmable Custom Computing Machines (2009).

Digital Library

[22]

Meier, K. and Forin, A. Hardware Compilation from Machine Code with M2V. In Proceedings of IEEE Symp. Field-Prog-rammable Custom Computing Machines (2008), 293--295.

Digital Library

[23]

Oberman, S. F. and Flynn, M. J. Design Issues in Division and Other Floating-Point Operations. IEEE Transactions on Computers, 46, 2 (February 1997), 154--161.

Digital Library

[24]

Oberman, S. F. and Flynn, M. J. Division Algorithms and Implementations. IEEE Transactions on Computers, 46, 8 (August 1997), 833--854.

Digital Library

[25]

Overton, M. L. Numerical Computing with IEEE Floating Point Arithmetic. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2001.

Digital Library

[26]

Pittman, R. N., Lynch, N. L., and Forin, A. eMIPS, A Dynamically Extensible Processor. 2006, MSR-TR-2006-143, Microsoft Research, Redmond, WA, Code available at http://research.microsoft.com/en-us/projects/emips.

[27]

Sahin, S., Kavak, A., Becerikli, Y., and Demiray, H. E. Implementation of Floating-Point Arithmetics using an FPGA. Mathematical Methods in Eng. (2007), 445--453.

[28]

Sam, H. and Gupta, A. A Generalized Multibit Recoding of Two's Complement Binary Numbers and Its Proof with Application in Multiplier Implementation. IEEE Trans. on Computers, 39, 8 (1990), 1006--1015.

Digital Library

[29]

Sekar, A. and Forin, A. Automatic Generation of Interrupt-Aware Hardware Accelerators with the M2V Compiler. MSR-TR-2008-110, Microsoft Research, Redmond, WA, 2008.

[30]

Shirazi, N., Walters, A., and Athanas, P. Quantitative Analysis of Floating Point Arithmetic on FPGA Based Custom Computing Machines. In Proceeding of IEEE Symp. FPGAs for Custom Computing Machines ( 1995), 155--162.

Digital Library

[31]

Underwood, K. FPGAs vs. CPUs: Trends in Peak Floating-Point Performance. In Proceedings of the 12th ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA) ( 2004), 171--180.

Digital Library

[32]

Wallace, C. S. A Suggestion for A Fast Multiplier. IEEE Trans. on Electronic Computers, EC-13 (1964), 14--17.

[33]

Wu, G., Dou, Y., Lei, Y., Zhou, J., and Wang, M. A Fine-grained Pipelined Implementation of the LINPACK Benchmark on FPGAs. In Proceedings of IEEE Symp. Field-Programmable Custom Computing Machines (2009).

Digital Library

[34]

XILINX INC. Virtex-5 FPGA XtremeDSP Design Considerations User Guide, UG193(v3.3). 2009.

[35]

XILINX INC. XtremeDSP for Virtex-4 FPGAs User Guide, UG073 (v2.7). 2008.

Cited By

Manikandan EKarthigeyan KImmanuvel Arokia James K(2012)Design of parallel vector/scalar floating point co-processor for reconfigurable architecture2012 International Conference on Computing, Electronics and Electrical Technologies (ICCEET)10.1109/ICCEET.2012.6203919(841-845)Online publication date: Mar-2012
https://doi.org/10.1109/ICCEET.2012.6203919
Huynh TMücke MGansterer W(2012)Evaluation of the Stretch S6 Hybrid Reconfigurable Embedded CPU Architecture for Power-Efficient Scientific ComputingProcedia Computer Science10.1016/j.procs.2012.04.0219(196-205)Online publication date: 2012
https://doi.org/10.1016/j.procs.2012.04.021

Index Terms

Reconfigurable custom floating-point instructions (abstract only)
1. Applied computing
  1. Computers in other domains
    1. Personal computers and PC applications
      1. Microcomputers
2. Hardware
  1. Integrated circuits
    1. Logic circuits
      1. Arithmetic and datapath circuits

Recommendations

A coarse-grain reconfigurable architecture for multimedia applications supporting subword and floating-point calculations

Signal processors exploiting ASIC acceleration suffer from sky-rocketing manufacturing costs and long design cycles. FPGA-based systems provide a programmable alternative for exploiting computation parallelism, but the flexibility they provide is not as ...
Floating-point divider design for FPGAs

Growth in floating-point applications for field-programmable gate arrays (FPGAs) has made it critical to optimize floating-point units for FPGA technology. The divider is of particular interest because the design space is large and divider usage in ...
Flexible multi-mode embedded floating-point unit for field programmable gate arrays
FPGA '09: Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays

Performance of Field Programmable Gate Arrays (FPGAs) used for floating-point applications is poor due to the complexity of floating-point arithmetic. Implementing floating-point units on FPGAs consume a large amount of resources. This makes FPGAs less ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

FPGA '10: Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays

February 2010

308 pages

ISBN:9781605589114

DOI:10.1145/1723112

General Chair:
Peter Cheung
Imperial College London, UK
,
Program Chair:
John Wawrzynek
UC Berkeley, USA

Copyright © 2010 Copyright held by author(s).

Sponsors

SIGDA: ACM Special Interest Group on Design Automation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 February 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Poster

Conference

FPGA '10

Sponsor:

SIGDA

FPGA '10: ACM/SIGDA International Symposium on Field Programmable Gate Arrays

February 21 - 23, 2010

California, Monterey, USA

Acceptance Rates

Overall Acceptance Rate 125 of 627 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Manikandan EKarthigeyan KImmanuvel Arokia James K(2012)Design of parallel vector/scalar floating point co-processor for reconfigurable architecture2012 International Conference on Computing, Electronics and Electrical Technologies (ICCEET)10.1109/ICCEET.2012.6203919(841-845)Online publication date: Mar-2012
https://doi.org/10.1109/ICCEET.2012.6203919
Huynh TMücke MGansterer W(2012)Evaluation of the Stretch S6 Hybrid Reconfigurable Embedded CPU Architecture for Power-Efficient Scientific ComputingProcedia Computer Science10.1016/j.procs.2012.04.0219(196-205)Online publication date: 2012
https://doi.org/10.1016/j.procs.2012.04.021

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

Media

Figures

Other

Tables

View Table of Contents