article

Free access

Micro-optimization of floating-point operations

Author:

W. J. DallyAuthors Info & Claims

ACM SIGARCH Computer Architecture News, Volume 17, Issue 2

Pages 283 - 289

https://doi.org/10.1145/68182.68208

Published: 01 April 1989 Publication History

PDF eReader

Abstract

This paper describes micro-optimization, a technique for reducing the operation count and time required to perform floating-point calculations. Micro-optimization involves breaking floating-point operations into their constituent micro-operations and optimizing the resulting code. Exposing micro-operations to the compiler creates many opportunities for optimization. Redundant normalization operations can be eliminated or combined. Also, scheduling micro-operations separately allows dependent operations to be partially overlapped. A prototype expression compiler has been written to evaluate a number of micro-optimizations. On a set of benchmark expressions operation count is reduced by 33% and execution time is reduced by 40%.

References

[1]

AMD, AMD 29000 User's Manual, 1987.

Google Scholar

[2]

ANSI/iEEE Standard 754-1985, IEEE Standard for Binary Floating-Point Arithmetic.

Google Scholar

[3]

Colwell, tL.P., et.al., "A VLiW Architecture for a Trace Scheduling Compiler," IEEE Trans. Computers, C- 37(8), August 1988, pp. 967-979.

Digital Library

Google Scholar

[4]

Coonen, jerome T., "An Implementation Guide to a Proposed Standard for Floating-Point Arithmetic," IEEE Computer, January 1980, pp. 68-79.

Digital Library

Google Scholar

[5]

Ilwang, K., Computer Arithmetic: Principles, Architecture, and Design, Wiley, 1979.

Digital Library

Google Scholar

[6]

Magenheimer, et.al., "Integer Multiplication and Division on the HP Precision Architecture," IEEE Trans. Computers, C-37(8), August 1988, pp. 980-990.

Digital Library

Google Scholar

[7]

Motorola, MC88100 $2-bit Third-Generation RI$C Microprocessor: Technical Summary, Document BR588/D, 1988.

Google Scholar

[8]

Moussouris, J. et.al, "A CMOS RISC Processor with Integrated System Function," COMPCON, 1986, pp. 126-131.

Google Scholar

[9]

Patterson, David A., "Reduced instruction Set Computers,'' CACM, 28(1), january 1985, pp. 8-21.

Digital Library

Google Scholar

[10]

Strecker, W.D., "VAX-11/780, A Virtual Address Extension to the PDP-11 Family", Proc. NCC, 1978, pp. 967-980.

Google Scholar

[11]

Thornton, James E., "Parallel Operation in the Control Data 6600," Proc. AFIPS FJCC, vol 26, 1964, pp. 33- 40.

Google Scholar

Cited By

View all

Hockert NCompton K(2018)Improving Floating-Point Performance in Less AreaJournal of Signal Processing Systems10.1007/s11265-010-0561-y67:1(31-46)Online publication date: 27-Dec-2018
https://dl.acm.org/doi/10.1007/s11265-010-0561-y
Thorson M(2011)Internet nuggetsACM SIGARCH Computer Architecture News10.1145/2024716.202472239:2(36-52)Online publication date: 31-Aug-2011
https://dl.acm.org/doi/10.1145/2024716.2024722
Hockert NCompton K(2009)FFPU: Fractured floating point unit for FPGA soft processors2009 International Conference on Field-Programmable Technology10.1109/FPT.2009.5377622(143-150)Online publication date: Dec-2009
https://doi.org/10.1109/FPT.2009.5377622
Show More Cited By

Index Terms

Micro-optimization of floating-point operations

Recommendations

Micro-optimization of floating-point operations
ASPLOS III: Proceedings of the third international conference on Architectural support for programming languages and operating systems

This paper describes micro-optimization, a technique for reducing the operation count and time required to perform floating-point calculations. Micro-optimization involves breaking floating-point operations into their constituent micro-operations and ...
Intel® Itanium® floating-point architecture
WCAE '03: Proceedings of the 2003 workshop on Computer architecture education: Held in conjunction with the 30th International Symposium on Computer Architecture

The Intel® Itanium® architecture is increasingly becoming one of the major processor architectures present in the market today. Launched in 2001, the Intel Itanium processor was followed in 2002 by the Itanium 2 processor, with increased integer and ...
Dynamic Optimization of Micro-Operations
HPCA '03: Proceedings of the 9th International Symposium on High-Performance Computer Architecture

Inherent within complex instruction set architectures such as x86 are inefficiencies that do not exist in a simpler ISAs. Modern x86 implementations decode instructions into one or more micro-operations in order to deal with the complexity of the ISA. ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News

ACM SIGARCH Computer Architecture News Volume 17, Issue 2

Special issue: Proceedings of ASPLOS-III: the third international conference on architecture support for programming languages and operating systems

April 1989

291 pages

ISSN:0163-5964

DOI:10.1145/68182

Editor:
Joel Emer

Issue’s Table of Contents

ASPLOS III: Proceedings of the third international conference on Architectural support for programming languages and operating systems
April 1989
303 pages
ISBN:0897913000
DOI:10.1145/70082
Chairman:
Joel Emer,
General Chair:
John Hennessy
Stanford University

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 1989

Published in SIGARCH Volume 17, Issue 2

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
947
Total Downloads

Downloads (Last 12 months)260
Downloads (Last 6 weeks)30

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Hockert NCompton K(2018)Improving Floating-Point Performance in Less AreaJournal of Signal Processing Systems10.1007/s11265-010-0561-y67:1(31-46)Online publication date: 27-Dec-2018
https://dl.acm.org/doi/10.1007/s11265-010-0561-y
Thorson M(2011)Internet nuggetsACM SIGARCH Computer Architecture News10.1145/2024716.202472239:2(36-52)Online publication date: 31-Aug-2011
https://dl.acm.org/doi/10.1145/2024716.2024722
Hockert NCompton K(2009)FFPU: Fractured floating point unit for FPGA soft processors2009 International Conference on Field-Programmable Technology10.1109/FPT.2009.5377622(143-150)Online publication date: Dec-2009
https://doi.org/10.1109/FPT.2009.5377622
Potkonjak MRabaey J(1994)Exploring the Algorithmic Design Space using High Level SynthesisVLSI Design Methodologies for Digital Signal Processing Architectures10.1007/978-1-4615-2762-6_4(131-167)Online publication date: 1994
https://doi.org/10.1007/978-1-4615-2762-6_4
Matthes W(1991)How many operation units are adequate?ACM SIGARCH Computer Architecture News10.1145/122576.12258619:4(94-108)Online publication date: 1-Jul-1991
https://dl.acm.org/doi/10.1145/122576.122586
Davis BBaird RGavin PSjälander MFinlayson IRasapour FCook GUh GWhalley DTyson GIyer RGarg S(2015)Scheduling instruction effects for a statically pipelined processorProceedings of the 2015 International Conference on Compilers, Architecture and Synthesis for Embedded Systems10.5555/2830689.2830710(167-176)Online publication date: 4-Oct-2015
https://dl.acm.org/doi/10.5555/2830689.2830710
Davis BBaird RGavin PSjalander MFinlayson IRasapour FCook GUh GWhalley DTyson G(2015)Scheduling instruction effects for a statically pipelined processor2015 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES)10.1109/CASES.2015.7324557(167-176)Online publication date: Oct-2015
https://doi.org/10.1109/CASES.2015.7324557
Reddy VGilani SGunadi EKim NSchulte MLipasti MChou PHuang RXie YKarnik T(2013)REELProceedings of the 2013 International Symposium on Low Power Electronics and Design10.5555/2648668.2648716(187-192)Online publication date: 4-Sep-2013
https://dl.acm.org/doi/10.5555/2648668.2648716
Finlayson IDavis BGavin PUh GWhalley DSjälander MTyson G(2013)Improving processor efficiency by statically pipelining instructionsACM SIGPLAN Notices10.1145/2499369.246555948:5(33-44)Online publication date: 20-Jun-2013
https://dl.acm.org/doi/10.1145/2499369.2465559
Finlayson IDavis BGavin PUh GWhalley DSjälander MTyson GFranke BXue J(2013)Improving processor efficiency by statically pipelining instructionsProceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems10.1145/2491899.2465559(33-44)Online publication date: 20-Jun-2013
https://dl.acm.org/doi/10.1145/2491899.2465559
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

References

Cited By

Index Terms

Recommendations