Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

A fast Fourier transform compiler

Published: 01 May 1999 Publication History
  • Get Citation Alerts
  • Abstract

    The FFTW library for computing the discrete Fourier transform (DFT) has gained a wide acceptance in both academia and industry, because it provides excellent performance on a variety of machines (even competitive with or faster than equivalent libraries supplied by vendors). In FFTW, most of the performance-critical code was generated automatically by a special-purpose compiler, called genfft, that outputs C code. Written in Objective Caml, genfft can produce DFT programs for any input length, and it can specialize the DFT program for the common case where the input data are real instead of complex. Unexpectedly, genfft "discovered" algorithms that were previously unknown, and it was able to reduce the arithmetic complexity of some other existing algorithms. This paper describes the internals of this special-purpose compiler in some detail, and it argues that a specialized compiler is a valuable tool.

    References

    [1]
    Myoung An, James W. Cooley, and Richard Tolimieri. Factofization method for crystallographic Fourier transforms. Advances in Applied Mathematics, 11:358-371, 1990.
    [2]
    Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Compilers, principles, techniques, and tools. Addison- Wesley, March 1986.
    [3]
    Alok Aggarwal and Jeffrey Scott Vitter. The input/output complexity of sorting and related problems. Communications of the ACM, 31(9):1116-1127, September 1988.
    [4]
    Robert D. Blumofe, Matteo Frigo, Chrisopher F. Joerg, Charles E. Leiserson, and Keith H. Randall. An analysis of dag-consistent distributed shared-memory algorithms. In Proceedings of the Eighth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 297-308, Padua, italy, June 1996.
    [5]
    R.E. Crochiere and A. V. Oppenheim. Analysis of linear digital networks. Proceedings of the IEEE, 63:581-595, April 1975.
    [6]
    J.W. Cooley and J. W. 'Ihkey. An algorithm for the machine computation of the complex Fourier series. Mathematics of Computation, 19:297-301, April 1965.
    [7]
    P. Duhamel and M. Vettefii. Fast Fourier transforms: a tutorial review and a state of the art. Signal Processing, 19:259-299, April 1990.
    [8]
    Matteo Frigo and Steven G. Johnson. The FFTW web page. http://theory, lcs .air. edu/'fftw.
    [9]
    Matteo Frigo and Steven G. Johnson. The fastest Fourier transform in the West. Technical Report MIT-LCS-TR- 728, MIT Lab for Computer Science, September 1997. The description of the codelet generator given in this report is no longer current.
    [10]
    Matteo Frigo and Steven G. Johnson. FFTW: An adaptive software architecture for the FFT, In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 3, pages 1381- 1384, Seattle, WA, May 1998.
    [11]
    Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. The implementation of the Cilk-5 multithreaded language. In Proceedings of the ACM SIG- PLAN '98 Conference on Programming Language Design and Implementation (PLDI), pages 212-223, Montreal, Canada, June 1998. ACM.
    [12]
    S. K. S. Gupta, C.-H. Huang, P. Sadayappan, and R. W. Johnson. A framework for generating distributedmemory parallel programs for block recursive algorithms. Journal of Parallel and Distributed Computing, 34(2):137-153, 1 May 1996.
    [13]
    Jia-Wei Hong and H. T. Kung. I/O complexity: the red-blue pebbling game. In Proceedings of the Thirteenth Annual A CM Symposium on Theory of Computing, pages 326-333, Milwaukee, 1981.
    [14]
    P.H. Hartel and W. G. Vree. Arrays in a lazy functional language---a case study: the fast Fourier transform. In G. Hains and L. M. R. Mullin, editors, Arrays, functional languages, and parallel systems (ATABLE), pages 52-66, June 1992.
    [15]
    H.W. Johnson and C. S. Bums. The design of optimal DFT algorithms using dynamic programming. IEEE Transactions on Acoustics, Speech and Signal Processing, 31:378-387, April 1983.
    [16]
    Donald E. Knuth. The Art of Computer Programming, volume 2 (Seminumerical Algorithms). Addison- Wesley, 3rd edition, 1998.
    [17]
    Joanna L. Kulik. Implementing compiler optimizations using parallel graph reduction. Master's thesis, Massachussets Institute of Technology, February 1995.
    [18]
    Xavier Leroy. The Objective Carol system release 2.00. Institut National de Recherche en Informatique at Automatique (INRIA), August 1998.
    [19]
    J.A. Maruhn. FOURGEN: a fast Fourier transform program generator. Computer Physics Communications, 12:147-162, 1976.
    [20]
    Steven S. Muchnick. Advanced Compiler Design Implementation. Morgan Kaufmann, 1997.
    [21]
    A.V. Oppenheim and R. W. Schafer. Discrete-time Signal Processing. Prentice-Hall, Englewood Cliffs, NJ 07632, 1989.
    [22]
    Will Partain. The nofib benchmark suite of Haskell programs. In J. Launchbury and P. M. Sansom, editors, Functional Programming, Workshops in Computing, pages 195-202. Springer Verlag, 1992.
    [23]
    F. Perez and T. Takaoka. A prime factor FF'T algorithm implementation using a program generation technique. IEEE Transactions on Acoustics, Speech and Signal Processing, 35(8): 1221-1223, August 1987.
    [24]
    C.M. Rader. Discrete Fourier transforms when the number of data samples is prime. Proc. of the IEEE, 56:1107-1108, June 1968.
    [25]
    i. Selesnick and C. S. Burrus. Automatic generation of prime length FFr programs. IEEE Transactions on Signal Processing, pages 14-24, January 1996.
    [26]
    H. V. Sorensen, D. L. Jones, M. T. Heideman, and C. S. Burrus. Real-valued fast Fourier transform algorithms. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-35(6):849-863, June 1987.
    [27]
    Richard Tolimieri, Myoung An, and Chao Lu. Algorithms for Discrete Fourier Transform and Convolution. Springer Verlag, 1997.
    [28]
    Todd Veldhuizen. Using C++ template metaprograms. C++ Report, 7(4):36-43, May 1995. Reprinted in C++ Gems, ed. Stanley Lippman.
    [29]
    J.S. Vitter and E. A. M. Shriver. Optimal algorithms for parallel memory I: Two-level memories. Algorithmica, 12(2-3):110-147, 1994. double special issue on Large- Scale Memories.
    [30]
    J.S. Vitter and E. A. M. Shriver. Optimal algorithms for parallel memory II: Hierarchical multilevel memories. Algorithrnica, 12(2-3):148-169, 1994. double special issue on Large-Scale Memories.
    [31]
    Philip Wadler. How to declare an imperative. A CM Computing Surveys, 29(3):240-263, September 1997.
    [32]
    S. Winograd. On computing the discrete Fourier transform. Mathematics of Computation, 32(1):175-199, January 1978.

    Cited By

    View all
    • (2024)Using the Gerchberg–Saxton algorithm to reconstruct nonmodulated pyramid wavefront sensor measurementsAstronomy & Astrophysics10.1051/0004-6361/202347220681(A48)Online publication date: 9-Jan-2024
    • (2023)Fast algorithm for 3D volume reconstruction from light field microscopy datasetsOptics Letters10.1364/OL.49006148:16(4177)Online publication date: 2-Aug-2023
    • (2023)Linear dual-comb interferometry at high power levelsOptics Express10.1364/OE.48167131:3(4393)Online publication date: 23-Jan-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 34, Issue 5
    May 1999
    304 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/301631
    Issue’s Table of Contents
    • cover image ACM Conferences
      PLDI '99: Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
      May 1999
      304 pages
      ISBN:1581130945
      DOI:10.1145/301618
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 May 1999
    Published in SIGPLAN Volume 34, Issue 5

    Check for updates

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)205
    • Downloads (Last 6 weeks)26

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Using the Gerchberg–Saxton algorithm to reconstruct nonmodulated pyramid wavefront sensor measurementsAstronomy & Astrophysics10.1051/0004-6361/202347220681(A48)Online publication date: 9-Jan-2024
    • (2023)Fast algorithm for 3D volume reconstruction from light field microscopy datasetsOptics Letters10.1364/OL.49006148:16(4177)Online publication date: 2-Aug-2023
    • (2023)Linear dual-comb interferometry at high power levelsOptics Express10.1364/OE.48167131:3(4393)Online publication date: 23-Jan-2023
    • (2023)AI learns to write sorting software on its ownNature10.1038/d41586-023-01812-5618:7964(240-241)Online publication date: 7-Jun-2023
    • (2022)Bind the gap: compiling real software to hardware FFT acceleratorsProceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3519939.3523439(687-702)Online publication date: 9-Jun-2022
    • (2022)A machine learning inversion scheme for determining interaction from scatteringCommunications Physics10.1038/s42005-021-00778-y5:1Online publication date: 28-Feb-2022
    • (2022)CaTSM: A Pseudo‐Spectral Thermodynamically Consistent Model of Compressible FlowsJournal of Advances in Modeling Earth Systems10.1029/2022MS00311214:11Online publication date: 23-Nov-2022
    • (2021)ROME: A Pseudo‐Spectral Algorithm for Time‐Dependent Shear Flows in Stratified EnvironmentsJournal of Advances in Modeling Earth Systems10.1029/2021MS00259813:11Online publication date: 19-Nov-2021
    • (2021)Error analysis of some nonlocal diffusion discretization schemes▪Computers & Mathematics with Applications10.1016/j.camwa.2021.10.023103:C(40-52)Online publication date: 1-Dec-2021
    • (2020)CodeSeerProceedings of the 34th ACM International Conference on Supercomputing10.1145/3392717.3392741(1-11)Online publication date: 29-Jun-2020
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media