Abstract
We describe a novel practical parallel FFT scheme designed for SIMD systems and/or data parallel programming. A bit-exchange of elements between the processors is avoided by means of the ‘Transpose Algorithm’. Our transposition is based on the assignment of the data field onto a 1-dimensional ring of systolic cells which subsequently is mapped onto a ring of processors, realized as a subset of the system's connectivity. We have implemented and benchmarked a 2-dimensional parallel FFT code on the APE100/Quadrics parallel computer, where–due to a rigid next-neighbour connectivity and lack of local addressing–efficient FFT implementations could not be realized so far.
Preview
Unable to display preview. Download preview PDF.
References
C. Battista et al.: ‘The APE-100 Computer: (I) the Architecture', Int. J. of High Speed Computing 5 (1993) 637.
R. Tripiccione, in: F. Karsch, B. Monien, and H. Satz (eds.), Proceedings of the International Conference “Multi-scale Phenomena and their Simulation”, ZiF, Bielefeld, Sep. 30–Oct. 4, 1996, (World Scientific, Singapore, 1997), pp. 91.
I. Arsenin et al. in: T. D Kieu et al. (eds.), Lattice 95, Proceedings of the International Symposium on Lattice Field Theory, Melbourne, Australia, 1995, Nucl. Phys. B (Proc. Suppl.) 47 (1996) 804.
H. Press et al.: Numerical Recipes (Cambridge University Press, 1989)
B. Parhami, 'sIMD Machines: Do They Have a Significant Future?', The Fifth symposium on the frontiers of massively parallel computation, McLean, VA, 1995, http://www.ece.ucsb.edu/Faculty/Parhami/FMPC95-SIMD-Panel.html
G. S. Almasi and A. Gottlieb Highly Parallel Computing, (Redwood City, California, Benjamin/Cummings Publishing Company, 1994).
V. Kumar, A. Grama, A. Gupta, and G. Karypis: Introduction to Parallel Computing, (Redwood City, California, Benjamin/Cummings Publishing Company, 1994).
N. Cabibbo and P.S. Paolucci, 'sIMD algorithm for matrix transposition', preprint ROME-963-1993.
N. Petkov: Systolic Parallel Processing, (Amsterdam: North-Holland, 1992).
Th. Lippert, N. Petkov, P. Palazzari, ‘BLAS-3 for the Quadrics Parallel Computer', in: Proceedings of the International Conference on High Performance COmputing and Networking HPCN '97, April, 1997, Vienna, Austria, B. Hertzberger and P. Sloot (eds.) (Springer, Berlin, 1997, p. 332).
Th. Lippert, U. Glässner, H. Hoeber, G. Ritzenhöfer, K. Schilling and A. Seyfried, ‘Hyper-Systolic Parallel Processing on APE100/Quadrics: 1. N 2 Loop Computations', Int. J. of Mod. Phys. C 7 (1996) 485–501.
Th. Lippert, A. Seyfried, A. Bode, K. Schilling: ‘Hyper-Systolic Parallel Computing', accepted for publication in IEEE Trans. of Parallel and Distributed Systems.
Th. Lippert and K. Schilling: ‘Hyper-Systolic Matrix Multiplication', in: Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA'96, August 9–11, 1996, Sunnyvale, California, Hamid R. Arabnia (edt.) (C.S.R.E.A. Press, 1996, p. 919).
R. Vogelsang, Cray/ZAM Jülich, Germany, private communication.
Th. Lippert, K. Schilling, F. Toschi, SJrentmann, and R. Tripiccione, accepted for publication in IJMPC.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lippert, T., Schilling, K., Toschi, F., Trentmann, S., Tripiccione, R. (1998). Transpose algorithm for FFT on APE/Quadrics. In: Sloot, P., Bubak, M., Hertzberger, B. (eds) High-Performance Computing and Networking. HPCN-Europe 1998. Lecture Notes in Computer Science, vol 1401. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0037171
Download citation
DOI: https://doi.org/10.1007/BFb0037171
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64443-9
Online ISBN: 978-3-540-69783-1
eBook Packages: Springer Book Archive