Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/2388996.2389034acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Efficient backprojection-based synthetic aperture radar computation with many-core processors

Published: 10 November 2012 Publication History

Abstract

Tackling computationally challenging problems with high efficiency often requires the combination of algorithmic innovation, advanced architecture, and thorough exploitation of parallelism. We demonstrate this synergy through synthetic aperture radar (SAR) via backprojection, an image reconstruction method that can require hundreds of TFLOPS. Computation cost is significantly reduced by our new algorithm of approximate strength reduction; data movement cost is economized by software locality optimizations facilitated by advanced architecture support; parallelism is fully harnessed in various patterns and granularities. We deliver over 35 billion backprojections per second throughput per compute node on an Intel® Xeon® E5-2670-based cluster, equipped with Intel® Xeon Phi™ coprocessors. This corresponds to processing a 3K x 3K image within a second using a single node. Our study can be extended to other settings: backprojection is applicable elsewhere including medical imaging, approximate strength reduction is a general code transformation technique, and many-core processors are emerging as a solution to energy-efficient computing.

References

[1]
P. Kogge, K. Bergman, S. Borkar, D. Campbell, W. Carlson, W. Dally, M. Denneau, P. Franzon, W. Harrod, K. Hill, J. Hiller, S. Karp, S. Keckler, D. Klein, R. Lucas, M. Richards, A. Scarpelli, S. Scott, A. Snavely, T. Sterling, R. S. Williams, and K. Yelick, "ExaScale Computing Study: Technology Challenges in Achieving Exascale Systems," 2008, www.cse.nd.edu/Reports/2008/TR-2008-13.pdf.
[2]
F. Niu, B. Recht, C. Re, and S. J. Wright, "HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent," in Advances in Neural Information Processing Systems (NIPS), 2011.
[3]
M. Wolfe, High Performance Compilers for Parallel Computing. Addison-Wesley, 1996.
[4]
A. Nguyen, N. Satish, J. Chhugani, C. Kim, and P. Dubey, "3.5-D Blocking Optimization for Stencil Computations on Modern CPUs and GPUs," in International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2010, pp. 1--13.
[5]
M. A. Richards, Fundamentals of Radar Signal Processing. McGraw-Hill, 2005.
[6]
D. P. Campbell, D. A. Cook, and B. P. Mulvaney, "A Streaming Sensor Challenge Problem for Ubiquitous High Performance Computing," http://www.ll.mit.edu/HPEC/agendas/proc11/Day1/Session_1/1025_Campbell.pdf, 2011.
[7]
D. C. Munson, Jr., J. D. O'Brien, and W. K. Jenkins, "A Tomographic Formulation of Spotlight-Mode Synthetic Aperture Radar," Proceedings of the IEEE, vol. 71, no. 8, pp. 917--925, 1983.
[8]
C. V. Jakowatz, Jr., D. E. Wahl, and D. A. Yocky, "Beamforming as a Foundtation for Spotlight-mode SAR Image Formation by Backprojection," in SPIE 6970, 2008.
[9]
A. Isola, A. Ziegler, T. Koehler, W. J. Niessen, and M. Grass, "Motion-compensated iterative cone-beam CT image reconstruction with adapted blobs as basis functions," Physics in Medicine and Biology, vol. 53, p. 6777, 2008.
[10]
E. Hansis, J. Bredno, D. Sowards-Emmerd, and L. Shao, "Iterative Reconstruction for Circular Cone-Beam CT with an Offset Flat-Panel Detector," in IEEE Nuclear Science Symposium Conference Record (NSS/MIC), 2010, pp. 2228--2231.
[11]
J. Cocke and K. Kennedy, "An Algorithm for Reduction of Operator Strength," Communications of the ACM, vol. 20, no. 11, pp. 850--856, 1977.
[12]
F. E. Allen, J. Cocke, and K. Kennedy, "Reduction of Operator Strength," Program Flow Analysis, pp. 79--101, 1981.
[13]
M. D. Desai and W. K. Jenkins, "Convolution Backprojection Image Reconstruction for Spotlight Mode Synthetic Aperture Radar," Image Processing, IEEE Transactions on Image Processing, vol. 1, no. 4, pp. 505--517, 1992.
[14]
C. V. Jakowatz, Jr., D. E. Wahl, D. C. G. Paul H. Eichel, and P. A. Thompson, Spotlight-Mode Synthetic Aperture Radar: A Signal Processing Approach. Kluwer Academic Publishers, 1996.
[15]
G. M. Amdahl, "Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities," in AFIPS spring joint computer conference, 1967, pp. 483--485.
[16]
A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman, Compilers: Principles, Techniques, and Tools. Addison-Wesley, 2007.
[17]
J.-M. Muller, Elementary Functions: Algorithms and Implementation, Birkhauser.
[18]
B. Cordes and M. Leeser, "Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing," EURASIP Journal on Embedded Systems, vol. 2009, p. 1, 2009.
[19]
T. D. R. Hartley, A. R. Fasih, C. A. Berdanier, F. Özgüner, and U. V. Catalyurek, "Investigating the Use of GPU-Accelerated Nodes for SAR Image Formation," in IEEE International Conference on Cluster Computing (CLUSTER), 2009, pp. 1--8.
[20]
A. R. Fasih and T. D. R. Hartley, "GPU-accelerated synthetic aperture radar backprojection in CUDA," in IEEE Radar Conference, 2010, pp. 1408--1413.
[21]
R. Portillo, S. Arunagiri, P. J. Teller, S. J. Park, L. H. Nguyen, J. C. Deroba, and D. Shires, "Power versus Performance Trade-offs of GPU-accelerated Backprojection-based Synthetic Aperture Radar Image Formation," Proceedings of SPIE, Modeling and Simulation for Defense Systems and Applications VI, vol. 8060, 2011.
[22]
T. M. Benson, D. P. Campbell, and D. A. Cook, "Gigapixel Spotlight Synthetic Aperture Radar Backprojection Using Clusters of GPUs and CUDA," in IEEE Radar Conference, 2012, pp. 853--858.
[23]
M. Soumekh, Synthetic Aperture Radar Signal Processing with MATLAB Algortihms. Wiley Interscience, 1999.
[24]
S. Xiao, D. C. Munson, Jr., S. Basu, and Y. Bresler, "An N2logN Back-Projection Algorithm for SAR Image Formation," in Asilomar Conference on Signals, Systems and Computers, 2000, pp. 3--7.
[25]
D. E. Wahl, D. A. Yocky, and C. V. Jakowatz, Jr., "An implementation of a fast backprojection image formation algorithm for spotlight-mode SAR," Algorithms for Synthetic Aperture Radar Imagery XV, vol. 6970, p. 69700H, 2008.
[26]
NVIDIA, "OpenCL Programming Guide for the CUDA Architecture," www.nvidia.com/content/cudazone/download/OpenCL/NVIDIA_OpenCL_ProgrammingGuide.pdf, 2009.
[27]
J. E. Volder, "The CORDIC Trigonometric Computing Technique," IRE Transactions on Electronic Computers, pp. 330--334, 1959.
[28]
J. R. Humphrey, D. K. Price, K. E. Spagnoli, A. L. Paolini, and E. J. Kelmelis, "CULA: hybrid GPU accelerated linear algebra routines," in SPIE Conference Series, vol. 7705, Apr. 2010.
[29]
S. McIntosh-Smith and J. Irwin, "The best of both worlds: Delivering Aggregated performance for high-performance math libraries in accelerated system," in ISC, 2007, pp. 331--340.
[30]
M. Deisher, M. Smelyanskiy, B. Nickerson, V. W. Lee, M. Chuvelev, and P. Dubey, "Designing and dynamically load balancing hybrid lu for multi/many-core," Computer Science-Research and Development, vol. 26, no. 3-4, pp. 211--220, 2011.

Cited By

View all
  • (2016)A unified Coq framework for verifying C programs with floating-point computationsProceedings of the 5th ACM SIGPLAN Conference on Certified Programs and Proofs10.1145/2854065.2854066(15-26)Online publication date: 18-Jan-2016
  • (2014)Versatile and scalable parallel histogram constructionProceedings of the 23rd international conference on Parallel architectures and compilation10.1145/2628071.2628108(127-138)Online publication date: 24-Aug-2014
  • (2013)Tera-scale 1D FFT with low-communication algorithm and Intel® Xeon Phi™ coprocessorsProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.1145/2503210.2503242(1-12)Online publication date: 17-Nov-2013
  1. Efficient backprojection-based synthetic aperture radar computation with many-core processors

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SC '12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
      November 2012
      1161 pages
      ISBN:9781467308045

      Sponsors

      Publisher

      IEEE Computer Society Press

      Washington, DC, United States

      Publication History

      Published: 10 November 2012

      Check for updates

      Qualifiers

      • Research-article

      Conference

      SC '12
      Sponsor:

      Acceptance Rates

      SC '12 Paper Acceptance Rate 100 of 461 submissions, 22%;
      Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 25 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2016)A unified Coq framework for verifying C programs with floating-point computationsProceedings of the 5th ACM SIGPLAN Conference on Certified Programs and Proofs10.1145/2854065.2854066(15-26)Online publication date: 18-Jan-2016
      • (2014)Versatile and scalable parallel histogram constructionProceedings of the 23rd international conference on Parallel architectures and compilation10.1145/2628071.2628108(127-138)Online publication date: 24-Aug-2014
      • (2013)Tera-scale 1D FFT with low-communication algorithm and Intel® Xeon Phi™ coprocessorsProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.1145/2503210.2503242(1-12)Online publication date: 17-Nov-2013

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media