Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Skip header Section
GPU Computing Gems Jade EditionNovember 2011
Publisher:
  • Morgan Kaufmann Publishers Inc.
  • 340 Pine Street, Sixth Floor
  • San Francisco
  • CA
  • United States
ISBN:978-0-12-385963-1
Published:02 November 2011
Pages:
560
Skip Bibliometrics Section
Reflects downloads up to 06 Oct 2024Bibliometrics
Skip Abstract Section
Abstract

This is the second volume of Morgan Kaufmanns GPU Computing Gems, offering an all-new set of insights, ideas, and practical hands-on skills from researchers and developers worldwide. Each chapter gives you a window into the work being performed across a variety of application domains, and the opportunity to witness the impact of parallel GPU computing on the efficiency of scientific research. GPU Computing Gems: Jade Edition showcases the latest research solutions with GPGPU and CUDA, including: Improving memory access patterns for cellular automata using CUDA Large-scale gas turbine simulations on GPU clusters Identifying and mitigating credit risk using large-scale economic capital simulations GPU-powered MATLAB acceleration with Jacket Biologically-inspired machine vision An efficient CUDA algorithm for the maximum network flow problem 30 more chapters of innovative GPU computing ideas, written to be accessible to researchers from any industry GPU Computing Gems: Jade Edition contains 100% new material covering a variety of application domains: algorithms and data structures, engineering, interactive physics for games, computational finance, and programming tools. This second volume of GPU Computing Gems offers 100% new material of interest across industry, including finance, medicine, imaging, engineering, gaming, environmental science, green computing, and more Covers new tools and frameworks for productive GPU computing application development and offers immediate benefit to researchers developing improved programming environments for GPUs Even more hands-on, proven techniques demonstrating how general purpose GPU computing is changing scientific research Distills the best practices of the community of CUDA programmers; each chapter provides insights and ideas as well as hands on skills applicable to a variety of fields Table of Contents Part 1: Parallel Algorithms and Data Structures - Paulius Micikevicius, NVIDIA 1 Large-Scale GPU Search 2 Edge v. Node Parallelism for Graph Centrality Metrics 3 Optimizing parallel prefix operations for the Fermi architecture 4 Building an Efficient Hash Table on the GPU 5 An Efficient CUDA Algorithm for the Maximum Network Flow Problem 6 On Improved Memory Access Patterns for Cellular Automata Using CUDA 7 Fast Minimum Spanning Tree Computation on Large Graphs 8 Fast in-place sorting with CUDA based on bitonic sort Part 2: Numerical Algorithms - Frank Jargstorff, NVIDIA 9 Interval Arithmetic in CUDA 10 Approximating the erfinv Function 11 A Hybrid Method for Solving Tridiagonal Systems on the GPU 12 LU Decomposition in CULA 13 GPU Accelerated Derivative-free Optimization Part 3: Engineering Simulation - Peng Wang, NVIDIA 14 Large-scale gas turbine simulations on GPU clusters 15 GPU acceleration of rarefied gas dynamic simulations 16 Assembly of Finite Element Methods on Graphics Processors 17 CUDA implementation of Vertex-Centered, Finite Volume CFD methods on Unstructured Grids with Flow Control Applications 18 Solving Wave Equations on Unstructured Geometries 19 Fast electromagnetic integral equation solvers on graphics processing units (GPUs) Part 4: Interactive Physics for Games and Engineering Simulation - Richard Tonge, NVIDIA 20 Solving Large Multi-Body Dynamics Problems on the GPU 21 Implicit FEM Solver in CUDA 22 Real-time Adaptive GPU multi-agent path planning Part 5: Computational Finance - Thomas Bradley, NVIDIA 23 High performance finite difference PDE solvers on GPUs for financial option pricing 24 Identifying and Mitigating Credit Risk using Large-scale Economic Capital Simulations 25 Financial Market Value-at-Risk Estimation using the Monte Carlo Method Part 6: Programming Tools and Techniques - Cliff Wooley, NVIDIA 26 Thrust: A Productivity-Oriented Library for CUDA 27 GPU Scripting and Code Generation with PyCUDA 28 Jacket: GPU Powered MATLAB Acceleration 29 Accelerating Development and Execution Speed with Just In Time GPU Code Generation 30 GPU Application Development, Debugging, and Performance Tuning with GPU Ocelot 31 Abstraction for AoS and SoA Layout in C++ 32 Processing Device Arrays with C++ Metaprogramming 33 GPU Metaprogramming: A Case Study in Biologically-Inspired Machine Vision 34 A Hybridization Methodology for High-Performance Linear Algebra Software for GPUs 35 Dynamic Load Balancing using Work-Stealing 36 Applying software-managed caching and CPUGPU task scheduling for accelerating dynamic workloads

Cited By

  1. Lettich F, Lucchese C, Nardini F, Orlando S, Perego R, Tonellotto N and Venturini R (2019). Parallel Traversal of Large Ensembles of Decision Trees, IEEE Transactions on Parallel and Distributed Systems, 30:9, (2075-2089), Online publication date: 1-Sep-2019.
  2. Wynters E (2018). Parallel particle swarm optimization can solve many optimization problems quickly on GPUS, Journal of Computing Sciences in Colleges, 33:6, (114-123), Online publication date: 1-Jun-2018.
  3. Andrzejewski W and Boinski P (2018). Efficient spatial co-location pattern mining on multiple GPUs, Expert Systems with Applications: An International Journal, 93:C, (465-483), Online publication date: 1-Mar-2018.
  4. ACM
    Small P, Liu K, Tiwari S, Kalia R, Nakano A, Nomura K and Vashishta P Acceleration of Dynamic n-Tuple Computations in Many-Body Molecular Dynamics Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, (159-170)
  5. Das R (2017). GPUs in subsurface simulation, Engineering with Computers, 33:4, (919-934), Online publication date: 1-Oct-2017.
  6. Perea J and Cordero J An improved parallel technique for neighbour search on CUDA Proceedings of the XXVII Spanish Computer Graphics Conference, (1-10)
  7. ACM
    George A, Manoj S, Gupte S and Sarkar S How Effective is Design Abstraction in Thrust? Proceedings of the 2017 Workshop on Software Engineering Methods for Parallel and High Performance Applications, (3-10)
  8. ACM
    Che S, Orr M and Gallmeier J Work Stealing in a Shared Virtual-Memory Heterogeneous Environment Proceedings of the Computing Frontiers Conference, (164-173)
  9. Jodra J, Gurrutxaga I, Muguerza J and Yera A (2017). Solving Poissons equation using FFT in a GPU cluster, Journal of Parallel and Distributed Computing, 102:C, (28-36), Online publication date: 1-Apr-2017.
  10. Barford L, Bhattacharyya S and Liu Y (2017). Data Flow Algorithms for Processors with Vector Extensions, Journal of Signal Processing Systems, 87:1, (21-31), Online publication date: 1-Apr-2017.
  11. Goli M and González---Vélez H (2018). Autonomic Coordination of Skeleton-Based Applications Over CPU/GPU Multi-Core Architectures, International Journal of Parallel Programming, 45:2, (203-224), Online publication date: 1-Apr-2017.
  12. Kukreja N, Louboutin M, Vieira F, Luporini F, Lange M and Gorman G Devito Proceedings of the Sixth International Workshop on Domain-Specific Languages and High-Level Frameworks for HPC, (11-19)
  13. ACM
    Sorensen T and Donaldson A (2016). Exposing errors related to weak memory in GPU applications, ACM SIGPLAN Notices, 51:6, (100-113), Online publication date: 1-Aug-2016.
  14. ACM
    Sorensen T and Donaldson A Exposing errors related to weak memory in GPU applications Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, (100-113)
  15. ACM
    Sarkar S Developer Productivity in HPC Application Development Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications, (29-30)
  16. ACM
    Che S, Orr M, Rodgers G and Gallmeier J Betweenness Centrality in an HSA-enabled System Proceedings of the ACM Workshop on High Performance Graph Processing, (35-38)
  17. Vias M, Fraguela B, Bozkus Z and Andrade D (2015). Improving OpenCL Programmability with the Heterogeneous Programming Library, Procedia Computer Science, 51:C, (110-119), Online publication date: 1-Sep-2015.
  18. ACM
    Bailey M Fundamentals seminar ACM SIGGRAPH 2015 Courses, (1-129)
  19. ACM
    Alglave J, Batty M, Donaldson A, Gopalakrishnan G, Ketema J, Poetzl D, Sorensen T and Wickerson J (2015). GPU Concurrency, ACM SIGARCH Computer Architecture News, 43:1, (577-591), Online publication date: 29-May-2015.
  20. ACM
    Alglave J, Batty M, Donaldson A, Gopalakrishnan G, Ketema J, Poetzl D, Sorensen T and Wickerson J (2015). GPU Concurrency, ACM SIGPLAN Notices, 50:4, (577-591), Online publication date: 12-May-2015.
  21. ACM
    Alglave J, Batty M, Donaldson A, Gopalakrishnan G, Ketema J, Poetzl D, Sorensen T and Wickerson J GPU Concurrency Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, (577-591)
  22. Reguly I, Mudalige G, Giles M, Curran D and McIntosh-Smith S The OPS domain specific abstraction for multi-block structured grid computations Proceedings of the Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing, (58-67)
  23. Noack M, Wende F, Steinke T and Cordes F A unified programming model for intra- and inter-node offloading on Xeon Phi clusters Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (203-214)
  24. ACM
    Vaidya A, Shayesteh A, Woo D, Saharoy R and Azimi M (2013). SIMD divergence optimization through intra-warp compaction, ACM SIGARCH Computer Architecture News, 41:3, (368-379), Online publication date: 26-Jun-2013.
  25. ACM
    Vaidya A, Shayesteh A, Woo D, Saharoy R and Azimi M SIMD divergence optimization through intra-warp compaction Proceedings of the 40th Annual International Symposium on Computer Architecture, (368-379)
  26. ACM
    Gonen O, Mahapatra S, Batra J and Liu J Exploring GPU architectures to accelerate semantic comparison for intention-based search Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units, (137-145)
  27. ACM
    Edwards H and Sunderland D Kokkos Array performance-portable manycore programming model Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores, (1-10)
Contributors

Recommendations