Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3615979.3656056acmconferencesArticle/Chapter ViewAbstractPublication PagespadsConference Proceedingsconference-collections
short-paper

Follow the Leader: Alternating CPU/GPU Computations in PDES

Published: 24 June 2024 Publication History

Abstract

Despite the successes of graphics processing units (GPUs) in accelerating simulations in several research fields, their use is largely restricted to domain-specific workloads that consistently offer the large degree of inherent parallelism and computational intensity at which GPUs excel. When targeting generic discrete-event simulations, whose dynamics can vary wildly over time, a static choice between a GPU-based and traditional CPU-based execution is likely to be suboptimal. Here, we explore a parallel discrete-event (PDES) execution scheme for CPU-GPU platforms that aims to approximate an optimal dynamic device choice. Starting from an intermediate model state, a current “leader” device running the simulation is periodically challenged by a brief concurrent run on another device starting from an intermediate model state. Based on the gathered performance measurements, a forecasting scheme determines the leader for the next period. The execution time and power consumption of this scheme hinge on 1) an efficient mechanism for providing the “follower” device with a consistent model state, and 2) robust performance forecasting to justify the device choices. We present these building blocks, their implementation combining the existing CPU and GPU simulators ROOT-Sim and GPUTW, and measurement results demonstrating substantially reduced execution time without increasing energy consumption over a static device choice.

References

[1]
Philipp Andelfinger and Hannes Hartenstein. 2014. Exploiting the parallelism of large-scale application-layer networks by adaptive GPU-based simulation. In Proceedings of the Winter Simulation Conference 2014. IEEE, Piscataway, NJ, USA, 3471–3482.
[2]
Philipp Andelfinger, Jens Mittag, and Hannes Hartenstein. 2011. GPU-based architectures and their benefit for accurate and efficient wireless network simulations. In 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems. IEEE, Piscataway, NJ, USA, 421–424.
[3]
Scott Bai and David M Nicol. 2008. Gpu coprocessing for wireless network simulation. In Symposium on Application Accelerators in High Performance Computing(SAAHPC’08). IEEE Computer Society, Washington, DC, USA, 3 pages.
[4]
Miguel Castro, Rodrigo Rodrigues, and Barbara Liskov. 2003. BASE: Using abstraction to improve fault tolerance. ACM transactions on computer systems 21, 3 (Aug. 2003), 236–269. https://doi.org/10.1145/859716.859718
[5]
John F Croix and Sunil P Khatri. 2009. Introduction to GPU programming for EDA. In Proceedings of the 2009 International Conference on Computer-Aided Design(ICCAD ’09). ACM, New York, NY, USA, 276–280.
[6]
Roland Ewald, Jan Himmelspach, and Adelinde M Uhrmacher. 2008. An algorithm selection approach for simulation systems. In 2008 22nd Workshop on Principles of Advanced and Distributed Simulation. IEEE, Piscataway, NJ, USA, 91–98.
[7]
Andreas K Fidjeland, Etienne B Roesch, Murray P Shanahan, and Wayne Luk. 2009. NeMo: a platform for neural modelling of spiking neurons using GPUs. In 2009 20th IEEE international conference on application-specific systems, architectures and processors. IEEE, Piscataway, NJ, USA, 137–144.
[8]
Richard M Fujimoto. 1990. Performance of Time Warp Under Synthetic Workloads. In Distributed Simulation(PADS ’90), David Nicol (Ed.). Society for Computer Simulation International, San Diego, CA, USA, 23–28.
[9]
Everette S Gardner Jr and ED McKenzie. 1985. Forecasting trends in time series. Management science 31, 10 (1985), 1237–1246.
[10]
Matthew Garrett. 2007. Powering Down: Smart power management is all about doing more with the resources we have.Queue 5, 7 (2007), 16–21.
[11]
Charles C. Holt. 2004. Forecasting seasonals and trends by exponentially weighted moving averages. International Journal of Forecasting 20, 1 (2004), 5–10. https://doi.org/10.1016/j.ijforecast.2003.09.015
[12]
Maria Hybinette and Richard M Fujimoto. 2001. Cloning parallel simulations. ACM transactions on modeling and computer simulation: a publication of the Association for Computing Machinery 11, 4 (Oct. 2001), 378–407. https://doi.org/10.1145/508366.508370
[13]
David Jefferson, Brian Beckman, Frederick Wieland, Leo Blume, and Mike DiLoreto. 1987. Time warp operating system. In Proceedings of the eleventh ACM Symposium on Operating systems principles. ACM, New York, NY, USA, 77–93.
[14]
Till Köster, Nicola M Drüeke, and Adelinde M Uhrmacher. 2018. Latency optimized execution of sequential simulators by parallel parameter optimization. In Proceedings of the 2018 Winter Simulation Conference. IEEE, Piscataway, NJ, USA, 4230–4231.
[15]
Xinhu Liu and Philipp Andelfinger. 2017. Time Warp on the GPU: Design and assessment. In Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation(SIGSIM PADS ’17). ACM, New York, NY, USA, 109–120. https://doi.org/10.1145/3064911.3064912
[16]
Unai Lopez-Novoa, Alexander Mendiburu, and Jose Miguel-Alonso. 2014. A survey of performance modeling and simulation techniques for accelerator-based computing. IEEE Transactions on Parallel and Distributed Systems 26, 1 (2014), 272–281.
[17]
Jayram Moorkanikara Nageswaran, Nikil Dutt, Jeffrey L Krichmar, Alex Nicolau, and Alex Veidenbaum. 2009. Efficient simulation of large-scale spiking neural networks using CUDA graphics processors. In 2009 International Joint Conference on Neural Networks. IEEE, Piscataway, NJ, USA, 2145–2152.
[18]
Cristobal A Navarro, Nancy Hitschfeld-Kahler, and Luis Mateu. 2014. A survey on parallel computing and its applications in data-parallel problems using GPU architectures. Communications in Computational Physics 15, 2 (2014), 285–329.
[19]
John A Nelder and Roger Mead. 1965. A simplex method for function minimization. The computer journal 7, 4 (1965), 308–313.
[20]
Quang Anh Pham Nguyen, Philipp Andelfinger, Wen Jun Tan, Wentong Cai, and Alois Knoll. 2021. Transitioning spiking neural network simulators to heterogeneous hardware. ACM Transactions on Modeling and Computer Simulation (TOMACS) 31, 2 (2021), 1–26.
[21]
David M Nicol. 1993. The cost of conservative synchronization in parallel discrete event simulations. Journal of the ACM (JACM) 40, 2 (1993), 304–333.
[22]
NVIDIA. 2024. CUDA, release: 10.3. https://developer.nvidia.com/cuda-toolkit
[23]
Hyungwook Park and Paul A Fishwick. 2010. A GPU-based application framework supporting fast discrete-event simulation. Simulation 86, 10 (2010), 613–628.
[24]
Alessandro Pellegrini, Roberto Vitali, and Francesco Quaglia. 2012. The ROme OpTimistic Simulator: Core Internals and Programming Model. In Proceedings of the 4th International ICST Conference on Simulation Tools and Techniques(SIMUTOOLS). ICST, Brussels, Belgium, 96–98. https://doi.org/10.4108/icst.simutools.2011.245551
[25]
Alessandro Pellegrini, Roberto Vitali, and Francesco Quaglia. 2015. Autonomic State Management for Optimistic Simulation Platforms. IEEE Transactions on Parallel and Distributed Systems 26 (2015), 1560–1569. https://doi.org/10.1109/TPDS.2014.2323967
[26]
Kalyan S Perumalla. 2006. Discrete-event execution alternatives on general purpose graphical processing units (GPGPUs). In 20th Workshop on Principles of Advanced and Distributed Simulation (PADS’06). IEEE, Piscataway, NJ, USA, 74–81.
[27]
Francesco Quaglia. 2007. Software Diversity-Based Active Replication as an Approach for Enhancing the Performance of Advanced Simulation Systems. Int. J. Found. Comput. Sci. 18, 3 (2007), 495–515. https://doi.org/10.1142/S0129054107004802
[28]
Roberto Vitali, Alessandro Pellegrini, and Francesco Quaglia. 2012. A load-sharing architecture for high performance optimistic simulations on multi-core machines. In Proceedings of the 19th International Conference on High Performance Computing(HiPC ’12). IEEE, Piscataway, NJ, USA, 1–10. https://doi.org/10.1109/hipc.2012.6507510
[29]
Jiajian Xiao, Philipp Andelfinger, Wentong Cai, Paul Richmond, Alois Knoll, and David Eckhoff. 2020. OpenABLext: An automatic code generation framework for agent-based simulations on CPU-GPU-FPGA heterogeneous platforms. Concurrency and Computation: Practice and Experience 32, 21 (2020), e5807.
[30]
Jiajian Xiao, Philipp Andelfinger, David Eckhoff, Wentong Cai, and Alois Knoll. 2019. A survey on agent-based simulation using hardware accelerators. ACM Computing Surveys (CSUR) 51, 6 (2019), 1–35.
[31]
Aiqi Zhu, Qi Chang, Ji Xu, and Wei Ge. 2023. A dynamic load balancing algorithm for CFD–DEM simulation with CPU–GPU heterogeneous computing. Powder Technology 428 (2023), 118782.

Cited By

View all
  • (2024)Model-Driven Engineering for High-Performance Parallel Discrete Event Simulations on Heterogeneous ArchitecturesProceedings of the Winter Simulation Conference10.5555/3712729.3712913(2202-2213)Online publication date: 15-Dec-2024
  • (2024)Reproducibility Report for the Paper: Follow the Leader: Alternating CPU/GPU Computations in PDESProceedings of the 38th ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3615979.3665109(135-140)Online publication date: 24-Jun-2024
  • (2024)Model-Driven Engineering for High-Performance Parallel Discrete Event Simulations on Heterogeneous Architectures2024 Winter Simulation Conference (WSC)10.1109/WSC63780.2024.10838978(2202-2213)Online publication date: 15-Dec-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGSIM-PADS '24: Proceedings of the 38th ACM SIGSIM Conference on Principles of Advanced Discrete Simulation
June 2024
155 pages
ISBN:9798400703638
DOI:10.1145/3615979
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 June 2024

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. GPU
  2. Parallel simulation
  3. Speculative simulation
  4. Time Warp

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Funding Sources

Conference

SIGSIM-PADS '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 398 of 779 submissions, 51%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)34
  • Downloads (Last 6 weeks)3
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Model-Driven Engineering for High-Performance Parallel Discrete Event Simulations on Heterogeneous ArchitecturesProceedings of the Winter Simulation Conference10.5555/3712729.3712913(2202-2213)Online publication date: 15-Dec-2024
  • (2024)Reproducibility Report for the Paper: Follow the Leader: Alternating CPU/GPU Computations in PDESProceedings of the 38th ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3615979.3665109(135-140)Online publication date: 24-Jun-2024
  • (2024)Model-Driven Engineering for High-Performance Parallel Discrete Event Simulations on Heterogeneous Architectures2024 Winter Simulation Conference (WSC)10.1109/WSC63780.2024.10838978(2202-2213)Online publication date: 15-Dec-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media