research-article

Can PDES scale in environments with heterogeneous delays?

Authors:

Ketan Bahulkar,

Dmitry Ponomarev,

Nael Abu-GhazalehAuthors Info & Claims

SIGSIM PADS '13: Proceedings of the 1st ACM SIGSIM Conference on Principles of Advanced Discrete Simulation

Pages 35 - 46

https://doi.org/10.1145/2486092.2486098

Published: 19 May 2013 Publication History

Abstract

The performance and scalability of Parallel Discrete Event Simulation (PDES) is often limited by communication latencies and overheads. The emergence of multi-core processors and their expected evolution into many-cores offers the promise of low latency communication and tight memory integration between cores; these properties should significantly improve the performance of PDES in such environments. However, on clusters of multi-cores (CMs), the latency and processing overheads incurred when communicating between different machines (nodes) far outweigh those between cores on the same chip, especially when commodity networking fabrics and communication software are used. It is unclear if there is any benefit to the low latency among cores on the same node given that communication links across nodes are significantly worse. In this study, we examine the performance of a multi-threaded implementation of PDES on CMs. We demonstrate that the inter-node communication costs impose a substantial bottleneck on PDES and demonstrate that without optimizations addressing these long latencies, multi-threaded PDES does not significantly outperform the multiprocess version despite direct communication through shared memory on the individual nodes. We then propose three optimizations: message consolidation and routing, infrequent polling and latency-sensitive model partitioning. We show that with these optimizations in place, threaded implementation of PDES significantly outperforms process-based implementation even on CMs.

References

[1]

K. Bahulkar, J. Wang, N. Abu-Ghazaleh, and D. Ponomarev. Partitioning on dynamic behavior for parallel discrete event simulation. In Principles of Advanced and Distributed Simulation (PADS), pages 221--230. IEEE, 2012.

Digital Library

[2]

M. L. Bailey, J. V. Briner, Jr., and R. D. Chamberlain. Parallel logic simulation of VLSI systems. ACM Computing Surveys, 26(3):255--294, sep 1994.

Digital Library

[3]

D. Bauer, C. Carothers, and A. Holder. Scalable time warp on bluegene supercomputer. In Principles of Advanced and Distributed Simulation (PADS), pages 35--44, 2009.

Digital Library

[4]

A. Boukerche and S. Das. Dynamic load balancing strategies for conservative parallel simulation. In Principles of Advanced and Distributed Simulation (PADS), pages 32--37, 1997.

Digital Library

[5]

A. Canedo, T. Yoshizawa, and H.Komatsu. Automatic parallelization of simulink applications. In Proc. of CGO, pages 151--159, 2010.

Digital Library

[6]

C. Carothers, D. Bauer, and S. Pearce. ROSS: A high-performance, low memory, modular time warp system. In Principles of Advanced and Distributed Simulation (PADS), pages 53--60. IEEE, 2000.

Digital Library

[7]

C. D. Carothers, R. M. Fujimoto, and P. England. Effect of communication overheads on Time Warp performance: An experimental study. In Principles of Advanced and Distributed Simulation (PADS), pages 118--125, jul 1994.

Digital Library

[8]

C. D. Carothers, R. M. Fujimoto, and Y.-B. Lin. A case study in simulating pcs networks using time warp. In Principles of Advanced and Distributed Simulation (PADS), pages 87--94. IEEE, 1995.

Digital Library

[9]

C. Chen, J. Zhang, R. Cohen, and P.Ho. Secure and efficient trust opinion aggregation for vehicular ad-hoc networks. In Proc. of VTC, pages 1--5, 2010.

[10]

L. Chen, Y. Lu, Y. Yao, S. Peng, and L. Wu. A well-balanced time warp system on multi-core environments. In Principles of Advanced and Distributed Simulation (PADS), pages 1--9. IEEE, 2011.

Digital Library

[11]

M. Chetlur, N. Abu-Ghazaleh, R. Radhakrishnan, and P. A. Wilsey. Optimizing communication in Time-Warp simulators. In Principles of Advanced and Distributed Simulation (PADS), pages 64--71. IEEE, 1998.

Digital Library

[12]

R. Child and P. Wilsey. Dynamically adjusting core frequencies to accelerate time warp simulations in many-core processors. In Principles of Advanced and Distributed Simulation (PADS), pages 35--43. IEEE, 2012.

Digital Library

[13]

J. Cloutier. Model partitioning and the performance of distributed timewarp simulation of logic circuits. Simulation Practice and Theory, 5(1):83--99, 1997.

[14]

J. Doi and Y. Negishi. Overlapping methods of all-to-all communication and FFT algorithms for torus-connected massively parallel supercomputers. In Proc. of Int'l Conference on Supercomputing, pages 1--9, 2010.

Digital Library

[15]

K. El-Khatib and C. Tropper. On metrics for the dynamic load balancing of optimistic simulations. In Proc. 32nd Hawaii International Conference on Systems Science (HICCS), 1999.

Digital Library

[16]

R. Fujimoto. Parallel discrete event simulation. Communications of the ACM, 33(10):30--53, oct 1990.

Digital Library

[17]

R. Fujimoto. Performance of time warp under synthetic workloads. Proceedings of the SCS Multiconference on Distributed Simulation, 22(1):23--28, 1990.

[18]

D. Jagtap, K. Bahulkar, D.Ponomarev, and N.Abu-Ghazaleh. Characterizing and understanding pdes behavior on tilera architecture. In Principles of Advanced and Distributed Simulation (PADS), pages 53--62. IEEE, 2012.

Digital Library

[19]

D. Jagtap, N.Abu-Ghazaleh, and D.Ponomarev. Optimization of parallel discrete event simulator for multi-core systems. In Parallel and Distributed Processing Symposium (IPDPS), pages 520--531. IEEE, 2012.

Digital Library

[20]

G. Karypis and V. Kumar. hmetis: a hypergraph partitioning package. Available on WWW at URL: http://www.cs.umn.edu/ karypis/metis/hmetis.

[21]

K.Bahulkar, N.Hofmann, D.Jagtap, N.Abu-Ghazaleh, and D.Ponomarev. Performance evaluation of pdes on multicore clusters. In 14th IEEE/ACM International Symposium on Distributed Simulation and Real-Time Applications (DS-RT), pages 131--140, 2010.

Digital Library

[22]

K.S.Perumalla. Scaling time warp-based discrete event execution to 104 processors on a blue gene supercomputer. In in Proceedings of the ACM Computing Frontiers, pages 69--76, 2007.

Digital Library

[23]

L. Li and C. Tropper. A design-driven partitioning algorithm for distributed verilog simulation. In Principles of Advanced and Distributed Simulation (PADS), pages 211--218. IEEE, 2007.

Digital Library

[24]

J. Liu, B. chandrasekaran, J. Wu, W. Jiang, S. Kini, W. Yu, D. Buntinas, P. Wyckoff, and D. Panda. Performance comparison of mpi implementations over infiniband, myrinet and quadrics. In Proc. of ACM/IEEE conference on Supercomputing, pages 58--71. IEEE, nov 2003.

Digital Library

[25]

J. Liu and R. Rong. Hierarchical composite synchronization. In Principles of Advanced and Distributed Simulation (PADS), pages 3--12. IEEE, 2012.

Digital Library

[26]

P. Peschlow, T. Honecker, and P. Martini. A flexible dynamic partitioning algorithm for optimistic distributed simulation. In Principles of Advanced and Distributed Simulation (PADS), pages 219--228. IEEE, 2007.

Digital Library

[27]

R. Preissl, N. Wichmann, B. Long, J. Shalf, S. Ethier, and A. Koniges. Multithreaded global address space communication techniques for gyrokinetic fusion applications on ultra-scale platforms. In Proc. of Int'l Conference on Supercomputing, 2011.

Digital Library

[28]

V. Sarkar and J. Hennessy. Compile-time partitioning and scheduling of parallel programs. In Proc. of the SIGPLAN Symposium on Compiler construction, pages 17--26, 1986.

Digital Library

[29]

G. D. Sharma, N. B. Abu-Ghazaleh, U. V. Rajasekaran, and P. A. Wilsey. Optimizing message delivery in asynchronous distributed applications. In Proc. of Euro-Par, pages 1204--1208, 1998.

Digital Library

[30]

G. D. Sharma, R. Radhakrishnan, U. V. Rajesekaran, N. B. Abu-Ghazaleh, and P. A. Wilsey. Time warp simulation on clumps. In Principles of Advanced and Distributed Simulation (PADS), pages 174--181, may 1999.

Digital Library

[31]

R. Vitali, A. Pellegrini, and F. Quaglia. Assessing load-sharing within optimistic simulation platforms. In Proceedings of the 2012 Winter Simulation Conference. IEEE, 2012.

Digital Library

[32]

R. Vitali, A. Pellegrini, and F. Quaglia. Towards symmetric multi-threaded optimistic simulation kernels. In Principles of Advanced and Distributed Simulation (PADS), pages 211--220. IEEE, 2012.

Digital Library

[33]

J. Wang, D.Ponomarev, and N.Abu-Ghazaleh. Performance analysis of a multithreaded pdes simulator on multicore clusters. In Principles of Advanced and Distributed Simulation (PADS) (Short Paper), pages 93--95. IEEE, 2012.

Digital Library

Cited By

Williams BEker AChiu KPonomarev DDiallo STolk AGiabbanelli P(2021)High-Performance PDES on Manycore ClustersProceedings of the 2021 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3437959.3459252(153-164)Online publication date: 21-May-2021
https://dl.acm.org/doi/10.1145/3437959.3459252
Eker AWilliams BChiu KPonomarev D(2019)Controlled Asynchronous GVTProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337927(1-10)Online publication date: 5-Aug-2019
https://dl.acm.org/doi/10.1145/3337821.3337927
Ross CCarothers CMubarak MRoss RLi JMa KJohansson BJain S(2018)Leveraging shared memory in the ross time warp simulator for complex network simulationsProceedings of the 2018 Winter Simulation Conference10.5555/3320516.3320974(3837-3848)Online publication date: 9-Dec-2018
https://dl.acm.org/doi/10.5555/3320516.3320974
Show More Cited By

Index Terms

Can PDES scale in environments with heterogeneous delays?
1. Computing methodologies
  1. Modeling and simulation
    1. Simulation types and techniques
      1. Discrete-event simulation
      2. Massively parallel and high-performance simulations

Recommendations

Can MIC find its place in the field of PDES?: An Early Performance Evaluation of PDES Simulator on Intel Many Integrated Cores Coprocessor
DS-RT 2015: Proceedings of the 19th International Symposium on Distributed Simulation and Real Time Applications

The widespread utilization of many-core processors offers a good opportunity for Parallel Discrete Events Simulation (PDES) to obtain a better execution performance. As one of the newly introduced many-core processors, the Intel Xeon Phi coprocessor ...
Coordinator-master-worker model for efficient large scale network simulation
SimuTools '13: Proceedings of the 6th International ICST Conference on Simulation Tools and Techniques

In this work, we propose a coordinator-master-worker (CMW) model for medium to extra-large scale network simulation. The model supports distributed and parallel simulation for a heterogeneous computing node architecture with both multi-core CPUs and ...
PDES-A: Accelerators for Parallel Discrete Event Simulation Implemented on FPGAs
Special Issue on PADS 2017

In this article, we present experiences implementing a general Parallel Discrete Event Simulation (PDES) accelerator on a Field Programmable Gate Array (FPGA). The accelerator can be specialized to any particular simulation model by defining the object ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGSIM PADS '13: Proceedings of the 1st ACM SIGSIM Conference on Principles of Advanced Discrete Simulation

May 2013

426 pages

ISBN:9781450319201

DOI:10.1145/2486092

General Chair:
Margaret L. Loper
Georgia Institute of Technology, USA
,
Program Chair:
Gabriel A. Wainer
Carleton University, Canada

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSIM: ACM Special Interest Group on Simulation and Modeling

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 May 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGSIM-PADS '13

Sponsor:

SIGSIM

SIGSIM-PADS '13: SIGSIM Principles of Advanced Discrete Simulation

May 19 - 22, 2013

Québec, Montr©al, Canada

Acceptance Rates

SIGSIM PADS '13 Paper Acceptance Rate 29 of 75 submissions, 39%;

Overall Acceptance Rate 398 of 779 submissions, 51%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
170
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)1

Reflects downloads up to 31 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Williams BEker AChiu KPonomarev DDiallo STolk AGiabbanelli P(2021)High-Performance PDES on Manycore ClustersProceedings of the 2021 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3437959.3459252(153-164)Online publication date: 21-May-2021
https://dl.acm.org/doi/10.1145/3437959.3459252
Eker AWilliams BChiu KPonomarev D(2019)Controlled Asynchronous GVTProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337927(1-10)Online publication date: 5-Aug-2019
https://dl.acm.org/doi/10.1145/3337821.3337927
Ross CCarothers CMubarak MRoss RLi JMa KJohansson BJain S(2018)Leveraging shared memory in the ross time warp simulator for complex network simulationsProceedings of the 2018 Winter Simulation Conference10.5555/3320516.3320974(3837-3848)Online publication date: 9-Dec-2018
https://dl.acm.org/doi/10.5555/3320516.3320974
Wang JAbu-Ghazaleh NPonomarev D(2015)AIRACM Transactions on Modeling and Computer Simulation10.1145/270142025:3(1-25)Online publication date: 16-Apr-2015
https://dl.acm.org/doi/10.1145/2701420
Wang JJagtap DAbu-Ghazaleh NPonomarev D(2014)Parallel Discrete Event Simulation for Multi-Core SystemsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2013.19325:6(1574-1584)Online publication date: 1-Jun-2014
https://dl.acm.org/doi/10.1109/TPDS.2013.193

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten