Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1878537.1878637acmotherconferencesArticle/Chapter ViewAbstractPublication PagesspringsimConference Proceedingsconference-collections
research-article

Using GPU to accelerate a pin-based multi-level cache simulator

Published: 11 April 2010 Publication History

Abstract

Trace-driven simulation methodology is the most widely used method to evaluate the design of future computer memory architecture. Since this methodology demands large amounts of storage and computer time, there is a growing need for simulation methodologies to determine the memory system requirements of emerging workloads in a reasonable amount of time. Several techniques have been proposed to reduce the space that store memory reference and improve the performance of sequential trace-driven simulation. This paper presents the use of binary instrumentation as the memory reference generator and parallel simulation technique that based on the generic graphics processing unit (GPU). One way to achieve fast parallel simulation is to simulate the independent sets of a cache concurrently on different compute resource, but results show that this method is not efficient because of a high correlation of the activity between different sets. To put parallelism to effective use, we show that a multi-configuration simulation in single pass method gains 2.44x performance improvement compared to traditional sequential algorithm.

References

[1]
Pin home page: http://www.pintool.org/
[2]
Jan Edler, formerly of NEC and Mark D. Hill, Univ. of Wisconsin Computer Sciences, Dinero IV Trace-Driven Uniprocessor Cache Simulator, http://www.cs.wisc.edu/~markhill/DineroIV
[3]
R. A. Uhlig. and T. N. Mudge. "Trace-driven Memory Simulation: A survey" {J}. ACM Computing surveys, Vol. 29, 1997.
[4]
Mattson, R. L., Gecsei, J., Slutz, D. R. and Traiger, I. L. Evaluation techniques for storage hierarchies.
[5]
Milenkovi'c A. and Milenkovi'c, M. An efficient single-pass trace compression technique utilizing instruction streams. ACM Transactions on Modeling and Computer Simulation, Vol. 17, No. 1, Article 2, Publication date: January 2007.
[6]
R. G. Ingalls, M. D. Rossetti, J. S. Smith, and B. A. Peters, eds. Approximate Time-parallel Cache simulation. Proceedings of the 2004 Winter Simulation Conference. 2002.
[7]
T. Kiesling and S. Pohl. Time-Parallel Simulation with Approximate State Matching. In Proceedings of the 18th Workshop on Parallel and Distributed Simulation, 2004.
[8]
NVIDIA CUDA Programming Guide, http://developer.nvidia.com/cuda
[9]
Srivastava and A. Eustace. ATOM: A System for Building Customized Program Analysis Tools, Programming Language Design and Implementation (PLDI), 1994, pp. 196--205.
[10]
E. Nurvitadhi, N. Chalaiinanont, and S. L. Lu., Characterization of L3 Cache Behaviro of SPECjAppServer2002 and TPC-C. In Proceedings of the 19th International Conference on Supercomputing (ICS), Boston, Massachusetts, 2005.
[11]
L. A. Barroso, K. Gharachorloo, and E. Bugnion, Memory System Charaterization of Commercial Workloads. In Proceedings of the 25th International Symposium on Computer Architecure (ISCA), Barcelona, Spain, 1998.
[12]
V. Reddi, A. M. Settle, D. A. Connors and R. S. Cohn. Pin: A Binary Instrumentation Tool for Computer Architecture Research and Education. In Proceedings of the Workshop on Computer Architecture Education, June 2004.
[13]
Aamer Jaleel, Robert S. Cohn, Chi-Keung Luk and Bruce Jacob, CMP$im: A Pin-Based On-The-Fly Multi-Core Cache Simulator. The 4th Annual Workshop on Modeling, Benchmarking, and Simulation. June 22, 2008, pp. 28--36.
[14]
Wan Han, Gao Xiaopeng, Wang Zhiqiang, Cache Simulator based on GPU Acceleration, SIMUTOOLS 2009, Rome, March 2--6 2009, Page(s):1--6.

Cited By

View all
  • (2016)Parallel Computing Education Through SimulationTheory, Methodology, Tools and Applications for Modeling and Simulation of Complex Systems10.1007/978-981-10-2663-8_60(585-591)Online publication date: 22-Sep-2016
  • (2011)Scalable Multi-cache Simulation Using GPUsProceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems10.1109/MASCOTS.2011.24(159-167)Online publication date: 25-Jul-2011

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SpringSim '10: Proceedings of the 2010 Spring Simulation Multiconference
April 2010
1726 pages
ISBN:9781450300698

Sponsors

  • SCS: Society for Modeling and Simulation International

In-Cooperation

Publisher

Society for Computer Simulation International

San Diego, CA, United States

Publication History

Published: 11 April 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. CUDA
  2. GPGPU
  3. cache
  4. parallel simulation
  5. pin

Qualifiers

  • Research-article

Funding Sources

Conference

SpringSim '10
Sponsor:
  • SCS
SpringSim '10: 2010 Spring Simulation Conference
April 11 - 15, 2010
Florida, Orlando

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2016)Parallel Computing Education Through SimulationTheory, Methodology, Tools and Applications for Modeling and Simulation of Complex Systems10.1007/978-981-10-2663-8_60(585-591)Online publication date: 22-Sep-2016
  • (2011)Scalable Multi-cache Simulation Using GPUsProceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems10.1109/MASCOTS.2011.24(159-167)Online publication date: 25-Jul-2011

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media