Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/378993.378999acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
Article
Free access

MemorIES3: a programmable, real-time hardware emulation tool for multiprocessor server design

Published: 12 November 2000 Publication History

Abstract

Modern system design often requires multiple levels of simulation for design validation and performance debugging. However, while machines have gotten faster, and simulators have become more detailed, simulation speeds have not tracked machine speeds, As a result, it is difficult to simulate realistic problem sizes and hardware configurations for a target machine. Instead, researchers have focussed on developing sealing methodologies and running smaller problem sizes and configurations that attempt to represent the behavior of the real problem. Given the increasing size of problems today, it is unclear whether such an approach yields accurate results. Moreover, although commercial workloads are prevalent and important in today's marketplace, many simulation tools are unable to adequately profile such applications, let alone for realistic sizes.In this paper we present a hardware-based emulation tool that can be used to aid memory system designers. Our focus is on the memory system because the ever-widening gap between processor and memory speeds means that optimizing the memory subsystem is critical for performance. We present the design of the Memory Instrumentation and Emulation System (MemoriES). MemoriES is a programmable tool designed using FPGAs and SDRAMs. It plugs into an SMP bus to perform on-line emulation of several cache configurations, structures and protocols while the system is running real-life workloads in real-time, without any slowdown in application execution speed. We demonstrate its usefulness in several case studies, and find several important results. First, using traces to perform system evaluation can lead to incorrect results (off by 100% or more in some cases) if the trace size is not sufficiently large. Second. MemoriES is able to detect performance problems by profiling miss behavior over the entire course of a run, rather than relying on a small interval of time. Finally, we observe that previous studies of SPLASH2 applications using scaled application sizes can result in optimistic miss rates relative to real sizes on real machines, providing potentially misleading data when used for design evaluation.

References

[1]
Altera Corporation, Flex10K Embedded Programmable Logic Family Data Sheet. http://www.altera.com.
[2]
E. Bilir, R. Dickson, Y. Hu, M. Plakal, D. Sorin, M. Hill, and D. Wood. Multicast Snooping: A New Coherence Method using a Multicast Address Network. In Proceedings of the 26th Annual International Symposium on Computer Architecture. May 1999.
[3]
M.Dubois, A. Gefflaut, J. Jeong, A. Moga, and K. Oner, "Rapid prototyping on RPM-Methodology and Experience," IEEE Design and Test of Computers, pp 112-118, July-Sep. 1998.
[4]
B. Falsafi and D. Wood. Reactive NUMA: A Design for Unifying S-COMA with CC-NUMA. In Proceedings of the 24th Annual International Symposium on Computer Architecture. June 1997.
[5]
B. Falsafi and D. Wood. Parallel Dispatch Queue: A Queue-Based Parallel Programming Abstraction to Parallelize Fine-Grain Communication Protocols. In Proceedings of the 5th International Conference on High-Performance Computing. January, 1999.
[6]
D. Fullagar, P. Quinn, C. Grillmair, J. Salmon, and M. Warren. N-body Methods on MIMD Supercomputers: Astrophysics on the Intel Touchstone Delta. In Proceedings of the Fifth Australian Supercomputing Conference. December 1992.
[7]
Y. Hu, H. Lu, A. Cox, and W. Zwaenepoel. OpenMP for Networks of SMPs. In Proceedings of the Thirteenth International Parallel Processing Symposium. April 1999.
[8]
IBM Corp., RS/6000 Enterprise Server S7A Users' Guide, Oct. 1998
[9]
J. Levesque. Personal Communication. April 2000.
[10]
M. Michael, A. Nanda, B.-H. Lim, and M. Scott. Coherence Controller Architectures for SMP-Based CC-NUMA Multiprocessors. In Proceedings of the 24th International Symposium on Computer Architecture. June 1997.
[11]
A.K. Nanda, Y. Hu, M. Ohara, M. Giampapa, C. Benveniste and M. Michael. The Design of COMPASS: An Execution Driven Simulator for Commercial Applications Running on Shared Memory Multiprocessors. In Proceedings of International Parallel Processing Symposium, April 1998.
[12]
A.-T. Nguyen, M. Michael, A. Sharma and J. Torrellas. The Augmint Multiprocessor Simulation Toolkit for Intel x86 Architectures. In Proceedings of the International Conference on Computer Design, pp. 486-490, Oct.1996.
[13]
V. S. Pai, P. Ranganathan, and S. Adve. RSIM: An Execution-Driven Simulator for ILP-Based Shared-Memory Multiprocessors and Uniprocessors. In Proceedings of the Third Workshop on Computer Architecture Education. Feb. 1997.
[14]
Quickturn Corporation. http://www.quickturn.com
[15]
M. Rosenblum, S. Herrod, E. Witchel, and A. Gupta. Complete Computer Simulation: The SimOS Approach. In IEEE Parallel and Distributed Technology. Fall 1995.
[16]
D. Jiang and J. P. Singh. Scaling Application Performance on Cache-coherent Multiprocessors. In Proceedings of the 26th Annual International Symposium on Computer Architecture. May 1999.
[17]
Transaction Processing Council: http://www.tpc.org
[18]
W.-D. Weber. Scalable Directories for Cache-Coherent Shared-Memory Multiprocessors. Stanford University Technical Report CSL-TR-93-557. Jan. 1993.
[19]
Z. Wang, J. Lupo, A. McKenney, and R. Pachter. Large Scale Molecular Dynamics Simulations with Fast Multipole Implementations. In Proceedings of SC99. Nov. 1999.
[20]
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh and A. Gupta. The SPLASH-2 Programs: Characterization and Methodological Considerations. In Proceedings of the 22nd International Symposium on Computer Architecture, June 1995.
[21]
E. Witchel and M. Rosenblum. Embra: Fast and Flexible Machine Simulation. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS). 1996.

Cited By

View all

Index Terms

  1. MemorIES3: a programmable, real-time hardware emulation tool for multiprocessor server design

                          Recommendations

                          Comments

                          Information & Contributors

                          Information

                          Published In

                          cover image ACM Conferences
                          ASPLOS IX: Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
                          November 2000
                          271 pages
                          ISBN:1581133170
                          DOI:10.1145/378993
                          • cover image ACM SIGARCH Computer Architecture News
                            ACM SIGARCH Computer Architecture News  Volume 28, Issue 5
                            Special Issue: Proceedings of the ninth international conference on Architectural support for programming languages and operating systems (ASPLOS '00)
                            Dec. 2000
                            269 pages
                            ISSN:0163-5964
                            DOI:10.1145/378995
                            Issue’s Table of Contents
                          • cover image ACM SIGOPS Operating Systems Review
                            ACM SIGOPS Operating Systems Review  Volume 34, Issue 5
                            Dec. 2000
                            269 pages
                            ISSN:0163-5980
                            DOI:10.1145/384264
                            Issue’s Table of Contents
                          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                          Sponsors

                          Publisher

                          Association for Computing Machinery

                          New York, NY, United States

                          Publication History

                          Published: 12 November 2000

                          Permissions

                          Request permissions for this article.

                          Check for updates

                          Qualifiers

                          • Article

                          Conference

                          ASPLOS00
                          ASPLOS00: ASPLOS 2000 Conference
                          Massachusetts, Cambridge, USA

                          Acceptance Rates

                          ASPLOS IX Paper Acceptance Rate 24 of 114 submissions, 21%;
                          Overall Acceptance Rate 535 of 2,713 submissions, 20%

                          Upcoming Conference

                          Contributors

                          Other Metrics

                          Bibliometrics & Citations

                          Bibliometrics

                          Article Metrics

                          • Downloads (Last 12 months)159
                          • Downloads (Last 6 weeks)69
                          Reflects downloads up to 15 Oct 2024

                          Other Metrics

                          Citations

                          Cited By

                          View all
                          • (2014)HMTTACM Transactions on Architecture and Code Optimization10.1145/257966811:1(1-25)Online publication date: 1-Feb-2014
                          • (2010)Performance of large low-associativity cachesACM SIGMETRICS Performance Evaluation Review10.1145/1773394.177339737:4(11-18)Online publication date: 27-Mar-2010
                          • (2010)A low-cost memory remapping scheme for address bus protectionJournal of Parallel and Distributed Computing10.1016/j.jpdc.2009.11.00870:5(443-457)Online publication date: 1-May-2010
                          • (2008)HMTTACM SIGMETRICS Performance Evaluation Review10.1145/1384529.137548436:1(229-240)Online publication date: 2-Jun-2008
                          • (2008)HMTTProceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems10.1145/1375457.1375484(229-240)Online publication date: 2-Jun-2008
                          • (2008)Active cache emulatorIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2007.91217716:3(229-240)Online publication date: 1-Mar-2008
                          • (2007)Understanding the Memory Performance of Data-Mining Workloads on Small, Medium, and Large-Scale CMPs Using Hardware-Software Co-simulation2007 IEEE International Symposium on Performance Analysis of Systems & Software10.1109/ISPASS.2007.363734(35-43)Online publication date: Apr-2007
                          • (2007)Addressing Cache/Memory Overheads in Enterprise Java CMP ServersProceedings of the 2007 IEEE 10th International Symposium on Workload Characterization10.1109/IISWC.2007.4362182(66-75)Online publication date: 27-Sep-2007
                          • (2007)An FPGA Approach to Quantifying Coherence Traffic Efficiency on Multiprocessor Systems2007 International Conference on Field Programmable Logic and Applications10.1109/FPL.2007.4380624(47-53)Online publication date: Aug-2007
                          • (2006)A low-cost memory remapping scheme for address bus protectionProceedings of the 15th international conference on Parallel architectures and compilation techniques10.1145/1152154.1152169(74-83)Online publication date: 16-Sep-2006
                          • Show More Cited By

                          View Options

                          View options

                          PDF

                          View or Download as a PDF file.

                          PDF

                          eReader

                          View online with eReader.

                          eReader

                          Get Access

                          Login options

                          Media

                          Figures

                          Other

                          Tables

                          Share

                          Share

                          Share this Publication link

                          Share on social media