Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1531743.1531751acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
research-article

Core monitors: monitoring performance in multicore processors

Published: 18 May 2009 Publication History

Abstract

As we reach the limits of single-core computing, we are promised more and more cores in our systems. Modern architectures include many performance counters per core, but few or no inter-core counters. In fact, performance counters were not designed to be exploited by users, as they now are, but simply as aids for hardware debugging and testing during system creation. As such, they tend to be an "after thought" in the design, with no standardization across or within platforms. Nonetheless, given access to these counters, researchers are using them to great advantage [17]. Furthermore, evaluating counters for multicore systems has become a complex and resource consuming task. We propose a Performance Monitoring System consisting of a specialized CPU core designed to allow efficient collection and evaluation of performance data for both static and dynamic optimizations. Our system provides a transparent mechanism to change architectural features dynamically, inform the Operating System of process behaviors, and assist in profiling and debugging. For instance, a piece of hardware watching snoop packets can determine when a write-update cache coherence protocol would be helpful or detrimental to the currently running program. Our system is designed to allow the hardware to feed performance statistics back to software, allowing dynamic architectural adjustments at runtime.

References

[1]
S. B. Pentium 4 performance-monitoring features. IEEE Micro, 22(4):72--82, Jul/Aug 2002.
[2]
W. Binder. Portable and accurate sampling profiling for java. Softw. Pract. Exper., 36(6):615--650, 2006.
[3]
N. L. Binkert, R. G. Dreslinski, L. R. Hsu, K. T. Lim, A. G. Saidi, and S. K. Reinhardt. The m5 simulator: Modeling networked systems. IEEE Micro, 26(4):52--60, 2006.
[4]
K. Chow and Y. Wu. Feedback-directed selection and characterization of compiler optimizations. 2nd Workshop on Feedback Directed Optimization, 1999.
[5]
Compaq. Alpha architecture handbook. whitpaper, October 1998.
[6]
J. Dean, J. Hicks, C. Waldspurger, W. Weihl, and G. Chrysos. ProfileMe: Hardware support for instruction-level profiling on out-of-order processors. In Proc. IEEE/ACM 30th International Symposium on Microarchitecture, pages 292--302, Dec. 1997.
[7]
J. Dean, J. E. Hicks, C. A. Waldspurger, W. E. Weihl, and G. Chrysos. ProfileMe: hardware support for instruction-level profiling on out-of-order processors. In MICRO 30: Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, pages 292--302, Washington, DC, USA, 1997. IEEE Computer Society.
[8]
G. Delzanno. Automatic verification of parameterized cache coherence protocols. In Computer Aided Verification, pages 53--68, Dec. 2006.
[9]
B. Fields, R. Bodik, M. Hill, and C. Newburn. Using interaction costs for microarchitectural bottleneck analysis. In Proc. IEEE/ACM 36th International Symposium on Microarchitecture, pages 228--239, Dec. 2003.
[10]
H. Grahn and P. Stenstrom. Evaluation of a competitive-update cache coherence protocol with migratory data detection. J. Parallel Distrib. Comput., 39(2):168--180, 1996.
[11]
T. Heil and J. E. Smith. Relational profiling: Enable thread-level paralelism in virtual machines. Microarchitecture, IEEE/ACM International Symposium on, 0:281, 2000.
[12]
M. Helms, T. Bochner, R. Fritz, T. Schlipf, and M. Walz. Event monitoring in a system-on-a-chip. In Proc. 12th Annual IEEE International ASIC/SOC Conference, Sept. 1999.
[13]
R. Hockauf, J. Jeitner, W. Karl, R. Lindhof, M. Schulz, V. Gonzales, E. Sanquis, and G. Torralba. Design and implementation aspects for the SMiLE hardware monitor. In G. Horn and W. Karl, editors, Proc. of SCI-Europe 2000, The 3rd International Conference on SCI-Based Technology and Research, pages 47--55. SINTEF Electronics and Cybernetics, Aug. 2000. ISBN: 82-595-9964-3, Also available at http://wwwbode.in.tum.de/events/.
[14]
Intel. Intel Itanium Architecture Software Developer's Manual, 2000.
[15]
Intel. Intel Architecture Software Developer's Manual Volume 3: System Programming Guide, 2002.
[16]
W. Karl, M. Leberecht, and M. Schulz. Optimizing data locality for SCI-based PC-clusters with the SMiLE monitoring approach. In Proc. of International Conference on Parallel Architectures and Compilation Techniques (PACT), pages 169--176, Oct. 1999.
[17]
M. Martonosi, D. W. Clark, and M. Mesarina. The SHRIMP performance monitor: Design and applications. In ACM SIGMETRICS Performance Evaluation Review, pages 61--69, May 1996.
[18]
M. Martonosi, D. Ofelt, and M. Heinrich. Integrating performance monitoring and communication in parallel computers. In Proc.ACM International Conference on Measurement and Modeling of Computer Systems, pages 138--147, May 1996.
[19]
T. Mu, J. Tao, M. Schulz, and S. McKee. Interactive locality optimization on NUMA architectures. In Proc. ACM 2003 Symposium on Software Visualization (SoftVis), pages 133--142,214, July 2003.
[20]
A. Nanda, K. Mak, K. Sugavanam, R. Sahoo, V. Soundararajan, and T. Smith. MemorIES: a programmable, real-time hardware emulation tool for multiprocessor server design. SIGPLAN Not., 35(11):37--48, 2000.
[21]
M. Prvulovic and J. Torrellas. Reenact: Using thread-level speculation mechanisms to debug data races in multithreaded codes. In Proc. 30th IEEE/ACM International Symposium on Computer Architecture, pages 110--121, June 2003.
[22]
V. Salapura. Bluegene/p performance counters. Personal Communication: Paper in Submission, Nov. 2007.
[23]
V. Salapura, K. Ganesan, A. Gara, M. Gschwind, J. Sexton, and R. Walkup. Next-generation performance counters: Towards monitoring over thousand concurrent events. Performance Analysis of Systems and software, 2008. ISPASS 2008. IEEE International Symposium on, pages 139--146, April 2008.
[24]
S. Sarangi, A. Tiwari, and J. Torrellas. Phoenix: Detecting and recovering from permanent processor design bugs with programmable hardware. In Proc. IEEE/ACM 40th Annual International Symposium on Microarchitecture, pages 26--37, Dec. 2006.
[25]
S. Sastry, R. Bodík, and J. Smith. Rapid profiling via stratified sampling. In Proc. 28th IEEE/ACM International Symposium on Computer Architecture, pages 278--289, July 2001.
[26]
M. Schulz, B. White, S. McKee, H. Lee, and J. Jeitner. Owl: Next generation system monitoring. In Proc. ACM Computing Frontiers Conference, May 2005.
[27]
B. Sprunt. The basics of performance--monitoring hardware. IEEE Micro, pages 64--71, July/August 2002.
[28]
B. Sprunt. Pentium 4 performance-monitoring features. IEEE Micro, pages 72--82, July/August 2002.
[29]
M. Xu, R. Bodik, and M. Hill. A flight data recorder for enabling full-system multiprocessor deterministic replay. In Proc. 30th IEEE/ACM International Symposium on Computer Architecture, pages 122--135, June 2003.
[30]
P. Zhou, V. Pandey, J. Sundaresan, A. Raghuraman, Y. Zhou, and S. Kumar. Dynamic tracking of page miss ratio curve for memory management. In Proc. 11th ACM Symposium on Architectural Support for Programming Languages and Operating Systems, pages 177--188, Oct. 2004.
[31]
P. Zhou, F. Qin, W. Liu, Y. Zhou, and J. Torrellas. iwatcher: efficient architectural support for software debugging. Computer Architecture, 2004. Proceedings. 31st Annual International Symposium on, pages 224--235, June 2004.
[32]
P. Zhou, F. Qin, W. Liu, Y. Zhou, and J. Torrellas. iWatcher: Efficient architectural support for software de-bugging. In Proc. 31st IEEE/ACM International Symposium on Computer Architecture, pages 224--237, June 2004.

Cited By

View all
  • (2012)Dual Monitoring Communication for Self-Aware Network-on-ChipInternational Journal of Adaptive, Resilient and Autonomic Systems10.4018/jaras.20120701053:3(72-91)Online publication date: 1-Jul-2012
  • (2012)A Scalable Monitoring Infrastructure for Self-Organizing Many-Core ArchitecturesProceedings of the 2012 15th Euromicro Conference on Digital System Design10.1109/DSD.2012.12(42-49)Online publication date: 5-Sep-2012
  • (2011)Adapt or become extinct!Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era10.1145/2000417.2000422(46-51)Online publication date: 5-Jun-2011

Index Terms

  1. Core monitors: monitoring performance in multicore processors

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CF '09: Proceedings of the 6th ACM conference on Computing frontiers
      May 2009
      238 pages
      ISBN:9781605584133
      DOI:10.1145/1531743
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 18 May 2009

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. cache coherency
      2. debugging
      3. multicore
      4. performance monitoring
      5. profiling
      6. realtime
      7. scheduling

      Qualifiers

      • Research-article

      Conference

      CF '09
      Sponsor:
      CF '09: Computing Frontiers Conference
      May 18 - 20, 2009
      Ischia, Italy

      Acceptance Rates

      CF '09 Paper Acceptance Rate 26 of 113 submissions, 23%;
      Overall Acceptance Rate 273 of 785 submissions, 35%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)15
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 10 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2012)Dual Monitoring Communication for Self-Aware Network-on-ChipInternational Journal of Adaptive, Resilient and Autonomic Systems10.4018/jaras.20120701053:3(72-91)Online publication date: 1-Jul-2012
      • (2012)A Scalable Monitoring Infrastructure for Self-Organizing Many-Core ArchitecturesProceedings of the 2012 15th Euromicro Conference on Digital System Design10.1109/DSD.2012.12(42-49)Online publication date: 5-Sep-2012
      • (2011)Adapt or become extinct!Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era10.1145/2000417.2000422(46-51)Online publication date: 5-Jun-2011

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media