Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Software prefetching for mark-sweep garbage collection: hardware analysis and software redesign

Published: 07 October 2004 Publication History

Abstract

Tracing garbage collectors traverse references from live program variables, transitively tracing out the closure of live objects. Memory accesses incurred during tracing are essentially random: a given object may contain references to any other object. Since application heaps are typically much larger than hardware caches, tracing results in many cache misses. Technology trends will make cache misses more important, so tracing is a prime target for prefetching.Simulation of Java benchmarks running with the Boehm-De-mers-Weiser mark-sweep garbage collector for a projected hardware platform reveal high tracing overhead (up to 65% of elapsed time), and that cache misses are a problem. Applying Boehm's default prefetching strategy yields improvements in execution time (16% on average with incremental/generational collection for GC-intensive benchmarks), but analysis shows that his strategy suffers from significant timing problems: prefetches that occur too early or too late relative to their matching loads. This analysis drives development of a new prefetching strategy that yields up to three times the performance improvement of Boehm's strategy for GC-intensive benchmark (27% average speedup), and achieves performance close to that of perfect timing ie, few misses for tracing accesses) on some benchmarks. Validating these simulation results with live runs on current hardware produces average speedup of 6% for the new strategy on GC-intensive benchmarks with a GC configuration that tightly controls heap growth. In contrast, Boehm's default prefetching strategy is ineffective on this platform.

References

[1]
Virtual memory primitives for user programs. In Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (Santa Clara, California, Apr.). ACM SIGPLAN Notices 26, 4 (Apr. 1991), pp. 96--107.]]
[2]
Austin, T. M., Larson, E., and Ernst, D. SimpleScalar: An infrastructure for computer system modeling. IEEE Computer 35, 2 (Feb. 2002), 59--67.]]
[3]
Boehm, H.-J. Space efficient conservative garbage collection. In Proceedings of the ACM Conference on Programming Language Design and Implementation (Albuquerque, New Mexico, June). ACM SIGPLAN Notices 28, 6 (June 1993), pp. 197--206.]]
[4]
Boehm, H.-J. Reducing garbage collector cache misses. In Proceedings of the ACM International Symposium on Memory Management (Minneapolis, Minnesota, Oct., 2000). ACM SIGPLAN Notices 36, 1 (Jan. 2001), pp. 59--64.]]
[5]
Boehm, H.-J., Demers, A. J., and Shenker, S. Mostly parallel garbage collection. In Proceedings of the ACM Conference on Object-Oriented Programming Systems, Languages, and Applications (Phoenix, Arizona, Oct.). ACM SIGPLAN Notices 26, 11 (Nov. 1991), pp. 157--164.]]
[6]
Boehm, H.-J., and Weiser, M. Garbage collection in an uncooperative environment. Software---Practice and Experience 18, 9 (Sept. 1988), 807--820.]]
[7]
Cahoon, B., and McKinley, K. S. Data data flow analysis for software prefetching linked data structures in Java. In Proceedings of IEEE International Conference on Parallel Architectures and Compilation Techniques (Barcelona, Spain, Sept.). 2001, pp. 280--291.]]
[8]
Cahoon, B. D. Effective Compile-Time Analysis for Data Prefetching in Java. PhD thesis, University of Massachusetts at Amherst, Sept. 2002.]]
[9]
Dijkstra, E., Lamport, L., Martin, A., Scholten, C., and Stefens, E. On-the-fly garbage collection: An exercise in cooperation. Commun. ACM 21, 11 (Nov. 1978), 966--975.]]
[10]
Gosling, J., Joy, B., Steele, Jr., G., and Bracha, G.The Java Language Specification, second ed. Addison-Wesley, 2000.]]
[11]
Horowitz, M., Martonosi, M., Mowry, T. C., and Smith, M. D. Informing memory operations: Memory performance feedback mechanisms and their applications.]]
[12]
Hughes, R. J. M. A semi-incremental garbage collection algorithm. Software---Practice and Experience 21, 11 (Nov. 1982), 1081--1084.]]
[13]
Jones, R., and Lins, R. Garbage Collection: Algorithms for Automatic Dynamic Memory Management. Wiley, May 1996. Chapter on distributed collection by Lins.]]
[14]
Karlsson, M., Dahlgren, F., and Stenström, P. A prefetching technique for irregular accesses to linked data structures. In Proceedings of the International Symposium on High Performance Computer Architecture (Toulouse, France, Jan.). IEEE Computer Society, 2000, pp. 206--217.]]
[15]
Lipasti, M. H., Schmidt, W. J., Kunkel, S. R., and Roediger, R. R. SPAID: Software prefetching in pointer- and call-intensive environments. In Proceedings of the International Symposium on Microarchitecture. ACM/IEEE, 1995, pp. 231--236.]]
[16]
Luk, C.-K., and Mowry, T. C. Compiler-based prefetching for recursive data structures. In Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (Cambridge, Massachusetts, Oct.). ACM SIGPLAN Notices 31, 9 (Sept. 1996), pp. 222--233.]]
[17]
Roth, A., Moshovos, A., and Sohi, G. S. Dependence based prefetching for linked data structures. In Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (San Jose, California, Oct.). ACM SIGPLAN Notices 33, 11 (Nov. 1998), pp. 115--126.]]
[18]
Roth, A., and Sohi, G. S. Effective jump-pointer prefetching for linked data structures. In Proceedings of the International Symposium on Computer Architecture (Atlanta, Georgia, May). Computer Architecture News 27, 2 (May 1999), pp. 111--121.]]
[19]
Rubin, S., Bernstein, D., and Rodeh, M. Virtual cache line: A new technique to improve cache exploitation for recursive data structures. In Proceedings of the International Conference on Compiler Construction (Amsterdam, The Netherlands, Mar.), S. Jähnichen, Ed. vol. 1575 of Lecture Notes in Computer Science. 1999, pp. 259--273.]]
[20]
Schkolnick, M. A clustering algorithm for hierarchical structures. ACM Trans. Database Syst. 2, 1 (Mar. 1977), 27--44.]]
[21]
SPEC. SPECjvm98 benchmarks, 1998. http://www.spec.org/osg/jvm98.]]
[22]
Stamos, J. W. Static grouping of small objects to enhance performance of a paged virtual memory. ACM Trans. Comput. Syst. 2, 2 (May 1984), 155--180.]]
[23]
Stoutchinin, A., Amaral, J. N., Gao, G. R., Dehnert, J. C., Jain, S., and Douillet, A. Speculative prefetching of induction pointers. In Proceedings of the International Conference on Compiler Construction (Genova, Italy, Apr.), R. Wilhelm, Ed. vol. 2027 of Lecture Notes in Computer Science. 2001, pp. 289--303.]]
[24]
Ungar, D. Generation scavenging: A non-disruptive high performance storage reclamation algorithm. In Proceedings of the ACM Symposium on Practical Software Development Environments (Pittsburgh, Pennsylvania, Apr.). 1984, pp. 157--167.]]
[25]
Wilson, P. R., Lam, M. S., and Moher, T. G. Effective ``static-graph'' reorganization to improve locality in garbage-collected systems. In Proceedings of the ACM Conference on Programming Language Design and Implementation (Toronto, Canada, June). ACM SIGPLAN Notices 26, 6 (June 1991), pp. 177--191.]]
[26]
Zilles, C. B. Benchmark Health considered harmful. ACM SIGARCH Newsletter 29, 3 (June 2001), 4--5.]]

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 39, Issue 11
ASPLOS '04
November 2004
283 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1037187
Issue’s Table of Contents
  • cover image ACM Conferences
    ASPLOS XI: Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
    October 2004
    296 pages
    ISBN:1581138040
    DOI:10.1145/1024393
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 October 2004
Published in SIGPLAN Volume 39, Issue 11

Check for updates

Author Tags

  1. breadth-first
  2. buffered prefetch
  3. cache architecture
  4. depth-first
  5. garbage collection
  6. mark-sweep
  7. prefetch-on-grey
  8. prefetching

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)2
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media