research-article

A study of the scalability of stop-the-world garbage collectors on multicores

Authors:

Marc ShapiroAuthors Info & Claims

ACM SIGARCH Computer Architecture News, Volume 41, Issue 1

Pages 229 - 240

https://doi.org/10.1145/2490301.2451142

Published: 16 March 2013 Publication History

Abstract

Large-scale multicore architectures create new challenges for garbage collectors (GCs). In particular, throughput-oriented stop-the-world algorithms demonstrate good performance with a small number of cores, but have been shown to degrade badly beyond approximately 8 cores on a 48-core with OpenJDK 7. This negative result raises the question whether the stop-the-world design has intrinsic limitations that would require a radically different approach. Our study suggests that the answer is no, and that there is no compelling scalability reason to discard the existing highly-optimised throughput-oriented GC code on contemporary hardware. This paper studies the default throughput-oriented garbage collector of OpenJDK 7, called Parallel Scavenge. We identify its bottlenecks, and show how to eliminate them using well-established parallel programming techniques. On the SPECjbb2005, SPECjvm2008 and DaCapo 9.12 benchmarks, the improved GC matches the performance of Parallel Scavenge at low core count, but scales well, up to 48~cores.

References

[1]

T. A. Anderson. Optimizations in a private nursery-based garbage collector. In ISMM '10, pages 21--30. ACM, 2010.

Digital Library

[2]

A. W. Appel. Simple generational garbage collection and fast allocation. SP&E, 19 (2): 171--183, 1989.

Digital Library

[3]

S. M. Blackburn and K. S. McKinley. Immix: a mark-region garbage collector with space efficiency, fast collection, and mutator performance. In PLDI '08, pages 22--32. ACM, 2008.

Digital Library

[4]

S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In OOPSLA '06, pages 169--190. ACM, 2006.

Digital Library

[5]

M. Dashti, A. Fedorova, J. Funston, F. Gaud, R. Lachaize, B. Lepers, V. Quema, and M. Roth. Traffic management: A holistic approach to memory placement on numa systems. In ASPLOS '13. ACM, 2013.

Digital Library

[6]

D. Detlefs, C. Flood, S. Heller, and T. Printezis. Garbage-first garbage collection. In ISMM '04, pages 37--48. ACM, 2004.

Digital Library

[7]

D. Doligez and X. Leroy. A concurrent, generational garbage collector for a multithreaded implementation of ml. In POPL '93, pages 113--123. ACM, 1993.

Digital Library

[8]

C. H. Flood, D. Detlefs, N. Shavit, and X. Zhang. Parallel garbage collection for shared memory multiprocessors. In JVM '01, pages 21--21. USENIX Association, 2001.

Digital Library

[9]

H. Franke and R. Russell M. K. Fuss, futexes and furwocks: Fast userlevel locking in linux. In Ottawa Linux Symposium, OLS '02, pages 479--495, 2002.

[10]

L. Gidra, G. Thomas, J. Sopena, and M. Shapiro. Assessing the scalability of garbage collectors on many cores. In SOSP Workshop on Programming Languages and Operating Systems, PLOS '11, pages 1--5. ACM, 2011.

Digital Library

[11]

B. Iyengar, G. Tene, M. Wolf, and E. Gehringer. The collie: a wait-free compacting collector. In ISMM '12, pages 61--72. ACM, 2012.

Digital Library

[12]

R. Jones, A. Hosking, and E. Moss. The garbage collection handbook: the art of automatic memory management. Chapman & Hall/CRC, 1st edition, 2011.

Digital Library

[13]

H. Lieberman and C. Hewitt. A real-time garbage collector based on the lifetimes of objects. CACM, 26 (6): 419--429, 1983.

Digital Library

[14]

LinuxMemPolicy. What is linux memory policy? http://www.kernel.org/doc/Documentation/vm/numa_memory_policy.txt, 2012.

[15]

J.-P. Lozi, F. David, G. Thomas, J. Lawall, and G. Muller. Remote Core Locking: migrating critical-section execution to improve the performance of multithreaded applications. In USENIX ATC '12, pages 65--76. USENIX Association, 2012.

Digital Library

[16]

S. Marlow and S. Peyton Jones. Multicore garbage collection with local heaps. In ISMM '11, pages 21--32. ACM, 2011.

Digital Library

[17]

S. Marlow, T. Harris, R. P. James, and S. Peyton Jones. Parallel generational-copying garbage collection with a block-structured heap. In ISMM '08, pages 11--20. ACM, 2008.

Digital Library

[18]

M. M. Michael and M. L. Scott. Simple, fast, and practical non-blocking and blocking concurrent queue algorithms. In PODC '96, pages 267--275. ACM, 1996.

Digital Library

[19]

C. E. Oancea, A. Mycroft, and S. M. Watt. A new approach to parallelising tracing algorithms. In ISMM '09, pages 10--19. ACM, 2009.

Digital Library

[20]

T. Ogasawara. NUMA-aware memory manager with dominant-thread-based copying GC. In OOPSLA '09, pages 377--390. ACM, 2009.

Digital Library

[21]

OpenJDK Memory. Memory management in the Java hotspot#8482; virtual machine. Technical report, Sun Microsystems, 2006.

[22]

F. Pizlo, D. Frampton, E. Petrank, and B. Steensgaard. Stopless: a real-time garbage collector for multiprocessors. In ISMM '07, pages 159--172. ACM, 2007.

Digital Library

[23]

F. Pizlo, L. Ziarek, P. Maj, A. L. Hosking, E. Blanton, and J. Vitek. Schism: fragmentation-tolerant real-time garbage collection. In PLDI '10, pages 146--159. ACM, 2010.

Digital Library

[24]

K. Sivaramakrishnan, L. Ziarek, and S. Jagannathan. Eliminating read barriers through procrastination and cleanliness. In ISMM '12, pages 49--60. ACM, 2012.

Digital Library

[25]

SPECjbb2005. SPECjbb2005 home page. http://www.spec.org/jbb2005/, 2012.

[26]

SPECjvm2008. SPECjvm2008 home page. http://www.spec.org/jvm2008/, 2012.

[27]

B. Steensgaard. Thread-specific heaps for multi-threaded programs. In ISMM '00, pages 18--24. ACM, 2000.

Digital Library

[28]

G. Tene, B. Iyengar, and M. Wolf. C4: the continuously concurrent compacting collector. In ISMM '11, pages 79--88. ACM, 2011.

Digital Library

[29]

M. M. Tikir and J. K. Hollingsworth. NUMA-aware Java heaps for server applications. In IPDPS '05, pages 108--117. IEEE Computer Society, 2005.

Digital Library

[30]

Tilera. TILE-Gx processor family. http://www.tilera.com/products/processors/TILE-Gx_Family, 2012.

[31]

D. Ungar. Generation scavenging: A non-disruptive high performance storage reclamation algorithm. In SDE '84, pages 157--167. ACM, 1984.

Digital Library

[32]

J. Zhou and B. Demsky. Memory management for many-core processors with software configurable locality policies. In ISMM '12, pages 3--14. ACM, 2012.

Digital Library

Cited By

Wingerath WGessert FRitter N(2020)InvaliDBProceedings of the VLDB Endowment10.14778/3415478.341553213:12(3032-3045)Online publication date: 14-Sep-2020
https://dl.acm.org/doi/10.14778/3415478.3415532
Papadakis OAndronikakis AFoutris NPapadimitriou MStratikopoulos AZakkak FXekalakis PKotselidis CBlackburn SPetrank E(2023)Scaling Up Performance of Managed Applications on NUMA SystemsProceedings of the 2023 ACM SIGPLAN International Symposium on Memory Management10.1145/3591195.3595270(1-14)Online publication date: 6-Jun-2023
https://dl.acm.org/doi/10.1145/3591195.3595270
Carpen-Amarie MVavouliotis GTovletoglou KGrot BMueller RBlackburn SPetrank E(2023)Concurrent GCs and Modern Java Workloads: A Cache PerspectiveProceedings of the 2023 ACM SIGPLAN International Symposium on Memory Management10.1145/3591195.3595269(71-84)Online publication date: 6-Jun-2023
https://dl.acm.org/doi/10.1145/3591195.3595269
Show More Cited By

Index Terms

A study of the scalability of stop-the-world garbage collectors on multicores
1. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Memory management
        Garbage collection

Recommendations

A study of the scalability of stop-the-world garbage collectors on multicores
ASPLOS '13

Large-scale multicore architectures create new challenges for garbage collectors (GCs). In particular, throughput-oriented stop-the-world algorithms demonstrate good performance with a small number of cores, but have been shown to degrade badly beyond ...
A study of the scalability of stop-the-world garbage collectors on multicores
ASPLOS '13: Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems

Large-scale multicore architectures create new challenges for garbage collectors (GCs). In particular, throughput-oriented stop-the-world algorithms demonstrate good performance with a small number of cores, but have been shown to degrade badly beyond ...
A performance study of Java garbage collectors on multicore architectures
PMAM '15: Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores

In the last few years, managed runtime environments such as the Java Virtual Machine (JVM) are increasingly used on large-scale multicore servers. The garbage collector (GC) represents a critical component of the JVM and has a significant influence on ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News

ACM SIGARCH Computer Architecture News Volume 41, Issue 1

ASPLOS '13

March 2013

540 pages

ISSN:0163-5964

DOI:10.1145/2490301

Issue’s Table of Contents

ASPLOS '13: Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
March 2013
574 pages
ISBN:9781450318709
DOI:10.1145/2451116
General Chair:
Vivek Sarkar
Rice University, USA
,
Program Chair:
Rastislav Bodik
University of California, Berkeley, USA

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 March 2013

Published in SIGARCH Volume 41, Issue 1

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

67
Total Citations
View Citations
798
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)1

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wingerath WGessert FRitter N(2020)InvaliDBProceedings of the VLDB Endowment10.14778/3415478.341553213:12(3032-3045)Online publication date: 14-Sep-2020
https://dl.acm.org/doi/10.14778/3415478.3415532
Papadakis OAndronikakis AFoutris NPapadimitriou MStratikopoulos AZakkak FXekalakis PKotselidis CBlackburn SPetrank E(2023)Scaling Up Performance of Managed Applications on NUMA SystemsProceedings of the 2023 ACM SIGPLAN International Symposium on Memory Management10.1145/3591195.3595270(1-14)Online publication date: 6-Jun-2023
https://dl.acm.org/doi/10.1145/3591195.3595270
Carpen-Amarie MVavouliotis GTovletoglou KGrot BMueller RBlackburn SPetrank E(2023)Concurrent GCs and Modern Java Workloads: A Cache PerspectiveProceedings of the 2023 ACM SIGPLAN International Symposium on Memory Management10.1145/3591195.3595269(71-84)Online publication date: 6-Jun-2023
https://dl.acm.org/doi/10.1145/3591195.3595269
Li BSu PChabbi MJiao SLiu XDubach CBruening DHardekopf B(2023)DJXPerf: Identifying Memory Inefficiencies via Object-Centric Profiling for JavaProceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3579990.3580010(81-94)Online publication date: 17-Feb-2023
https://dl.acm.org/doi/10.1145/3579990.3580010
Wang MStuardo CKurniawan DSinurat RGunawi H(2022)Layered Contention Mitigation for Cloud Storage2022 IEEE 15th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD55607.2022.00036(167-178)Online publication date: Jul-2022
https://doi.org/10.1109/CLOUD55607.2022.00036
Akram S(2021)Performance Evaluation of Intel Optane Memory for Managed WorkloadsACM Transactions on Architecture and Code Optimization10.1145/345134218:3(1-26)Online publication date: 22-Apr-2021
https://dl.acm.org/doi/10.1145/3451342
Yang YWu MChen HZang BBarbalace ABhatotia PAlvisi LCadar C(2021)Bridging the performance gap for copy-based garbage collectors atop non-volatile memoryProceedings of the Sixteenth European Conference on Computer Systems10.1145/3447786.3456246(343-358)Online publication date: 21-Apr-2021
https://dl.acm.org/doi/10.1145/3447786.3456246
Wu MZhao ZYang YLi HChen HZang BGuan HLi SLu CZhang TGavrilovska AZadok E(2020)PlatinumProceedings of the 2020 USENIX Conference on Usenix Annual Technical Conference10.5555/3489146.3489157(159-172)Online publication date: 15-Jul-2020
https://dl.acm.org/doi/10.5555/3489146.3489157
Papadakis OZakkak FFoutris NKotselidis CMarr S(2020)You can’t hide you can’t run: a performance assessment of managed applications on a NUMA machineProceedings of the 17th International Conference on Managed Programming Languages and Runtimes10.1145/3426182.3426189(80-88)Online publication date: 4-Nov-2020
https://dl.acm.org/doi/10.1145/3426182.3426189
Ismail MSuh GMars JTang LXue JWu P(2020)Efficient nursery sizing for managed languages on multi-core processors with shared cachesProceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3368826.3377908(1-15)Online publication date: 22-Feb-2020
https://dl.acm.org/doi/10.1145/3368826.3377908
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents