Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1254882.1254915acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
Article

Data layouts for object-oriented programs

Published: 12 June 2007 Publication History

Abstract

Object-oriented programs rely heavily on objects and pointers, making them vulnerable to slow downs from cache and TLB misses. The cache and TLB behavior depends on the data layout of objects in memory. There are many possible data layouts with different impacts on performance, but it is not known which perform better. This paper presents a novel framework for evaluating data layouts. The framework both makes implementing many layouts easy, andenables performance measurements of real programs using a product Java virtual machine on stock hardware. This is achieved by sorting objects during copying garbage collection; outside of garbage collection, program performance is solely determined by the data layout that the sort key implements. This paper surveys and evaluates 10 common data layouts with 32 realistic bench mark programs running on 3 different hardware configurations. The results confirm the importance of data layouts for program performance, and show that almost all layouts yield the best performance for some programs and the worst performance for others.

References

[1]
D. Abuaiadh, Y. Ossia, E. Petrank, and U. Silbershtein. An efficient parallel heap compaction algorithm. In OOPSLA, 2004.
[2]
A.-R. Adl-Tabatabai, R. L. Hudson, M. J. Serrano, and S. Subramoney. Prefetch injection based on hardware monitoring and object metadata. In PLDI, 2004.
[3]
E. D. Berger and B. G. Zorn. DieHard: Probabilistic memory safety for unsafe languages. In PLDI, 2006.
[4]
S. Bhatkar, R. Sekar, and D. C. DuVarney. Efficient techniques for comprehensive protection from memory error exploits. In USENIX Security Symposium, 2005.
[5]
S. M. Blackburn, P. Cheng, and K. S. McKinley. Myths and realities: The performance impact of garbage collection. In SIGMETRICS, 2004.
[6]
S. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In OOPSLA, 2006.
[7]
S. M. Blackburn and K. S. McKinley. Ulterior reference counting: Fast garbage collection without a long wait. In OOPSLA, 2003.
[8]
R. Blau. Paging on an object-oriented personal computer. In SIGMETRICS, 1983.
[9]
H.-J. Boehm and M. Weiser. Garbage collection in an uncooperative environment. Software -- Practice and Experience (SPE), 1988.
[10]
S. Browne, J. Dongarra, N. Garner, K. London, and P. Mucci. A scalable cross-platform infrastructure for application performance tuning using hardware counters. In IEEE SuperComputing (SC), 2000.
[11]
B. Calder, C. Krintz, S. John, and T. Austin. Cache-conscious data placement. In ASPLOS, 1998.
[12]
W. K. Chen, S. Bhansali, T. Chilimbi, X. Gao, and W. Chuang. Profile-guided proactive garbage collection for locality optimization. In PLDI, 2006.
[13]
C. J. Cheney. A nonrecursive list compacting algorithm. CACM, 1970.
[14]
P. Cheng and G. E. Blelloch. A parallel, real-time garbage collector. In PLDI, 2001.
[15]
S. Cherem and R. Rugina. Region analysis and transformation for Java programs. In ISMM, 2004.
[16]
T. M. Chilimbi and J. R. Larus. Using generational garbage collection to implement cache-conscious data placement. In ISMM, 1998.
[17]
G. E. Collins. A method for overlapping and erasure of lists. CACM, 1960.
[18]
W. T. Comfort. Multiword list items. CACM, 1964.
[19]
R. Courts. Improving locality of reference in a garbage-collecting memory management system. CACM, 1988.
[20]
D. Detlefs, C. Flood, S. Heller, and T. Printezis. Garbage-first garbage collection. In ISMM, 2004.
[21]
C. Ding and K. Kennedy. Improving cache performance in dynamic applications through data and computation reorganization at run time. In PLDI, 1999.
[22]
A. Diwan, D. Tarditi, and J. E. B. Moss. Memory subsystem performance of programs with intensive heap allocation. ACM Transactions on Computer Systems (TOCS), 1995.
[23]
R. R. Fenichel and J. C. Yochelson. A LISP garbage-collector for virtual-memory computer systems. CACM, 1969.
[24]
C. H. Flood, D. Detlefs, N. Shavit, and X. Zhang. Parallel garbage collection for shared memory multiprocessors. In Java Virtual Machine Research and Technology Symposium (JVM), 2001.
[25]
D. Gay and A. Aiken. Memory management with explicit regions. In PLDI, 1998.
[26]
R. H. Halstead, Jr. Multilisp: A language for concurrent symbolic computation. TOPLAS, 1985.
[27]
B. Hayes. Using key object opportunism to collect old objects. In OOPSLA, 1991.
[28]
M. Hertz, Y. Feng, and E. D. Berger. Garbage collection without paging. In PLDI, 2005.
[29]
M. Hirzel, A. Diwan, and M. Hertz. Connectivity-based garbage collection. In OOPSLA, 2003.
[30]
M. Hirzel, J. Henkel, A. Diwan, and M. Hind. Understanding the connectivity of heap objects. In ISMM, 2002.
[31]
X. Huang, S. M. Blackburn, K. S. McKinley, J. E. B. Moss, Z. Wang, and P. Cheng. The garbage collection advantage: improving program locality. In OOPSLA, 2004.
[32]
R. L. Hudson and J. E. B. Moss. Incremental collection of mature objects. In International Workshop on Memory Management, 1992.
[33]
A. Imai and E. Tick. Evaluation of parallel copying garbage collection on a shared-memory multiprocessor. IEEE Transactions on Parallel and Distributed Systems, 1993.
[34]
T. Inagaki, T. Onodera, H. Komatsu, and T. Nakatani. Stride prefetching by dynamically inspecting objects. In PLDI, 2003.
[35]
R. Jones and R. Lins. Garbage collection: Algorithms for automatic dynamic memory management. John Wiley & Son Ltd., 1996.
[36]
T. Kotzmann and H. Mössenböck. Escape analysis in the context of dynamic compilation and deoptimization. In Virtual Execution Environments (VEE), 2005.
[37]
C. Lattner and V. Adve. Automatic pool allocation: Improving performance by controlling data structure layout on the heap. In PLDI, 2005.
[38]
H. Lieberman and C. Hewitt. A real-time garbage collector based on the lifetimes of objects. CACM, 1983.
[39]
P. McGachey and A. L. Hosking. Reducing generational copy reserve overhead with fallback compaction. In ISMM, 2006.
[40]
D. A. Moon. Garbage collection in a large Lisp system. In LISP and Functional Programming (LFP), 1984.
[41]
E. Petrank and D. Rawitz. The hardness of cache conscious data placement. In POPL, 2002.
[42]
F. Qian and L. Hendren. An adaptive, region-based allocator for Java. In ISMM, 2002.
[43]
M. B. Reinhold. Cache performance of garbage-collected programs. In PLDI, 1994.
[44]
S. Rubin, R. Bodik, and T. M. Chilimbi. An efficient profile-analysis framework for data layout optimizations. In POPL, 2002.
[45]
X. Shen, Y. Gao, C. Ding, and R. Archambault. Lightweight reference affinity analysis. In International Conference on Supercomputing (ICS), 2005.
[46]
X. Shen, Y. Zhong, and C. Ding. Locality phase prediction. In ASPLOS, 2004.
[47]
Y. Shuf, M. Gupta, R. Bordawekar, and J. P. Singh. Exploiting prolific types for memory management and optimizations. In POPL, 2002.
[48]
Y. Shuf, M. Gupta, H. Franke, A. Appel, and J. P. Singh. Creating and preserving locality of Java applications at allocation and garbage collection times. In OOPSLA, 2002.
[49]
Y. Shuf, M. J. Serrano, M. Gupta, and J. P. Singh. Characterizing the memory behavior of Java workloads: A structured view and opportunities for optimizations. In SIGMETRICS, 2001.
[50]
D. Siegwart and M. Hirzel. Improving locality with parallel hierarchical copying GC. In ISMM, 2006.
[51]
J. W. Stamos. Static grouping of small objects to enhance performance of a paged virtual memory. Transactions on Computer Systems (TOCS), 1984.
[52]
B. Steensgaard. Thread-specific heaps for multi-threaded programs. In ISMM, 2000.
[53]
P. F. Sweeney, M. Hauswirth, B. Cahoon, P. Cheng, A. Diwan, D. Grove, and M. Hind. Using hardware performance monitors to understand the behavior of Java applications. In Virtual Machine Research and Technology Symposium (VM), 2004.
[54]
M. Tofte. A brief introduction to regions. In ISMM, 1998.
[55]
D. Ungar. Generation scavenging: A non-disruptive high performance storage reclamation algorithm. In Software Engineering Symposium on Practical Software Development Environments (SESPSDE), 1984.
[56]
P. R. Wilson, M. S. Lam, and T. G. Moher. Effective "static-graph" reorganization to improve locality in a garbage-collected system. In Conference on PLDI, 1991.
[57]
C. Zhang, C. Ding, M. Ogihara, Y. Zhong, and Y. Wu. A hierarchical model of data locality. In POPL, 2006.
[58]
C. Zhang, K. Kelsey, X. Shen, C. Ding, M. Hertz, and M. Ogihara. Program-level adaptive memory management. In ISMM, 2006.
[59]
L. Zhang, Z. Fang, M. Parker, B. K. Mathew, L. Schaelicke, J. B. Carter, W. C. Hsieh, and S. A. McKee. The Impulse memory controller. IEEE Transactions on Computers, 2001.
[60]
Y. Zhong, M. Orlovich, X. Shen, and C. Ding. Array regrouping and structure splitting using whole-program reference affinity. In PLDI, 2004.
[61]
B. G. Zorn. The effect of garbage collection on cache performance. Technical report, University of Colorado at Boulder, 1991.

Cited By

View all
  • (2022)Compressed Forwarding Tables ReconsideredProceedings of the 19th International Conference on Managed Programming Languages and Runtimes10.1145/3546918.3546928(45-63)Online publication date: 14-Sep-2022
  • (2022)Online Application Guidance for Heterogeneous Memory SystemsACM Transactions on Architecture and Code Optimization10.1145/353385519:3(1-27)Online publication date: 6-Jul-2022
  • (2022)HAMLET: A Hierarchical Agent-based Machine Learning PlatformACM Transactions on Autonomous and Adaptive Systems10.1145/353019116:3-4(1-46)Online publication date: 6-Jul-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMETRICS '07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
June 2007
398 pages
ISBN:9781595936394
DOI:10.1145/1254882
  • cover image ACM SIGMETRICS Performance Evaluation Review
    ACM SIGMETRICS Performance Evaluation Review  Volume 35, Issue 1
    SIGMETRICS '07 Conference Proceedings
    June 2007
    382 pages
    ISSN:0163-5999
    DOI:10.1145/1269899
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. GC
  2. TLB
  3. cache
  4. data layout
  5. data placement
  6. hardware performance counters
  7. memory subsystem
  8. spatial locality

Qualifiers

  • Article

Conference

SIGMETRICS07

Acceptance Rates

Overall Acceptance Rate 459 of 2,691 submissions, 17%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)4
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Compressed Forwarding Tables ReconsideredProceedings of the 19th International Conference on Managed Programming Languages and Runtimes10.1145/3546918.3546928(45-63)Online publication date: 14-Sep-2022
  • (2022)Online Application Guidance for Heterogeneous Memory SystemsACM Transactions on Architecture and Code Optimization10.1145/353385519:3(1-27)Online publication date: 6-Jul-2022
  • (2022)HAMLET: A Hierarchical Agent-based Machine Learning PlatformACM Transactions on Autonomous and Adaptive Systems10.1145/353019116:3-4(1-46)Online publication date: 6-Jul-2022
  • (2022)Risk-aware Collection Strategies for Multirobot Foraging in Hazardous EnvironmentsACM Transactions on Autonomous and Adaptive Systems10.1145/351425116:3-4(1-38)Online publication date: 6-Jul-2022
  • (2022)Assured Mission Adaptation of UAVsACM Transactions on Autonomous and Adaptive Systems10.1145/351309116:3-4(1-27)Online publication date: 6-Jul-2022
  • (2022)Efficient High-Level Programming in Plain JavaInternational Journal of Parallel Programming10.1007/s10766-022-00747-051:1(22-42)Online publication date: 5-Dec-2022
  • (2022)High Performance Computing with Java StreamsEuro-Par 2021: Parallel Processing Workshops10.1007/978-3-031-06156-1_2(17-28)Online publication date: 9-Jun-2022
  • (2020)Efficient nursery sizing for managed languages on multi-core processors with shared cachesProceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3368826.3377908(1-15)Online publication date: 22-Feb-2020
  • (2019)Evaluating the effectiveness of program data features for guiding memory managementProceedings of the International Symposium on Memory Systems10.1145/3357526.3357537(383-395)Online publication date: 30-Sep-2019
  • (2017)You can have it all: abstraction and good cache performanceProceedings of the 2017 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software10.1145/3133850.3133861(148-167)Online publication date: 25-Oct-2017
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media