Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1542476.1542520acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

A study of memory management for web-based applications on multicore processors

Published: 15 June 2009 Publication History

Abstract

More and more server workloads are becoming Web-based. In these Web-based workloads, most of the memory objects are used only during one transaction. We study the effect of the memory management approaches on the performance of such Web-based applications on two modern multicore processors. In particular, using six PHP applications, we compare a general-purpose allocator (the default allocator of the PHP runtime) and a region-based allocator, which can reduce the cost of memory management by not supporting per-object free. The region-based allocator achieves better performance for all workloads on one processor core due to its smaller memory management cost. However, when using eight cores, the region-based allocator suffers from hidden costs of increased bus traffics and the performance is reduced for many workloads by as much as 27.2% compared to the default allocator. This is because the memory bandwidth tends to become a bottleneck in systems with multicore processors.
We propose a new memory management approach, defrag-dodging, to maximize the performance of the Web-based workloads on multicore processors. In our approach, we reduce the memory management cost by avoiding defragmentation overhead in the malloc and free functions during a transaction. We found that the transactions in Web-based applications are short enough to ignore heap fragmentation, and hence the costs of the defrag-mentation activities in existing general-purpose allocators outweigh their benefits. By comparing our approach against the region-based approach, we show that a per-object free capability can reduce bus traffic and achieve higher performance on multicore processors. We demonstrate that our defrag-dodging approach improves the performance of all the evaluated applications on both processors by up to 11.4% and 51.5% over the default allocator and the region-based allocator, respectively.

References

[1]
D. R. Hanson. Fast allocation and deallocation of memory based on object lifetimes. Software-Practice & Experience, 20(1), pp. 5--12, 1990.
[2]
D. A. Barrett and B. G. Zorn. Using Lifetime Predictors to Improve Memory Allocation Performance. In Proceedings of the ACM Conference on Programming Language Design and Implementation, pp. 187--196, 1993.
[3]
D. Gay and A Aiken. Memory management with explicit regions. In Proceedings of the ACM Conference on Programming Language Design and Implementation, pp. 313--323, 1998
[4]
D. Grossman, G. Morrisett, T. Jim, M. Hicks, Y. Wang, and J. Cheney. Region-Based Memory Management in Cyclone. In Proceedings of the ACM Conference on Programming Language Design and Implementation, pp. 282--293, 2002.
[5]
Apache Software Foundation. The Apache Portable Runtime Project. http://apr.apache.org/ .
[6]
R. Iyer, M. Bhat, L. Zhao, R. Illikkal, S. Makineni, M. Jones, K. Shiv, and D. Newell. Exploring Small-Scale and Large-Scale CMP Architectures for Commercial Java Servers. In Proceedings of the IEEE International Symposium on Workload Characterization, pp. 191--200, 2006.
[7]
Y. Chen, E. Li, W. Li, T. Wang, J. Li, X. Tong, P. P. Wang, W. Hu, Y. Zhang, Y. Chen. Media mining -- emerging tera-scale computing applications. Intel Technology Journal, 11(3), pp 239--250, 2007.
[8]
The PHP Group. PHP: Hypertext Preprocessor. http://www.php.net/ .
[9]
E. D. Berger, B. G. Zorn, and K. S. McKinley. Reconsidering custom memory allocation. In Proceedings of the ACM Conference on Object Oriented Programming, Systems, Languages, and Applications, pp. 1--12, 2002.
[10]
Free Software Foundation, Inc. GNU C Library obstack. http://www.gnu.org/software/libc/manual/html_node/Obstacks.html.
[11]
E. D. Berger, K. S. McKinley, R. D. Blumofe, and P. R. Wilson. Hoard: A Scalable Memory Allocator for Multithreaded Applications. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 117--128, 2000.
[12]
S. Ghemawat and P. Menage. TCMalloc : Thread-Caching Malloc. http://goog-perftools.sourceforge.net/doc/tcmalloc.html .
[13]
Doug Lea. A Memory Allocator. http://g.oswego.edu/dl/html/malloc.html .
[14]
M. S. Johnstone and P. R. Wilson. The memory fragmentation problem: Solved. In Proceedings of the International Symposium on Memory Management, pp. 26--36, 1998.
[15]
Wikimedia Foundation. MediaWiki. http://www.mediawiki.org .
[16]
SugarCRM Inc. SugarCRM. http://www.sugarcrm.com .
[17]
eZ Systems. eZ Publish. http://ez.no.
[18]
The phpBB Group. phpBB. http://www.phpbb.com/.
[19]
Cake Software Foundation, Inc. CakePHP. http://www.cakephp.org/.
[20]
Standard Performance Evaluation Corporation. SPECweb2005. http://www.spec.org/web2005/.
[21]
OProfile -- A System Profiler for Linux. http://oprofile.sourceforge.net/news/.
[22]
D. H. Hansson. Ruby on Rails. http://www.rubyonrails.org.
[23]
A. Shankar, M. Arnold, and R. Bodik. Jolt: lightweight dynamic analysis and removal of object churn. In Proceedings of the ACM Conference on Object Oriented Programming Systems Languages and Applications, pp. 127--142, 2008.
[24]
F. Xian, W. Srisaan, and, H. Jiang. Microphase: an approach to proactively invoking garbage collection for improved performance. In Proceedings of the ACM Conference on Object Oriented Programming Systems Languages and Applications, pp. 77--96, 2007.
[25]
D. Detlefs, A. Dosser, and B. Zorn. Memory allocation costs in large C and C++ programs. Software-Practice & Experience, 24(6), pp. 527--542, 1994.
[26]
P. R. Wilson, M. S. Johnstone, M. Neely, and D. Boles. Dynamic Storage Allocation: A Survey and Critical Review. In Proceedings of the International Workshop on Memory Management, pp. 1--116, 1995.
[27]
M. L. Seidl, and B. G. Zorn. Segregating heap objects by reference behavior and lifetime. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 12--23, 1998.
[28]
Y. Shuf, M. Gupta, R. Bordawekar, and J. P. Singh. Exploiting prolific types for memory management and optimizations. In Proceedings of the ACM Symposium on Principles of Programming Languages, pp. 295--306, 2002.

Cited By

View all
  • (2020)RSMCC: Enabling Ring-based Software Managed Cache-Coherent Embedded SoCs2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP50117.2020.00026(131-135)Online publication date: Mar-2020
  • (2019)Timescale functions for parallel memory allocationProceedings of the 2019 ACM SIGPLAN International Symposium on Memory Management10.1145/3315573.3329987(64-78)Online publication date: 23-Jun-2019
  • (2016)Rethinking a heap hierarchy as a cache hierarchy: a higher-order theory of memory demand (HOTM)ACM SIGPLAN Notices10.1145/3241624.292670851:11(111-121)Online publication date: 14-Jun-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PLDI '09: Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation
June 2009
492 pages
ISBN:9781605583921
DOI:10.1145/1542476
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 44, Issue 6
    PLDI '09
    June 2009
    478 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/1543135
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dynamic memory management
  2. region-based memory management
  3. scripting language
  4. web-based applications

Qualifiers

  • Research-article

Conference

PLDI '09
Sponsor:

Acceptance Rates

Overall Acceptance Rate 406 of 2,067 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)1
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2020)RSMCC: Enabling Ring-based Software Managed Cache-Coherent Embedded SoCs2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP50117.2020.00026(131-135)Online publication date: Mar-2020
  • (2019)Timescale functions for parallel memory allocationProceedings of the 2019 ACM SIGPLAN International Symposium on Memory Management10.1145/3315573.3329987(64-78)Online publication date: 23-Jun-2019
  • (2016)Rethinking a heap hierarchy as a cache hierarchy: a higher-order theory of memory demand (HOTM)ACM SIGPLAN Notices10.1145/3241624.292670851:11(111-121)Online publication date: 14-Jun-2016
  • (2016)Rethinking a heap hierarchy as a cache hierarchy: a higher-order theory of memory demand (HOTM)Proceedings of the 2016 ACM SIGPLAN International Symposium on Memory Management10.1145/2926697.2926708(111-121)Online publication date: 14-Jun-2016
  • (2016)HPTA: High-performance text analytics2016 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2016.7840632(416-423)Online publication date: Dec-2016
  • (2015)Memory Management Scheme to Improve Utilization Efficiency and Provide Fast Contiguous Allocation without a Statically Reserved AreaACM Transactions on Design Automation of Electronic Systems10.1145/277087121:1(1-23)Online publication date: 2-Dec-2015
  • (2014)Cooperative cache scrubbingProceedings of the 23rd international conference on Parallel architectures and compilation10.1145/2628071.2628083(15-26)Online publication date: 24-Aug-2014
  • (2013)A Method for Eliminating Metadata Cache Deallocation Latency in Enterprise File ServersIEEE Transactions on Magnetics10.1109/TMAG.2013.225260549:6(2504-2509)Online publication date: Jun-2013
  • (2012)Does lean imply green?ACM SIGMETRICS Performance Evaluation Review10.1145/2318857.225478940:1(259-270)Online publication date: 11-Jun-2012
  • (2012)Does lean imply green?Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems10.1145/2254756.2254789(259-270)Online publication date: 11-Jun-2012
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media