research-article

Writeback-aware bandwidth partitioning for multi-core systems with PCM

Authors:

Bruce R. Childers,

Daniel MosseAuthors Info & Claims

PACT '13: Proceedings of the 22nd international conference on Parallel architectures and compilation techniques

Pages 113 - 122

Published: 07 October 2013 Publication History

Abstract

Phase-Change Memory (PCM) has emerged as a promising low-power candidate to replace DRAM in main memory. Hybrid memory architecture comprised of a large PCM and a small DRAM is a popular solution to mitigate undesirable characteristics of PCM writes. Because PCM writes are much slower than reads, writebacks from the last-level cache consume a large portion of memory bandwidth, and thus, impact performance. Effectively utilizing shared resources, such as the last-level cache and the memory bandwidth, is crucial to achieving high performance for multi-core systems. Although existing memory bandwidth allocation schemes improve system performance, no current approach uses writeback information to partition bandwidth for hybrid memory. We use a writeback-aware analytic model to derive the allocation strategy for bandwidth partitioning of phase-change memory. From the derivation of the model, Writeback-aware Bandwidth Partitioning (WBP) is proposed as a new runtime mechanism to partition PCM service cycles among applications. WBP uses a partitioning weight to indicate the importance of writebacks (in addition to LLC misses) to bandwidth allocation. A companion Dynamic Weight Adjustment (DWA) scheme dynamically selects the partitioning weight to maximize system performance. Simulation results show that WBP and DWA improve performance by 24.9% (weighted speedup) over bandwidth partitioning schemes that do not take writebacks into consideration in a 8-core system.

References

[1]

J. Kong, J. Choi, L. Choi, and S. W. Chung, "Low-cost application-aware DVFS for multi-core architecture," in ICCIT '08, 2008.

Digital Library

[2]

Kwang-Jin Lee et al., "A 90 nm 1.8 V 512 Mb diode-switch PRAM with 266 MB/s read throughput," Solid-State Circuits, IEEE Journal of, vol. 43, 2008.

[3]

Kang et al, "A 0.1 μm 1.8V 256Mb 66MHz Synchronous Burst PRAM," in ISSCC '06, 2006.

[4]

F. Pellizzer et al., "A 90nm phase change memory technology for stand-alone non-volatile memory applications," in Symp. on VLSI Tech., 2006.

[5]

P. Zhou, B. Zhao, J. Yang, and Y. Zhang, "A durable and energy efficient main memory using phase change memory technology," in ISCA '09, 2009.

Digital Library

[6]

Qureshi, Moinuddin K. et al., "Enhancing lifetime and security of PCM-based main memory with start-gap wear leveling," in MICRO, 2009.

Digital Library

[7]

S. Cho and H. Lee, "Flip-N-Write: a simple deterministic technique to improve PRAM write performance, energy and endurance," in MICRO, 2009.

Digital Library

[8]

A. P. Ferreira, M. Zhou, S. Bock, B. Childers, R. Melhem, and D. Mosse, "Increasing PCM main memory lifetime," in DATE '10, 2010.

Digital Library

[9]

B. C. Lee, E. Ipek, O. Mutlu, and D. Burger, "Architecting phase change memory as a scalable DRAM alternative," in ISCA '09, 2009.

Digital Library

[10]

A. P. Ferreira, B. Childers, R. Melhem, D. Mosse, and M. Yousif, "Using PCM in next-generation embedded space applications," in RTAS, 2010.

Digital Library

[11]

M. K. Qureshi, V. Srinivasan, and J. A. Rivers, "Scalable high performance main memory system using phase-change memory technology," in ISCA '09, 2009.

Digital Library

[12]

F. Liu, X. Jiang, and Y. Solihin, "Understanding how off-chip memory bandwidth partitioning in chip multiprocessors affects system performance," in HPCA, 2010.

[13]

S. Chen, P. B. Gibbons, and S. Nath, "Rethinking database algorithms for phase change memory," in CIDR '11, 2011.

[14]

M. K. Qureshi, M. Franceschini, and L. A. Lastras-Monta\ no, "Improving read performance of phase change memories via write cancellation and write pausing," in HPCA, 2010, pp. 1--11.

[15]

A. S. Tanenbaum, Computer Networks, 3rd Edition.\hskip 1em plus 0.5em minus 0.4em\relax Prentice Hall, 1996.

Digital Library

[16]

M. K. Qureshi and Y. N. Patt, "Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches," in MICRO 39, 2006.

Digital Library

[17]

M. Zhou, Y. Du, B. Childers, R. Melhem, and D. Mossé, "Writeback-aware partitioning and replacement for last-level caches in phase change main memory systems," ACM Trans. Archit. Code Optim., vol. 8, no. 4, pp. 53:1--53:21, Jan. 2012.

Digital Library

[18]

P. G. Emma, "Understanding some simple processor-performance limits," IBM J. Res. Dev., vol. 41, no. 3, pp. 215--232, May 1997.

Digital Library

[19]

Y. Luo, O. M. Lubeck, H. Wasserman, F. Bassetti, and K. W. Cameron, "Development and validation of a hierarchical memory model incorporating CPU- and memory-operation overlap model," in Proc. of the 1st Intl. workshop on Software and performance, 1998.

Digital Library

[20]

Z. Zhang, Z. Zhu, and X. Zhang, "A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality," in MICRO 33, 2000.

Digital Library

[21]

J. G. K. Luo and M. Franklin, "Balancing throughput and fairness in SMT processors," in ISPASS '01, 2001, pp. 164 -- 171.

[22]

W. Zhang and T. Li, "Exploring phase change memory and 3D die-stacking for power/thermal friendly, fast and durable memory architectures," in PACT '09, 2009.

Digital Library

[23]

G. E. Suh, L. Rudolph, and S. Devadas, "Dynamic partitioning of shared cache memory," Journal of Supercomputing, 2002.

Digital Library

[24]

M. Moreto, F. J. Cazorla, A. Ramirez, and M. Valero, "Transactions on high-performance embedded architectures and compilers III."\hskip 1em plus 0.5em minus 0.4em\relax Berlin, Heidelberg: Springer-Verlag, 2011, ch. Dynamic cache partitioning based on the MLP of cache misses, pp. 3--23.

Digital Library

[25]

J. D. Owens, P. Mattson, U. J. Kapasi, W. J. Dally, and S. Rixner, "Memory access scheduling," ISCA, vol. 0, p. 128, 2000.

[26]

K. J. Nesbit, N. Aggarwal, J. Laudon, and J. E. Smith, "Fair queuing memory systems," in MICRO 39, 2006, pp. 208--222.

Digital Library

[27]

E. Ipek, O. Mutlu, J. F. Martınez, and R. Caruana, "Self-optimizing memory controllers: A reinforcement learning approach," in ISCA '08.

Digital Library

[28]

R. Wang, L. Chen, and T. Pinkston, "An analytical performance model for partitioning off-chip memory bandwidth," in IPDPS, 2013.

Digital Library

[29]

D. Kaseridis, J. Stuecheli, J. Chen, and L. K. John, "A bandwidth-aware memory-subsystem resource management using non-invasive resource profilers for large CMP systems," in HPCA'10, 2010, pp. 1--11.

[30]

E. Ebrahimi, C. J. Lee, O. Mutlu, and Y. N. Patt, "Fairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systems," in ASPLOS, ser. ASPLOS XV, 2010.

Digital Library

Cited By

Zhou MDu YChilders BMosse DMelhem R(2016)Symmetry-Agnostic Coordinated Management of the Memory Hierarchy in Multicore SystemsACM Transactions on Architecture and Code Optimization10.1145/284725412:4(1-26)Online publication date: 4-Jan-2016
https://dl.acm.org/doi/10.1145/2847254
Gao SHe BXu JBhuyan LChong FSarkar V(2015)Real-Time In-Memory Checkpointing for Future Hybrid Memory SystemsProceedings of the 29th ACM on International Conference on Supercomputing10.1145/2751205.2751212(263-272)Online publication date: 8-Jun-2015
https://dl.acm.org/doi/10.1145/2751205.2751212
Gulur NMehendale MGovindarajan RJohn LSmith CSachs KLladó C(2015)A Comprehensive Analytical Performance Model of DRAM CachesProceedings of the 6th ACM/SPEC International Conference on Performance Engineering10.1145/2668930.2688044(157-168)Online publication date: 28-Jan-2015
https://dl.acm.org/doi/10.1145/2668930.2688044
Show More Cited By

Index Terms

Writeback-aware bandwidth partitioning for multi-core systems with PCM
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
2. Hardware
  1. Integrated circuits
    1. Semiconductor memory
      1. Dynamic memory

Recommendations

Writeback-Aware LLC Management for PCM-Based Main Memory Systems

With the increase in the number of data-intensive applications on today's workloads, DRAM-based main memories are struggling to satisfy the growing data demand capacity. Phase Change Memory (PCM) is a type of non-volatile memory technology that has been ...
Writeback-aware partitioning and replacement for last-level caches in phase change main memory systems
Special Issue on High-Performance Embedded Architectures and Compilers

Phase-Change Memory (PCM) has emerged as a promising low-power main memory candidate to replace DRAM. The main problems of PCM are that writes are much slower and more power hungry than reads, write bandwidth is much lower than read bandwidth, and ...
Write-aware memory management for hybrid SLC-MLC PCM memory systems

In recent years, phase-change memory (PCM) has generated a great deal of interest because of its byte addressability and non-volatility properties. It is regarded as a good alternative storage medium that can reduce the performance gap between the main ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

PACT '13: Proceedings of the 22nd international conference on Parallel architectures and compilation techniques

October 2013

422 pages

ISBN:9781479910212

Conference Chair:
Christian Fensch
University of Edinburgh, UK
,
General Chair:
Michael O'Boyle
University of Edinburgh, UK
,
Program Chairs:
André Seznec
INRIA Rennes, France
,
François Bodin
IRISA/CAPS Entreprise, France

Sponsors

IFIP WG 10.3: IFIP WG 10.3
IEEE TCCA: IEEE Computer Society Technical Committee on Computer Architecture
SIGARCH: ACM Special Interest Group on Computer Architecture
IEEE CS TCPP: IEEE Computer Society Technical Committee on Parallel Processing

Publisher

IEEE Press

Publication History

Published: 07 October 2013

Check for updates

Author Tags

Qualifiers

Research-article

Acceptance Rates

PACT '13 Paper Acceptance Rate 36 of 208 submissions, 17%;

Overall Acceptance Rate 121 of 471 submissions, 26%

Upcoming Conference

PACT '24

Sponsor:
sigarch

International Conference on Parallel Architectures and Compilation Techniques

October 14 - 16, 2024

Long Beach , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
240
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhou MDu YChilders BMosse DMelhem R(2016)Symmetry-Agnostic Coordinated Management of the Memory Hierarchy in Multicore SystemsACM Transactions on Architecture and Code Optimization10.1145/284725412:4(1-26)Online publication date: 4-Jan-2016
https://dl.acm.org/doi/10.1145/2847254
Gao SHe BXu JBhuyan LChong FSarkar V(2015)Real-Time In-Memory Checkpointing for Future Hybrid Memory SystemsProceedings of the 29th ACM on International Conference on Supercomputing10.1145/2751205.2751212(263-272)Online publication date: 8-Jun-2015
https://dl.acm.org/doi/10.1145/2751205.2751212
Gulur NMehendale MGovindarajan RJohn LSmith CSachs KLladó C(2015)A Comprehensive Analytical Performance Model of DRAM CachesProceedings of the 6th ACM/SPEC International Conference on Performance Engineering10.1145/2668930.2688044(157-168)Online publication date: 28-Jan-2015
https://dl.acm.org/doi/10.1145/2668930.2688044
Gulur NMehendale MManikantan RGovindarajan R(2014)ANATOMYACM SIGMETRICS Performance Evaluation Review10.1145/2637364.259199542:1(505-517)Online publication date: 16-Jun-2014
https://dl.acm.org/doi/10.1145/2637364.2591995
Gulur NMehendale MManikantan RGovindarajan RSanghavi SShakkottai SLelarge MSchroeder B(2014)ANATOMYThe 2014 ACM international conference on Measurement and modeling of computer systems10.1145/2591971.2591995(505-517)Online publication date: 16-Jun-2014
https://dl.acm.org/doi/10.1145/2591971.2591995

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents