Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

Adaptive Burst-Writes (ABW): Memory Requests Scheduling to Reduce Write-Induced Interference

Published: 02 December 2015 Publication History

Abstract

Main memory latencies have become a major performance bottleneck for chip-multiprocessors (CMPs). Since reads are on the critical path, existing memory controllers prioritize reads over writes. However, writes must be eventually processed when the write queue is full. These writes are serviced in a burst to reduce the bus turnaround delay and increase the row-buffer locality. Unfortunately, a large number of reads may suffer long queuing delay when the burst-writes are serviced. The long write latency of future nonvolatile memory will further exacerbate the long queuing delay of reads during burst-writes.
In this article, we propose a run-time mechanism, Adaptive Burst-Writes (ABW), to reduce the queuing delay of reads. Based on the row-buffer hit rate of writes and the arrival rate of reads, we dynamically control the number of writes serviced in a burst to trade off the write service time and the queuing latency of reads. For prompt adjustment, our history-based mechanism further terminates the burst-writes earlier when the row-buffer hit rate of writes in the previous burst-writes is low. As a result, our policy improves system throughput by up to 28% (average 10%) and 43% (average 14%) in CMPs with DRAM-based and PCM-based main memory.

References

[1]
Manu Awasthi, David W. Nellans, Kshitij Sudan, Rajeev Balasubramonian, and Al Davis. 2010. Handling the problems and opportunities posed by multiple on-chip memory controllers. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques.
[2]
Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The gem5 simulator. SIGARCH Comput. Archit. News 39, 2.
[3]
Niladrish Chatterjee, Naveen Muralimanohar, Rajeev Balasubramonian, Al Davis, and Norman P. Jouppi. 2012. Staged reads: Mitigating the impact of DRAM writes on DRAM reads. In Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture.
[4]
Sangyeun Cho and Hyunjin Lee. 2009. Flip-N-Write: A simple deterministic technique to improve PRAM write performance, energy and endurance. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture.
[5]
Youngdon Choi, Ickhyun Song, Mu-Hui Park, Hoeju Chung, Sanghoan Chang, Beakhyoung Cho, Jinyoung Kim, Younghoon Oh, Duckmin Kwon, Jung Sunwoo, Junho Shin, Yoohwan Rho, Changsoo Lee, Min Gu Kang, Jaeyun Lee, Yongjin Kwon et al. 2012. A 20nm 1.8V 8Gb PRAM with 40MB/s program bandwidth. In Proceedings of IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).
[6]
Gaurav Dhiman, Raid Ayoub, and Tajana Rosing. 2009. PDRAM: A Hybrid PRAM and DRAM main memory system. In Proceedings of the 46th Annual Design Automation Conference. 664--469.
[7]
Xiangyu Dong, Cong Xu, Yuan Xie, and N. P. Jouppi. 2012. NVSim: A circuit-level performance, energy, and area model for emerging nonvolatile memory. IEEE Trans. Comput.-Aided Design Integr. Circuits Syst. 31, 7
[8]
Jingtong Hu, Wei-Che Tseng, C. J. Xue, Qingfeng Zhuge, Yingchao Zhao, and E. H.-M. Sha. 2011. Write activity minimization for nonvolatile main memory via scheduling and recomputation. IEEE Trans. Comput.-Aided Design Integr. Circuits Syst. 30, 4, 584--592.
[9]
Yazhi Huang, Tiantian Liu, and C. J. Xue. 2011. Register allocation for write activity minimization on non-volatile main memory. In Proceedings of 16th Asia and South Pacific Design Automation Conference (ASP-DAC). 129--134.
[10]
Ibrahim Hur and Calvin Lin. 2004. Adaptive history-based memory schedulers. In Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture.
[11]
Ibrahim Hur and Calvin Lin. 2007. Memory scheduling for modern microprocessors. ACM Trans. Comput. Syst. 25, 4.
[12]
Bruce Jacob, Spencer W. Ng, and David T. Wang. 2007. Memory Systems: Cache, DRAM, Disk. Morgan Kaufmann Publisher.
[13]
Lei Jiang, Bo Zhao, Jun Yang, and Youtao Zhang. 2014. A low power and reliable charge pump design for Phase Change Memories. In Proceedings of ACM/IEEE 41st International Symposium on Computer Architecture (ISCA). 397--408.
[14]
Dongki Kim, Sungkwang Lee, Jaewoong Chung, Dae Hyun Kim, Dong Hyuk Woo, Sungjoo Yoo, and Sunggu Lee. 2012a. Hybrid DRAM/PRAM-based main memory for single-chip CPU/GPU. In Proceedings of 49th ACM/EDAC/IEEE Design Automation Conference (DAC). 888--896.
[15]
Yoongu Kim, Dongsu Han, Onur Mutlu, and Mor Harchol-Balter. 2010a. ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers. In Proceedings of the 16th IEEE International Symposium on High Performance Computer Architecture.
[16]
Yoongu Kim, Michael Papamichael, Onur Mutlu, and Mor Harchol-Balter. 2010b. Thread cluster memory scheduling: Exploiting differences in memory access behavior. In Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[17]
Youngsik Kim, Sungjoo Yoo, and Sunggu Lee. 2012b. Write performance improvement by hiding R drift latency in phase-change RAM. In Proceedings of 49th ACM/EDAC/IEEE Design Automation Conference (DAC). 897--906.
[18]
Suknam Kwon, Dongki Kim, Youngsik Kim, Sungjoo Yoo, and Sunggu Lee. 2012. A case study on the application of real phase-change RAM to main memory subsystems. In Proceedings of Design, Automation Test in Europe Conference Exhibition (DATE). 264--267.
[19]
Chih-Yen Lai, Gung-Yu Pan, Hsien-Kai Kuo, and Jing-Yang Jou. 2014. A read-write aware DRAM scheduling for power reduction in multi-core systems. In Proceedings of 19th Asia and South Pacific Design Automation Conference (ASP-DAC). 604--609.
[20]
Chung Lam. 2008. Cell design considerations for phase change memory as a universal memory. In Proceedings of International Conference on VLSI Technology, Systems and Applications. 132--133.
[21]
Benjamin C. Lee, Engin Ipek, Onur Mutlu, and Doug Burger. 2009. Architecting phase change memory as a scalable DRAM alternative. In Proceedings of the 36th Annual International Symposium on Computer Architecture.
[22]
Chang Joo Lee, Eiman Ebrahimi, Veynu Narasiman, Onur Mutlu, and Yale N. Patt. 2010a. DRAM-aware last-level cache replacement. HPS Tech. Rep. TR-HPS-2010-007.
[23]
Chang Joo Lee, Veynu Narasiman, Eiman Ebrahimi, Onur Mutlu, and Yale N. Patt. 2010b. DRAM-aware last-level cache writeback: Reducing write-caused interference in memory systems. HPS Tech. Rep. TR-HPS-2010-002.
[24]
Hsien-Hsin S. Lee, Gary S. Tyson, and Matthew K. Farrens. 2000. Eager writeback—A technique for improving bandwidth utilization. In Proceedings of the 33rd Annual ACM/IEEE International Symposium on Microarchitecture.
[25]
Bing Li, ShuChang Shan, Yu Hu, and Xiaowei Li. 2014. Partial-SET: Write speedup of PCM main memory. In Proceedings of Design, Automation and Test in Europe Conference and Exhibition (DATE). 1--4.
[26]
Jiayin Li and K. Mohanram. 2014. Write-once-memory-code phase change memory. In Proceedings of Design, Automation and Test in Europe Conference and Exhibition (DATE). 1--6.
[27]
John D. McCalpin. 1995. Memory bandwidth and machine balance in current high performance computers. IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter.
[28]
Micron-DDR3. 4Gb DDR3 SDRAM, MT41J512M8-64Megx8x8 banks. http://www.micron.com/products/dram/ddr3-sdram.
[29]
Micron-LPDDR2. 4Gb LPDDR2 SDRAM, MT42L256M16D1-32Megx16x8 banks. http://www.micron.com/products/dram/mobile-lpdram?
[30]
Azalia Mirhoseini, Miodrag Potkonjak, and Farinaz Koushanfar. 2012. Coding-based energy minimization for phase change memory. In Proceedings of the 49th Annual Design Automation Conference. 68--76.
[31]
Onur Mutlu and Thomas Moscibroda. 2007. Stall-time fair memory access scheduling for chip multiprocessors. In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture.
[32]
Onur Mutlu and Thomas Moscibroda. 2008. Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared DRAM systems. In Proceedings of the 35th Annual International Symposium on Computer Architecture.
[33]
Chitra Natarajan, Bruce Christenson, and Fayé Briggs. 2004. A study of performance impact of memory controller features in multiprocessor server environment. In Proceedings of the 3rd Workshop on Memory Performance Issues.
[34]
Dimin Niu, Yibo Chen, Xiangyu Dong, and Yuan Xie. 2010. Energy and performance driven circuit design for emerging Phase-Change Memory. In Proceedings of 15th Asia and South Pacific Design Automation Conference (ASP-DAC). 193--198.
[35]
Behrooz Parhami. 2010. Computer Arithmetic: Algorithms and Hardware Designs, 2nd Ed. Oxford University Press.
[36]
Hyunsun Park, Sungjoo Yoo, and Sunggu Lee. 2011. Power management of hybrid DRAM/PRAM-based main memory. In Proceedings of 48th ACM/EDAC/IEEE Design Automation Conference (DAC). 59--64.
[37]
Matt Poremba and Yuan Xie. 2012. NVMain: An architectural-level main memory simulator for emerging non-volatile memories. In Proceedings of IEEE Computer Society Annual Symposium on VLSI (ISVLSI).
[38]
Moinuddin K. Qureshi, Michele M. Franceschini, and Luis A. Lastras-montao. 2010. Improving read performance of phase change memories via write cancellation and write pausing. In Proceedings of the IEEE 16th International Symposium on High Performance Computer Architecture.
[39]
Moinuddin K. Qureshi, Vijayalakshmi Srinivasan, and Jude A. Rivers. 2009. Scalable high performance main memory system using phase-change memory technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture.
[40]
S. Raoux, G. W. Burr, M. J. Breitwisch, C. T. Rettner, Y.-C. Chen, R. M. Shelby, M. Salinga, D. Krebs, S.-H. Chen, H.-L. Lung, and C. H. Lam. 2008. Phase-change random access memory: A scalable technology. IBM J. Res. Dev. 52, 4, 465--479.
[41]
Scott Rixner, William J. Dally, Ujval J. Kapasi, Peter Mattson, and John D. Owens. 2000. Memory access scheduling. In Proceedings of the 27th Annual International Symposium on Computer Architecture.
[42]
Roberto Rodriguez-Rodriguez, Fernando Castro, Daniel Chaver, Luis Pinuel, and Francisco Tirado. 2013. Reducing writes in Phase-Change Memory environments by using efficient cache replacement policies. In Proceedings of Design, Automation Test in Europe Conference Exhibition (DATE). 93--96.
[43]
Jun Shao and Brian T. Davis. 2007. A burst scheduling access reordering mechanism. In Proceedings of the IEEE 13th International Symposium on High Performance Computer Architecture.
[44]
Allan Snavely and Dean M. Tullsen. 2000. Symbiotic job scheduling for a simultaneous multithreaded processor. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems.
[45]
Jeffrey Stuecheli, Dimitris Kaseridis, David Daly, Hillery C. Hunter, and Lizy K. John. 2010. The virtual write queue: Coordinating DRAM and last-level cache policies. In Proceedings of the 37th Annual International Symposium on Computer Architecture.
[46]
Guangyu Sun, Dimin Niu, Jin Ouyang, and Yuan Xie. 2011. A frequent-value based PRAM memory architecture. In Proceedings of 16th Asia and South Pacific Design Automation Conference (ASP-DAC). 211--216.
[47]
Zhe Wang, Samira M. Khan, and Daniel A. Jimnez. 2012. Improving writeback efficiency with decoupled last-write prediction. In Proceedings of the 39th Annual International Symposium on Computer Architecture.
[48]
Jianhui Yue and Yifeng Zhu. 2013. Exploiting subarrays inside a bank to improve phase change memory performance. In Proceedings of Design, Automation Test in Europe Conference Exhibition (DATE). 386--391.
[49]
Guangfei Zhang, Huandong Wang, Xinke Chen, Shuai Huang, and Peng Li. 2012. Heterogeneous multi-channel: Fine-grained DRAM control for both system performance and power efficiency. In Proceedings of 49th ACM/EDAC/IEEE Design Automation Conference (DAC). 876--881.
[50]
Wangyuan Zhang and Tao Li. 2009. Exploring phase change memory and 3D die-stacking for power/thermal friendly, fast and durable memory architectures. In Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques.
[51]
XianWei Zhang, Le Jang, Youao Zhang, Chuanjun Zhang, and Jun Yang. 2013. WoM-SET: Low power proactive-SET-based PCM write using WoM code. In Proceedings of IEEE International Symposium on Low Power Electronics and Design (ISLPED). 217--222.
[52]
Miao Zhou, Yu Du, Bruce Childers, Rami Melhem, and Daniel Mossé. 2012. Writeback-aware partitioning and replacement for last-level caches in phase change main memory systems. ACM Trans. Archit. Code Optim., Special Issue on High-Performance and Embedded Architectures and Compilers.
[53]
Ping Zhou, Bo Zhao, Jun Yang, and Youtao Zhang. 2009. A durable and energy efficient main memory using phase change memory technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture.

Cited By

View all
  • (2021)A Soft Real-time Memory Request Scheduler for Phase Change Memory Systems2021 IEEE 27th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA)10.1109/RTCSA52859.2021.00021(109-118)Online publication date: Aug-2021
  • (2018)ReadPRO: Read Prioritization Scheduling in ORAM for Efficient Obfuscation in Main Memories2018 IEEE 36th International Conference on Computer Design (ICCD)10.1109/ICCD.2018.00024(100-107)Online publication date: Oct-2018

Index Terms

  1. Adaptive Burst-Writes (ABW): Memory Requests Scheduling to Reduce Write-Induced Interference

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Design Automation of Electronic Systems
    ACM Transactions on Design Automation of Electronic Systems  Volume 21, Issue 1
    November 2015
    464 pages
    ISSN:1084-4309
    EISSN:1557-7309
    DOI:10.1145/2852253
    • Editor:
    • Naehyuck Chang
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Journal Family

    Publication History

    Published: 02 December 2015
    Accepted: 01 March 2015
    Revised: 01 February 2015
    Received: 01 October 2014
    Published in TODAES Volume 21, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Memory controller
    2. memory request scheduling
    3. memory subsystem
    4. writeback-aware management

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)71
    • Downloads (Last 6 weeks)15
    Reflects downloads up to 01 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)A Soft Real-time Memory Request Scheduler for Phase Change Memory Systems2021 IEEE 27th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA)10.1109/RTCSA52859.2021.00021(109-118)Online publication date: Aug-2021
    • (2018)ReadPRO: Read Prioritization Scheduling in ORAM for Efficient Obfuscation in Main Memories2018 IEEE 36th International Conference on Computer Design (ICCD)10.1109/ICCD.2018.00024(100-107)Online publication date: Oct-2018

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media