Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3357526.3357563acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmemsysConference Proceedingsconference-collections
research-article

Transitioning scientific applications to using non-volatile memory for resilience

Published: 30 September 2019 Publication History

Abstract

Scientific applications often run for long periods of time, and as a result, frequently save their internal states to storage media in cases of unexpected interruptions (e.g., hardware failures). Emerging non-volatile memory (NVRAM) can write up to 40× faster than traditional mechanical storage devices, providing an attractive medium for this purpose. This paper investigates the implications of transitioning a scientific application, Fluidanimate, to use NVRAM for fault tolerance. In particular, we evaluate the performance implications and ease-of-use of four fault-tolerance approaches: 1) logging through transactions, 2) multi-versioning through copy-on-write operations, and 3) checkpointing through IO operations (e.g., fwrite) on a direct access (DAX) filesystem and 4) checkpointing with a DRAM cache. Our study results in three key findings. First, additional changes to the application are required to take advantage of the increase in IO speed provided by NVRAM. Second, the performance scalability of the approaches lack when considering a single process. Third, NVRAM can increase reliability in a distributed computing environment by allowing individual nodes to error and automatically recover before the rest of the system notices.

References

[1]
2011. Nuts and Bolts of Multithreaded Programming | Intel® Software. https://software.intel.com/en-us/articles/nuts-and-bolts-of-multithreaded-programming. (9 2011).
[2]
2016. Intel 64 and IA-32 Architectures Software Developer's Manual. https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf. (9 2016).
[3]
2017. perf(1) - Linux manual page. http://man7.org/linux/man-pages/man1/perf.1.html. (9 2017).
[4]
2017. unlink(2) - Linux manual page. http://man7.org/linux/man-pages/man2/unlink.2.html. (9 2017).
[5]
2018. CRIU: CheckpointRestore In Userspace. (2018). https://criu.org/Main_Page
[6]
2019. access time Definition from PC Magazine Encyclopedia. https://www.pcmag.com/encyclopedia/term/37400/access-time. (2019).
[7]
2019. Date and time utilities - cppreference.com. https://en.cppreference.com/w/cpp/chrono. (3 2019).
[8]
2019. dd(1) - Linux manual page. http://man7.org/linux/man-pages/man1/dd.1.html. (3 2019).
[9]
2019. fopen(3) - Linux manual page. http://man7.org/linux/man-pages/man3/fopen.3.html. (5 2019).
[10]
2019. https://www.kernel.org/doc/Documentation/filesystems/dax.txt. https://www.kernel.org/doc/Documentation/filesystems/dax.txt. (1 2019).
[11]
2019. Intel Optane DC Persistent Memory. (2019). https://www.intel.com/content/www/us/en/architecture-and-technology/optane-dc-persistent-memory.html.
[12]
2019. PMDK: Persistent Memory Development Kit. (2019). https://github.com/pmem/pmdk/.
[13]
2019. What is Access Time? Webopedia Definition. https://www.webopedia.com/TERM/A/access_time.html. (2019).
[14]
2019. What is distributed computing. https://www.ibm.com/support/knowledgecenter/en/SSAL2T_8.1.0/com.ibm.cics.tx.doc/concepts/c_wht_is_distd_comptg.html. (2019).
[15]
Adam Beguelin, Erik Seligman, and Peter Stephan. 1997. Application level fault tolerance in heterogeneous networks of workstations. J. Parallel and Distrib. Comput. 43, 2 (1997), 147--155.
[16]
Christian Bienia. 2011. Benchmarking Modern Multiprocessors. Ph.D. Dissertation. Princeton University.
[17]
Greg Bronevetsky, Daniel Marques, Keshav Pingali, and Paul Stodghill. 2003. Automated application-level checkpointing of MPI programs. In ACM Sigplan Notices, Vol. 38. ACM, 84--94.
[18]
Greg Bronevetsky, Daniel Marques, Keshav Pingali, Peter Szwed, and Martin Schulz. 2004. Application-level Checkpointing for Shared Memory Programs. SIGPLAN Not. 39, 11 (Oct. 2004), 235--247.
[19]
Stewart Cant. 2002. High-performance computing in computational fluid dynamics: progress and challenges. Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences 360, 1795 (2002), 1211--1225.
[20]
J. Cao, K. Arya, R. Garg, S. Matott, D. K. Panda, H. Subramoni, J. Vienne, and G. Cooperman. 2016. System-Level Scalable Checkpoint-Restart for Petascale Computing. In 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS). 932--941.
[21]
Adrian M. Caulfield, Joel Coburn, Todor Mollov, Arup De, Ameen Akel, Jiahua He, Arun Jagatheesan, Rajesh K. Gupta, Allan Snavely, and Steven Swanson. 2010. Understanding the Impact of Emerging Non-Volatile Memories on High-Performance, IO-Intensive Computing. In Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC '10). IEEE Computer Society, Washington, DC, USA, 1--11.
[22]
Huai-Yu Cheng, M BrightSky, S Raoux, CF Chen, PY Du, JY Wu, YY Lin, TH Hsu, Y Zhu, S Kim, et al. 2013. Atomic-level engineering of phase change material for novel fast-switching and high-endurance PCM for storage class memory application. In 2013 IEEE International Electron Devices Meeting. IEEE, 30--6.
[23]
Joel Coburn, Adrian M. Caulfield, Ameen Akel, Laura M. Grupp, Rajesh K. Gupta, Ranjit Jhala, and Steven Swanson. 2011. NV-Heaps: Making Persistent Objects Fast and Safe with Next-generation, Non-volatile Memories. SIGPLAN Not. 46, 3 (March 2011), 105--118.
[24]
Jeremy Condit, Edmund B. Nightingale, Christopher Frost, Engin Ipek, Benjamin Lee, Doug Burger, and Derrick Coetzee. 2009. Better I/O Through Byte-addressable, Persistent Memory. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles (SOSP '09). ACM, New York, NY, USA, 133--146.
[25]
Camille Coti, Thomas Herault, Pierre Lemarinier, Laurence Pilard, Ala Rezmerita, Eric Rodriguezb, and Franck Cappello. 2006. Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant MPI. In SC'06: Proceedings of the 2006 ACM/IEEE conference on Supercomputing. IEEE, 18--18.
[26]
N. El-Sayed and B. Schroeder. 2014. To checkpoint or not to checkpoint: Understanding energy-performance-I/O tradeoffs in HPC checkpointing. In 2014 IEEE International Conference on Cluster Computing (CLUSTER). 93--102.
[27]
Hussein Elnawawy, Mohammad Alshboul, James Tuck, and Yan Solihin. 2017. Efficient checkpointing of loop-based codes for non-volatile main memory. In 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT). IEEE, 318--329.
[28]
M. Gamell, D. S. Katz, H. Kolla, J. Chen, S. Klasky, and M. Parashar. 2014. Exploring Automatic, Online Failure Recovery for Scientific Applications at Extreme Scales. In SC '14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 895--906.
[29]
Al Geist. 2016. Supercomputing's monster in the closet. IEEE Spectrum 53, 3 (2016), 30--35.
[30]
NVM Programming Technical Work Group. 2016. pmem.io: How to emulate Persistent Memory. (2 2016). https://pmem.io/2016/02/22/pm-emulation.html
[31]
Rinku Gupta, Harish Naik, and Pete Beckman. 2011. Understanding checkpointing overheads on massive-scale systems: Analysis of the ibm blue gene/p system. The International Journal of High Performance Computing Applications 25, 2 (2011), 180--192.
[32]
Paul H Hargrove and Jason C Duell. 2006. Berkeley lab checkpoint/restart (BLCR) for Linux clusters. Journal of Physics: Conference Series 46 (sep 2006), 494--499.
[33]
Vincent Heuveline and Andrea Walther. 2006. Online checkpointing for parallel adjoint computation in PDEs: Application to goal-oriented adaptivity and flow control. In European Conference on Parallel Processing. Springer, 689--699.
[34]
Kuang-Hua Huang and Jacob A Abraham. 1984. Algorithm-based fault tolerance for matrix operations. IEEE transactions on computers 100, 6 (1984), 518--528.
[35]
Joseph Izraelevitz, Terence Kelly, and Aasheesh Kolli. 2016. Failure-Atomic Persistent Memory Updates via JUSTDO Logging. SIGPLAN Not. 51, 4 (March 2016), 427--442.
[36]
Paul A Jensen. 1963. Quadded NOR logic. IEEE Transactions on Reliability 12, 3 (1963), 22--31.
[37]
A. Joshi, V. Nagarajan, M. Cintra, and S. Viglas. 2018. DHTM: Durable Hardware Transactional Memory. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 452--465.
[38]
Olzhas Kaiyrakhmet, Songyi Lee, Beomseok Nam, Sam H. Noh, and Young ri Choi. 2019. SLM-DB: Single-Level Key-Value Store with Persistent Memory. In 17th USENIX Conference on File and Storage Technologies (FAST 19). USENIX Association, Boston, MA, 191--205. https://www.usenix.org/conference/fast19/presentation/kaiyrakhmet
[39]
Sudarsun Kannan, Ada Gavrilovska, Karsten Schwan, and Dejan Milojicic. 2013. Optimizing checkpoints using nvm as virtual memory. In 2013 IEEE 27th International Symposium on Parallel and Distributed Processing. IEEE, 29--40.
[40]
David E Keyes, Dinesh K Kaushik, and Barry F Smith. 2000. Prospects for CFD on petaflops systems. In Parallel Solution of Partial Differential Equations. Springer, 247--277.
[41]
Aasheesh Kolli, Jeff Rosen, Stephan Diestelhorst, Ali Saidi, Steven Pelley, Sihang Liu, Peter M. Chen, and Thomas F. Wenisch. 2016. Delegated Persist Ordering. In The 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-49). IEEE Press, Piscataway, NJ, USA, Article 58, 13 pages. http://dl.acm.org/citation.cfm?id=3195638.3195709
[42]
Benjamin C. Lee, Engin Ipek, Onur Mutlu, and Doug Burger. 2009. Architecting Phase Change Memory As a Scalable Dram Alternative. SIGARCH Comput. Archit. News 37, 3 (June 2009), 2--13.
[43]
D. Li, J. S. Vetter, G. Marin, C. McCurdy, C. Cira, Z. Liu, and W. Yu. 2012. Identifying Opportunities for Byte-Addressable Non-Volatile Memory in Extreme-Scale Scientific Applications. In 2012 IEEE 26th International Parallel and Distributed Processing Symposium. 945--956.
[44]
Michael Litzkow, Todd Tannenbaum, Jim Basney, and Miron Livny. 1997. Checkpoint and migration of UNIX processes in the Condor distributed processing system. Technical Report. University of Wisconsin-Madison Department of Computer Sciences.
[45]
Mengxing Liu, Mingxing Zhang, Kang Chen, Xuehai Qian, Yongwei Wu, Weimin Zheng, and Jinglei Ren. 2017. DudeTM: Building Durable Transactions with Decoupling for Persistent Memory. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '17). ACM, New York, NY, USA, 329--343.
[46]
Raymond A Lorie. 1977. Physical integrity in a large segmented database. ACM Transactions on Database Systems (TODS) 2, 1 (1977), 91--104.
[47]
Youyou Lu, Jiwu Shu, and Long Sun. 2016. Blurred Persistence: Efficient Transactions in Persistent Memory. Trans. Storage 12, 1, Article 3 (Jan. 2016), 29 pages.
[48]
Robert E Lyons and Wouter Vanderkulk. 1962. The use of triple-modular redundancy to improve computer reliability. IBM journal of research and development 6, 2 (1962), 200--209.
[49]
Micron. 2016. 3D XPoint Technology: Breakthrough Nonvolatile Memory Technology. (2016). https://www.micron.com/products/advanced-solutions/3d-xpoint-technology
[50]
Bryan Mills, Taieb Znati, and Rami Melhem. 2014. Shadow computing: An energy-aware fault tolerant computing model. In 2014 International Conference on Computing, Networking and Communications (ICNC). IEEE, 73--77.
[51]
Sanketh Nalli, Swapnil Haria, Mark D. Hill, Michael M. Swift, Haris Volos, and Kimberly Keeton. 2017. An Analysis of Persistent Memory Use with WHISPER. SIGOPS Oper. Syst. Rev. 51, 2 (April 2017), 135--148.
[52]
Moohyeon Nam, Hokeun Cha, Young ri Choi, Sam H. Noh, and Beomseok Nam. 2019. Write-Optimized Dynamic Hashing for Persistent Memory. In 17th USENIX Conference on File and Storage Technologies (FAST 19). USENIX Association, Boston, MA, 31--44. https://www.usenix.org/conference/fast19/presentation/nam
[53]
Xiang Ni, Esteban Meneses, and Laxmikant V Kalé. 2012. Hiding checkpoint overhead in HPC applications with a semi-blocking algorithm. In 2012 IEEE International Conference on Cluster Computing. IEEE, 364--372.
[54]
Ahmad Shukri Mohd Noor and Mustafa Mat Deris. 2009. Extended heartbeat mechanism for fault detection service methodology. In International Conference on Grid and Distributed Computing. Springer, 88--95.
[55]
M. A. Ogleari, E. L. Miller, and J. Zhao. 2018. Steal but No Force: Efficient Hardware Undo+Redo Logging for Persistent Memory Systems. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). 336--349.
[56]
Jiaxin Ou, Jiwu Shu, and Youyou Lu. 2016. A High Performance File System for Non-volatile Main Memory. In Proceedings of the Eleventh European Conference on Computer Systems (EuroSys '16). ACM, New York, NY, USA, Article 12, 16 pages.
[57]
Joaquim Peiró and Spencer Sherwin. 2005. Finite difference, finite element and finite volume methods for partial differential equations. In Handbook of materials modeling. Springer, 2415--2446.
[58]
Simone Raoux, Geoffrey W Burr, Matthew J Breitwisch, Charles T Rettner, Yi-Chou Chen, Robert M Shelby, Martin Salinga, Daniel Krebs, Shih-Hung Chen, Hsiang-Lan Lung, et al. 2008. Phase-change random access memory: A scalable technology. IBM Journal of Research and Development 52, 4/5 (2008), 465.
[59]
J. Ren, J. Zhao, S. Khan, J. Choi, Y. Wu, and O. Mutiu. 2015. ThyNVM: Enabling software-transparent crash consistency in persistent memory systems. In 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 672--685.
[60]
Ohad Rodeh, Josef Bacik, and Chris Mason. 2013. BTRFS: The Linux B-Tree Filesystem. Trans. Storage 9, 3, Article 9 (Aug. 2013), 32 pages.
[61]
Gabriel Rodríguez, María J Martín, Patricia González, Juan Tourino, and Ramón Doallo. 2010. CPPC: a compiler-assisted tool for portable checkpointing of message-passing applications. Concurrency and Computation: Practice and Experience 22, 6 (2010), 749--766.
[62]
Haris Volos, Andres Jaan Tack, and Michael M. Swift. 2011. Mnemosyne: Lightweight Persistent Memory. SIGPLAN Not. 47, 4 (March 2011), 91--104.
[63]
John Paul Walters and Vipin Chaudhary. 2006. Application-Level Checkpointing Techniques for Parallel Programs. In Distributed Computing and Internet Technology, Sanjay K. Madria, Kajal T. Claypool, Rajgopal Kannan, Prem Uppuluri, and Manoj Madhava Gore (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 221--234.
[64]
T. Wang, J. Levandoski, and P. Larson. 2018. Easy Lock-Free Indexing in Non-Volatile Memory. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). 461--472.
[65]
T. Wang, S. Sambasivam, and J. Tuck. 2018. Hardware Supported Permission Checks on Persistent Objects for Performance and Programmability. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 466--478.
[66]
Yi-Min Wang, Pi-Yu Chung, Yennun Huang, and Elmootazbellah N Elnozahy. 1997. Integrating checkpointing with transaction processing. In Proceedings of IEEE 27th International Symposium on Fault Tolerant Computing. IEEE, 304--308.
[67]
Yi-Min Wang, Yennun Huang, Kiem-Phong Vo, Pe-Yu Chung, and Chandra Kintala. 1995. Checkpointing and its applications. In Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers. IEEE, 22--31.
[68]
Panruo Wu and Zizhong Chen. 2014. FT-ScaLAPACK: Correcting soft errors online for ScaLAPACK Cholesky, QR, and LU factorization routines. In Proceedings of the 23rd international symposium on High-performance parallel and distributed computing. ACM, 49--60.
[69]
Ren Xiaoguang, Xu Xinhai, Tang Yuhua, and Fang Xudong. 2014. The Analysis of Checkpoint Strategies for Large-Scale CFD Simulation in HPC System. In 2014 Fourth International Conference on Communication Systems and Network Technologies. IEEE, 1097--1101.
[70]
Jian Xu and Steven Swanson. 2016. NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories. In 14th USENIX Conference on File and Storage Technologies (FAST 16). USENIX Association, Santa Clara, CA, 323--338. https://www.usenix.org/conference/fast16/technical-sessions/presentation/xu
[71]
Jun Yang, Qingsong Wei, Cheng Chen, Chundong Wang, Khai Leong Yong, and Bingsheng He. 2015. NV-Tree: Reducing Consistency Cost for NVM-based Single Level Systems. In 13th USENIX Conference on File and Storage Technologies (FAST 15). USENIX Association, Santa Clara, CA, 167--181. https://www.usenix.org/conference/fast15/technical-sessions/presentation/yang
[72]
Tatu Ylönen. 1992. Concurrent shadow paging: A new direction for database research. (1992).
[73]
Lu Zhang and Steven Swanson. 2019. Pangolin: A Fault-Tolerant Persistent Memory Programming Library. CoRR abs/1904.10083 (2019). arXiv:1904.10083 http://arxiv.org/abs/1904.10083
[74]
J. Zhao, S. Li, D. H. Yoon, Y. Xie, and N. P. Jouppi. 2013. Kiln: Closing the performance gap between systems with and without persistence support. In 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 421--432.
[75]
Avi Ziv and Jehoshua Bruck. 1996. Efficient checkpointing over local area networks. In Proceedings of IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems. IEEE, 30--35.

Cited By

View all
  • (2021)MOSIQS: Persistent Memory Object Storage With Metadata Indexing and Querying for Scientific ComputingIEEE Access10.1109/ACCESS.2021.30875029(85217-85231)Online publication date: 2021
  • (2020)Persistent Memory Object Storage and Indexing for Scientific Computing2020 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC)10.1109/MCHPC51950.2020.00006(1-9)Online publication date: Nov-2020

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
MEMSYS '19: Proceedings of the International Symposium on Memory Systems
September 2019
517 pages
ISBN:9781450372060
DOI:10.1145/3357526
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 September 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. fault tolerance
  2. scientific applications
  3. software performance

Qualifiers

  • Research-article

Conference

MEMSYS '19
MEMSYS '19: The International Symposium on Memory Systems
September 30 - October 3, 2019
District of Columbia, Washington, USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)1
Reflects downloads up to 03 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2021)MOSIQS: Persistent Memory Object Storage With Metadata Indexing and Querying for Scientific ComputingIEEE Access10.1109/ACCESS.2021.30875029(85217-85231)Online publication date: 2021
  • (2020)Persistent Memory Object Storage and Indexing for Scientific Computing2020 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC)10.1109/MCHPC51950.2020.00006(1-9)Online publication date: Nov-2020

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media