Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Multi-versioning Performance Opportunities in BGAS System for Resilience

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9697))

Included in the following conference series:

Abstract

Resilience has become a major concern in high-performance computing (HPC) systems. Addressing the increasing risk of latent errors (or silent data corruption) is one of the biggest challenges. Multi-version checkpointing system, which keeps multi-version of the application states, has been proposed as a solution and has been implemented in Global View Resilience (GVR). The resulting more sophisticated management of data introduces overheads and the resulting impact on performance need to be investigated. In this paper we explore the performance of GVR for an HPC system with integrated non-volatile memories, namely Blue Gene Active Storage (BGAS). Our empirical study shows that the BGAS system provides a significantly more efficient basis for flexible error recovery by using GVR multi-versioning features compared to using a standard external storage system attached to the same Blue Gene/Q installation. Using BGAS especially achieves at least \(10\times \) performance boost for random traversal across multiple versions due to significantly better performance for small random I/O operations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. IOR benchmark. http://ior-sio.sourceforge.net

  2. Scalable checkpoint/restart (SCR) library. https://github.com/hpc/scr

  3. Summit compute system. https://www.olcf.ornl.gov/summit/

  4. Antypas, K., Wright, N., Cardo, N.P., Andrews, A., Cordery, M.: Cori: a Cray XC pre-exascale system for NERSC. In: Cray User Group Proceedings. Cray (2014)

    Google Scholar 

  5. Bariuso, R., Knies, A.: SHMEM user’s guide for C. Cray Research, Inc. (1994)

    Google Scholar 

  6. Bautista-Gomez, L., Tsuboi, S., Komatitsch, D., Cappello, F., Maruyama, N., Matsuoka, S.: FTI: high performance fault tolerance interface for hybrid systems. In: Proceedings of the 2011 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2011 (2011)

    Google Scholar 

  7. Bent, J., Grider, G., Kettering, B., Manzanares, A., McClelland, M., Torres, A., Torrez, A.: Storage challenges at Los Alamos National Lab. In: IEEE 28th Symposium on Mass Storage Systems and Technologies, pp. 1–5, April 2012

    Google Scholar 

  8. Bergman, K., Borkar, S., Campbell, D., Carlson, W., Dally, W., Denneau, M., Franzon, P., Harrod, W., Hiller, J., Karp, S., Keckler, S., Klein, D., Lucas, R., Richards, M., Scarpelli, A., Scott, S., Snavely, A., Sterling, T., Williams, R.S., Yelick, K.: Exascale computing study: technology challenges in achieving exascale systems. Technical report DARPA IPTO (2008)

    Google Scholar 

  9. Borkar, S., Chien, A.A.: The future of microprocessors. Commun. ACM 54, 67–77 (2011)

    Article  Google Scholar 

  10. Brown, D.L., Messina, P., Keyes, D., Morrison, J., Lucas, R., Shalf, J., Beckman, P., Brightwell, R., Geist, A., Vetter, J., et al.: Scientific grand challenges: crosscutting technologies for computing at the exascale. Office of Science, U.S. Department of Energy, pp. 2–4, February 2010

    Google Scholar 

  11. Cappello, F.: Fault tolerance in petascale/exascale systems: current knowledge, challenges and research opportunities. Int. J. High Perform. Comput. Appl. 23(3), 212–226 (2009)

    Article  Google Scholar 

  12. Cappello, F., Casanova, H., Robert, Y.: Preventive migration vs. preventive checkpointing for extreme scale supercomputers. Parallel Process. Lett. 21(02), 111–132 (2011)

    Article  MathSciNet  Google Scholar 

  13. Carns, P., Latham, R., Ross, R., Iskra, K., Lang, S., Riley, K.: 24/7 characterization of petascale I/O workloads. In: IEEE International Conference on Cluster Computing and Workshops, pp. 1–10, August 2009

    Google Scholar 

  14. Chien, A.A., Balaji, P., Beckman, P., Dun, N., Fang, A., Fujita, H., Iskra, K., Rubenstein, Z., Zheng, Z., Schreiber, R., Hammond, J., Dinan, J., Laguna, I., Dubey, A., Hoemmen, M., Heroux, M., Teranishi, K., Siegel, A.: Versioned distributed arrays for resilience in scientific applications: global view resilience. In: Proceedings of International Conference on Computational Science (2015)

    Google Scholar 

  15. Daly, J.T.: A higher order estimate of the optimum checkpoint interval for restart dumps. Future Gener. Comput. Syst. 22(3) (2006)

    Google Scholar 

  16. Dong, X., Muralimanohar, N., Jouppi, N., Kaufmann, R., Xie, Y.: Leveraging 3D PCRAM technologies to reduce checkpoint overhead for future exascale systems. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC 2009, pp. 57:1–57:12 (2009)

    Google Scholar 

  17. Dun, N., Fujita, H., Tramm, J., Chien, A.A., Siegel, A.R.: Data decomposition in Monte Carlo particle transport simulations using global view arrays. Int. J. High Perform. Comput. Appl. March 2015

    Google Scholar 

  18. Egwutuoha, I.P., Levy, D., Selic, B., Chen, S.: A survey of fault tolerance mechanisms and checkpoint/restart implementations for high performance computing systems. J. Supercomput. 65(3), 1302–1326 (2013)

    Article  Google Scholar 

  19. Fang, A., Chien, A.A.: How much SSD is useful for resilience in supercomputers. In: Proceedings of the 5th Workshop on Fault Tolerance for HPC at eXtreme Scale (2015)

    Google Scholar 

  20. Ferreira, K., Stearley, J., Laros III, J.H., Oldfield, R., Pedretti, K., Brightwell, R., Riesen, R., Bridges, P.G., Arnold, D.: Evaluating the viability of process replication reliability for exascale systems. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (2011)

    Google Scholar 

  21. Fiala, D., Mueller, F., Engelmann, C., Riesen, R., Ferreira, K., Brightwell, R.: Detection and correction of silent data corruption for large-scale high-performance computing. In: Proceedings of 2012 International Conference on High Performance Computing, Networking, Storage and Analysis, pp. 78:1–78:12 (2012)

    Google Scholar 

  22. Fitch, B.G.: Exploring the capabilities of a massively scalable, compute-in-storage architecture (2013). http://www.hpdc.org/2013/site/files/HPDC13_Fitch_BlueGeneActiveStorage.pdf

  23. Fitch, B.G., Rayshubskiy, A., Pitman, M.C., Ward, T.J.C., Germain, R.S.: Using the active storage fabrics model to address petascale storage challenges. In: Proceedings of the 4th Annual Workshop on Petascale Data Storage (2009)

    Google Scholar 

  24. Fujita, H., Dun, N., Rubenstein, Z.A., Chien, A.A.: Log-structured global array for efficient multi-version snapshots. In: Proceedings of 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 281–291 (2015)

    Google Scholar 

  25. Fujita, H., Iskra, K., Balaji, P., Chien, A.A.: Empirical comparison of three versioning architectures. In: Proceedings of IEEE Cluster 2015 (2015)

    Google Scholar 

  26. Gao, S., He, B., Xu, J.: Real-time in-memory checkpointing for future hybrid memory systems. In: Proceedings of the 29th ACM on International Conference on Supercomputing, pp. 263–272 (2015)

    Google Scholar 

  27. GVR Team.: Global View Resilience (GVR) API documentation, version 1.0.1. Technical report, University of Chicago, Department of Computer Science, October 2015

    Google Scholar 

  28. Hargrove, P.H., Duell, J.C.: Berkeley lab checkpoint/restart (BLCR) for Linux clusters. J. Phys. Conf. Ser. 46, 494 (2006)

    Article  Google Scholar 

  29. Heger, D., Shah, G.: IBM’s general parallel file system (GPFS) 1.4 for AIX. Technical report, IBM Corporation, November 2001

    Google Scholar 

  30. Huang, K.H., Abraham, J.: Algorithm-based fault tolerance for matrix operations. IEEE Trans. Comput. C–33(6), 518–528 (1984)

    Article  MATH  Google Scholar 

  31. IBM Blue Gene Team: The IBM Blue Gene project. IBM J. Res. Dev. 57 (2013)

    Google Scholar 

  32. Jones, T., Koniges, A., Yates, R.K.: Performance of the IBM general parallel file system. In: Proceedings of 2000 IEEE International Parallel and Distributed Processing Symposium (2000)

    Google Scholar 

  33. Jülich Supercomputing Centre: BGAS user documentation. https://trac.version.fz-juelich.de/EIC/wiki/bgas-user

  34. Jülich Supercomputing Centre: Blue Gene Active Storage boosts I/O performance at JSC. http://www.fz-juelich.de/SharedDocs/Pressemitteilungen/UK/EN/2013/13-11-18bgas.html

  35. Kulkarni, A., Manzanares, A., Ionkov, L., Lang, M., Lumsdaine, A.: The design and implementation of a multi-level content-addressable checkpoint file system. In: 2012 19th International Conference on High Performance Computing, pp. 1–10, December 2012

    Google Scholar 

  36. Li, D., Vetter, J.S., Marin, G., McCurdy, C., Cira, C., Liu, Z., Yu, W.: Identifying opportunities for byte-addressable non-volatile memory in extreme-scale scientific applications. In: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium, pp. 945–956 (2012)

    Google Scholar 

  37. Liu, N., Cope, J., Carns, P., Carothers, C., Ross, R., Grider, G., Crume, A., Maltzahn, C.: On the role of burst buffers in leadership-class storage systems. In: Proceedings of the 2012 IEEE Conference on Massive Data Storage (2012)

    Google Scholar 

  38. Lu, G., Zheng, Z., Chien, A.A.: When is multi-version checkpointing needed? In: Proceedings of the 3rd Workshop on Fault-Tolerance for HPC at Extreme Scale, pp. 49–56 (2013)

    Google Scholar 

  39. Martino, C.D., Kalbarczyk, Z., Iyer, R.K., Baccanico, F., Fullop, J., Kramer, W.: Lessons learned from the analysis of system failures at petascale: the case of Blue Waters. In: Proceedings of the 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 610–621 (2014)

    Google Scholar 

  40. Metzler, B., Trivedi, A.: Prototyping byte-addressable NVM access. In: Proceedings of 11th OpenFabrics Developers Workshop (2015)

    Google Scholar 

  41. Moody, A., Bronevetsky, G., Mohror, K., de Supinski, B.R.: Design, modeling, and evaluation of a scalable multi-level checkpointing system. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–11 (2010)

    Google Scholar 

  42. Nieplocha, J., Palmer, B., Tipparaju, V., Krishnan, M., Trease, H., Aprà, E.: Advances, applications and performance of the global arrays shared memory programming toolkit. Int. J. High Perform. Comput. Appl. 20(2), 203–231 (2006)

    Article  Google Scholar 

  43. Numrich, R.W., Reid, J.: Co-Array Fortran for parallel programming. SIGPLAN Fortran Forum 17(2) (1998)

    Google Scholar 

  44. Ouyang, X., et al.: Enhancing checkpoint performance with staging I/O and SSD. In: Proceedings of 2010 International Workshop on Storage Network Architecture and Parallel I/Os, May 2010

    Google Scholar 

  45. Romano, P.K., Forget, B.: The OpenMC Monte Carlo particle transport code. Ann. Nucl. Energy 51, 274–281 (2013)

    Article  Google Scholar 

  46. Sato, K., Mohror, K., Moody, A., Gamblin, T., de Supinski, B.R., Maruyama, N., Matsuoka, S.: A user-level InfiniBand-based file system and checkpoint strategy for burst buffers. In: Proceedings of 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (2014)

    Google Scholar 

  47. Schlichting, R.D., Schneider, F.B.: Fail-stop processors: an approach to designing fault-tolerant computing systems. ACM Trans. Comput. Syst. 1(3), 222–238 (1983)

    Article  Google Scholar 

  48. Schroeder, B., Gibson, G.A.: A large-scale study of failures in high-performance computing systems. In: Proceedings of 2006 IEEE/IFIP International Conference on Dependable Systems and Networks (2006)

    Google Scholar 

  49. Shantharam, M., Srinivasmurthy, S., Raghavan, P.: Characterizing the impact of soft errors on iterative methods in scientific computing. In: Proceedings of Supercomputing (2011)

    Google Scholar 

  50. Young, J.W.: A first order approximation to the optimum checkpoint interval. Commun. ACM 17(9) (1974)

    Google Scholar 

  51. Zheng, Z., Yu, L., Tang, W., Lan, Z., Gupta, R., Desai, N., Coghlan, S., Buettner, D.: Co-analysis of RAS log and job log on Blue Gene/P. In: Proceedings of 2011 IEEE International Parallel and Distributed Processing Symposium (2011)

    Google Scholar 

  52. Zhou, M., Du, Y., Childers, B.R., Melhem, R., Mosse, D.: Writeback-aware bandwidth partitioning for multi-core systems with PCM. In: Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, pp. 113–122 (2013)

    Google Scholar 

Download references

Acknowledgments

This work was supported by the Office of Advanced Scientific Computing Research, Office of Science, the U.S. Department of Energy, under Award DE-SC0008603 and Contract DE-AC02-06CH11357. This work was completed in part with resources provided by Jülich Supercomputing Centre and we would like to thank in particular Michael Stephan for his support. We gracefully acknowledge the collaboration with IBM Research on the BGAS architecture in the context of the Exascale Innovation Center (EIC). In particular, we want to thank Blake Fitch for his continuous support and for many helpful discussions. Part of the work has been done within the Joint Laboratory for Extreme Scale Computing (JLESC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dirk Pleiter .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Dun, N., Pleiter, D., Fang, A., Vandenbergen, N., Chien, A.A. (2016). Multi-versioning Performance Opportunities in BGAS System for Resilience. In: Kunkel, J., Balaji, P., Dongarra, J. (eds) High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science(), vol 9697. Springer, Cham. https://doi.org/10.1007/978-3-319-41321-1_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41321-1_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41320-4

  • Online ISBN: 978-3-319-41321-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics