Abstract
The mismatch between compute performance and I/O performance has long been a stumbling block as supercomputers evolve from petaflops to exaflops. Currently, many parallel applications are I/O intensive, and their overall running times are typically limited by I/O performance. To quantify the I/O performance bottleneck and highlight the significance of achieving scalable performance in peta/exascale supercomputing, in this paper, we introduce for the first time a formal definition of the ‘storage wall’ from the perspective of parallel application scalability. We quantify the effects of the storage bottleneck by providing a storage-bounded speedup, defining the storage wall quantitatively, presenting existence theorems for the storage wall, and classifying the system architectures depending on I/O performance variation. We analyze and extrapolate the existence of the storage wall by experiments on Tianhe-1A and case studies on Jaguar. These results provide insights on how to alleviate the storage wall bottleneck in system design and achieve hardware/software optimizations in peta/exascale supercomputing.
Similar content being viewed by others
References
Agarwal, S., Garg, R., Gupta, M.S., et al., 2004. Adaptive incremental checkpointing for massively parallel systems. Proc. 18th Annual Int. Conf. on Supercomputing, p.277–286. http://dx.doi.org/10.1145/1006209.1006248
Agerwala, T., 2010. Exascale computing: the challenges and opportunities in the next decade. IEEE 16th Int. Symp. on High Performance Computer Architecture. http://dx.doi.org/10.1109/HPCA.2010.5416662
Alam, S.R., Kuehn, J.A., Barrett, R.F., et al., 2007. Cray XT4: an early evaluation for petascale scientific simulation. Proc. ACM/IEEE Conf. on Supercomputing, p.1–12. http://dx.doi.org/10.1145/1362622.1362675
Ali, N., Carns, P.H., Iskra, K., et al., 2009. Scalable I/O forwarding framework for high-performance computing systems. IEEE Int. Conf. on Cluster Computing and Workshops, p.1–10, http://dx.doi.org/10.1109/CLUSTR.2009.5289188
Amdahl, G.M., 1967. Validity of the single processor approach to achieving large scale computing capabilities. Proc. Spring Joint Computer Conf., p.483–485. http://dx.doi.org/10.1145/1465482.1465560
Bent, J., Gibson, G., Grider, G., et al., 2009. PLFS: a checkpoint file system for parallel applications. Proc. Conf. on High Performance Computing Networking, Storage and Analysis, p.21. http://dx.doi.org/10.1145/1654059.1654081
Cappello, F., Geist, A., Gropp, B., et al., 2009. Toward exascale resilience. Int. J. High Perform. Comput. Appl., 23(4):374–388. http://dx.doi.org/10.1177/1094342009347767
Carns, P., Harms, K., Allcock, W., et al., 2011. Understanding and improving computational science storage access through continuous characterization. ACM Trans. Stor., 7(3):1–26. http://dx.doi.org/10.1145/2027066.2027068
Chen, J., Tang, Y.H., Dong, Y., et al., 2016. Reducing static energy in supercomputer interconnection networks using topology-aware partitioning. IEEE Trans. Comput., 65(8):2588–2602. http://dx.doi.org/10.1109/TC.2015.2493523
Culler, D.E., Singh, J.P., Gupta, A., 1998. Parallel Computer Architecture: a Hardware/Software Approach. Morgan Kaufmann Publishers Inc., San Francisco, USA.
Egwutuoha, I.P., Levy, D., Selic, B., et al., 2013. A survey of fault tolerance mechanisms and checkpoint/restart implementations for high performance computing systems. J. Supercomput., 65(3):1302–1326. http://dx.doi.org/10.1007/s11227-013-0884-0
Elnozahy, E.N., Plank, J.S., 2004. Checkpointing for peta-scale systems: a look into the future of practical rollback-recovery. IEEE Trans. Depend. Secur. Comput., 1(2):97–108. http://dx.doi.org/10.1109/TDSC.2004.15
Elnozahy, E.N., Alvisi, L., Wang, Y.M., et al., 2002. A survey of rollback-recovery protocols in message-passing systems. ACM Comput. Surv., 34(3):375–408. http://dx.doi.org/10.1145/568522.568525
Fahey, M., Larkin, J., Adams, J., 2008. I/O performance on a massively parallel cray XT3/XT4. IEEE Int. Symp. on Parallel and Distributed Processing, p.1–12. http://dx.doi.org/10.1109/IPDPS.2008.4536270
Ferreira, K.B., Riesen, R., Bridges, P., et al., 2014. Accelerating incremental checkpointing for extreme-scale computing. Fut. Gener. Comput. Syst., 30:66–77. http://dx.doi.org/10.1016/j.future.2013.04.017
Frasca, M., Prabhakar, R., Raghavan, P., et al., 2011. Virtual I/O caching: dynamic storage cache management for concurrent workloads. Proc. Int. Conf. for High Performance Computing, Networking, Storage and Analysis, p.38. http://dx.doi.org/10.1145/2063384.2063435
Gamblin, T., de Supinski, B.R., Schulz, M., et al., 2008. Scalable load-balance measurement for SPMD codes. Proc. ACM/IEEE Conf. on Supercomputing, p.1–12.
Gustafson, J.L., 1988. Reevaluating Amdahl’s law. Commun. ACM, 31(5):532–533. http://dx.doi.org/10.1145/42411.42415
Hargrove, P.H., Duell, J.C., 2006. Berkeley lab checkpoint/restart (BLCR) for Linux clusters. J. Phys. Conf. Ser., 46(1):494–499. http://dx.doi.org/10.1088/1742-6596/46/1/067
Hennessy, J.L., Patterson, D.A., 2011. Computer Architecture: a Quantitative Approach. Elsevier.
HPCwire, 2010. DARPA Sets Ubiquitous HPC Program in Motion. Available from http://www.hpcwire.com/2010/08/10/darpa_sets_ubiquitous_hpc_program_ in_motion/.
Hu, W., Liu, G.M., Li, Q., et al., 2016. Storage speedup: an effective metric for I/O-intensive parallel application. 18th Int. Conf. on Advanced Communication Technology, p.1–2. http://dx.doi.org/10.1109/ICACT.2016.7423395
Kalaiselvi, S., Rajaraman, V., 2000. A survey of checkpointing algorithms for parallel and distributed computers. Sadhana, 25(5):489–510. http://dx.doi.org/10.1007/BF02703630
Kim, Y., Gunasekaran, R., 2015. Understanding I/O workload characteristics of a peta-scale storage system. J. Supercomput., 71(3):761–780. http://dx.doi.org/10.1007/s11227-014-1321-8
Kim, Y., Gunasekaran, R., Shipman, G.M., et al., 2010. Workload characterization of a leadership class storage cluster. Petascale Data Storage Workshop, p.1–5. http://dx.doi.org/10.1109/PDSW.2010.5668066
Kotz, D., Nieuwejaar, N., 1994. Dynamic file-access characteristics of a production parallel scientific workload. Proc. Supercomputing, p.640–649. http://dx.doi.org/10.1109/SUPERC.1994.344328
Liao, W.K., Ching, A., Coloma, K., et al., 2007. Using MPI file caching to improve parallel write performance for large-scale scientific applications. Proc. ACM/IEEE Conf. on Supercomputing, p.8. http://dx.doi.org/10.1145/1362622.1362634
Liu, N., Cope, J., Carns, P., et al., 2012. On the role of burst buffers in leadership-class storage systems. IEEE 28th Symp. on Mass Storage Systems and Technologies, p.1–11. http://dx.doi.org/10.1109/MSST.2012.6232369
Liu, Y., Gunasekaran, R., Ma, X.S., et al., 2014. Automatic identification of application I/O signatures from noisy server-side traces. Proc. 12th USENIX Conf. on File and Storage Technologies, p.213–228.
Lu, K., 1999. Research on Parallel File Systems Technology Toward Parallel Computing. PhD Thesis, National University of Defense Technology, Changsha, China (in Chinese).
Lucas, R., Ang, J., Bergman, K., et al., 2014. DOE Advanced Scientific Computing Advisory Subcommittee (ASCAC) Report: Top Ten Exascale Research Challenges. USDOE Office of Science. http://dx.doi.org/10.2172/1222713
Miller, E.L., Katz, R.H., 1991. Input/output behavior of supercomputing applications. Proc. ACM/IEEE Conf. on Supercomputing, p.567–576. http://dx.doi.org/10.1145/125826.126133
Moreira, J., Brutman, M., Castano, J., et al., 2006. Designing a highly-scalable operating system: the blue Gene/L story. Proc. ACM/IEEE Conf. on Supercomputing, p.53–61. http://dx.doi.org/10.1109/SC.2006.23
Oldfield, R.A., Arunagiri, S., Teller, P.J., et al., 2007. Modeling the impact of checkpoints on next-generation systems. 24th IEEE Conf. on Mass Storage Systems and Technologies, p.30–46. http://dx.doi.org/10.1109/MSST.2007.4367962
Pasquale, B.K., Polyzos, G.C., 1993. A static analysis of I/O characteristics of scientific applications in a production workload. Proc. ACM/IEEE Conf. on Supercomputing, p.388–397. http://dx.doi.org/10.1145/169627.169759
Plank, J.S., Beck, M., Kingsley, G., et al., 1995. Libckpt: transparent checkpointing under Unix. Proc. USENIX Technical Conf., p.18.
Purakayastha, A., Ellis, C., Kotz, D., et al., 1995. Characterizing parallel file-access patterns on a large-scale multiprocessor. 9th Int. Parallel Processing Symp., p.165–172. http://dx.doi.org/10.1109/IPPS.1995.395928
Sisilli, J., 2015. Improved Solutions for I/O Provisioning and Application Acceleration. Available from http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2015/20150811_FD11_Sisilli.pdf [Accessed on Nov. 18, 2015].
Rudin, W., 1976. Principles of Mathematical Analysis. McGraw-Hill Publishing Co.
Shalf, J., Dosanjh, S., Morrison, J., 2011. Exascale computing technology challenges. 9th Int. Conf. on High Performance Computing for Computational Science, p.1–25. http://dx.doi.org/10.1007/978-3-642-19328-6_1
Strohmaier, E., Dongarra, J., Simon, H., et al., 2015. TOP500 Supercomputer Sites. Available from http://www.top500.org/ [Accessed on Dec. 30, 2015].
Sun, X.H., Ni, L.M., 1993. Scalable problems and memorybounded speedup. J. Parall. Distr. Comput., 19(1):27–37. http://dx.doi.org/10.1006/jpdc.1993.1087
University of California, 2007. IOR HPC Benchmark. Available from http://sourceforge.net/projects/ior-sio/ [Accessed on Sept. 1, 2014].
Wang, F., Xin, Q., Hong, B., et al., 2004. File system workload analysis for large scale scientific computing applications. Proc. 21st IEEE/12th NASA Goddard Conf. on Mass Storage Systems and Technologies, p.139–152.
Wang, T., Oral, S., Wang, Y.D., et al., 2014. Burstmem: a high-performance burst buffer system for scientific applications. IEEE Int. Conf. on Big Data, p.71–79. http://dx.doi.org/10.1109/BigData.2014.7004215
Wang, T., Oral, S., Pritchard, M., et al., 2015. Development of a burst buffer system for data-intensive applications. arXiv:1505.01765. Available from http://arxiv.org/abs/1505.01765
Wang, Z.Y., 2009. Reliability speedup: an effective metric for parallel application with checkpointing. Int. Conf. on Parallel and Distributed Computing, Applications and Technologies, p.247–254. http://dx.doi.org/10.1109/PDCAT.2009.19
Xie, B., Chase, J., Dillow, D., et al., 2012. Characterizing output bottlenecks in a supercomputer. Int. Conf. for High Performance Computing, Networking, Storage and Analysis, p.1–11. http://dx.doi.org/10.1109/SC.2012.28
Yang, X.J., Du, J., Wang, Z.Y., 2011. An effective speedup metric for measuring productivity in large-scale parallel computer systems. J. Supercomput., 56(2):164–181. http://dx.doi.org/10.1007/s11227-009-0355-9
Yang, X.J., Wang, Z.Y., Xue, J.L., et al., 2012. The reliability wall for exascale supercomputing. IEEE Trans. Comput., 61(6):767–779. http://dx.doi.org/10.1109/TC.2011.106
Author information
Authors and Affiliations
Corresponding author
Additional information
Project supported by the National Natural Science Foundation of China (Nos. 61272141 and 61120106005) and the National High-Tech R&D Program (863) of China (No. 2012AA01A301)
ORCID: Wei HU, http://orcid.org/0000-0002-8839-7748
Rights and permissions
About this article
Cite this article
Hu, W., Liu, Gm., Li, Q. et al. Storage wall for exascale supercomputing. Frontiers Inf Technol Electronic Eng 17, 1154–1175 (2016). https://doi.org/10.1631/FITEE.1601336
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.1601336