Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3149457.3149464acmotherconferencesArticle/Chapter ViewAbstractPublication PageshpcasiaConference Proceedingsconference-collections
research-article

Improving Collective MPI-IO Using Topology-Aware Stepwise Data Aggregation with I/O Throttling

Published: 28 January 2018 Publication History
  • Get Citation Alerts
  • Abstract

    MPI-IO has been used in an internal I/O interface layer of HDF5 or PnetCDF, where collective MPI-IO plays a big role in parallel I/O to manage a huge scale of scientific data. However, existing collective MPI-IO optimization named two-phase I/O has not been tuned enough for recent supercomputers consisting of mesh/torus interconnects and a huge scale of parallel file systems due to lack of topology-awareness in data transfers and optimization for parallel file systems. In this paper, we propose I/O throttling and topology-aware stepwise data aggregation in two-phase I/O of ROMIO, which is a representative MPI-IO library, in order to improve collective MPI-IO performance even if we have multiple processes per compute node. Throttling I/O requests going to a target file system mitigates I/O request contention, and consequently I/O performance improvements are achieved in file access phase of two-phase I/O. Topology-aware aggregator layout with paying attention to multiple aggregators per compute node alleviates contention in data aggregation phase of two-phase I/O. In addition, stepwise data aggregation improves data aggregation performance. HPIO benchmark results on the K computer indicate that the proposed optimization has achieved up to about 73% and 39% improvements in write performance compared with the original implementation using 12,288 and 24,576 processes on 3,072 and 6,144 compute nodes, respectively.

    References

    [1]
    Yuichiro Ajima, Tomohiro Inoue, Shinya Hiramoto, Yuzo Takagi, and Toshiyuki Shimizu. 2012. The Tofu Interconnect. IEEE Micro 32, 1 (2012), 21--31.
    [2]
    Javier García Blas, Florin Isaila, David E. Singh, and Jesús Carretero. 2008. View-Based Collective I/O for MPI-IO. In CCGRID. 409--416.
    [3]
    Mohamad Chaarawi and Edgar Gabriel. 2011. Automatically Selecting the Number of Aggregators for Collective I/O Operations. In Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER'11). IEEE Computer Society, 428--437.
    [4]
    Yong Chen, Xian-He Sun, Rajeev Thakur, Philip C. Roth, and William D. Gropp. 2011. LACIO: A New Collective I/O Strategy for Parallel I/O Systems. In Proceedings of 25th IEEE International Parallel and Distributed Processing Symposium(IPDPS '11). IEEE, 794--804.
    [5]
    Phillip M. Dickens and Jeremy Logan. 2010. A high performance implementation of MPI-IO for a Lustre file system environment. Concurrency and Computation: Practice and Experience 22, 11 (August 2010), 1433--1449.
    [6]
    Wei keng Liao and Alok Choudhary. 2008. Dynamically Adapting File Domain Partitioning Methods for Collective I/O Based on Underlying Parallel File System Locking Protocols. In Proceedings of the 2008 ACM/IEEE Conference on High Performance Computing (SC '08). IEEE Press, Piscataway, NJ, USA, Article 3, 12 pages.
    [7]
    Jianwei Li, Wei-Keng Liao, Alok Choudhary, Robert Ross, Rajeev Thakur, William Gropp, Rob Latham, Andrew Siegel, Brad Gallagher, and Michael Zingale. 2003. Parallel netCDF: A High-Performance Scientific I/O Interface. In Proceedings of the 2003 ACM/IEEE Conference on Supercomputing (SC '03). ACM, 39.
    [8]
    Jay Lofstead, Fang Zheng, Qing Liu, Scott Klasky, Ron Oldfield, Todd Kordenbrock, Karsten Schwan, and Matthew Wolf. 2010. Managing Variability in the IO Performance of Petascale Storage Systems. In 2010 International Conference for High Performance Computing Networking, Storage and Analysis (SC'10). IEEE, 1--12.
    [9]
    Lustre. {n. d.}. ({n. d.}). http://lustre.org/
    [10]
    Lustre. 2008. Lustre ADIO collective write driver. Technical Report. Lustre.
    [11]
    Siyuan Ma, Xian-He Sun, and Ioan Raicu. 2012. I/O Throttling and Coordination for MapReduce. Technical Report. Illinois Institute of Technology. http://datasys.cs.iit.edu/reports/2012_io-mapreduce-TechRep.pdf
    [12]
    Takumi Maruyama, Toshio Yoshida, Ryuji Kan, Iwao Yamazaki, Shuji Yamamura, Noriyuki Takahashi, Mikio Hondou, and Hiroshi Okano. 2010. Sparc64 VIIIfx: A New-Generation Octocore Processor for Petascale Computing. IEEE Micro 30, 2 (2010), 30--40.
    [13]
    Message Passing Interface Forum. 1997. MPI-2: Extensions to the Message-Passing Interface. MPI Forum.
    [14]
    Hiroyuki Miyazaki, Yoshihiro Kusano, Naoki Shinjou, Fumiyoshi Shoji, Mitsuo Yokokawa, and Tadashi Watanabe. 2012. Overview of the K Computer System. Fujitsu Sci. Tech. J. 48, 3 (2012), 255--265.
    [15]
    MPI Forum. {n. d.}. ({n. d.}). http://www.mpi-forum.org/
    [16]
    Open MPI. {n. d.}. Open Source High Performance Computing. ({n. d.}). http://www.open-mpi.org/
    [17]
    Kenichiro Sakai, Shinji Sumimoto, and Motoyoshi Kurokawa. 2012. High-Performance and Highly Reliable File System for the K computer. Fujitsu Sci. Tech. J. 48, 3 (2012), 302--309.
    [18]
    Frank Schmuck and Roger Haskin. 2002. GPFS: A Shared-Disk File System for Large Computing Clusters. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST '02). USENIX Association, Article 19.
    [19]
    Seetharami Seelam, Andre Kerstens, and Patricia J. Teller. 2007. Throttling I/O Streams to Accelerate File-IO Performance. In High Performance Computing and Communications, Third International Conference, HPCC 2007, Houston, USA, September 26-28, 2007, Proceedings (Lecture Notes in Computer Science), Vol. 4782. Springer, 718--731.
    [20]
    Rajeev Thakur, William Gropp, and Ewing Lusk. 1996. An Abstract-Device Interface for Implementing Portable Parallel-I/O Interfaces. In Proceedings of the Sixth Symposium on the Frontiers of Massively Parallel Computation. 180--187.
    [21]
    Rajeev Thakur, William Gropp, and Ewing Lusk. 1999. On Implementing MPI-IO Portably and with High Performance. In Proceedings of the Sixth Workshop on Input/Output in Parallel and Distributed Systems. 23--32.
    [22]
    Rajeev Thakur, William Gropp, and Ewing Lusk. 2002. Optimizing noncontiguous accesses in MPI-IO. Parallel Comput. 28, 1 (2002), 83--105.
    [23]
    The National Center for Supercomputing Applications. {n. d.}. ({n. d.}). https://www.hdfgroup.org/
    [24]
    Yuichi Tsujita, Atsushi Hori, and Yutaka Ishikawa. 2014. Locality-Aware Process Mapping for High Performance Collective MPI-IO on FEFS with Tofu Interconnect. In Proceedings of the 21th European MPI Users' Group Meeting (EuroMPI/ASIA '14). ACM, Article 157, 157:157--157:162 pages. Challenges in Data-Centric Computing.
    [25]
    Yuichi Tsujita, Atsushi Hori, and Yutaka Ishikawa. 2015. Striping Layout Aware Data Aggregation for High Performance I/O on a Lustre File System. In High Performance Computing - 30th International Conference, ISC High Performance 2015, Frankfurt, Germany, July 12-16, 2015, Proceedings. 282--290.

    Cited By

    View all
    • (2023)I/O Access Patterns in HPC Applications: A 360-Degree SurveyACM Computing Surveys10.1145/361100756:2(1-41)Online publication date: 15-Sep-2023
    • (2023)Strategies for Fast I/O Throughput in Large-Scale Climate Modeling Applications2023 IEEE 30th International Conference on High Performance Computing, Data, and Analytics (HiPC)10.1109/HiPC58850.2023.00038(203-212)Online publication date: 18-Dec-2023
    • (2022)Improving I/O Performance for Exascale Applications Through Online Data Layout ReorganizationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.310078433:4(878-890)Online publication date: 1-Apr-2022
    • Show More Cited By

    Index Terms

    1. Improving Collective MPI-IO Using Topology-Aware Stepwise Data Aggregation with I/O Throttling

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      HPCAsia '18: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region
      January 2018
      322 pages
      ISBN:9781450353724
      DOI:10.1145/3149457
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 28 January 2018

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. I/O throttling
      2. MPI-IO
      3. aggregator
      4. topology-aware stepwise data aggregation
      5. two-phase I/O

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      HPC Asia 2018

      Acceptance Rates

      HPCAsia '18 Paper Acceptance Rate 30 of 67 submissions, 45%;
      Overall Acceptance Rate 69 of 143 submissions, 48%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)0

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)I/O Access Patterns in HPC Applications: A 360-Degree SurveyACM Computing Surveys10.1145/361100756:2(1-41)Online publication date: 15-Sep-2023
      • (2023)Strategies for Fast I/O Throughput in Large-Scale Climate Modeling Applications2023 IEEE 30th International Conference on High Performance Computing, Data, and Analytics (HiPC)10.1109/HiPC58850.2023.00038(203-212)Online publication date: 18-Dec-2023
      • (2022)Improving I/O Performance for Exascale Applications Through Online Data Layout ReorganizationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.310078433:4(878-890)Online publication date: 1-Apr-2022
      • (2020)On Overlapping Communication and File I/O in Collective Write Operation2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW50202.2020.00175(1-8)Online publication date: May-2020
      • (2020)Characterizing I/O Optimization Effect Through Holistic Log Data Analysis of Parallel File Systems and InterconnectsHigh Performance Computing10.1007/978-3-030-59851-8_11(177-190)Online publication date: 22-Jun-2020
      • (2019)An Unsupervised Learning Approach for I/O Behavior Characterization2019 31st International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)10.1109/SBAC-PAD.2019.00019(33-40)Online publication date: Oct-2019
      • (2018)System Software for Many-Core and Multi-core ArchitectureAdvanced Software Technologies for Post-Peta Scale Computing10.1007/978-981-13-1924-2_4(59-75)Online publication date: 7-Dec-2018

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media