Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Dynamic Process Migration Based on Block Access Patterns Occurring in Storage Servers

Published: 14 June 2016 Publication History

Abstract

An emerging trend in developing large and complex applications on today’s high-performance computers is to couple independent components into a comprehensive application. The components may employ the global file system to exchange their data when executing the application. In order to reduce the time required for input/output (I/O) data exchange and data transfer in the coupled systems or other applications, this article proposes a dynamic process migration mechanism on the basis of block access pattern similarity for utilizing the local file cache to exchange the data. We first introduce the scheme of the block access counting diagram to profile the process access pattern during a time period on the storage server. Next, we propose an algorithm that compares the access patterns of processes running on different computing nodes. Last, processes are migrated in order to group processes with similar access patterns. Consequently, the processes on the computing node can exchange their data by accessing the local file cache, instead of the global file system.
The experimental results show that the proposed process migration mechanism can reduce the execution time required by the application because of the shorter I/O time, as well as yield attractive I/O throughput. In summary, this dynamic process migration technique can work fairly well for distributed applications whose data dependency rely on distributed file systems.

References

[1]
2010. Filesystem in Userspace (FUSE). http://fuse.sourceforge.net. (2010). Online; accessed Nov. 2010.
[2]
2012. MPICH2. https://www.mpich.org. (2012). Online; Accessed in Dec., 2012.
[3]
2014. MPI-IO Test (fs test). http://institute.lanl.gov/data/software/. (2014). Online; Accessed in May, 2014.
[4]
D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. 1991. The NAS parallel benchmarks&Mdash;summary and preliminary results. In Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing’91). ACM, New York, NY, 158--165.
[5]
Hu Chen, Wenguang Chen, Jian Huang, Bob Robert, and H. Kuhn. 2006. MPIPP: An automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters. In Proceedings of the 20th Annual International Conference on Supercomputing (ICS’06). ACM, New York, NY, 353--360.
[6]
Quan Chen and Minyi Guo. 2014. Adaptive workload-aware task scheduling for single-ISA asymmetric multicore architectures. ACM Trans. Archit. Code Optim. 11, 1, Article 8 (Feb. 2014), 25 pages.
[7]
Luwei Cheng and Cho-Li Wang. 2012. vBalance: Using interrupt load balance to improve I/O performance for SMP virtual machines. In Proceedings of the Third ACM Symposium on Cloud Computing (SoCC’12). ACM, New York, NY, Article 2, 14 pages.
[8]
Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul, Christian Limpach, Ian Pratt, and Andrew Warfield. 2005. Live migration of virtual machines. In Proceedings of the 2nd Conference on Symposium on Networked Systems Design & Implementation--Volume 2 (NSDI’’05). USENIX Association, Berkeley, CA, 273--286. http://dl.acm.org/citation.cfm?id=1251203.1251223.
[9]
Lauro Beltrão Costa, Samer Al-Kiswany, Hao Yang, and Matei Ripeanu. 2014. Supporting storage configuration for I/O intensive workflows. In Proceedings of the 28th ACM International Conference on Supercomputing (ICS’14). ACM, New York, NY, 191--200.
[10]
Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified data processing on large clusters. Commun. ACM 51, 1 (Jan. 2008), 107--113.
[11]
Xiaoning Ding, Song Jiang, Feng Chen, Kei Davis, and Xiaodong Zhang. 2007. DiskSeen: Exploiting disk layout and access history to enhance I/O prefetch. In 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference (ATC’07). USENIX Association, Berkeley, CA, USA, Article 20, 14 pages. http://dl.acm.org/citation.cfm?id=1364385.1364405.
[12]
Xiangyu Dong, Yuan Xie, Naveen Muralimanohar, and Norman P. Jouppi. 2011. Hybrid checkpointing using emerging nonvolatile memories for future exascale systems. ACM Trans. Archit. Code Optim. 8, 2, Article 6 (June 2011), 29 pages.
[13]
Jack Dongarra, Pete Beckman, Terry Moore, Patrick Aerts, Giovanni Aloisio, Jean-Claude Andre, David Barkai, Jean-Yves Berthou, Taisuke Boku, Bertrand Braunschweig, Franck Cappello, Barbara Chapman, Xuebin Chi, Alok Choudhary, Sudip Dosanjh, Thom Dunning, Sandro Fiore, Al Geist, Bill Gropp, Robert Harrison, Mark Hereld, Michael Heroux, Adolfy Hoisie, Koh Hotta, Zhong Jin, Yutaka Ishikawa, Fred Johnson, Sanjay Kale, Richard Kenway, David Keyes, Bill Kramer, Jesus Labarta, Alain Lichnewsky, Thomas Lippert, Bob Lucas, Barney Maccabe, Satoshi Matsuoka, Paul Messina, Peter Michielse, Bernd Mohr, Matthias S. Mueller, Wolfgang E. Nagel, Hiroshi Nakashima, Michael E. Papka, Dan Reed, Mitsuhisa Sato, Ed Seidel, John Shalf, David Skinner, Marc Snir, Thomas Sterling, Rick Stevens, Fred Streitz, Bob Sugar, Shinji Sumimoto, William Tang, John Taylor, Rajeev Thakur, Anne Trefethen, Mateo Valero, Aad Van Der Steen, Jeffrey Vetter, Peg Williams, Robert Wisniewski, and Kathy Yelick. 2011. The international exascale software project roadmap. Int. J. High Perform. Comput. Appl. 25, 1 (Feb. 2011), 3--60.
[14]
Cong Du, Xian-He Sun, and Ming Wu. 2007. Dynamic scheduling with process migration. In Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGRID’07). IEEE Computer Society, Washington, DC, 92--99.
[15]
Karl Fuerlinger, Nicholas J. Wright, and David Skinner. 2010. Effective performance measurement at petascale using IPM. In Proceedings of the 2010 IEEE 16th International Conference on Parallel and Distributed Systems (ICPADS’10). IEEE Computer Society, Washington, DC, 373--380.
[16]
Zhenhuan Gong. 2013. Multi-Level Data Layout Optimization for Heterogeneous Access Patterns. Ph.D. Dissertation. Advisor(s) Samatova, Nagiza F. AAI3586115.
[17]
James J. Hack, Julie M. Caron, G. Danabasoglu, Keith W. Oleson, Cecilia Bitz, and John E. Truesdale. 2006. CCSM-CAM3 climate simulation sensitivity to changes in horizontal resolution. J. Clim. 19, 11 (2006), 2267--2289.
[18]
Ryan Haney, Theresa Meuse, Jeremy Kepner, and James Lebak. 2005. The HPEC challenge benchmark suite. In HPEC 2005 Workshop.
[19]
Jun He, John Bent, Aaron Torres, Gary Grider, Garth Gibson, Carlos Maltzahn, and Xian-He Sun. 2013. I/O acceleration with pattern detection. In Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing (HPDC’13). ACM, New York, NY, 25--36.
[20]
Brian R. Hunt, Eric J. Kostelich, and Istvan Szunyogh. 2007. Efficient data assimilation for spatiotemporal chaos: A local ensemble transform Kalman filter. Physica D 230, 1 (2007), 112--126.
[21]
Emmanuel Jeannot, Guillaume Mercier, and François Tessier. 2014. Process placement in multicore clusters: Algorithmic issues and practical techniques. IEEE Trans. Parallel Distrib. Syst. 25, 4 (2014), 993--1002.
[22]
Song Jiang, Xiaoning Ding, Yuehai Xu, and Kei Davis. 2013. A prefetching scheme exploiting both data layout and access history on disk. Trans. Storage 9, 3, Article 10 (Aug. 2013), 23 pages.
[23]
Jay Larson, Robert Jacob, and Everest Ong. 2005. The model coupling toolkit: A new Fortran90 toolkit for building multiphysics parallel coupled models. Int. J. High Perform. Comput. Appl. 19, 3 (2005), 277--292.
[24]
Zhenmin Li, Zhifeng Chen, Sudarshan M. Srinivasan, and Yuanyuan Zhou. 2004. C-miner: Mining block correlations in storage systems. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST’04). USENIX Association, Berkeley, CA, USA, 173--186. http://dl.acm.org/citation.cfm?id=1096673.1096695.
[25]
Jianwei Liao. 2012. A new concurrent checkpoint mechanism for embeded multi-core systems. Comput. Inform. 31, 3 (2012), 693--709.
[26]
Jianwei Liao and Yutaka Ishikawa. 2012. Partial replication of metadata to achieve high metadata availability in parallel file systems. In Proceedings of the 2012 41st International Conference on Parallel Processing (ICPP’12). IEEE Computer Society, Washington, DC, 168--177.
[27]
Jianwei Liao, Francois Trahay, Balazs Gerofi, and Yutaka Ishikawa. Prefetching on storage servers through mining access patterns on blocks. IEEE Trans. Parallel Distrib. Syst.
[28]
Esteban Meneses, Xiang Ni, Gengbin Zheng, Celso L. Mendes, and Laxmikant V. Kale. 2015. Using migratable objects to enhance fault tolerance schemes in supercomputers. IEEE Trans. Parallel Distrib. Syst. 26, 7 (2015), 2061--2074.
[29]
Guillaume Mercier and Jérôme Clet-Ortega. 2009. Towards an efficient process placement policy for MPI applications in multicore environments. In Proceedings of the 16th European PVM/MPI Users’ Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface. Springer-Verlag, Berlin, 104--115.
[30]
Guillaume Mercier and Emmanuel Jeannot. 2011. Improving MPI applications performance on multicore clusters with rank reordering. In Proceedings of the 18th European MPI Users’ Group Conference on Recent Advances in the Message Passing Interface (EuroMPI’11). Springer-Verlag, Berlin, 39--49. http://dl.acm.org/citation.cfm?id=2042476.2042483
[31]
Pierre Michaud, André Seznec, Damien Fetis, Yiannakis Sazeides, and Theofanis Constantinou. 2007. A study of thread migration in temperature-constrained multicores. ACM Trans. Archit. Code Optim. 4, 2, Article 9 (June 2007).
[32]
Franco Molteni. 2003. Atmospheric simulations using a GCM with simplified physical parametrizations. I: Model climatology and variability in multi-decadal experiments. Climate Dynam. 20, 2-3 (2003), 175--191.
[33]
Juan Piernas, Jarek Nieplocha, and Evan J. Felix. 2007. Evaluation of active storage strategies for the lustre parallel file system. In Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC’07). ACM, New York, NY, Article 28, 10 pages.
[34]
Juan Piernas-Canovas and Jarek Nieplocha. 2010. Implementation and evaluation of active storage in modern parallel file systems. Parallel Comput. 36, 1 (Jan. 2010), 26--47.
[35]
Laercio L. Pilla, Christiane Pousa Ribeiro, Daniel Cordeiro, Chao Mei, Abhinav Bhatele, Philippe O. A. Navaux, Francois Broquedis, Jean-Francois Mehaut, and Laxmikant V. Kale. 2012. A hierarchical approach for load balancing on parallel multi-core systems. In Proceedings of the 2012 41st International Conference on Parallel Processing (ICPP’12). IEEE Computer Society, Washington, DC, 118--127.
[36]
Mustafa M. Tikir, Michael A. Laurenzano, Laura Carrington, and Allan Snavely. 2009. PSINS: An open source event tracer and execution simulator for MPI applications. In Euro-Par 2009 Parallel Processing. Springer, Berlin, 135--148.
[37]
S. Valcke, V. Balaji, A. Craig, C. DeLuca, R. Dunlap, R. W. Ford, R. Jacob, Jay Larson, R. O’Kuinghttons, G. D. Riley, and others. 2012. Coupling technologies for earth system modelling. Geosci. Model Develop. 5, 6 (2012), 1589--1596.
[38]
Sophie Valcke, Reinhard G. Budich, Mick Carter, Eric Guilyardi, Marie-Alice Foujols, Michael Lautenschlager, René Redler, Lois Steenman-Clark, and Nils Wedi. 2006. The PRISM software framework and the OASIS coupler. In Annual BMRC Modelling Workshop “The Australian Community Climate and Earth System Simulator (ACCESS)-Challenges and Opportunities.”
[39]
Christian Vecchiola, Suraj Pandey, and Rajkumar Buyya. 2009. High-performance cloud computing: A view of scientific applications. In Proceedings of the 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks (ISPAN’09). IEEE Computer Society, Washington, DC, 4--16.
[40]
Chao Wang, Frank Mueller, Christian Engelmann, and Stephen L. Scott. 2008. Proactive process-level live migration in HPC environments. In Proceedings of the 2008 ACM/IEEE Conference on Supercomputing (SC’08). IEEE Press, Piscataway, NJ, Article 43, 12 pages. http://dl.acm.org/citation.cfm?id=1413370.1413414.
[41]
Dan Williams, Hani Jamjoom, and Hakim Weatherspoon. 2012. The Xen-blanket: Virtualize once, run everywhere. In Proceedings of the 7th ACM European Conference on Computer Systems (EuroSys’12). ACM, New York, NY, 113--126.
[42]
Yulai Xie, Dan Feng, Yan Li, and Darrell D. E. Long. 2016. Oasis. Future Gener. Comput. Syst. 56, C (March 2016), 746--758.
[43]
Fang Zheng, Hongbo Zou, Greg Eisenhauer, Karsten Schwan, Matthew Wolf, Jai Dayal, Tuan-Anh Nguyen, Jianting Cao, Hasan Abbasi, Scott Klasky, Norbert Podhorszki, and Hongfeng Yu. 2013. FlexIO: I/O middleware for location-flexible scientific data analytics. In Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing (IPDPS’13). IEEE Computer Society, Washington, DC, 320--331.

Cited By

View all
  • (2023)I/O Access Patterns in HPC Applications: A 360-Degree SurveyACM Computing Surveys10.1145/361100756:2(1-41)Online publication date: 15-Sep-2023
  • (2019)Fine Granularity and Adaptive Cache Update Mechanism for Client CachingIEEE Systems Journal10.1109/JSYST.2018.286690513:2(1587-1598)Online publication date: Jun-2019

Index Terms

  1. Dynamic Process Migration Based on Block Access Patterns Occurring in Storage Servers

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Architecture and Code Optimization
    ACM Transactions on Architecture and Code Optimization  Volume 13, Issue 2
    June 2016
    200 pages
    ISSN:1544-3566
    EISSN:1544-3973
    DOI:10.1145/2952301
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 June 2016
    Accepted: 01 March 2016
    Revised: 01 March 2016
    Received: 01 October 2015
    Published in TACO Volume 13, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Distributed file system
    2. I/O performance
    3. access counting diagram
    4. block access events
    5. pattern similarity
    6. process migration

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • NFSC

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)50
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 18 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)I/O Access Patterns in HPC Applications: A 360-Degree SurveyACM Computing Surveys10.1145/361100756:2(1-41)Online publication date: 15-Sep-2023
    • (2019)Fine Granularity and Adaptive Cache Update Mechanism for Client CachingIEEE Systems Journal10.1109/JSYST.2018.286690513:2(1587-1598)Online publication date: Jun-2019

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media