Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3152041.3152086acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Chapel-on-X: Exploring Tasking Runtimes for PGAS Languages

Published: 12 November 2017 Publication History

Abstract

With the shift to exascale computer systems, the importance of productive programming models for distributed systems is increasing. Partitioned Global Address Space (PGAS) programming models aim to reduce the complexity of writing distributed-memory parallel programs by introducing global operations on distributed arrays, distributed task parallelism, directed synchronization, and mutual exclusion. However, a key challenge in the application of PGAS programming models is the improvement of compilers and runtime systems. In particular, one open question is how runtime systems meet the requirement of exascale systems, where a large number of asynchronous tasks are executed.
While there are various tasking runtimes such as Qthreads, OCR, and HClib, there is no existing comparative study on PGAS tasking/threading runtime systems. To explore runtime systems for PGAS programming languages, we have implemented OCR-based and HClib-based Chapel runtimes and evaluated them with an initial focus on tasking and synchronization implementations. The results show that our OCR and HClib-based implementations can improve the performance of PGAS programs compared to the existing Qthreads backend of Chapel.

References

[1]
Ben Albrecht and Michael Ferguson. 2016. Social Network Analysis on Twitter with Chapel. In Proceedings of the Chapel Implementers and Users Workshop (CHIUW '16).
[2]
Michael Bauer, Sean Treichler, Elliott Slaughter, and Alex Aiken. 2012. Legion: Expressing Locality and Independence with Logical Regions. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC '12). IEEE Computer Society Press, Los Alamitos, CA, USA, Article 66, 11 pages. http://dl.acm.org/citation.cfm?id=2388996.2389086
[3]
Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and Yuli Zhou. 1995. Cilk: an efficient multithreaded runtime system. (1995), 207--216. https://doi.org/10.1145/209936.209958
[4]
Chapel. 2017. a Productive Parallel Programming Language. https://github.com/chapel-lang/chapel (Accessed 13 October 2017). (2017).
[5]
Chapel. 2017. The Chapel Language Specification Version 0.983. http://chapel.cray.com/docs/latest/_downloads/chapelLanguageSpec.pdf. (April 2017).
[6]
Barbara Chapman, Tony Curtis, Swaroop Pophale, Stephen Poole, Jeff Kuehn, Chuck Koelbel, and Lauren Smith. 2010. Introducing OpenSHMEM: SHMEM for the PGAS Community. In Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model (PGAS '10). ACM, New York, NY, USA, Article 2, 3 pages. https://doi.org/10.1145/2020373.2020375
[7]
Philippe Charles, Christian Grothoff, Vijay Saraswat, Christopher Donawa, Allan Kielstra, Kemal Ebcioglu, Christoph von Praun, and Vivek Sarkar. 2005. X10: an object-oriented approach to non-uniform cluster computing (OOPSLA'05). ACM, New York, NY, USA, 519--538.
[8]
Sanjay Chatterjee, Sagnak Tasirlar, Zoran Budimlić, Vincent Cavé, Milind Chabbi, Max Grossman, Vivek Sarkar, and Yonghong Yan. 2013. Integrating Asynchronous Task Parallelism with MPI (IPDPS '13). IEEE Computer Society, Washington, DC, USA, 712--725. https://doi.org/10.1109/IPDPS.2013.78
[9]
COMD. 2017. CoMD implementation in Chapel. https://github.com/LLNL/CoMD-Chapel (Accessed 13 October 2017). (2017).
[10]
Jiri Dokulil, Martin Sandrieser, and Siegfried Benkner. 2015. OCR-Vx - An Alternative Implementation of the Open Community Runtime. In International Workshop on Runtime Systems for Extreme Scale Programming Models and Architecture (RESPA '15).
[11]
Tarek El-Ghazawi, William W. Carlson, and Jesse M. Draper. 2003. UPC Language Specification V1.1.1. (October 2003).
[12]
Sri Raj Paul et al. 2017. Chapel Tasking Runtimes with OCR and HClib. https://github.com/srirajpaul/chapel/tree/hclib_ocr (Accessed 13 October 2017). (2017).
[13]
William Gropp, Ewing Lusk, and Anthony Skjellum. 1994. Using MPI: Portable Parallel Programming with the Message-Passing Interface. MIT Press, Cambridge, MA.
[14]
Max Grossman, Vivek Kumar, Zoran Budimlić, and Vivek Sarkar. 2016. Integrating Asynchronous Task Parallelism with OpenSHMEM. In Workshop on OpenSHMEM and Related Technologies. Springer, 3--17.
[15]
Riyaz Haque and David Richards. 2016. Optimizing PGAS Overhead in a Multi-locale Chapel Implementation of CoMD. In Proceedings of the First Workshop on PGAS Applications (PAW '16). IEEE Press, Piscataway, NJ, USA, 25--32. https://doi.org/10.1109/PAW.2016.9
[16]
Intel. 2017. Open Community Runtime. [online] https://01.org/open-community-runtime (Accessed 13 October 2017). (2017).
[17]
Vivek Kumar, Karthik Murthy, Vivek Sarkar, and Yili Zheng. 2016. Optimized Distributed Work-stealing. In Proceedings of the Sixth Workshop on Irregular Applications: Architectures and Algorithms (IA3 '16). IEEE Press, Piscataway, NJ, USA, 74--77. https://doi.org/10.1109/IA3.2016.19
[18]
Vivek Kumar, Yili Zheng, Vincent Cavé, Zoran Budimlić, and Vivek Sarkar. 2014. HabaneroUPC++: A Compiler-free PGAS Library. In Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models (PGAS '14). ACM, New York, NY, USA, Article 5, 10 pages. https://doi.org/10.1145/2676870.2676879
[19]
T. G. Mattson, R. Cledat, V. Cave, V. Sarkar, Z. Budimlic, S. Chatterjee, J. Fryman, I. Ganev, R. Knauerhase, Min Lee, B. Meister, B. Nickerson, N. Pepperling, B. Seshasayee, S. Tasirlar, J. Teller, and N. Vrvilo. 2016. The Open Community Runtime: A runtime system for extreme scale computing. In 2016 IEEE High Performance Extreme Computing Conference (HPEC). 1--7. https://doi.org/10.1109/HPEC.2016.7761580
[20]
John M. Mellor-Crummey and Michael L. Scott. 1991. Algorithms for Scalable Synchronization on Shared-memory Multiprocessors. ACM Trans. Comput. Syst. 9, 1 (Feb. 1991), 21--65. https://doi.org/10.1145/103727.103729
[21]
Robert W. Numrich and John Reid. 1998. Co-array Fortran for Parallel Programming. SIGPLAN Fortran Forum 17, 2 (Aug. 1998), 1--31. https://doi.org/10.1145/289918.289920
[22]
Stephen Olivier, Jun Huan, Jinze Liu, Jan Prins, James Dinan, P. Sadayappan, and Chau-Wen Tseng. 2007. UTS: An Unbalanced Tree Search Benchmark. Springer Berlin Heidelberg, Berlin, Heidelberg, 235--250. https://doi.org/10.1007/978-3-540-72521-3_18
[23]
Jun Shirako, David M. Peixotto, Vivek Sarkar, and William N. Scherer. 2008. Phasers: A Unified Deadlock-free Construct for Collective and Point-to-point Synchronization. In Proceedings of the 22Nd Annual International Conference on Supercomputing (ICS '08). ACM, New York, NY, USA, 277--288. https://doi.org/10.1145/1375527.1375568
[24]
Sean Treichler, Michael Bauer, and Alex Aiken. 2014. Realm: An Event-based Low-level Runtime for Distributed Memory Architectures. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation (PACT '14). ACM, New York, NY, USA, 263--276. https://doi.org/10.1145/2628071.2628084
[25]
K. B. Wheeler, R. C. Murphy, and D. Thain. 2008. Qthreads: An API for programming with millions of lightweight threads. In 2008 IEEE International Symposium on Parallel and Distributed Processing. 1--8. https://doi.org/10.1109/IPDPS.2008.4536359
[26]
Yili Zheng, Amir Kamil, Michael B. Driscoll, Hongzhang Shan, and Katherine Yelick. 2014. UPC++: A PGAS Extension for C++ (IPDPS '14). IEEE Computer Society, Washington, DC, USA, 1105--1114. https://doi.org/10.1109/IPDPS.2014.115

Cited By

View all
  • (2022)Supercharging the APGAS Programming Model with Relocatable Distributed CollectionsScientific Programming10.1155/2022/50924222022Online publication date: 1-Jan-2022
  • (2019)Enabling Resilience in Asynchronous Many-Task Programming ModelsEuro-Par 2019: Parallel Processing10.1007/978-3-030-29400-7_25(346-360)Online publication date: 26-Aug-2019
  • (2018)Parallel sparse flow-sensitive points-to analysisProceedings of the 27th International Conference on Compiler Construction10.1145/3178372.3179517(59-70)Online publication date: 24-Feb-2018

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ESPM2'17: Proceedings of the Third International Workshop on Extreme Scale Programming Models and Middleware
November 2017
61 pages
ISBN:9781450351331
DOI:10.1145/3152041
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Chapel
  2. Habanero-C
  3. Open Community Runtime
  4. PGAS languages
  5. Qthreads
  6. Runtime Systems
  7. Task Parallelism

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

SC '17
Sponsor:

Acceptance Rates

Overall Acceptance Rate 5 of 10 submissions, 50%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 11 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Supercharging the APGAS Programming Model with Relocatable Distributed CollectionsScientific Programming10.1155/2022/50924222022Online publication date: 1-Jan-2022
  • (2019)Enabling Resilience in Asynchronous Many-Task Programming ModelsEuro-Par 2019: Parallel Processing10.1007/978-3-030-29400-7_25(346-360)Online publication date: 26-Aug-2019
  • (2018)Parallel sparse flow-sensitive points-to analysisProceedings of the 27th International Conference on Compiler Construction10.1145/3178372.3179517(59-70)Online publication date: 24-Feb-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media