Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3605507.3610631acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article

DEMAC: A Platform for Education in High-performance Computing, Bridging the Gap Between Users and Hardware

Published: 26 June 2024 Publication History

Abstract

Scientists, engineers, and researchers leverage high-performance computing (HPC) systems to perform complex computations and process large amounts of data. Designing, developing, and operating HPC systems have a steep learning curve, thus making it crucial to train a highly skilled and knowledgeable workforce in order to keep up with the rapidly evolving field, drive innovation, and meet the increasing demand for HPC across various sectors. Limited access to HPC educational resources is the main deterrent to training HPC talent. This paper addresses two primary culprits for the limited access: the high cost of production systems and the lack of realistic full-stack HPC training. Cutting-edge hardware is usually expensive and requires specialized facilities. Moreover, large HPC facilities typically discourage experimenting with the systems since they run production computation workloads and require minimal disturbance. Furthermore, HPC training often does not reflect the scale or complexity of production systems. This lack of realistic training support makes education in this area particularly difficult and ineffective. This paper proposes an educational framework for HPC that includes the development of a low-cost and flexible platform design for users in diverse fields. It allows study and experimentation with multiple realistic elements involved in a production HPC ecosystem. DEMAC, the Delaware Modular Assembly Cluster, is a set of 3D-printable frames designed to house embedded systems and auxiliary systems in a way that emulates HPC platforms. The teaching framework focuses on practical training as an education model in which learners reinforce theoretical knowledge with hands-on experience. If successful, this effort will contribute fundamentally to scientific research, technological advancements, HPC workforce development, and economic growth.

References

[1]
[n. d.]. ACM Europe Summer School on HPC Computer Architectures for AI and Dedicated Applications. https://europe.acm.org/hpc-summer-school.
[2]
[n. d.]. Argonne National Laboratory - Training Overview. https://alcf.anl.gov/support-center/training-overview.
[3]
[n. d.]. Argonne Training Program on Extreme-Scale Computing (ATPESC). https://extremecomputingtraining.anl.gov/.
[4]
[n. d.]. CAPSL website. https://www.capsl.udel.edu/.
[5]
[n. d.]. DEMAC Instructables. https://www.instructables.com/DEMAC-a-3Dprinted-Modular-Beowulf-Cluster-1/.
[6]
[n. d.]. eDARTS Website. https://www.capsl.udel.edu//codelets_parallella.shtml.
[7]
[n. d.]. High-Perfomance Computing (HPC) School. https://www.hpcschool.net/.
[8]
[n. d.]. International HPC Summer School. https://www.ihpcss.org/.
[9]
[n. d.]. International SuperComputing Camp. https://sc-camp.org/2023/index.html.
[10]
[n. d.]. Latin American Introductory School on Parallel Programming and Parallel Architecture for High-Performance Computing (ICTP). https://indico.ictp.it/event/10206/.
[11]
[n. d.]. National Energy Research Scientific Computing Center - Training Overview. https://www.nerscinc.org/onlinelearning.
[12]
[n. d.]. Oak Ridge National Laboratory - Training Overview. https://www.olcf.ornl.gov/for-users/training/training-calendar/.
[13]
[n. d.]. OctaPi. https://projects.raspberrypi.org/en/projects/build-an-octapi.
[14]
[n. d.]. Parallella Board. https://parallella.org/.
[15]
[n. d.]. Summer School on High Performance Computing (HPC) Challenges in Computational Science. https://prace-ri.eu/announcing-the-second-eu-us-summer-school-on-hpc-challenges-in-computational-sciences/.
[16]
[n. d.]. Tiny Titan. https://tinytitan.github.io/.
[17]
[n. d.]. Top500 The List. https://www.top500.org/.
[18]
[n. d.]. UD K12 Engineering. https://engr.udel.edu/offerings/k12-outreach/.
[19]
[n. d.]. Water cooled raspberry pi cluster. https://www.the-diy-life.com/building-a-water-cooled-raspberry-pi-4-cluster/.
[20]
[n. d.]. XSEDE Training. https://web.archive.org/web/20220820025825/https://www.xsede.org/for-users/training.
[21]
ACCESS. [n. d.]. Advanced Cyberinfrastructure Coordination Ecosystem: Services and Support. https://access-ci.org/.
[22]
Linda Akli. 2018. XSEDE: Tackling Diversity and Inclusion in Advanced Computing. Computing in Science & Engineering 20, 3 (2018), 71–72. https://doi.org/10.1109/MCSE.2018.03202635
[23]
Baine Alexander and Julie Foertsch. 2003. The Impact of the EOT-PACI Program on Partners, Projects, and Participants: A Summative Evaluation. EOT-PACI (2003).
[24]
Linda Argote and Paul Ingram. 2000. Knowledge Transfer: A Basis for Competitive Advantage in Firms. Organizational Behavior and Human Decision Processes 82, 1 (2000), 150–169. https://doi.org/10.1006/obhd.2000.2893
[25]
Carolyn Connor, Amanda Bonnie, Gary Grider, and Andree Jacobson. 2016. Next Generation HPC Workforce Development: The Computer System, Cluster, and Networking Summer Institute. 2016 Workshop on Education for High-Performance Computing (EduHPC) (2016), 32–39.
[26]
J. B. Dennis. 1997. A parallel program execution model supporting modular software construction. In Proceedings. Third Working Conference on Massively Parallel Programming Models (Cat. No.97TB100228). 50–60.
[27]
Jack B. Dennis. April 9-11, 1974. First version of a data flow procedure language. Symposium on Programming (April 9-11, 1974), 362–376.
[28]
Energy Bot. [n. d.]. Electric Bill Calculator. https://www.energybot.com/electricity-bill-calculator.html.
[29]
L. Fedeli, A. Huebl, F. Boillod-Cerneux, T. Clark, K. Gott, C. Hillairet, S. Jaure, A. Leblanc, R. Lehe, A. Myers, C. Piechurski, M. Sato, N. Zaim, W. Zhang, J. Vay, and H. Vincenti. 2022. Pushing the Frontier in the Design of Laser-Based Electron Accelerators with Groundbreaking Mesh-Refined Particle-In-Cell Simulations on Exascale-Class Supercomputers. In SC22: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society, Los Alamitos, CA, USA, 1–12. https://doi.org/10.1109/SC41404.2022.00008
[30]
RISC-V Foundation. 2019. RISC-V Instruction Set Architecture, Volume I: User-Level ISA, Version 2.2. Technical Report. RISC-V International. https://riscv.org/specifications/isa-spec-pdf/
[31]
M. R. Fowler, E. Stipidis, and F. H. Ali. 2008. Practical Verification of an Embedded [Beowulf] Architecture Using Standard Cluster Benchmarks. In 2008 The Third International Conference on Software Engineering Advances. 140–145. https://doi.org/10.1109/ICSEA.2008.39
[32]
Dedre Gentner and Albert L. Stevens. 1983. Mental models. L. Erlbaum Associates.
[33]
HPC University. [n. d.]. HPC University. http://www.hpcuniversity.org/.
[34]
Jose M Monsalve Diaz. [n. d.]. OpenMP Jupyter Notebook-based Tutorial. https://www.github.com/josemonsalve2/openmp_tutorial.
[35]
Ryan Kabrick, Diego A Roa Perdomo, Siddhisanket Raskar, Jose M. Monsalve Diaz, Dawson Fox, and Guang R. Gao. 2020. CODIR: Towards an MLIR Codelet Model Dialect. Fourth Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware IPDRM20 (2020).
[36]
Michael Johan Kruger. 2015. Building a Parallella board cluster. Master’s thesis. Rhodes University.
[37]
S. Lathrop. 2016. A Call to Action to Prepare the High-Performance Computing Workforce. Computing in Science & Engineering 18, 06 (nov 2016), 80–83. https://doi.org/10.1109/MCSE.2016.101
[38]
Yong (Alexander) Liu, Xin (Lucy) Liu, Fang (Nancy) Li, Haohuan Fu, Yuling Yang, Jiawei Song, Pengpeng Zhao, Zhen Wang, Dajia Peng, Huarong Chen, Chu Guo, Heliang Huang, Wenzhao Wu, and Dexun Chen. 2021. Closing the "Quantum Supremacy" Gap: Achieving Real-Time Simulation of a Random Quantum Circuit Using a New Sunway Supercomputer. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (St. Louis, Missouri) (SC ’21). Association for Computing Machinery, New York, NY, USA, Article 3, 12 pages. https://doi.org/10.1145/3458817.3487399
[39]
Julia Mullen, Chansup Byun, Vijay Gadepally, Siddharth Samsi, Albert Reuther, and Jeremy Kepner. 2017. Learning by doing, High Performance Computing education in the MOOC era. J. Parallel and Distrib. Comput. 105 (2017), 105–115. https://doi.org/10.1016/j.jpdc.2017.01.015 Keeping up with Technology: Teaching Parallel, Distributed and High-Performance Computing.
[40]
National Academies of Sciences Engineering and Medicine. 2016. Future Directions for NSF Advanced Computing Infrastructure to Support U.S. Science and Engineering in 2017–2020. The National Academies Press, Washington, DC. https://doi.org/10.17226/21886
[41]
OpenACC. [n. d.]. OpenACC Education Resources. https://www.openacc.org/resources.
[42]
J Carroll P Airasian, B Bloom. 1971. Mastery learning: theory and practice.Holt, Rinehart and Winston, New York.
[43]
Peng Qu, Jin Yan, Youhui Zhang, and Guang Gao. 2017. Parallel Turing Machine, a Proposal. Journal of Computer Science and Technology 32 (03 2017), 269–285. https://doi.org/10.1007/s11390-017-1721-3
[44]
Albert Reuther, Peter Michaleas, Michael Jones, Vijay Gadepally, Siddharth Samsi, and Jeremy Kepner. 2020. Survey of Machine Learning Accelerators. (08 2020).
[45]
Diego A Roa Perdomo, Ryan Kabrick, Jose M. Monsalve Diaz, Siddhisanket Raskar, Dawson Fox, and Guang R. Gao. 2020. DEMAC: A Modular Platform for HW-SW Co-Design. Fourth Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware IPDRM20 (2020).
[46]
Jaroslav Sobota, Roman PiŜl, Pavel Balda, and MiloŜ Schlegel. 2013. Raspberry Pi and Arduino boards in control education. IFAC Proceedings Volumes 46, 17 (2013), 7–12. https://doi.org/10.3182/20130828-3-UK-2039.00003 10th IFAC Symposium Advances in Control Education.
[47]
Thomas L. Sterling, Daniel Savarese, Donald J. Becker, John E. Dorband, Udaya A. Ranawake, and Charles V. Packer. 1995. BEOWULF: A Parallel Workstation for Scientific Computation. In International Conference on Parallel Processing.
[48]
Kevin Bryan Theobald. 1999. EARTH: An Efficient Architecture for Running Threads. Ph. D. Dissertation. McGill, Montreal.
[49]
Fung Po Tso, David R. White, Simon Jouet, Jeremy Singer, and Dimitrios P. Pezaros. 2013. [The Glasgow Raspberry Pi] Cloud: A Scale Model for Cloud Computing Infrastructures. In 2013 IEEE 33rd International Conference on Distributed Computing Systems Workshops. 108–112. https://doi.org/10.1109/ICDCSW.2013.25
[50]
James Wolfer. 2015. A heterogeneous supercomputer model for high-performance parallel computing pedagogy. In 2015 IEEE Global Engineering Education Conference (EDUCON). 799–805. https://doi.org/10.1109/EDUCON.2015.7096063
[51]
XSEDE. [n. d.]. XSEDE Website.
[52]
Mingkun Yang. 2013. CAL code generator for Epiphany architecture. Master’s thesis. Halmstad University.

Index Terms

  1. DEMAC: A Platform for Education in High-performance Computing, Bridging the Gap Between Users and Hardware

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WCAE '23: Proceedings of the Workshop on Computer Architecture Education
      June 2023
      56 pages
      ISBN:9798400702532
      DOI:10.1145/3605507
      Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 26 June 2024

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. High-performance computing
      2. cluster
      3. codelet model
      4. dataflow
      5. distributed computing
      6. education
      7. parallel programming
      8. programming execution model

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      ISCA '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 9 of 10 submissions, 90%

      Upcoming Conference

      ISCA '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 29
        Total Downloads
      • Downloads (Last 12 months)29
      • Downloads (Last 6 weeks)5
      Reflects downloads up to 17 Oct 2024

      Other Metrics

      Citations

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media