Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3311790.3396656acmconferencesArticle/Chapter ViewAbstractPublication PagespearcConference Proceedingsconference-collections
research-article

Frontera: The Evolution of Leadership Computing at the National Science Foundation

Published: 26 July 2020 Publication History
  • Get Citation Alerts
  • Abstract

    As part of the NSF’s cyberinfrastructure vision for a robust mix of high capability and capacity HPC systems, Frontera represents the most recent evolution of trans-petascale resources available to all open science research projects in the U.S. Debuting as the fifth largest supercomputer in the world, Frontera represents a robust and well-balanced HPC system designed to enable large-scale, productive science on day one of operations. The system provides a primary compute capability of nearly 39PF, delivered completely via more than 8,000 dual-socket servers with conventional Intel 8280 (“Cascade Lake”) processors. A unique configuration of both desktop GPUs and advanced floating units from NVIDIA enables both machine learning and scientific workloads, and the system delivers nearly 2TB/s of total filesystem bandwidth with 55 PB of usable Lustre disk-based storage and 3PB of all flash Lustre storage. A Mellanox InfiniBand (IB) interconnect provides very low latency with 100Gbps to each node, and 200Gbps between switches in a fat tree topology with minimal oversubscription for efficient communication, even in jobs that use the full system with complex communication patterns. The system hardware is complemented by a robust set of software services, including Application Programmer Interfaces (APIs) to support an evolving user base that increasingly demands productive access via science gateways and automated workflows, as well as a first-of-its-kind partnership with the three major cloud service providers to create a bridge between “traditional” HPC and the cloud infrastructure upon which research increasingly depends.

    Supplemental Material

    MP4 File
    Presentation video

    References

    [1]
    2011. NSF Advisory Committee for Cyberinfrastructure Task Force on Grand Challenges. Final Report.
    [2]
    2016. Future directions for NSF advanced computing infrastructure to support U.S. science and engineering in 2017-2020. The National Academies Press. https://doi.org/10.17226/21886
    [3]
    2017. Blue Waters Sustained Petascale in Action: Enabling Transformative Research, 2017 Annual Report.
    [4]
    K. Agrawal, M. R. Fahey, R. McLay, and D. James. 2014. User Environment Tracking and Problem Detection with XALT. In 2014 First International Workshop on HPC User Support Tools. 32–40.
    [5]
    Todd Evans, William L. Barth, James C. Browne, Robert L. DeLeon, Thomas R. Furlani, Steven M. Gallo, Matthew D. Jones, and Abani K. Patra. 2014. Comprehensive Resource Use Monitoring for HPC Systems with TACC Stats. In Proceedings of the First International Workshop on HPC User Support Tools (New Orleans, Louisiana) (HUST ’14). IEEE Press, 13–21. https://doi.org/10.1109/HUST.2014.7
    [6]
    Robert McLay, Karl W. Schulz, William L. Barth, and Tommy Minyard. 2011. Best Practices for the Deployment and Management of Production HPC Clusters. In State of the Practice Reports (Seattle, Washington) (SC ’11). Association for Computing Machinery, New York, NY, USA, Article 9, 11 pages. https://doi.org/10.1145/2063348.2063360
    [7]
    Johann Rudi, A. Cristiano I. Malossi, Tobin Isaac, Georg Stadler, Michael Gurnis, Peter W. J. Staar, Yves Ineichen, Costas Bekas, Alessandro Curioni, and Omar Ghattas. 2015. An Extreme-Scale Implicit Solver for Complex PDEs: Highly Heterogeneous Flow in Earth’s Mantle. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (Austin, Texas) (SC ’15). Association for Computing Machinery, New York, NY, USA, Article 5, 12 pages. https://doi.org/10.1145/2807591.2807675

    Cited By

    View all
    • (2024)Interaction between a Coronal Mass Ejection and Comet 67P/Churyumov–GerasimenkoThe Astrophysical Journal10.3847/1538-4357/ad3c42967:1(43)Online publication date: 16-May-2024
    • (2024)Solar Wind Driven from GONG Magnetograms in the Last Solar CycleThe Astrophysical Journal10.3847/1538-4357/ad32ca965:1(1)Online publication date: 1-Apr-2024
    • (2024)A Theory for Neutron Star and Black Hole Kicks and Induced SpinsThe Astrophysical Journal10.3847/1538-4357/ad2353963:1(63)Online publication date: 29-Feb-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PEARC '20: Practice and Experience in Advanced Research Computing 2020: Catch the Wave
    July 2020
    556 pages
    ISBN:9781450366892
    DOI:10.1145/3311790
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 July 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. HPC
    2. cyberinfrastructure
    3. supercomputer
    4. system design

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    PEARC '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 133 of 202 submissions, 66%

    Upcoming Conference

    PEARC '24

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)104
    • Downloads (Last 6 weeks)13

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Interaction between a Coronal Mass Ejection and Comet 67P/Churyumov–GerasimenkoThe Astrophysical Journal10.3847/1538-4357/ad3c42967:1(43)Online publication date: 16-May-2024
    • (2024)Solar Wind Driven from GONG Magnetograms in the Last Solar CycleThe Astrophysical Journal10.3847/1538-4357/ad32ca965:1(1)Online publication date: 1-Apr-2024
    • (2024)A Theory for Neutron Star and Black Hole Kicks and Induced SpinsThe Astrophysical Journal10.3847/1538-4357/ad2353963:1(63)Online publication date: 29-Feb-2024
    • (2024)On rigorously quantifying uncertainty in shear and compression wave velocity during surface wave site characterizationJapanese Geotechnical Society Special Publication10.3208/jgssp.v10.OS-41-0110:52(1940-1945)Online publication date: 2024
    • (2024)Machine learning-based prediction of site responses at liquefiable sites subjected to bi-directional ground motionsJapanese Geotechnical Society Special Publication10.3208/jgssp.v10.OS-1-0210:12(311-316)Online publication date: 2024
    • (2024)Seismic Tomography 2024Bulletin of the Seismological Society of America10.1785/0120230229114:3(1185-1213)Online publication date: 3-May-2024
    • (2024)Open-source data pipeline for street-view images: A case study on community mobility during COVID-19 pandemicPLOS ONE10.1371/journal.pone.030318019:5(e0303180)Online publication date: 10-May-2024
    • (2024)A comparison of ground motions predicted through one-dimensional site response analyses and three-dimensional wave propagation simulations at regional scalesEarthquake Spectra10.1177/8755293024123193540:2(1215-1234)Online publication date: 28-Feb-2024
    • (2024)Tuning Apex DQN: A Reinforcement Learning based Deep Q-Network AlgorithmPractice and Experience in Advanced Research Computing 2024: Human Powered Computing10.1145/3626203.3670581(1-5)Online publication date: 17-Jul-2024
    • (2024)Agile-DRAM: Agile Trade-Offs in Memory Capacity, Latency, and Energy for Data Centers2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00089(1141-1153)Online publication date: 2-Mar-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media