"OpenHPC is a collaborative, community effort that initiated from a desire to aggregate a number of common ingredients required to deploy and manage High Performance Computing (HPC) Linux clusters including provisioning tools, resource management, I/O clients, development tools, and a variety of scientific libraries. Packages provided by OpenHPC have been pre-built with HPC integration in mind with a goal to provide re-usable building blocks for the HPC community. Over time, the community also plans to identify and develop abstraction interfaces between key components to further enhance modularity and interchangeability. The community includes representation from a variety of sources including software vendors, equipment manufacturers, research institutions, supercomputing sites, and others."
Watch the video: http://wp.me/p3RLHQ-gKz
Learn more: http://openhpc.community/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
1. OpenHPC: A Cohesive and
Comprehensive System Software Stack
The Time is Right
IDC HPC User Forum
April 19, 2017
Dr. Robert W. Wisniewski
Chief Software Architect Extreme Scale Computing
Senior Principal Engineer, Intel
1
4. Trends and Challenges
• Talk is software focused, but
–Hardware: scale, power, reliability, network bandwidth and latency, memory
bandwidth and latency
• Complexity of software stack
–Increase in classical HPC computing
–Richer environments, python
–New models: UQ, workflows
–Introduction of big data and analytics
–BDEC
–Multi tenancy
–Need for new frameworks
–AI: ML/DL
–Cloud 4
5. Trends and Challenges Needs
• Talk is software focused, but
–Hardware: Value of co-designing and integrating cores, network, and memory
•Complexity drives the need to integrate and provide a coherent and
comprehensive system software stack rather than a bag of parts
–More components
–More components lead to a greater potential for incompatibilities
–Co-design applies within system software also
–Need to test and continuously integrate
–Over time fewer organizations could assemble the whole stack
–Increasing time going to just standing up
–Versus focusing on mission needs
5
6. Overview
Status
Governance:
Technical steering committee active
since June 2016
Technical submission process
published
Currently there are 30 official members
in Platinum, Silver, Academic and
Technical Committees
Held first face to face meeting, post-
formation, in June 2016 at ISC, one at
SC 2016, and one planned for ISC 2017
Goals
Provide a common SW platform to the HPC
community that works across multiple
segments and on which end-users can
collaborate and innovate
Simplify the complexity of installation,
configuration, and ongoing maintenance of a
custom software stack
Receive contributions and feedback from
community to drive innovation
Enable developers to focus on their
differentiation and unique value, rather than
having to spend on developing, testing, and
maintaining a core stack
Deliver integrated hardware and software
innovations to ease the path to exascale
6Courtesy of OpenHPC*
*Other names and brands may be claimed as the property of others.
7. Background Motivation for Community Effort
7Courtesy of OpenHPC*
• Many sites spend considerable effort aggregating a large suite of open-
source projects to provide a capable HPC environment for their users:
–necessary to build/deploy HPC focused packages that are either absent or do not keep
pace from distro providers
–local packaging or customization frequently tries to give software versioning access to
users (e.g. via modules or similar equivalent)
–hierarchal packaging necessary for multiple compiler/mpi families
• On the developer front, many successful projects must engage in continual
triage and debugging regarding configuration and installation issues on HPC
systems
*Other names and brands may be claimed as the property of others.
8. What is OpenHPC?
8Courtesy of OpenHPC*
• OpenHPC is a community effort endeavoring to:
–provide collection(s) of pre-packaged components that can be used to
help install and manage flexible HPC systems throughout their lifecycle
–leverage standard Linux delivery model
to retain admin familiarity (ie. package repos)
–allow and promote multiple system configuration recipes that leverage
community reference
designs and best practices
–implement integration testing to
gain validation confidence
–provide additional distribution
mechanism for groups releasing
open-source software
–provide a stable platform for
new R&D initiatives
Install Guide - CentOS7.1 Version (v1.0)
eth1eth0
Data
Center
Network
high speed network
tcp networking
to compute eth interface
to compute BMC interface
compute
nodes
Lustre* storage system
Master
(SMS)
Figure 1: Overview of physical cluster architecture.
Typical cluster
architecture
*Other names and brands may be claimed as the property of others.
9. OpenHPC: Mission and Vision
9Courtesy of OpenHPC*
• Mission: to provide a reference collection of open-source HPC software
components and best practices, lowering barriers to deployment, advancement,
and use of modern HPC methods and tools.
• Vision: OpenHPC components and best practices will enable and accelerate
innovation and discoveries by broadening access to state-of-the-art, open-source
HPC methods and tools in a consistent environment, supported by a collaborative,
worldwide community of HPC users, developers, researchers, administrators, and
vendors.
*Other names and brands may be claimed as the property of others.
10. OpenHPC: Project Members
10Courtesy of OpenHPC*
Project member participation interest?
Contact Kevlin Husser or Jeff ErnstFriedman
jernstfriedman@linuxfoundation.org
Mixture of Academics, Labs, OEMs, and ISVs/OSVs
• Argonne National Laboratory • Center for Research in Extreme Scale Technologies
– Indiana University
30 Members
OpenHPC is a Linux Foundation Project
initiated by Intel and gained wide
participation right away
The goal is to collaboratively advance
the state of the software ecosystem
Governing board is composed of
Platinum members (Intel, Dell, HPE,
SUSE) plus reps from Silver &
Academic, Technical committees
WWW.OpenHPC.Community
*Other names and brands may be claimed as the property of others.
• University of Cambridge
11. Repository server metrics: monthly visitors
11Courtesy of OpenHPC*
v1.0
v1.0.1
v1.1
v1.1.1
v1.2
v1.2.1
v1.3
0
500
1000
1500
2000
Jul-15 Oct-15 Jan-16 May-16 Aug-16 Nov-16 Mar-17
#ofUniqueVisitors
Build Server Access: Unique Visitors
Releases
*Other names and brands may be claimed as the property of others.
12. Intel® HPC Orchestrator Modular View
• Intra-stack APIs to allow for customization/differentiation (OEMs enabling)
• Defined external APIs for consistency across versions (ISVs)
Node-specific OS Kernel(s)
Linux* Distro Runtime Libraries
Overlay & Pub-sub Networks, Identity
User Space
Utilities
SW
Development
Toolchain
Compiler &
Programming
Model
Runtimes
High
Performance
Parallel
LibrariesScalable
Debugging
& Perf
Analysis
Tools
Optimized
I/O
Libraries
I/O
Services
Data
Collection
And
System
Monitors
Workload
Manager
Resource
Mgmnt
Runtimes
DB
Schema
Scalable
DB
SystemManagement
(Confi,Inventory)
Provisioning
SystemDiagnostics
FabricMgmnt
Operator Interface Applications (not part of initial stack)
ISV Applications
Hardware
12*Other names and brands may be claimed as the property of others.
13. OpenHPC v1.3 - Current S/W components
13Courtesy of OpenHPC*
Functional Areas Components
Base OS CentOS 7.3, SLES12 SP2
Architecture x86_64, aarch64 (Tech Preview)
Administrative Tools
Conman, Ganglia, Lmod, LosF, Nagios, pdsh, prun, EasyBuild, ClusterShell,
mrsh, Genders, Shine, Spack, test-suite
Provisioning Warewulf
Resource Mgmt. SLURM, Munge, PBS Professional
Runtimes OpenMP, OCR
I/O Services Lustre client (community version)
Numerical/Scientific
Libraries
Boost, GSL, FFTW, Metis, PETSc, Trilinos, Hypre, SuperLU, SuperLU_Dist,
Mumps, OpenBLAS, Scalapack
I/O Libraries HDF5 (pHDF5), NetCDF (including C++ and Fortran interfaces), Adios
Compiler Families GNU (gcc, g++, gfortran)
MPI Families MVAPICH2, OpenMPI, MPICH
Development Tools Autotools (autoconf, automake, libtool), Valgrind,R, SciPy/NumPy
Performance Tools PAPI, IMB, mpiP, pdtoolkit TAU, Scalasca, ScoreP, SIONLib
Notes:
• Additional dependencies
that are not provided by
the BaseOS or community
repos (e.g. EPEL) are also
included
• 3rd Party libraries are built
for each compiler/MPI
family
• Resulting repositories
currently comprised of
~300 RPMs
Future additions approved
for inclusion:
• BeeGFS client
• hwloc
• Singularity
• xCAT
*Other names and brands may be claimed as the property of others.
14. Intel® HPC Orchestrator Framework
14
University
Community
OEM
CommunityGNU
Linux
Parallel File system
Upstream
source
Communities
Resource Manager
Upstream
source
Communities
Upstream
source
Communities
Upstream
source
Communities
Integrates and
tests HPC
stacks and
makes them
available as OS
Base
HPC Stack
OEM
Stack
University
Stack
Contributors include
Intel, OEMs, ISVs,
labs, academia
RRV
RRV
RRV
RRV
RRVs
Continuous Integration Environment
-Build Environment & Source Control
-Bug Tracking
-User & Dev Forums
-Collaboration tools
-Validation Environment
Cadence 6~12 mo
“RRV” = Relevant and Reliable Version
Intel® HPC Orchestrator
Core HPC Stack
PRODUCT
Supported HPC Stack
-Premium Features
-Advanced Integration Testing
-Testing at scale
-Validated updates
-Level3 Support across stack
OEM
Stack
PROJECT
*Other names and brands may be claimed as the property of others.
15. Base Stack and Derivatives
- Sufficient
performance
and scalability
- Ease of Install
- Performance &
scalability
- Energy efficiency
- Ease of use &
administration
- Auto-configuration
Ease of
administration
across multiple
tiers in the same
data center
Common Core
(same across all offerings)
Additions Targeting High End Market
Additions Targeting Volume Market
“CUSTOM”
AdditionsTargetingTop500
&Verticals
Provides
“TURNKEY”
Provides
15
“ADVANCED”
16. Conclusion
•Trends and challenges led to the need for a cohesive and
comprehensive system software stack for HPC
•OpenHPC provides a vehicle that facilitates collaboration,
removes replicative work, and provides a more efficient
ecosystem
•Intel® HPC Orchestrator provides a supported version of
OpenHPC analogous to CentOS and RHEL with three tiers for
different computing needs
•OpenHPC is gaining momentum with increased contributions
16
*Other names and brands may be claimed as the property of others.