Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Singularity Registry HPC
Collaborative Container Modules
Vanessa Sochat PhD
Computer Scientist
Lawrence Livermore National Lab
A formula for collaboration?
Identify an actual need that crosses communities
Singularity Registry HPC
How do we provide containers?
The Problem:
● Public registries (Docker Hub, Quay.io) or self deployed (Singularity
Registry Server, Docker Registry)
● Each user separately pulling means a lot of redundancy
● Pulling “latest” means we aren’t aware of versioning
● It’s not impossible to be reproducible, but it’s harder, and best practices
are hard to follow!
How do we provide containers?
● HPC admins can provide a shared collection of containers
● Users could also install a collection, if not provided by an admin
● It’s done so in a way so that using the containers is reproducible.
Ideally:
A formula for collaboration?
Identify an actual need that crosses communities
Have an idea that fits into a current way of thinking
Singularity and LMOD? 🤔
Hmm 🤔
Hmm 🤔
What is wrong with current solutions?
● Not easy enough to collaborate on
● Not modular enough
● Not enough granularity in defining container versions or interactions
Reproducibility
01
Define exact tags
and digests to install
and load.
Reproducibility
01
Define exact tags
and digests to install
and load.
Ease of use
02
Install, load, and use.
I don’t want to know
it’s a container!
Reproducibility
01
Define exact tags
and digests to install
and load.
Ease of use
02
Install, load, and use.
I don’t want to know
it’s a container!
Abstraction
03
I largely don’t want to
be aware that I’m
interacting with
containers.
Reproducibility
01
Define exact tags
and digests to install
and load.
Ease of use
02
Install, load, and use.
I don’t want to know
it’s a container!
Abstraction
03
I largely don’t want to
be aware that I’m
interacting with
containers.
Automation
04
Find new container
tags and digests for
me. Let me check for
updates and decide.
How does it work?
$ git clone https://github.com/singularityhub/singularity-hpc
$ cd singularity-hpc
$ pip install -e .
$ which shpc
$ git clone https://github.com/singularityhub/singularity-hpc
$ cd singularity-hpc
$ pip install -e .
$ which shpc
https://singularityhub.github.io/singularity-hpc/
https://singularityhub.github.io/singularity-hpc/
biocontainers
bids
docker
spack
jupyter
nvidia
...
Singularity Registry HPC
singularity exec --home ${HOME} --bind ${HOME}/.local:/home/joyvan/.local
<container> jupyter notebook --no-browser --port=$(shuf -i 20000-65000) --ip 0.0.0.0
run-notebook
I want to run a notebook!
What does the filesystem registry
look like?
Singularity Registry HPC
Singularity Registry HPC
Singularity Registry HPC
An shpc registry entry
An shpc registry entry
An shpc registry entry
An shpc registry entry
An shpc registry entry
An shpc registry entry
Automated Updates with “Binoc”
@alecbcs
Automated Updates with “Binoc”
Automated Testing of Containers
A formula for collaboration?
Identify an actual need that crosses communities
Have an idea that fits into a current way of thinking
Create a contributor friendly learning environment
Singularity Registry HPC
A formula for collaboration?
Identify an actual need that crosses communities
Have an idea that fits into a current way of thinking
Create a contributor friendly learning environment
What do I do now?
Singularity Registry HPC
Research Software Engineers
National Lab Lenny
“We have the biggest
supercomputers!”
Academic Alicia
“We have a research
computing team too.”
Industry Ignacio
“We use cloud services
and anywhere we can
find GPUs!”
National Lab Lenny
“We have the biggest
supercomputers!”
Academic Alicia
“We have a research
computing team too.”
Industry Ignacio
“We use cloud services
and anywhere we can
find GPUs!”
How do we work together?
🤔
Singularity Registry HPC
Singularity Registry HPC
Singularity Registry HPC
What are the benefits of collaboration?
Why should we work together?
● Collaboration leads to better ideas and approaches.
● Convergence on best practices (behavior and software) for research
software engineering
● Learn and grow from one another.
What are the drawbacks of collaboration?
Why shouldn’t we (or can’t we) work together?
● Costs of effort and time.
● Concerns about intellectual property or resulting profits from work
● It requires a security clearance
● I can’t afford it.
How are we doing?
LLNL Open Source
1. Identify an actual need that crosses communities → Assess repository state and define statement of need
2. Have an idea for a way to fix it.
3. Create documentation, branding, and tutorials
4. Tell people about it, get them interested.
Singularity Registry HPC
Singularity Registry HPC
Singularity Registry HPC
How are we doing?
● 765 repos
● 36 % have more external than internal contributors
● 61 % have more internal than external contributors.
● 3 % have equal numbers
● Top ten internal/external contributors has overlap of 1
How are we doing?
1. We have a few projects that are excelling in attracting external contributors
2. The most “popular” projects at the lab are possibly not contributor friendly.
3. What about the long tail?
spack (package manager)
mfem
zfs
https://vsoch.github.io/llnl-contributors/
Can we do better? 🤔
A Plan of Action!
Contributor CI
1. BASELINE: Collect metrics to measure contributions
2. CFA: Contributor Friendliness Assessment
3. ACTION: Timestamps of proactive action
A set of tools and procedures to assess contributions
A Call to Action!
Why should I care?
- It’s hard to work together across siloed RSE communities.
- Let’s look closely at our institutions and get better at that.
- Identify software developed in your lab or institution.
- Make a assessment of external vs. internal contributions
- Contributor Friendliness Assessment
- Improve the scores.
Why should I care?
- No established best practice for an HPC registry
- Little involvement of RSE/HPC communities with OCI
What do you think?
Thank you!
@vsoch GitHub and Twitter

More Related Content

Singularity Registry HPC

  • 1. Singularity Registry HPC Collaborative Container Modules Vanessa Sochat PhD Computer Scientist Lawrence Livermore National Lab
  • 2. A formula for collaboration? Identify an actual need that crosses communities
  • 4. How do we provide containers? The Problem: ● Public registries (Docker Hub, Quay.io) or self deployed (Singularity Registry Server, Docker Registry) ● Each user separately pulling means a lot of redundancy ● Pulling “latest” means we aren’t aware of versioning ● It’s not impossible to be reproducible, but it’s harder, and best practices are hard to follow!
  • 5. How do we provide containers? ● HPC admins can provide a shared collection of containers ● Users could also install a collection, if not provided by an admin ● It’s done so in a way so that using the containers is reproducible. Ideally:
  • 6. A formula for collaboration? Identify an actual need that crosses communities Have an idea that fits into a current way of thinking
  • 10. What is wrong with current solutions? ● Not easy enough to collaborate on ● Not modular enough ● Not enough granularity in defining container versions or interactions
  • 11. Reproducibility 01 Define exact tags and digests to install and load.
  • 12. Reproducibility 01 Define exact tags and digests to install and load. Ease of use 02 Install, load, and use. I don’t want to know it’s a container!
  • 13. Reproducibility 01 Define exact tags and digests to install and load. Ease of use 02 Install, load, and use. I don’t want to know it’s a container! Abstraction 03 I largely don’t want to be aware that I’m interacting with containers.
  • 14. Reproducibility 01 Define exact tags and digests to install and load. Ease of use 02 Install, load, and use. I don’t want to know it’s a container! Abstraction 03 I largely don’t want to be aware that I’m interacting with containers. Automation 04 Find new container tags and digests for me. Let me check for updates and decide.
  • 15. How does it work?
  • 16. $ git clone https://github.com/singularityhub/singularity-hpc $ cd singularity-hpc $ pip install -e . $ which shpc
  • 17. $ git clone https://github.com/singularityhub/singularity-hpc $ cd singularity-hpc $ pip install -e . $ which shpc
  • 21. singularity exec --home ${HOME} --bind ${HOME}/.local:/home/joyvan/.local <container> jupyter notebook --no-browser --port=$(shuf -i 20000-65000) --ip 0.0.0.0 run-notebook I want to run a notebook!
  • 22. What does the filesystem registry look like?
  • 32. Automated Updates with “Binoc” @alecbcs
  • 33. Automated Updates with “Binoc”
  • 34. Automated Testing of Containers
  • 35. A formula for collaboration? Identify an actual need that crosses communities Have an idea that fits into a current way of thinking Create a contributor friendly learning environment
  • 37. A formula for collaboration? Identify an actual need that crosses communities Have an idea that fits into a current way of thinking Create a contributor friendly learning environment
  • 38. What do I do now?
  • 40. Research Software Engineers National Lab Lenny “We have the biggest supercomputers!” Academic Alicia “We have a research computing team too.” Industry Ignacio “We use cloud services and anywhere we can find GPUs!”
  • 41. National Lab Lenny “We have the biggest supercomputers!” Academic Alicia “We have a research computing team too.” Industry Ignacio “We use cloud services and anywhere we can find GPUs!” How do we work together? 🤔
  • 45. What are the benefits of collaboration? Why should we work together? ● Collaboration leads to better ideas and approaches. ● Convergence on best practices (behavior and software) for research software engineering ● Learn and grow from one another.
  • 46. What are the drawbacks of collaboration? Why shouldn’t we (or can’t we) work together? ● Costs of effort and time. ● Concerns about intellectual property or resulting profits from work ● It requires a security clearance ● I can’t afford it.
  • 47. How are we doing?
  • 48. LLNL Open Source 1. Identify an actual need that crosses communities → Assess repository state and define statement of need 2. Have an idea for a way to fix it. 3. Create documentation, branding, and tutorials 4. Tell people about it, get them interested.
  • 52. How are we doing? ● 765 repos ● 36 % have more external than internal contributors ● 61 % have more internal than external contributors. ● 3 % have equal numbers ● Top ten internal/external contributors has overlap of 1
  • 53. How are we doing? 1. We have a few projects that are excelling in attracting external contributors 2. The most “popular” projects at the lab are possibly not contributor friendly. 3. What about the long tail?
  • 55. Can we do better? 🤔
  • 56. A Plan of Action!
  • 57. Contributor CI 1. BASELINE: Collect metrics to measure contributions 2. CFA: Contributor Friendliness Assessment 3. ACTION: Timestamps of proactive action A set of tools and procedures to assess contributions
  • 58. A Call to Action!
  • 59. Why should I care? - It’s hard to work together across siloed RSE communities. - Let’s look closely at our institutions and get better at that. - Identify software developed in your lab or institution. - Make a assessment of external vs. internal contributions - Contributor Friendliness Assessment - Improve the scores.
  • 60. Why should I care? - No established best practice for an HPC registry - Little involvement of RSE/HPC communities with OCI
  • 61. What do you think?