Vanessa Sochat, Research Software Engineer, Stanford Research Computing Center, and Gregory Kurtzer, High Performance Computing Services group at Lawrence Berkeley National Lab, present on their work with Singularity and Singularity Hub.
Dear reader, how should you disseminate your software? If you want your recipe to come out just right, we encourage you to put it in a container. One such container, Singularity, is the first of its kind to be securely deployed internationally on more than 40 shared cluster resources. Its registry, Singularity Hub, further supports reproducible science by building and making containers accessible to any user of the software. In this talk, Vanessa will review the primary use cases for both Singularity and Singularity Hub, and how both have been designed to support modern, common workflows. (Greg will participate remotely.) She will discuss current and future challenges for building, capturing metadata for, and organizing the exploding landscape of containers, and present novel work for assessing reproducibility of such containers. Containers are changing scientific computing, and this is something to be excited about.
1 of 250
More Related Content
PEARC17: Reproducibility and Containers: The Perfect Sandwich
13. 1. Our recipe was not reproducible
2. We had missing dependencies
14. 1. Our recipe was not reproducible
2. We had missing dependencies
3. The perfect sandwich might never be made again
15. 1. Our recipe was not reproducible
2. We had missing dependencies
3. The perfect sandwich might never be made again
no ability to easily distribute or validate work
30. WHY NOT DOCKER?
Docker is not designed for,
efficient for,
or even compatible with
traditional HPC architectures
31. WHY NOT DOCKER?
Docker is not designed for,
efficient for,
or even compatible with
traditional HPC architectures
No centers run Docker on their traditional HPC
65. Customizable by the HPC Admin
SINGULARITY.CONF
- bind/mount points
- permissions
- overlayfs
66. Customizable by the HPC Admin
SINGULARITY.CONF
- bind/mount points
- permissions
- overlayfs
- config file must be root owned
67. Customizable by the HPC Admin
SINGULARITY.CONF
- bind/mount points
- permissions
- overlayfs
- config file must be root owned
- controls what user can/not do
68. Customizable by the HPC Admin
SINGULARITY.CONF
- bind/mount points
- permissions
- overlayfs
- config file must be root owned
- controls what user can/not do
- dis/allow different devices
69. Customizable by the HPC Admin
SINGULARITY.CONF
- bind/mount points
- permissions
- overlayfs
- config file must be root owned
- controls what user can/not do
- dis/allow different devices
- paths, session dirs all controlled
70. If you want to be root inside the container, you
must be root outside the container.
91. 1. Add bootstrap specification file to Github repo base
2. “Turn build on” in Singularity Hub
92. 1. Add bootstrap specification file to Github repo base
2. “Turn build on” in Singularity Hub
3. Commits are built automatically on Google Cloud
93. 1. Add bootstrap specification file to Github repo base
2. “Turn build on” in Singularity Hub
3. Commits are built automatically on Google Cloud
4. Accessible via command line
123. container...
predictions!
Change in the movement of information: put stuff in containers
Change in the representation of containers: reproducibility metrics
137. Levels of Reproducibility
Identical: the exact same image file
Replicate: the same image built at different times
Base: the core os is estimated to be the same
138. Levels of Reproducibility
Identical: the exact same image file
Replicate: the same image built at different times
Base: the core os is estimated to be the same
Runscript: the content of the runscript is the same
Environment: the environments are the same
Labels: the container labels are the same
139. What is a level of reproducibility?
A set of files between containers that are compared via content hash
143. Do the levels behave as I would expect?
Compare an image to itself
144. Do the levels behave as I would expect?
Compare an image to itself
- At step 1, start with the image compared to its full self
145. Do the levels behave as I would expect?
Compare an image to itself
- At step 1, start with the image compared to its full self
- Subtract one file from the second image, recalculate, until empty
146. Do the levels behave as I would expect?
Compare an image to itself
- At step 1, start with the image compared to its full self
- Subtract one file from the second image, recalculate, until empty
a. Remove more recent files first
147. Do the levels behave as I would expect?
Compare an image to itself
- At step 1, start with the image compared to its full self
- Subtract one file from the second image, recalculate, until empty
a. Remove more recent files first
Do this across all levels
152. Reproducibility Metrics: Takeaways
1. “Operating system science” needs to be a thing
2. Definitions of levels important
3. I learned things about the OS just looking at the graphs
153. Reproducibility Metrics: Takeaways
1. “Operating system science” needs to be a thing
2. Definitions of levels important
3. I learned things about the OS just looking at the graphs
4. A way to derive features for an operating system?
165. Challenges
- Most resources can’t support web download links
- How to share images? manifests?
- Storage (for most) is a file system
- No Docker for orchestration
- Permissions?
- Integration with Singularity Hub?
- Management?
179. registry:
container collection corresponds to a folder in repository
Individual user:
container collection corresponds to an entire GIthub repo
both
build multiple tags for one collection from within same repository
183. 1. If setup to build locally
Launches local build job
2. If setup to only build on Singularity Hub
Pings Singularity Hub
3. Both
Launches local build job
Successful builds ping Singularity Hub
184. 1. run a build command
sudo sregistry build tensorflow
sudo sregistry build tensorflow/tensorflow
sudo sregistry build tensorflow/tensorflow:gpu