JupyterHub HPC Meeting - July 2022#
Date: 2022-07-06
Time: 8:30 AM PDT
GitHub issue:
Calendar for future meetings: https://jupyterhub-team-compass.readthedocs.io/en/latest/meetings/
Welcome to the Meeting#
Hello! If you are joining the team video meeting, sign in below so we know who was here. Roll call:
name / institution / GitHub handle
Rollin Thomas / NERSC / @rcthomas
Michael Milligan / MSI @ UMN / @mbmilligan
Félix-Antoine Fortin / Université Laval / @cmd-ntrf
James Beal/ Sanger Institute / @jbeal-work
Rick Wagner / UCSD / @rpwagner
Jens Henrik Göbbert / Jülich Supercomputing Centre / @jhgoebbert
Kevin Paul / NCAR / @kmpaul
Quick updates#
60 second updates on things you have been up to, questions you have, or developments you think people should know about. This is also a chance to suggest a future presentation if you’ve got work currently in progress you might want to share. Please add yourself, and if you do not have an update to share, you can pass.
Name: Your update
Félix-Antoine Fortin: JupyterHub team meetings and notes page has moved to https://jupyterhub-team-compass.readthedocs.io/en/latest/meetings/
Reports and celebrations#
This is a place to make announcements (without a need for discussion). This is also a great place to give shout-outs to contributors! We’ll read through these at the beginning of the meeting.
Name: Your report or celebration
Agenda items#
Let’s collect all potential agenda items here before the start of the meeting. We will then attempt to create a coherent agenda that fits in the 60m meeting slot. If there are similar items try and group them together.
All: Standing project items:
Batchspawner check-in: Issues and PRs
~~CI OK?~~
Current PRs to review
Users overwriting script template
Created issue to not encourage users to do this
Documentation issue, not code issue
Some better/newer config examples for README
Wrapspawner check-in: Issues and PRs
EPA team contribution PR
They’ve been working on responding to review
Workshop/conference season:
ISC 22 Interactive HPC
SC 22 Urgent+Interactive HPC workshop
PEARC, next week, Rick + Mike will be there
JupyterHub for Workshops Proof-of-Concept (Rick)
Want Jupyter for workshops with some code, data
e.g. summer school, training class
BYOD bring your own data
computing environments may be ephemeral
enable instructors/workshop leads to manage
Make it easy to stand up hub for workshops
Components:
Hub; Globus (w/Globus roots); GitHub repos for Dockerfiles config, etc; file system
Globus connect server w/guest collection, globus groups
c.GlobusOAuthenticator.allowed_globus_groups
instructors, students
Data
Accessible via guest collection
Notebook bind mounts the collections
Includes a “welcome.md” page with links to everything (RO) for the training/workshop etc
=> The welcome.md page is more useful than expected
Compute
DockerSpawner w/some pre-made allowed images
Documentation
Custom login template points students to Globus Group to request membership
Instructors approve
Observations
Usernames via Globus needs work (different IdPs have different username patterns) … namespace collisions
DockerSpawner escaping is an issue… username scheme that helps avoid collisions across IdPs?
“Running notebooks in shared folders is as bad as you think,” use
FileCheckpoints.checkpoint_dir
helps a bitCustomizing templates was frustrating (didn’t want to mess with css though)
Demo
Concern about collisions, has some gnarly usernames
Terminal drops you into work
Welcome file has links to slides off the server (new tab) etc
Had discussed nbgitpuller but instructors just wanted to upload notebooks and touch shared files live
Link to Globus Connect
They can host their SIF files in their public data dir (totally safe you can use it trust Rick!
Q&A
Can share? Rick would like to do a write-up, wanted some feedback from the group
Change for on-prem, suggests a single VM
Is there filesystem with globus endpoint/guest collection
Delegating management to the instructors
Or w/k8s you can do z2jh
Hub itself is an OAuth client
How did you learn what you needed to know to write the post-auth hook?
Having worked for Globus helps :)
Had to be a way to trigger an action after a user was authenticated
Hard part about hooks: Not knowing what part of hub is running it
As things get more complicated, the config includes more and more code
Has documentation kept up so that a new person knows how to “figure out how to figure out” a config?
Meta question: How to capture good stuff like this?
Start building out in a repo that already exists…?
Rollin: Do a survey of existing things that may have been abandoned and see about reviving them
FAQ, institutional mailing list etc
Scaling issues?