JupyterHub HPC Meeting - July 2022#

Welcome to the Meeting#

Hello! If you are joining the team video meeting, sign in below so we know who was here. Roll call:

  • name / institution / GitHub handle

  • Rollin Thomas / NERSC / @rcthomas

  • Michael Milligan / MSI @ UMN / @mbmilligan

  • Félix-Antoine Fortin / Université Laval / @cmd-ntrf

  • James Beal/ Sanger Institute / @jbeal-work

  • Rick Wagner / UCSD / @rpwagner

  • Jens Henrik Göbbert / Jülich Supercomputing Centre / @jhgoebbert

  • Kevin Paul / NCAR / @kmpaul

Quick updates#

60 second updates on things you have been up to, questions you have, or developments you think people should know about. This is also a chance to suggest a future presentation if you’ve got work currently in progress you might want to share. Please add yourself, and if you do not have an update to share, you can pass.

Reports and celebrations#

This is a place to make announcements (without a need for discussion). This is also a great place to give shout-outs to contributors! We’ll read through these at the beginning of the meeting.

  • Name: Your report or celebration

Agenda items#

Let’s collect all potential agenda items here before the start of the meeting. We will then attempt to create a coherent agenda that fits in the 60m meeting slot. If there are similar items try and group them together.

  • All: Standing project items:

    • Batchspawner check-in: Issues and PRs

      • ~~CI OK?~~

      • Current PRs to review

      • Users overwriting script template

        • Created issue to not encourage users to do this

        • Documentation issue, not code issue

        • Some better/newer config examples for README

    • Wrapspawner check-in: Issues and PRs

      • EPA team contribution PR

      • They’ve been working on responding to review

  • Workshop/conference season:

    • ISC 22 Interactive HPC

    • SC 22 Urgent+Interactive HPC workshop

    • PEARC, next week, Rick + Mike will be there

  • JupyterHub for Workshops Proof-of-Concept (Rick)

    • Want Jupyter for workshops with some code, data

      • e.g. summer school, training class

      • BYOD bring your own data

      • computing environments may be ephemeral

      • enable instructors/workshop leads to manage

    • Make it easy to stand up hub for workshops

    • Components:

      • Hub; Globus (w/Globus roots); GitHub repos for Dockerfiles config, etc; file system

      • Globus connect server w/guest collection, globus groups

      • c.GlobusOAuthenticator.allowed_globus_groups instructors, students

    • Data

      • Accessible via guest collection

      • Notebook bind mounts the collections

      • Includes a “welcome.md” page with links to everything (RO) for the training/workshop etc

      • => The welcome.md page is more useful than expected

    • Compute

      • DockerSpawner w/some pre-made allowed images

    • Documentation

      • Custom login template points students to Globus Group to request membership

      • Instructors approve

    • Observations

      • Usernames via Globus needs work (different IdPs have different username patterns) … namespace collisions

      • DockerSpawner escaping is an issue… username scheme that helps avoid collisions across IdPs?

      • “Running notebooks in shared folders is as bad as you think,” use FileCheckpoints.checkpoint_dir helps a bit

      • Customizing templates was frustrating (didn’t want to mess with css though)

    • Demo

      • Concern about collisions, has some gnarly usernames

      • Terminal drops you into work

      • Welcome file has links to slides off the server (new tab) etc

      • Had discussed nbgitpuller but instructors just wanted to upload notebooks and touch shared files live

      • Link to Globus Connect

      • They can host their SIF files in their public data dir (totally safe you can use it trust Rick!

    • Q&A

      • Can share? Rick would like to do a write-up, wanted some feedback from the group

        • Change for on-prem, suggests a single VM

        • Is there filesystem with globus endpoint/guest collection

        • Delegating management to the instructors

        • Or w/k8s you can do z2jh

      • Hub itself is an OAuth client

      • How did you learn what you needed to know to write the post-auth hook?

        • Having worked for Globus helps :)

        • Had to be a way to trigger an action after a user was authenticated

        • Hard part about hooks: Not knowing what part of hub is running it

      • As things get more complicated, the config includes more and more code

        • Has documentation kept up so that a new person knows how to “figure out how to figure out” a config?

      • Meta question: How to capture good stuff like this?

        • Start building out in a repo that already exists…?

        • Rollin: Do a survey of existing things that may have been abandoned and see about reviving them

          • FAQ, institutional mailing list etc

  • Scaling issues?