JupyterHub HPC Meeting - April 2021#

  • Date: 2021-04-07

  • Time: 8:30 AM PST

    • Your timezone: https://arewemeetingyet.com/Los%20Angeles/2021-04-07/08:30/JupyterHub-HPC

  • GitHub issue:

  • Calendar for future meetings: https://jupyterhub-team-compass.readthedocs.io/en/latest/meetings.html

Welcome to the Meeting#

Hello! If you are joining the team video meeting, sign in below so we know who was here. Roll call:

  • name / institution / GitHub handle

  • Rollin / NERSC / @rcthomas

  • Zach Price / ORNL

  • Michael Milligan / UMN MSI

  • Jens Henrik Goebbert / FZJ

Quick updates#

60 second updates on things you have been up to, questions you have, or developments you think people should know about. This is also a chance to suggest a future presentation if you’ve got work currently in progress you might want to share. Please add yourself, and if you do not have an update to share, you can pass.

  • NAME: What you’d like to update on

  • Rollin Jupyter Community Workshop series is coming back to life, “plans for plans” for security-focused workshop will be coming together soon.

Reports and celebrations#

This is a place to make announcements (without a need for discussion). This is also a great place to give shout-outs to contributors! We’ll read through these at the beginning of the meeting.

Agenda items#

Let’s collect all potential agenda items here before the start of the meeting. We will then attempt to create a coherent agenda that fits in the 60m meeting slot. If there are similar items try and group them together.

  • Standing Items:

    • Batchspawner check-in

      • Batchspawner 1.1 released

        • On PyPI, open for business for new PRs!

      • Acceptance/acquiescence on reformatting

        • https://github.com/jupyterhub/batchspawner/pull/208

        • Now outdated needs to be brought up to date

        • Ask Erik to update the PR then MM will merge

      • If there was a push to upstream something:

        • Remote port stuff would really be nice to have upstreamed because multiple projects need it

        • Port range implementation is also a feature people have been asking for

    • Wrapspawner check-in

      • Reformatting PR?

      • Certain issues with traitlets

      • Code needs a pass to catch up to modern

        • ProfileSpawner prevents using internal SSL

      • Release in current state pretty soon

      • But after that, want to deep dive into some issues

  • Kubespawner - cross cluster? (Zach Price):

    • ARM (ORNL group) has access to 2 HPC clusters, one at OLCF, one not

    • JupyterHub that allows launching on HPC or not (2 profiles)

    • OLCF also has a new hub deployment they’d like to use but trying to avoid users having to know the differences between the 2

    • So one hub in one cluster wants to start notebooks in another cluster, is there a way to do that?

    • Single hub at a big instution, main problem is a common identity management system

    • Jens asked how people handle MFA:

      • Basically everyone does their own thing

  • https://github.com/jupyterhub/jupyterhub/pull/2726

    • rkdarst: help editing to get it in?

  • Startup time when launching a new JupyterLab (Jens Henrik)

    • They just switched from home-grown Docker to k8s

    • Performance testing (why 15-20 s to launch?)

    • Partially parallel file system

    • Any nice solutions?

      • Jens: Turn off checking build update! 7 seconds!

      • Richard observed some things with Jupyter Docker stacks with chown’ing, but Jens is doing bare metal GPFS

      • Rollin:

        • Login node setup is over GPFS

        • Compute node setup is over Cray DVS, there the filesystem is mounted read-only w/client-side caching on

      • Where is the time going and what JupyterLab is this?

        • Lots of requests to NFS

      • Median start-up time at MSI looks like 20-40s

      • Richard observed 22s at Aalto during call

      • What kind of overhead does a non-Jupyter pod get? Is this Jupyter specific? Start a pod and run mdtest :)