JupyterHub HPC Meeting - September 2023#
Date: 2023-09-06
Time: 8:30 AM PDT
Where you can find these and past meeting notes: https://jupyterhub-team-compass.readthedocs.io/en/latest/meetings/hpc/
Calendar for future meetings: https://jupyterhub-team-compass.readthedocs.io/en/latest/meetings/
Jupyter Code of Conduct: https://jupyter.org/governance/conduct/code_of_conduct.html
Welcome to the Meeting#
Hello! If you are joining the team video meeting, sign in below so we know who was here. Roll call:
name / institution / GitHub handle
Rollin Thomas / NERSC / @rcthomas
Cory Sherman / UW Madison / @cccsss01
Rick Wagner / SDSC / @rpwagner
Jens Henrik Goebbert / FZJ-JSC / @jhgoebbert
Michael Milligan / MSI @ UMN / @mbmilligan
Félix-Antoine Fortin / Université Laval, The Alliance / @cmd-ntrf
Mridul Seth / Anaconda
Quick updates#
60 second updates on things you have been up to, questions you have, or developments you think people should know about. This is also a chance to suggest a future presentation if you’ve got work currently in progress you might want to share. Please add yourself, and if you do not have an update to share, you can pass.
Name: Your report or celebration
Cory Sherman
Difficulties for iterative improvements with the hub secrets file, still using 2.0.0
Having to rebuild helm chart every time
Mridul, Rick may have some suggestions on secrets
Global image repo setting in 3.0.0?
Maybe check in with JupyterHub team on their call?
Félix-Antoine Fortin:
We had to debug 504 error in Jupyter notebooks running from a Lustre HOME. Turns out we had to fix 4 different things to fix bad user experience.
Planning to write a blog post of sort about that and I am wondering if anyone had similar issues.
Fixes were :
moving IPython history database in memory
moving Jupyter server signature database in memory
moving Jupyter runtime directory to SLURM_TMPDIR (it’s no longer XDG_RUNTIME_DIR since 2019…)
disabling FileManagerMixin use_atomic_writing (https://jupyter-server.readthedocs.io/en/latest/api/jupyter_server.services.contents.html#jupyter_server.services.contents.fileio.FileManagerMixin.use_atomic_writing)
Used strace then went to pyspy for RCA
Discourse post encouraged by all
Rollin: If we have time I can tell you about misconfigured MTUs and bad optical connections that Jupyter found
I now ping every Jupyter notebook running every 30s to catch timeouts and 504 storms
DVS woes
GPFS woes
Reports and celebrations#
This is a place to make announcements (without a need for discussion). This is also a great place to give shout-outs to contributors! We’ll read through these at the beginning of the meeting.
Name: Your report or celebration
Agenda items#
Let’s collect all potential agenda items here before the start of the meeting. We will then attempt to create a coherent agenda that fits in the 60m meeting slot. If there are similar items try and group them together.
All: Standing project items:
Batchspawner check-in, issues and PRs
jupyterhub/batchspawner#273 (bump minimum to python 3.7)
Review after PEARC
Wrapspawner check-in, issues and PRs
Mentioned in discourse thread
But no repo action this month
Discourse:
HPC-related projects topic <- Rollin (done)
Batchspawner, wrapspawner
jupyter-lmod
jupyterhub-announcement
jupyter-slurm-provisioner
Census of deployments, what versions are people running? <- Mike
Felix: Write up the 504 issue w/Lustre on Discourse?
Workshop/conference news:
TrustedCI Summit at LBL, week of October 23
October 24-26
Security subproject + Zeek
Security training
Jupyter + Zeek workshop
KubeCon NA Chicago Nov. 2023
November 6-9
HPC track
SC23
HPPSS at SC23
High Performance Python for Science at Scale
Magic Castle Tutorial
Others?