JupyterHub and BinderHub Team Meeting#
Date: Thursday 15th July 2021
Time: 5PM UTC
This HackMD: https://hackmd.io/@sgibson91/hubs-team-meeting
GitHub issue: https://github.com/jupyterhub/team-compass/issues/423
Calendar for future meetings: https://jupyterhub-team-compass.readthedocs.io/en/latest/meetings.html
By participating in this meeting, you are agreeing to abide by and uphold the Code of Conduct. Please take a second to read through it! :pray:#
Welcome to the Team Meeting#
If you are joining the team video meeting, sign in below so we know who was here. Roll call:
name / institution / GitHub handle
Sarah Gibson / 2i2c / @sgibson91
Min RK / Simula / @minrk
Simon Li / University of Dundee / @manics
Erik Sundell / 2i2c etc / @consideratio
Mridul Seth / GESIS / @MridulS
Samuel Gaist / Idiap / @sgaist
60 second updates on things you have been up to, questions you have, or developments you think people should know about. Please add yourself, and if you do not have an update to share, you can pass.
Sarah: At the end of this month, I will be co-hosting a session on mybinder.org for Digital Humanities researchers with some former Turing colleagues
*Sarah: I have been developing https://github.com/sgibson91/test-this-pr-action which is an action that will allow us to test mybinder.org deployments on staging for PRs from forks. I think it’s ready for a field test and will open a PR soon. Thanks to Min for the reviews and ideas!
Chris: Just want to draw attention to this issue asking if anybody wants to find or use funding for Binder. I think we have a good chance of getting funding because it is such an impactful project, I want to make sure that those on this team who are interested have an opportunity to join. (note I’m not attending this meeting though because I’m w/ family in the mountains!)
Min: DockerSpawner CI has mysteriously started failing. Having a hard time finding time to investigate https://github.com/jupyterhub/dockerspawner/issues/432
Sarah suggested a way to SSH into the github runner doing the job.
Reports and celebrations#
This is a place to make announcements (without a need for discussion). This is also a great place to give shout-outs to contributors! We’ll read through these at the beginning of the meeting.
Let’s collect all potential agenda items here before the start of the meeting. We will then attempt to create a coherent agenda that fits in the 60m meeting slot. If there are similar items try and group them together.
Sarah: (10 mins): Outreachy
The next round of Outreachy is upcoming: https://www.outreachy.org/apply/project-selection/
We have outlined 2 interns in 2 rounds each of Outreachy (4 interns total) in the CZI proposal Chris and I worked on. We don’t have a concrete date of when they’ll be awarded, but we know the committee has been looking through them so we hope it will be announced soon.
Some really useful front-loading work would be to scope out some self-contained, well defined projects for interns to work on: https://github.com/jupyterhub/outreachy
This work is separate to the “who has capacity to mentor” question
Samuel Gaist: (10-15 mins): GitLab like forge support discussion. For references:
The use of BinderHub with private registries / forge system like GitLab is getting more popular however the current implementation does really scale well with this use case. For example, using the authentication from JupyterHub does not allow connecting to multiple different providers.
There’s also a need to get additional information from the user account like access token in order to be able to authenticate and push the images. For this use case, it will remove the current hard coded information in the various Providers.
Simon suggests, rather than add an extra layer of authentication handling to deal with such use cases on a case-by-case basis which could cause security issues, why not do some refactoring to make authentication more pluggable?
Beside these aspects, there was also a consensus that the current way to deploy BinderHub on Kubernetes is not ideal as for example it will clash with DaskHub as each of them would install their own JupyterHub instance.
Mridul formulated the wish to drop the helm chart from BinderHub and have it integrated with JupyterHub as he would no longer need to maintain the Persistent BinderHub chart.
Therefore, the long term plan would be to refactor the helm charts so that the hub versions would be integrated with the one from JupyterLab and then they can be deployed as optional additional systems allowing to simplify the process and maintenance of the charts.
Sarah thinks both of these suggestions, while having benefits, would be a lot of work
What can be done to gain consensus before taking action?
What frameworks or support can be put in place to facilitate such a large amount of work?
How can we help folk like Yuvi who are doing the “smallest PR that is useful on the road to the bigger vision” type of thinking?
Hackathons and “merge parties” to coordinate work and ensure work doesn’t go stale?
Depending on the tasks found, parts of them could potentially be Outreachy projects
Erik: (3 min): Codecoverage warnings when reviewing code is disrupting me.
If we would disable it, what is required?
Is doing that acceptable across the JupyterHub org’s repos?
Yuvi: (3min): Python Popularity contest (https://github.com/yuvipanda/python-popcontest)
Measure actual usage of installed packages in a privacy friendly way
Library (not module) usage over last 6h on a hub
Can be setup on any kinda JupyterHub or other kinda interactive python setup
Min: is there a way to detect ‘level’ of import from user (i.e. transitive dependencies)? Yuvi says yes in theory, not right now.
Yuvi: (3min): Authenticating one JupyterHub with another JupyterHub
One JupyterHub uses another ‘root’ JupyterHub to authenticate
Very helpful when you have a number of hubs that all are the same auth provider
Lets you automate hub deployment completely, since auth provisioning is often manual
Example at https://github.com/minrk/jupyterhub-yo-dawg
data8 berkeley is using this right now with a few hundred users
https://github.com/jupyterhub/jupyterhub/pull/3488 is very helpful from a UX perspective - users then never see the ‘root’ hub’s home page