JupyterHub and BinderHub Team Meeting - April

Date: 2020-04-16 at 8am UTC

Welcome to the Team Meeting

Hello!

If you are joining the team video meeting, sign in below so we know who was here. Roll call:

  • Sarah / Turing Institute / @sgibson91

  • Tim / Binder / @betatim

  • Simon / OME, University of Dundee / @manics

  • Georgiana / CIR / @GeorgianaElena

  • Erik / Sundell Open Source Consulting AB / @consideratio

  • Arnim / GESIS / @arnim

  • Richard Darst / Aalto University, Finland / @rkdarst

  • Kirstie Whitaker / Turing Institute / @KirstieJane

  • Hugo Bowne-Anderon/ DataCamp/ @hugobowne

Quick updates

60 second updates on things you have been up to, questions you have, or developments you think people should know about. Please add yourself, and if you do not have an update to share, you can pass.

  • admin: who has access to the team calendar and can update today’s meeting (and its future versions) to have the correct start time? Right now it is set to 7am UTC instead of 8am UTC

    • let’s ask Zach to move it and maybe add more people who have access

Reports and celebrations

This is a place to make announcements (without a need for discussion). This is also a great place to give shout-outs to contributors! We’ll read through these at the beginning of the meeting.

  • https://github.com/jupyterhub/zero-to-jupyterhub-k8s/releases/tag/0.9.0 is out!

    • 🎉

Agenda items

This is what we’ll cover in the meeting (we have about 60m in total). We’ll copy the proposed agenda topics above here just before the meeting.

  • Moving support questions to Discourse: team-compass#271 .github#1 (Simon +?, 10 min)

    • Georgiana’s done the work of creating and setting up standard templates and a support bot. Is everyone happy with adding a support label on all new support issues to automatically close them?

      • Example: https://github.com/jupyterhub/binderhub/issues/1087#issuecomment-614437442

    • What should we do with existing issues?

    • [name=Georgiana] Should we enable the bot for all the repos at once or should we first run a trial for a few repos? (At the moment, there should be 3 pending requests on the @jupyterhub account to enable the bot for jupyterhub, binderhub and tljh. Update: the bot is out for these 3 repos :tada:)

      • Tim thinks starting on a small number of repos is good, then we can experiment and maybe at the next call decide to roll it out to all repos?

    • Q: what are your thoughts on a PR template as well? For example https://github.com/alan-turing-institute/the-turing-way/blob/master/.github/PULL_REQUEST_TEMPLATE.md

    • Todo: open a PR on team-compass to document this process to help spread the word and get a consistent response from all maintainers

  • Increasing limits for “special” repositories [Tim]

    • https://github.com/jupyterhub/team-compass/issues/276

    • We increased the limit for the repo mentioned in the issue to 200 for the duration of a live stream

    • do we need a bigger process around increasing the limit for a repository for a short amount of time? what response time?

    • Does it matter who is asking for the increase? Paid vs free vs one off vs “every week”?

    • What is the absolute maximum we can give to people to keep the service available for all as a shared resource?

    • What do we ask for in return from the person who benefits from the increase?

    • Take away: for one off events with up to 200 concurrent users (for around 4h in length) we are happy to raise the limit. We are not happy with sustained usage above the limit of 100.

    • Take away: not so happy if the person teaching it or the conference is a commercial event.

    • What about creating a PR template to use that helps formalise the process a bit, few bullet points with above questions to fill in

      • useful to organise applicants

      • useful for keeping track of our impact for funding/grants

    • Todo: spread the word on how to contact the team as whole

      • tweet out a link to the “binder cheatsheet”

      • Sarah will make a bot that replies to tweets with this link

  • Funding/paying for mybinder.org [Tim]

    • in the context of https://github.com/jupyterhub/team-compass/issues/276 a discussion around funding for mybinder.org started

    • There is not a lot of structure to this discussion item. Maybe if we have some spare time at the end of the meeting we can chat about it.

    • [name=Simon] Would direct sponsorship from a conference imply a Service Level Agreement? Alternative could be to promote Sarah’s one-click Azure deployment?

    • [name=Erik] If we take payments, it must be as simple as possible to manage because we don’t have resources to manage this afaik.

    • Q: can people directly buy us credits on GKE/AWS/Azure?

    • Q: why do we need money? A: We are always looking for cloud credits.

    • One way to help us is to use your influence to make it easier for Binder to continue to get credits from those that give them to us (GKE, Gesis management)

    • Accepting money ourselves is a huge amount of work and {possibly,probably} not worth it.

    • Demonstrate the value we deliver!

      • maybe an exploration of our binder analytics archive

      • https://discourse.jupyter.org/t/a-datasette-of-mybinder-org-launches/3719

  • Removing repository images/analytics from mybinder

    • [Side diversion during previous item]

    • happened a few times, someone spawns a repo they shouldn’t have. The image gets created, repo appears in the gallery.

    • we can’t remove images from OVH/…

  • Helm 2->3 transition z2jh

    • [name=Simon] Helm 3 works well for new deployments of JupyterHub and BinderHub (I’ve been using it). It won’t upgrade from Helm 2 deployments:

      • https://helm.sh/blog/migrate-from-helm-v2-to-helm-v3/

      • https://helm.sh/docs/topics/v2_v3_migration/

    • [name=Erik] General notes on helm2 and helm3 upgrades:

      • Always use --cleanup-on-fail.

      • Don’t use --wait, or if you do, ensure to not abort!

      • Additional notes: https://github.com/consideRatio/consideratio.github.io/issues/6

    • [name=Erik] I’ve successfully upgraded my old Helm2 JupyterHub deployment, and now use Helm3. I used the helm2to3 plugin.

    • [name=Erik] Relevant understanding:

      • Helm stores information about what it has deployed both with Helm2 and Helm3 in Kubernetes, but this “release information” is moved from configmaps in kube-system to secrets in the namespace where the Helm chart is installed.

      • There is some information saved on local computers using the helm CLI, this is not complicated during the upgrade process.

  • [name=Arnim] Exploratory Survey Paper on Projects using MyBinder idea https://github.com/jupyterhub/team-compass/issues/277

  • [name=Kenan] Work on a Helm chart to install persistent BinderHub (including MiniKube howto) https://github.com/gesiscss/persistent_binderhub … feedback welcome ;)

    • Tim: this is pretty cool!

    • Q: should we advertise this on twitter/discourse already?

    • Q: should we upstream this to the BinderHub chart?

      • let’s investigate what the best way is (one repo or a new repo in the jupyterhub org)

        • Tim: divide and conquer is nice because it splits up the work. I don’t know what the process is for adding the repo to the JupyterHub org but I’d be in favour

        • Erik: I’d like to see that anything in JupyterHub.org is sufficiently maintained and have public adoption

  • [name=Sarah] I’ve asked my colleague at the Turing who specialises in NLP to help me analyse the results of the mybinder.org user survey - particularly identifying themes from the free text answers

    • I’ve already had discussions with Chris Holdgraf around data privacy. The only data we collect besides the questions is the submission timestamp - I will do another sweep to check there are no identifying features before sharing with him (such as someone mentioning an institute). I’m very happy to hear any other comments/opinions/suggestions on the data privacy issues here.

    • https://github.com/jupyterhub/team-compass/issues/256

  • [name=Erik] I’d love additional comments to this discourse topic! What information from a JupyterHub / BinderHub deployment that you would find useful? I’m trying to figure out what could be sensible to develop to provide more insights for funders, evangelists, administrators, and end users: https://discourse.jupyter.org/t/towards-jupyterhub-deployment-insights/3836

  • How did we like Google Hangouts for this call?

    • [name=Kirstie] I really don’t like it - I found it super hard to see people. BUT if we’re choosing hangouts because zoom has been terrible on their encryption and data management then I’m fine with using Hangouts instead. Same if there are people in the team who aren’t allowed to use zoom! Morals & inclusion over UX design, usability & thieving data if it comes to that :fist:

    • I miss the “grid view” even though there was a “tiled layout” that allowed me to see 4 people at the same time. Ah perhaps I could actually see everyone with a camera active? ohhhh good point!

      • even if everyone has their camera on you can only see four people

    • I also miss the grid view, sound and video quality was good

    • Worked quite well, except for grid thing. But maybe, like someone said above, we had only four cameras…

    • [name=Simon] Usable, but still not as nice as Zoom (updated: in case it’s not clear I’m happy with any solution that’s usable even if it’s not my first choice)

    • [name=Sarah] Agree with Kirstie, not my favourite but I’d definitely put people’s comfort and inclusion above usability.