Run JupyterHub on a single VM enabling Notebooks persistence (sys-admin nomination required)¶
Table of Contents
Prerequisites¶
Make sure you are registered to the IAM system for INFN-CLOUD https://iam.cloud.infn.it, as described in the Getting Started guide. Only registered users can login into the INFN-CLOUD dashboard https://my.cloud.infn.it.
Access to the INFN-CLOUD dashboard enables users to instantiate a JupyterHub service on a single VM, providing Notebooks with data persistence.
Important
This solution requires the instantiation of a JupytherHub service on top of a newly created virtual machine (VM). You will have complete control, administration rights, on the respective service and VM becoming a service administrator.
Please read the INFN Cloud AUP in order to understand the responsibilities you have in managing this service.
How to deploy and access the JupyterHub service¶
Step 1 - Connecting and authenticating to the INFN-CLOUD dashboard¶
Connect to the INFN-CLOUD dashboard (https://my.cloud.infn.it/):
You need to authenticate with the credentials used for the IAM account (https://iam.cloud.infn.it/login).
Step 2 - Select and Configure the JupyterHub service¶
First of all make sure to select which project, among those you belong to, your application should be deployed.
After logging into the dashboard, select the “Jupyter with persistence for Notebooks” card in the service catalog and click on the Configure button.
After that you will have to configure your deployment. The deployment definition window consists of three tabs: “General”, “Authorizations” and “Advanced”. Before continuing please fill the first mandatory field - the “Deployment description”. You will not be able to submit your deployment without it!
“General” TAB¶
In this tab, all fields have default values but can be changed if desired. You can fill the following fields:
num_cpus
- Number of virtual CPUs for the VM that will host the Jupyter service. The default value is 2.
mem_size
- Amount of memory for the VM in GB. The default value is 4.
enable_monitoring
It is disabled by default.
jupyter_images
- Default value: “harbor.cloud.infn.it/datacloud-templates/snj-base-lab-persistence”. If you want to build and use your own JupyterHub images, you can follow the dedicated guide.
jupyterlab_collaborative
- enable the new collaborative editing feature that allows collaboration in real-time between multiple users. It is disabled by default. See JupyterLab documentation for more information
jupyterlab_collaborative_image
- “harbor.cloud.infn.it/datacloud-templates/snj-base-labc” is the default image for JupyterLab collaborative feature.
ports
- List of additional ports to be opened. By default, and you don’t need to specify them, the deployment will have the following TCP ports accessible: 22 (ssh to the host VM), 3000 (for grafana dashboard), 8888 (for JupyterHub), 8889 (for jupyter collaborative)
Note
Please be aware that this solution is only available for the Ubuntu 20.04 operating system.
“Authorization” TAB¶
You can decide to authorize INFN Cloud user groups by filling:
- iam_groups
- user groups that are allowed to access JupyterHub services.
- iam_admin_groups
- user groups that are allowed to administrate JupyterHub.
Note
INFN Cloud (https://iam.cloud.infn.it) is the IAM identity provider.
“Advanced” TAB¶
Advanced parameters can be configured here:
Configure Scheduling
- Automatic (Default)
- The system will choose the most suitable cloud provider for the deployment
- Manual
- A resource provider can be selected from the list of available cloud sites
- Automatic (Default)
The following extra parameters can be set as well:
- Deployment creation timeout (minutes)
- If specified the deployment will fail when the timeout is reached
- Do not delete the deployment in case of failure
- Send a confirmation email when complete
Step 3 - Submitting the deployment¶
Once all the parameters have been set, you can click on the “Continue” button. After that an overview of the deployment will be shown.
Now you can submit your application and you will be redirected to the list of your deployments from where you can follow the evolution of the new deployment.
Step 4 - Access your application¶
On successful completion (“CREATE_COMPLETE”),
- an e-mail is send to notify you on the status of the deployment, completed or failed
- you can check your deployment outputs by clicking on the “Details” button and then on the “Output values” Tab.
Use the reported IP address to connect to the services you deployed.
How to change the authorized IAM group¶
If you deployed an instance of JupyterHub with persistence of Notebooks and
want to change the name of the IAM group that users must be members of to have
access granted, you need to update the file located in
/usr/local/share/dodasts/jupyterhub/compose.yaml
. Here is an example of its
content:
version: "3.9" services: jupyterhub: depends_on: - http_proxy [...] environment: - [...] - OAUTH_GROUPS=users/example admins/example - ADMIN_OAUTH_GROUPS=admins/example - [...]
In the example, the OAUTH_GROUPS
environment variable is used to define the
IAM groups of users that granted user-role access within the JupyterHub
instance, while the ADMIN_OAUTH_GROUPS
environment variable defines the IAM
group of users with admin-role access. Multiple groups can be defined,
separated by a space `` `` character.
Furthermore, to make the change effective, a restart of the service has to be performed:
cd /usr/local/share/dodasts/jupyterhub/ docker-compose down || docker compose down docker-compose up -d || docker compose up -d