Deploy a Spark cluster + Jupyter notebook (sys-admin nomination required)

Prerequisites

The user has to be registered in the IAM system for INFN-Cloud https://iam.cloud.infn.it/login. Only registered users can login into the INFN-Cloud dashboard https://my.cloud.infn.it/login.

User responsibilities

Important

The solution described in this guide consists on the deployment of a Spark cluster on top of a Virtual Machine instantiated on INFN-CLOUD infrastructure. The instantiation of a VM comes with the responsibility of maintaining it and all the services it hosts.

Please read the INFN Cloud AUP in order to understand the responsibilities you have in managing this service.

Spark cluster configuration

Note

If you belong to multiple projects, aka multiple IAM-groups, after login into the dashboard, from the upper right corner, select the one to be used for the deployment you intend to perform. Not all solutions are available for all projects. The resources used for the deployment will be accounted to the respective project, and impact on their available quota. See figure below.

../../../_images/project_selection1.png

After the selection of the project choose the “Spark + Jupyter cluster” button from the list of available solutions.

Dashboard

The configuration menu is shown. Parameters are split in two pages: “Basic” and “Advanced” configuration.

Basic configuration

Default parameters are ready for the submission of a cluster composed by 1 master and 1 slave both with 4CPU and 8GB RAM. By default the provider where the cluster will be instantiated is automatically selected by the INFN-Cloud Orchestrator Service.

The user must specify (see fig.1)

  • a human readable name for your deployment (max 50 characters)
  • a password that will be required to access the Kubernetes dashboard and the Grafana monitoring as admin user
  • the number of slaves and the RAM and CPU value of both master and slaves
  • optionally, a S3 storage endpoint and a list of its buckets to be mounted as persistent storage on the Jupyter notebook
Fig. 1  basic input data configuration

Figure 1: basic input data configuration.

Advanced configuration

The user can select (see fig.2)

  • the timeout for the deployment
  • “no cluster deletion” in case of failure
  • the automatic or manual scheduling, that selects the provider where the cluster will be created
  • send a confirmation email when complete
Figure 2: advanced configuration tab

Figure 2: advanced configuration tab.

Deployment result

To check the status of the deployment and its details select the “deployments” button. Here all the user’s deployments are reported with “deployment uuid”, “status”, “creation time” and the “provider” (see fig.3).

Figure 3: list of user deployments

Figure 3: list of user deployments.

For each deployment the button “Details” allows:

  • to get the details of deployment: overview info, input values and output values as the Kubernetes dashboard and Jupyter notebook endpoints (see fig. 4a)
  • to edit the description of the deployment
  • to retrieve the deployment log file that contains error messages in case of failure
  • to show the TOSCA template of the cluster
  • to request new ports to be opened
  • to retrieve VM details (see fig. 4b for an example)
  • to delete the cluster
  • to lock the deployment (it makes disappear the Delete action)
Figure 4a: deployment output values

Figure 4a: deployment output values.

Figure 4b: deployment output values

Figure 4b: VM details screen.

Use Spark from Jupyter

Clicking on the jupyter_endpoint link you’ll be asked to authenticate with IAM and choose the size of your personal Jupyter server (see fig. 5).

Figure 5: Jupyter server options

Figure 5: Jupyter server options.

This will start a Jupyter notebook with your S3 bucket(s) mounted on the file-system, as shown in fig. 6.

Figure 6: Jupyter instance dashboard.

Figure 6: Jupyter instance dashboard.

You can then upload your preferred notebook (or take one previously uploaded in your S3 bucket) and open it in Jupyter. Click on the star button (shown in fig. 7) to connect with the underlying cluster by creating the Spark Context and Session.

Figure 7: Jupyter notebook example.

Figure 7: Jupyter notebook example.

In the Spark clusters connection box you can specify the Spark configuration, as shown in fig. 8.

Figure 8: Spark cluster connection configuration box

Figure 8: Spark cluster connection configuration box.

After clicking the Connect button and waiting a few seconds, you’ll see the connection details as shown in fig. 9

Figure 9: Spark connection details. Go back to the notebook and use sc and spark variables to execute Spark operations.

Figure 9: Spark connection details. Go back to the notebook and use sc and spark variables to execute Spark operations.

Troubleshooting

In both the cases of auto and manual scheduling, the success of creation depends on the provider resources availability. Otherwise a “no quota” is reported as failure reason.

Known issues: the Jupyter notebook takes time to start, sometimes it could fail due to a timeout. In this case, go back to control panel and restart the notebook.

Contact for support: cloud-support@infn.it

Resource Availability Less Than Requested For a Spark Server

User may request resources for a Spark server that are not available in Kubernetes Cluster. In this case, a message will be prompted warning that there are insufficient CPU and/or memory. During this period, it is not possible to cancel the deployment using JupyterHub UI.

Figure 10: resource warning while deploying Spark Server.

Figure 10: resource warning while deploying Spark Server.

Jupyter returns Spawn failed error after 600 seconds. After that, the user can redeploy the Server.

Figure 11: Spark Spawn Failure after 600 seconds.

Figure 11: Spark Spawn Failure after 600 seconds.