Phase 3.1: Initial setup#
This phase is about a fast initial hub setup that later could be fine-tuned.
Definition of ready#
The following lists the information that needs to be available to the engineer before this phase can start.
Name of the hub
Will Dask gateway be required?
Splash image
URL of the community’s webpage
Funded by information (name and URL)
Authentication mechanism
List of admin users
Outputs#
At the end of Phase 3.1 both 2i2c engineers and the other admin users can login into the hub.
The file assets that should have been generated and included in the PR should be:
➕ config/clusters/<new-cluster-name>
├── common.values.yaml
├── <new-hub-name>.values.yaml
└── enc-<new-hub-name>.secret.values.yaml
And the following existing file should have been updated if the new hub was the first in a cluster:
~ .github/workflows
└── deploy-hubs.yaml
Tip
When reviewing initial hub setup PRs, make sure the files above are all present.
Initial setup runbook#
All of the following steps must be followed in order to consider phase 3.1 complete. Steps might contain references to other smaller, topic-specific runbooks that are gathered together and listed in the order they should be carried out in by an engineer.
Create the relevant
values.yaml
file/s under the appropriate cluster directoryexport CLUSTER_NAME=cluster-name; export HUB_NAME=hub-name
If you’re adding a hub to an existing cluster with hubs on it then create only one file to hold the specific hub configuration via
touch ./config/clusters/$CLUSTER_NAME/$HUB_NAME.values.yaml
If the cluster has currently no hubs
but will have more than one (and chances are it will as this is a common 2i2c practice to always deploy a staging hub alongside a production one), then create two values.yaml files under the appropriate cluster directory.
One file will hold the common hubs configuration and one will hold the specific hub configuration.
Make sure you are in the root of the infrastructure repository and run:
touch ./config/clusters/$CLUSTER_NAME/$HUB_NAME.values.yaml; touch ./config/clusters/$CLUSTER_NAME/common.values.yaml
Run the deployer to generate a sample basic hub configuration
The easiest way to add new configuration is to use the deployer to generate an initial sample config.
You will be asked to input all the information needed for the command to run successfully. Follow the instructions on the screen and using the information provided to you, fill in all the fields.
If you’re adding a hub to an existing cluster with hubs on it
Then run the deployer command below to generate config for the specific hub configuration:
If this will be a regular JupyterHub then:
Run:
deployer generate hub-asset main-values-file
Setup the relevant Authentication Provider with relevant credentials
See Enable authentication for steps on how to achieve this.
If this will be a binderhub style hub, then run:
deployer generate hub-asset binderhub-ui-values-file
And continue following the guide at Make binderhub-ui hub.
If the cluster has currently no hubs
Determine the address of the storage server that a hub on this cluster should use to connect to it
Get the address of the EFS server via terraform and store it as it will be required in a later step.
Make sure you are in the right terraform directory, i.e.
terraform/projects/aws
and the right terraform workspace by runningterraform workspace show
.terraform output nfs_server_dns
Get the address of the Google FileStore IP from the UI and store it as it will be required in a later step.
tf output azure_fileshare_url
Run the deployer command below to generate config for the common hubs configuration, passing the admin users one by one:
deployer generate hub-asset common-values-file --admin-users admin1 --admin-users admin2
Warning
If the admin users list is not passed independently as arguments and is instead left to be passed via de prompt with all the other args, then the following error is raised no matter the value passed:
Error: Value must be an iterable.
.
Tip
Each
*.values.yaml
file is a Helm chart configuration file (basehub
, ordaskhub
), and you can also configure their chart dependencies (jupyterhub
,dask-gateway
, etc).You can also look at the entries for similar hubs under the same cluster folder, copy / paste one of them, and make modifications as needed for this specific hub. For example, see the hubs configuration in the 2i2c Google Cloud cluster configuration directory.
Then reference these files in a new entry under the
hubs
key in the cluster’scluster.yaml
fileYou can use the
deployer generate hub-asset
subcommand to generate the relevant entry to insert into cluster.yaml file.deployer generate hub-asset cluster-entry --cluster-name $CLUSTER_NAME --hub-name $HUB_NAME
Warning
Please pay attention to all the fields that have been auto-generated for you by this command and change every one that doesn’t match the community’s requirements or was not rendered correctly before copying-pasting it into the relevant files.
Important
If you are deploying a binderhub ui style hub, then make sure that in the
cluster.yaml
file the hub’s domain is entered instead of the binderhub’s, for testing purposes.Enable dask-gateway
Use the info provided in the new hub GitHub issue for the
Dask gateway
section. If Dask gateway will be needed, then choose abasehub
, and follow the guide on (how to enable dask-gateway on an existing hub)[howto:features:daskhub]Add the new cluster to CI/CD
Important
This step is only applicable if the hub is the first hub being deployed to a cluster.
To ensure the new cluster and its hubs are appropriately handled by our CI/CD system, please add it as an entry in the following places:
The
deploy-hubs.yaml
GitHub workflow has a job namedupgrade-support-and-staging
that needs to list of clusters being automatically deployed by our CI/CD system. Add an entry for the new cluster here.
Create a Pull Request with the new hub entry
And get a team member to review it.
Merge the PR once it’s approved Once you merge the pull request, the GitHub Action workflow will detect that a new entry has been added to the configuration file.
It will then deploy a new JupyterHub with the configuration you’ve specified onto the corresponding cluster.
Monitor the action to make sure that it completes
If something goes wrong and the workflow does not finish, try deploying locally to access the logs to help understand what is going on. It may be necessary to make new changes to the hub’s configuration via a Pull Request, or to revert the old Pull Request if you cannot determine how to resolve the problem.
Attention
In order to protect sensitive tokens, our CI/CD pipeline will not print testing output to its logs. You will need to run the health check locally to inspect these logs.
Log in to the hub
And ensure that the hub works as expected from a user’s perspective.
Send a link to the hub’s Community Representative(s) So they can confirm that it works from their perspective as well.