Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Features available on the hubs

This document is a concise description of various features we can optionally enable on a given JupyterHub. Explicit instructions on how to do so should be provided in a linked how-to document.

GPUs

GPUs are heavily used in machine learning workflows, and we support provisioning GPUs for users on all major platforms.

See the associated howto guide for more information on enabling this.

Cloud permissions

Users of our hubs often need to be granted specific cloud permissions so they can use features of the cloud provider they are on, without having to do a bunch of cloud-provider specific setup themselves. This helps keep code cloud provider agnostic as much as possible, while also improving the security posture of our hubs.

GCP

‘Requester Pays’ access

By default, the organization hosting data on Google Cloud pays for both storage and bandwidth costs of the data. However, Google Cloud also offers a requester pays option, where the bandwidth costs are paid for by the organization requesting the data. This is very commonly used by organizations that provide big datasets on Google Cloud storage, to sustainably share costs of maintaining the data.

Requester Pays is a feature that a bucket can have.

Allow access to external Requester Payes buckets

If buckets outside the project have the Requester Payes flag, then we need to:

Enable Requester Pays flag on community buckets

The buckets that we set for communities, inside their projects can also have this flag enabled on them, which means that other people outside will be charged for their usage.

‘Scratch’ buckets on object storage

Users often want one or more object storage buckets to store intermediate results, share big files with other users, or to store raw data that should be accessible to everyone within the hub. We can create one more more buckets and provide all users on the hub equal access to these buckets, allowing users to create objects in them. A single bucket can also be designated as as scratch bucket, which will set a SCRATCH_BUCKET (and a deprecated PANGEO_SCRATCH) environment variable of the form <s3 or gcs>://<bucket-name>/<user-name>. This can be used by individual users to store objects temporarily for their own use, although there is nothing preventing other users from accessing these objects!

‘Persistent’ buckets on object storage

This is exactly the same as scratch bucket storage, but without a rule deleting contents after a set number of days. This is helpful for storing intermediate computational results that take a while to compute, and are consistently used throughout the lifetime of a project. We set the environment variable PERSISTENT_BUCKET to the form <s3 or gcs>://<bucket-name>/<user-name> so users can put stuff in this.