Cluster Toolkit uses a cluster blueprint
to define a deployment. For each blueprint, you can use the deployment_groups
setting to define a set of modules that can be used to customize your
deployment. These modules are used to specify options such as your compute
resources, networking, schedulers, monitoring applications, and file systems.
This document outlines the core supported module options for Cluster Toolkit v1.3.0 or later. For a complete list of supported modules including the experimental options, see the modules page on the Cluster Toolkit GitHub repository documentation.
Supported file systems
The following file systems are supported:
You can also specify that Cluster Toolkit integrates a pre-existing file system into your deployment.
Cluster Toolkit also supports the use of Kubernetes Persistent Volumes and Persistent Volume Claims as a connector to cloud-based file systems.
Supported schedulers
The following schedulers are supported:
Supported Compute Engine resources
The following features are supported for Compute Engine resources:
VM creation - many of the core VM customization options are supported including, but not limited to, the following:
- Machine type options: all machine types
- VM instance placement policy options: both compact and spread placement policies
- GPU integration: all GPU types
- Advanced networking: all options including gVNIC support and Tier 1 higher bandwidths
- Service account setup
- Disabling simultaneous multithreading (SMT)
- Startup scripts
- Spot VMs
Supported monitoring options
The following tools are supported for collecting measurements of your service and of the Google Cloud resources that you use.
Supported networking resources
The following features are supported for Google Cloud's VPC resources:
- Create a new VPC network
- Integrate with existing VPC networks
Supported software installation and system setup
Cluster Toolkit can be used for the following use cases:
- Automation of application installations by using Spack
- Ansible installation
- GPU accelerated Chrome Remote Desktop
- Google Cloud Observability Ops Agent installation and setup
- Network File System (NFS) client installation and automatic mounting
- Custom image building automation with Packer
What's next
- Review how to prepare your cluster blueprint.
- Try a quickstart tutorial, see Deploy an HPC cluster with Slurm.