An HPC blueprint is a YAML
file that defines a reusable configuration and
describes the specific HPC environment that you want to deploy using
Cloud HPC Toolkit.
To configure your environment, you can either start with one of the
Example HPC blueprints which you can modify, or create your
own blueprint. To create your own blueprint, review the
Design an HPC blueprint section for
an overview of the configurations that you need to specify in your blueprint.
Before you deploy a cluster, ensure to
review the quota requirements.
Design an HPC blueprint
An HPC blueprint is comprised of the following three main components:
HPC blueprint name. The name of the blueprint. When naming your HPC
blueprint, use the following conventions:
- If you are updating or modifying an existing configuration, don't change the
blueprint name.
- If you are creating a new configuration, specify a new unique blueprint name.
The blueprint name is added as a label to your cloud resources and is used
for tracking usage and monitoring costs.
The HPC blueprint name is set using the blueprint_name
field.
Deployment variables. A set of parameters that are used by all
modules in the blueprint. Use these variables to set values that are
specific to a deployment.
Deployment variables are set using the vars
field in the blueprint, but
you can override or set deployment variables at deployment time by
specifying the --vars
flag with the ghpc
command.
The most common deployment variables are as follows:
deployment_name
: the name of the deployment. The deployment_name
is
a required variable for a deployment.
This variable must be set to a unique value any time you deploy a new copy
of an existing blueprint. The deployment name is added as a label to cloud
resources and is used for tracking usage and monitoring costs.
Because a single HPC blueprint can be used for multiple
deployments, you can use the blueprint_name
to identify the type of HPC
environment, for example slurm-high-performance-cluster
. While the
deployment_name
can be used to identify the targeted use of that cluster,
for example research-dept-prod
.
project_id
: the
ID for the project
where you want to deploy the cluster. The project_id
is a required
variable for a deployment.
zone
: the zone where you want to deploy
the cluster.
region
: the region where you want to
deploy the cluster.
Other variables that you might want to specify here include a custom image
family, a Shared VPC network, or subnetwork that you want all modules
to use.
Deployment groups. Defines a distinct set of modules that are to be
deployed together. A deployment group can only contain modules of a single
type, for example a deployment group can't mix Packer and Terraform modules.
Deployment groups are set using the deployment_groups
field. Each deployment
group requires the following parameters:
group
: the name of the group.
modules
: the descriptors for each module, these include the
following:
id
: a unique identifier for the module.
source
: the directory path or URL where the module is located. For
more information, see Module fields.
kind
: the type of module. Valid values are packer
or terraform
.
This is an optional parameter that defaults to terraform
if omitted.
use
: a list of module IDs whose outputs can be
linked to the module's settings. This is an optional parameter.
outputs
: If you are using Terraform modules, use this parameter to
specify a list of
Terraform output values
that you want to make available at the deployment group level.
During deployment, these output values are printed to the screen
after you run the terraform apply
command.
After deployment, you can access these outputs by running the
terraform output
command.
This is an optional parameter.
settings
: any module variable that you want to add.
This is an optional parameter.
For a list of supported modules, see Supported modules.
Terraform Remote State configuration (optional). Most blueprints use
Terraform modules to provision Cloud infrastructure. It is recommended to
use Terraform
remote state
backed by a Cloud Storage bucket configured with
object versioning.
All configuration settings of the
Cloud Storage backend
are supported. The prefix
setting determines the path
within a bucket where state is stored. If prefix
is left unset, the Cloud HPC Toolkit
automatically generates a unique value based upon the blueprint_name
,
deployment_name
, and deployment group name. The following configuration
enables remote state for all deployment groups in a blueprint:
terraform_backend_defaults:
type: gcs
configuration:
bucket: BUCKET_NAME
For more information about advanced Terraform remote state configuration,
see the Cloud HPC Toolkit GitHub repository.
Example HPC blueprints
To get started, you can use one of the following example HPC blueprints.
- Example 1: Deploys a basic HPC cluster with Slurm
- Example 2: Deploys an HPC cluster with Slurm and a tiered filesystem
For a full list of example HPC blueprints, see the
Cloud HPC Toolkit GitHub repository.
Example 1
Deploys a basic autoscaling cluster with Slurm that uses default
settings. The blueprint also creates a new VPC network, and a
filestore instance mounted to /home
.
Example 2
Deploys a cluster with Slurm that has a tiered file systems for higher
performance. It connects to the default Virtual Private Cloud of the project and
creates seven partitions and a login node.
# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
---
blueprint_name: hpc-enterprise-slurm
vars:
project_id: ## Set GCP Project ID Here ##
deployment_name: hpc01
region: us-central1
zone: us-central1-a
gpu_zones: [us-central1-a, us-central1-b, us-central1-c, us-central1-f]
slurm_image:
# Visit https://github.com/GoogleCloudPlatform/slurm-gcp/blob/master/docs/images.md#published-image-family
# for a list of valid family options with Slurm
family: slurm-gcp-5-10-hpc-centos-7
project: schedmd-slurm-public
# If image above is changed to use custom image, then setting below must be set to true
instance_image_custom: false
# Set to true for active cluster reconfiguration.
# Note that setting this option requires additional dependencies to be installed locally.
# https://github.com/GoogleCloudPlatform/hpc-toolkit/tree/main/community/modules/scheduler/schedmd-slurm-gcp-v5-controller#description
enable_reconfigure: true
# When set, active compute nodes will be cleaned up on destroy.
# Note that setting this option requires additional dependencies to be installed locally.
enable_cleanup_compute: true
# Recommended to use GCS backend for Terraform state
# See https://github.com/GoogleCloudPlatform/hpc-toolkit/tree/main/examples#optional-setting-up-a-remote-terraform-state
#
# terraform_backend_defaults:
# type: gcs
# configuration:
# bucket: <<BUCKET_NAME>>
# Documentation for each of the modules used below can be found at
# https://github.com/GoogleCloudPlatform/hpc-toolkit/blob/main/modules/README.md
deployment_groups:
- group: primary
modules:
# Source is an embedded module, denoted by "modules/*" without ./, ../, /
# as a prefix. To refer to a local or community module, prefix with ./, ../ or /
# Example - ./modules/network/vpc
- id: network1
source: modules/network/pre-existing-vpc
- id: controller_sa
source: community/modules/project/service-account
settings:
name: controller
project_roles:
- compute.instanceAdmin.v1
- iam.serviceAccountUser
- logging.logWriter
- monitoring.metricWriter
- pubsub.admin
- storage.objectViewer
- id: login_sa
source: community/modules/project/service-account
settings:
name: login
project_roles:
- logging.logWriter
- monitoring.metricWriter
- storage.objectViewer
- id: compute_sa
source: community/modules/project/service-account
settings:
name: compute
project_roles:
- logging.logWriter
- monitoring.metricWriter
- storage.objectCreator
- id: homefs
source: modules/file-system/filestore
use: [network1]
settings:
local_mount: /home
- id: projectsfs
source: modules/file-system/filestore
use: [network1]
settings:
local_mount: /projects
# This file system has an associated license cost.
# https://console.developers.google.com/marketplace/product/ddnstorage/exascaler-cloud
- id: scratchfs
source: community/modules/file-system/DDN-EXAScaler
use: [network1]
settings:
local_mount: /scratch
- id: n2_node_group
source: community/modules/compute/schedmd-slurm-gcp-v5-node-group
settings:
node_count_dynamic_max: 4
machine_type: n2-standard-2
instance_image: $(vars.slurm_image)
service_account:
email: $(compute_sa.service_account_email)
scopes:
- https://www.googleapis.com/auth/cloud-platform
- id: n2_partition
source: community/modules/compute/schedmd-slurm-gcp-v5-partition
use: [n2_node_group, network1, homefs, projectsfs, scratchfs]
settings:
partition_name: n2
exclusive: false # allows nodes to stay up after jobs are done
enable_placement: false # the default is: true
is_default: true
partition_conf:
SuspendTime: 300 # time (in secs) the nodes in this partition stay active after their tasks have completed
- id: c2_node_group
source: community/modules/compute/schedmd-slurm-gcp-v5-node-group
settings:
node_count_dynamic_max: 20
machine_type: c2-standard-60 # this is the default
instance_image: $(vars.slurm_image)
bandwidth_tier: tier_1_enabled
disk_type: pd-ssd
disk_size_gb: 100
service_account:
email: $(compute_sa.service_account_email)
scopes:
- https://www.googleapis.com/auth/cloud-platform
# use `-p c2` to submit jobs to this partition:
# ex: `srun -p c2 -N 1 hostname`
- id: c2_partition
source: community/modules/compute/schedmd-slurm-gcp-v5-partition
use: [c2_node_group, network1, homefs, projectsfs, scratchfs]
settings:
partition_name: c2
# the following two are true by default
exclusive: true # this must be true if enable_placement is true
enable_placement: true
- id: c2d_node_group
source: community/modules/compute/schedmd-slurm-gcp-v5-node-group
settings:
node_count_dynamic_max: 20
machine_type: c2d-standard-112
instance_image: $(vars.slurm_image)
bandwidth_tier: tier_1_enabled
disk_type: pd-ssd
disk_size_gb: 100
service_account:
email: $(compute_sa.service_account_email)
scopes:
- https://www.googleapis.com/auth/cloud-platform
- id: c2d_partition
source: community/modules/compute/schedmd-slurm-gcp-v5-partition
use: [c2d_node_group, network1, homefs, projectsfs, scratchfs]
settings:
partition_name: c2d
- id: c3_node_group
source: community/modules/compute/schedmd-slurm-gcp-v5-node-group
settings:
node_count_dynamic_max: 20
machine_type: c3-highcpu-176
instance_image: $(vars.slurm_image)
bandwidth_tier: tier_1_enabled
disk_type: pd-ssd
disk_size_gb: 100
service_account:
email: $(compute_sa.service_account_email)
scopes:
- https://www.googleapis.com/auth/cloud-platform
- id: c3_partition
source: community/modules/compute/schedmd-slurm-gcp-v5-partition
use: [c3_node_group, network1, homefs, projectsfs, scratchfs]
settings:
partition_name: c3
- id: a2_8_node_group
source: community/modules/compute/schedmd-slurm-gcp-v5-node-group
settings:
node_count_dynamic_max: 16
machine_type: a2-ultragpu-8g
bandwidth_tier: gvnic_enabled
instance_image: $(vars.slurm_image)
disk_type: pd-ssd
disk_size_gb: 100
node_conf:
Sockets: 2
CoresPerSocket: 24
service_account:
email: $(compute_sa.service_account_email)
scopes:
- https://www.googleapis.com/auth/cloud-platform
# use `-p a208` to submit jobs to this partition:
# ex: `srun -p a208 --gpus-per-node=8 -N 1 nvidia-smi`
- id: a2_8_partition
source: community/modules/compute/schedmd-slurm-gcp-v5-partition
use: [a2_8_node_group, network1, homefs, projectsfs, scratchfs]
settings:
partition_name: a208
# This makes this partition look for machines in any of the following zones
# https://github.com/GoogleCloudPlatform/hpc-toolkit/tree/develop/community/modules/compute/schedmd-slurm-gcp-v5-partition#compute-vm-zone-policies
zones: $(vars.gpu_zones)
# The following allows users to use more host memory without specifying cpus on a job
partition_conf:
DefMemPerGPU: 160000
DefMemPerCPU: null
- id: a2_16_node_group
source: community/modules/compute/schedmd-slurm-gcp-v5-node-group
settings:
node_count_dynamic_max: 16
machine_type: a2-megagpu-16g
bandwidth_tier: gvnic_enabled
instance_image: $(vars.slurm_image)
disk_type: pd-ssd
disk_size_gb: 100
node_conf:
Sockets: 2
CoresPerSocket: 24
service_account:
email: $(compute_sa.service_account_email)
scopes:
- https://www.googleapis.com/auth/cloud-platform
# use `-p a216` to submit jobs to this partition:
# ex: `srun -p a216 --gpus-per-node=16 -N 1 nvidia-smi`
- id: a2_16_partition
source: community/modules/compute/schedmd-slurm-gcp-v5-partition
use: [a2_16_node_group, network1, homefs, projectsfs, scratchfs]
settings:
partition_name: a216
# This makes this partition look for machines in any of the following zones
# https://github.com/GoogleCloudPlatform/hpc-toolkit/tree/develop/community/modules/compute/schedmd-slurm-gcp-v5-partition#compute-vm-zone-policies
zones: $(vars.gpu_zones)
# The following allows users to use more host memory without specifying cpus on a job
partition_conf:
DefMemPerGPU: 160000
DefMemPerCPU: null
- id: h3_node_group
source: community/modules/compute/schedmd-slurm-gcp-v5-node-group
settings:
node_count_dynamic_max: 16
machine_type: h3-standard-88
bandwidth_tier: gvnic_enabled # https://cloud.google.com/compute/docs/compute-optimized-machines#h3_network
instance_image: $(vars.slurm_image)
service_account:
email: $(compute_sa.service_account_email)
scopes:
- https://www.googleapis.com/auth/cloud-platform
# H3 does not support pd-ssd and pd-standard
# https://cloud.google.com/compute/docs/compute-optimized-machines#h3_disks
disk_type: pd-balanced
disk_size_gb: 100
# use `-p h3` to submit jobs to this partition:
# ex: `srun -p h3 -N 1 hostname`
- id: h3_partition
source: community/modules/compute/schedmd-slurm-gcp-v5-partition
use: [h3_node_group, network1, homefs, projectsfs, scratchfs]
settings:
partition_name: h3
- id: slurm_controller
source: community/modules/scheduler/schedmd-slurm-gcp-v5-controller
use: [network1, homefs, projectsfs, scratchfs, n2_partition,
c2_partition, c2d_partition, c3_partition, a2_8_partition, a2_16_partition,
h3_partition]
settings:
instance_image: $(vars.slurm_image)
# the following allow for longer boot time
# which is useful for large GPU nodes
cloud_parameters:
no_comma_params: false
resume_rate: 0
resume_timeout: 600
suspend_rate: 0
suspend_timeout: 600
# we recommend disabling public IPs if possible
# but that requires your network to have a NAT or
# private access configured
disable_controller_public_ips: false
service_account:
email: $(controller_sa.service_account_email)
scopes:
- https://www.googleapis.com/auth/cloud-platform
- id: slurm_login
source: community/modules/scheduler/schedmd-slurm-gcp-v5-login
use:
- network1
- slurm_controller
settings:
instance_image: $(vars.slurm_image)
machine_type: n2-standard-4
disable_login_public_ips: false
service_account:
email: $(login_sa.service_account_email)
scopes:
- https://www.googleapis.com/auth/cloud-platform
- id: hpc_dashboard
source: modules/monitoring/dashboard
outputs: [instructions]
Request additional quotas
You might need to request additional quota to be able to deploy and use your
HPC cluster.
For example, by default the schedmd-slurm-gcp-v5-node-group module
uses
c2-standard-60
VMs for compute nodes. The default quota for C2 VMs might be as
low as 8, which might prevent even a single node from being started.
The required quotas are based on your custom HPC configuration. Minimum quotas
are
documented on GitHub
for the provided example blueprints.
To view and increase quotas, see
Managing your quota using the Google Cloud CLI.
What's next