Cluster deployment overview

This document provides an overview of how to use Cloud HPC Toolkit to deploy an HPC cluster on Google Cloud.

Before you begin

  1. If you are using a Linux or macOS workstation to deploy your cluster, install dependencies.
  2. From either your workstation or Cloud Shell, configure your environment.
  3. Ensure that you have created an HPC blueprint or selected one from the Cloud HPC Toolkit examples. See Prepare an HPC blueprint.

Overview

To deploy a cluster, you need to complete the following steps:

  1. Create the HPC deployment folder from the HPC blueprint file using the ghpc create command.
  2. Deploy the cluster from the HPC deployment folder using the ghpc deploy command.

Create the HPC deployment folder

To create an HPC deployment folder, use the ghpc create command command. Replace PATH_TO_BLUEPRINT with the location of your HPC blueprint file.

./ghpc create PATH_TO_BLUEPRINT

Set deployment variables at the command line

The example HPC blueprints included with the Cloud HPC Toolkit do not set the project ID. You must supply a valid project ID by using the --vars flag with ghpc create.

For example, if you are in the main working directory and you want to use the example hpc-slurm.yaml blueprint that is located in the hpc-toolkit/examples/ directory, run the following command:

./ghpc create examples/hpc-slurm.yaml \
    --vars project_id=PROJECT_ID

Get help at the command line

For a full list of flags that you can use, run the --help flag on ghpc or on any of the sub-commands.

./ghpc --help
./ghpc create --help

Deploy the cluster

To deploy the cluster, run the ghpc deploy command as shown in the output of the ./ghpc create command. For example:

  1. Run the ghpc deploy command to begin automatic deployment of your cluster:

    ./ghpc deploy hpc-slurm
  2. ghpc reports proposed changes for your cluster. Optionally, you may review the proposed changes by typing d and pressing enter. To deploy the cluster, accept the proposed changes by typing a and pressing enter.

    Summary of proposed changes: Plan: 37 to add, 0 to change, 0 to destroy.
    (D)isplay full proposed changes,
    (A)pply proposed changes,
    (S)top and exit,
    (C)ontinue without applying
    Please select an option [d,a,s,c]:
    
  3. After accepting the changes, ghpc runs terraform apply automatically. This takes approximately 5 minutes while it displays progress. If the run is successful, the output is similar to the following:

    Apply complete! Resources: 37 added, 0 changed, 0 destroyed.
    

You are now ready to submit jobs to your HPC cluster.