This document provides best practices for tuning Google Cloud resources for optimal performance of high performance computing (HPC) workloads.
Use the compute-optimized machine type
Use the compute-optimized machine family: H3, C2, or C2D. Virtual machine (VM) instances created with this machine type have a fixed virtual-to-physical core mapping. They also expose NUMA cell architecture to the guest OS. Both features are critical for the performance of tightly-coupled HPC applications.
To reduce communication overhead between VM nodes,
consolidate onto a smaller number of c2-standard-60
or
c2d-standard-112
VMs (with the same total core count) instead of launching a
larger number of smaller C2 or C2D VMs. Inter-node communication is the greatest bottleneck in MPI workloads. Larger VM shapes minimize this communication.
Use compact placement policies
To reduce internode latency, VM instance placement policies let you control the placement of VMs in Google Cloud data centers. We recommend compact placement policies because they provide lower-latency communication within a single zone.
Use the HPC VM image
Use the HPC VM image, which incorporates best practices for running HPC applications on Google Cloud. These images are based on Rocky Linux 8 and are available at no additional cost on Google Cloud.
Disable automatic updates
Automatic updates can significantly and unpredictably degrade performance. To disable automatic updates, use the
google_disable_automatic_updates
metadata flag
on VMs that use HPC VM images version v20240712
or later. Any VM image that has an HPC VM image as its base can also use this feature, for example, Slurm images.
For example, this setting affects dnf
automatic package updates on the
following image families:
- HPC images, such as
hpc-rocky-linux-8
(projectcloud-hpc-image-public
) - Slurm images, such as
slurm-gcp-6-6-hpc-rocky-linux-8
(projectschedmd-slurm-public
)
Cluster Toolkit provides a convenient setting on relevant modules to set this
metadata flag for you: allow_automatic_updates: false
. Here is an example
using the vm-instance
module:
- id: workstation-rocky source: modules/compute/vm-instance use: [network] settings: allow_automatic_updates: false
Here is an example for a Slurm nodeset:
- id: dynamic_nodeset source: community/modules/compute/schedmd-slurm-gcp-v6-nodeset use: [network] settings: node_count_static: 1 node_count_dynamic_max: 4 allow_automatic_updates: false
Adjust HPC VM image tunings
To get the best performance on Google Cloud, use the following image tunings.
You can use the following sample command to manually configure a VM to run HPC workloads. However, Cluster Toolkit automatically handles all of this tuning when you use a cluster blueprint.
To create the VM manually, use the Google Cloud CLI and provide the following settings.
gcloud compute instances create VM_NAME \ --image-family=hpc-rocky-linux-8 \ --image-project=cloud-hpc-image-public \ --machine-type=MACHINE_TYPE \ --network-interface=nic-type=GVNIC \ --metadata=google_mpi_tuning=--hpcthroughput \ --threads-per-core=1
The preceding sample command applies the following tunings:
Sets Google Virtual NIC (gVNIC) network interface to enable better communication performance and higher throughput:
--network-interface=nic-type=GVNIC
.Sets network HPC throughput profile:
--metadata=google_mpi_tuning=--hpcthroughput
.If the VM already exists, run
sudo google_mpi_tuning --hpcthroughput
to update the network HPC throughput profile setting.Disables simultaneous multithreading (SMT) in the guest OS:
--threads-per-core=1
.If the VM already exists, run
sudo google_mpi_tuning --nosmt
to disable simultaneous multithreading.Turns off Meltdown and Spectre mitigations. The HPC VM image enables this setting by default.
If the VM already exists, run
sudo google_mpi_tuning --nomitigation
to turn off Meltdown and Spectre mitigations.
Configure file system tuning
Each primary storage choice for tightly-coupled applications has its own cost, performance profile, APIs, and consistency semantics. The primary choices include the following:
Network File System (NFS) solutions, such as Filestore and Google Cloud NetApp Volumes. These solutions let you deploy shared storage options. Both Filestore and NetApp Volumes are fully managed by Google Cloud. Use them when your application does not have extreme I/O requirements to a single dataset. For performance limits, see the Filestore and NetApp Volumes documentation.
Google Cloud Managed Lustre is a fully managed POSIX-based parallel file system. This solution is commonly used by MPI applications.
Use Intel MPI
For best performance, use Intel MPI.
For Ansys Fluent, use Intel MPI
2018.4.274
. Set the version of Intel MPI in Ansys Fluent by using the following command. ReplaceMPI_DIRECTORY
with the path to the directory that contains your Intel MPI library.export INTELMPI_ROOT="MPI_DIRECTORY/compilers_and_libraries_2018.5.274/linux/mpi/intel64/"
Intel MPI collective algorithms can be tuned for optimal performance. The recommended collective algorithms for Ansys Fluent is
-genv I_MPI_ADJUST_BCAST 8 -genv I_MPI_ADJUST_ALLREDUCE 10
.For Simcenter STAR-CCM+, we also recommend you use the TCP fabric providers by specifying the following environment variables:
I_MPI_FABRICS shm:ofi
andFI_PROVIDER tcp
.
Summary of best practices
The following is a summary of the recommended best practices for running HPC workloads on Google Cloud.
Resource |
Recommendation |
---|---|
Machine family |
|
OS image |
|
File system |
Use one of the following:
|
MPI |
|