Introduction
Tightly coupled high performance computing (HPC) workloads often use the Message Passing Interface (MPI) to communicate between processes and virtual machine (VM) instances. But building your own VM image that is tuned for optimal MPI performance requires systems expertise, Google Cloud knowledge, and extra time for maintenance. To quickly create VM instances for your HPC workloads, you can use the HPC VM image.
The HPC VM image is a CentOS 7.9 based VM image that is optimized for tightly coupled HPC workloads. It includes pre-configured kernel and network tuning parameters required to create VM instances that achieve optimal MPI performance on Google Cloud.
You can create an HPC-ready VM by using the following options:
- Google Cloud CLI
- Google Cloud console. In the console, the image is available through Cloud Marketplace.
- SchedMD's Slurm workload manager, which uses the HPC VM image by default. For more information, see Creating Intel Select Solution verified clusters.
- Omnibond CloudyCluster, which uses the HPC VM image by default.
Benefits
The HPC VM image provides the following benefits:
- VMs ready for HPC out-of-the-box. There is no need to manually tune performance, manage VM reboots, or stay up to date with the latest Google Cloud updates for tightly coupled HPC workloads.
- Networking optimizations for tightly-coupled workloads. Optimizations that reduce latency for small messages are included, which benefits applications that are heavily dependent on point-to-point and collective communications.
- Compute optimizations for HPC workloads. Optimizations that reduce system jitter are included, which makes single-node high performance more predictable.
- Consistent, reproducible performance. VM image standardization gives you consistent, reproducible application-level performance.
- Improved application compatibility. Alignment with the node-level requirements of the Intel HPC platform specification enables a high degree of interoperability between systems.
Features
Intel MPI collective tunings
The HPC VM image includes Intel MPI collective tunings performed on
c2-standard-60
and c2d-standard-112
instances using compact placement policies.
Pre-installed RPMs
The HPC VM image comes with the following RPM packages pre-installed:
Lmod
,dkms
,htop
,hwloc
,hwloc-devel
,kernel-devel
,ltrace
,libXt
,nfs-utils
,numactl
,numactl-devel
,papi
,pciutils
,pdsh
,perf
,redhat-lsb-core
,redhat-lsb-cxx
,rsh
,screen
,strace
,wget
,zsh
, "Development Tools" package group
Quickstarts
Before you begin
- To use the Google Cloud CLI for this quickstart, you must first install and initialize the Google Cloud CLI:
- In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Create an HPC VM instance
Create the VM
Console
In the Google Cloud console, go to the HPC VM Cloud Marketplace page. Go to the HPC VM Cloud Marketplace page
Click Launch.
On the HPC VM deployment page, enter a Deployment name. This name becomes the root of your VM name. Compute Engine appends
-vm
to this name when naming your instance.Choose a Zone and Machine type. For this quickstart, you can leave all settings as they are or change them. We strongly recommend choosing a compute optimized machine type, such as C2 or C2D. To learn why, read Use compute-optimized instances.
Leave the Boot disk type, Boot disk size, and Network interface at their default settings.
Click Deploy.
After the VM instance creation completes, the Cloud Deployment Manager opens, where you can manage your HPC VM and other deployments.
gcloud
Create an HPC VM by using the instances create
command.
We strongly recommend that you create HPC VMs using compact placement
policies
to achieve low network latency.
gcloud compute instances create VM_NAME \ --zone=ZONE \ --image-family=hpc-centos-7 \ --image-project=cloud-hpc-image-public \ --maintenance-policy=TERMINATE \ --machine-type=MACHINE_TYPE
Replace the following:
VM_NAME
: name of the HPC VM to create.ZONE
: zone in which to create the VM.MACHINE_TYPE
: machine type for the new VM. We strongly recommend choosing a C2 or a C2D machine type, such asc2-standard-60
orc2d-standard-112
. To learn why, read Use compute-optimized instances.
After some time, the VM instance creation completes. To verify the VM and to see its status, run the following command:
gcloud compute instances describe VM_NAME
Access the VM
Console
After you create your HPC VM instance, it starts automatically. To access it, do the following:
In the Google Cloud console, go to the VM instances page.
Click the name of your VM instance.
In the Remote Access section, click the first drop-down list and choose how you want to access the instance.
Compute Engine propagates your SSH keys and creates your user. For more information, see Connecting to Linux VMs.
gcloud
After you create your HPC VM instance, it starts automatically. To access
it using SSH, use the compute ssh
command:
gcloud compute ssh VM_NAME
Compute Engine propagates your SSH keys and creates your user. For more information, see Connecting to instances.
Clean up
To avoid incurring charges to your Google Cloud account for the resources used in this quickstart, delete the HPC VM instance that you created.
Console
In the Google Cloud console, go to the Deployments page.
Select the checkbox next to the HPC VM deployment.
Click Delete.
gcloud
Use the instances delete
command:
gcloud compute instances delete VM_NAME
Create HPC VMs with compact placement policies
You can reduce the latency between VMs by creating a compact placement policy. A compact placement policy ensures that VMs in the same availability zone are located close to each other.
Create a compact placement policy for the desired number of VMs by using the
resource-policies create group-placement
command:gcloud compute resource-policies create group-placement \ PLACEMENT_POLICY_NAME --collocation=COLLOCATED \ --vm-count=NUMBER_OF_VMS
Replace the following:
PLACEMENT_POLICY_NAME
: name for the placement policy.NUMBER_OF_VMS
: number of VMs to create under the compact placement policy.
Create HPC VMs and specify the placement policy by using the
instances create
command.gcloud compute instances create VM_1_NAME \ VM_2_NAME \ --zone=ZONE \ --resource-policies=PLACEMENT_POLICY_NAME \ --maintenance-policy=TERMINATE --no-restart-on-failure
Replace the following:
VM_1_NAME
,VM_2_NAME
: names for your VMs. You must list exactlyNUMBER_OF_VMS
names to create the VMs under the same placement policy.ZONE
: zone of the VM.PLACEMENT_POLICY_NAME
: name for the placement policy.
In some cases, you might not have control over the VM creation process. In such cases, you can create the placement policy and then apply it to an existing instance.
Configure your HPC VM according to best practices
To get better and more predictable performance for your HPC VM, we recommend that you use the following best practices.
Disable simultaneous multithreading
The HPC VM image enables simultaneous multithreading (SMT), also known as Hyper-Threading on Intel processors, by default. Disabling SMT can make your performance more predictable and can decrease job times. For more information, see the best practice on disabling SMT.
You can use the following methods to disable SMT:
To disable SMT while creating a new HPC VM, follow the steps to create an HPC VM and include the flag
--threads-per-core=1
.To disable SMT on an existing HPC VM, connect to the VM and run the following command from the VM:
sudo google_mpi_tuning --nosmt
For more information, see Configuring SMT.
Use gVNIC as the virtual network interface
The HPC VM image supports both Virtio-net and Google Virtual NIC (gVNIC) as virtual network interfaces. Using gVNIC instead of Virtio-net can improve the scalability of MPI applications by providing better communication performance and higher throughput. Additionally, gVNIC is a prerequisite for advanced networking, which provides higher bandwidth and allows for higher throughput.
When you create a new VM, Virtio-net is used as the virtual network
interface by default. To use gVNIC, follow the steps to create an HPC VM
and include the --network-interface=nic-type=GVNIC
flag. The HPC VM image
includes the gVNIC driver as a Dynamic Kernel Module Support (DKMS).For more information,
see Using Google Virtual NIC.
Turn off Meltdown and Spectre mitigations
The HPC VM image enables the Meltdown and Spectre mitigations by default. In some cases, these mitigations might result in workload-specific performance degradation. To disable these mitigations and incur the associated security risks, do the following:
Run the following command on your HPC VM:
sudo google_mpi_tuning --nomitigation
Reboot the VM.
Improve network performance
To improve the network performance of your VM, set up one or more of the following configurations:
- Configure a higher bandwidth. To configure Tier 1 bandwidth, use the
gcloud beta compute instances create
command to create the VM and specify the--network-performance-configs
flag. For more information, see Creating a VM with high-bandwidth configuration. - Increase the TCP memory limits. Higher bandwidth requires larger TCP memory.
Follow the steps to increase
tcp_*mem
settings. - Use the network-latency profile. Evaluate your application's latency and enable busy polling that reduces latency in the network receive path. For more information, see Use the network-latency profile.
Use Intel MPI 2018 and MPI collective tunings
Google recommends that you use the Intel MPI 2018 library for running MPI jobs on Google Cloud. The HPC VM image includes a convenient way to install this library and apply MPI collective tunings that have been validated on Google Cloud.
Install Intel MPI 2018
To install the library while creating a new HPC VM, follow the steps to create
an HPC VM and include the --metadata=google_install_mpi="--intel_mpi"
flag.
To install the library on an existing HPC VM, run the following command on that VM:
sudo google_install_mpi --intel_mpi
For additional use cases, such as running MPI applications built with Intel
Parallel Studio XE, use the full Intel Parallel Studio XE (PSXE) Runtime
by replacing intel_mpi
with intel_psxe_runtime
in the above commands. The
PSXE runtime includes several libraries that are important for running MPI
applications, such as the Intel Math Kernel Library (MKL).
The Intel MPI library will be installed to the /opt/intel
directory by
default. To install the libraries to a different location, use the --prefix
flag.
Use MPI collective tunings
MPI implementations such as Intel MPI and OpenMPI have many internal configuration parameters that can affect communication performance. These parameters are especially relevant for MPI collective communication, which lets you specify algorithms and configuration parameters that can perform very differently in the Google Cloud environment.
We strongly recommend that you tune configuration parameters based on the characteristics of your applications. We also strongly recommend that you enable VM placement policies when generating and using the tuning configuration files.
The HPC VM image includes output configurations from Intel MPI collective tunings performed on c2-standard-60 and c2d-standard-112 instances with compact placement policies. These tuning files are available in the following directory:
/usr/share/google-hpc-compute/mpitune-configs/intelmpi-2018
The HPC image contains several tuning configurations to support the following scenarios:
- Number of VMs: 1 to 22
- Number of MPI ranks (processes) per VM on
c2-standard-60
: 1, 2, 6, 10, and 30 - Number of MPI ranks (processes) per VM on
c2d-standard-112
: 1, 2, 4, 7, 8, 14, 16, 28, 32 and 56
To use these tuning configurations, do the following:
Install the Intel MPI library 2018.
Execute the following bash script to set up the environment:
source MPI_INSTALL_DIR/mpivars.sh
Replace
MPI_INSTALL_DIR
with the path to the directory where you installed the Intel MPI library.Install the tunings included in the HPC VM image with the following command. Use the
--sudo
option when you need root access to the directory:google_install_mpitune
Generate custom tuning configurations using mpitune
You can use mpitune
to manually specify the algorithms and configuration
parameters for MPI collective communication and to generate configuration files.
For example, to tune for 22 VMs and 30 processes per VM, source the
mpivars.sh
script to set up the proper environment, then run the following
command. You must have write access to the directory or run the command as root.
mpitune -hf hostfile -fl 'shm:tcp' -pr 30:30 -hr 22:22
This generates a configuration file in the Intel MPI directory, which you can
use to run applications. To make use of the tuning configuration for an
application, add the -tune
option to the following command:
mpirun -tune -hostfile hostfile -genv I_MPI_FABRICS 'shm:tcp' -np 660 -ppn 30 ./app
Create a custom image using the HPC VM image
Create a custom image using the boot disk of your HPC VM image as the source disk. You can do so using the Google Cloud console or the Google Cloud CLI.
Console
In the Google Cloud console, go to the Images page.
Click Create image.
Specify a Name for your image.
Under Source disk, select the name of the boot disk on your HPC VM.
Choose other remaining properties for your image.
Click Create.
gcloud
Create the custom image by using the images create
command.
gcloud compute images create IMAGE_NAME \ --source-disk=VM_NAME \ --source-disk-zone=VM_ZONE \ --family=IMAGE_FAMILY \ --storage-location=LOCATION
Replace the following:
IMAGE_NAME
: name for the custom image.VM_NAME
: name of your HPC VM.INSTANCE_ZONE
: zone where your HPC VM is located.IMAGE_FAMILY
: optional. The image family this image belongs to.LOCATION
: optional. Region in which to store the custom image. The default location is the multi-region closest to the location of the source disk.
Pricing
The HPC VM image is available at no additional cost. Because the HPC VM image runs on Compute Engine, you might incur charges for Compute Engine resources such as C2 vCPUs and memory. To learn more, see Compute Engine pricing.
Limitations
The benefits of tuning vary from application to application. In some cases, a particular tuning might have a negative effect on performance. Consider benchmarking your applications to find the most efficient or cost-effective configuration.
What's next
- Learn more about high performance computing on Google Cloud.
- Learn how to use the bulk instance API.
- If you have feedback or require support, email hpc-image-feedback@google.com.