Problem
When trying to create a GPU node pool the operation fails with the following error:
Machines with GPUs have certain limitations which may affect your workflow. Learn more at https://cloud.google.com/kubernetes-engine/docs/how-to/gpus WARNING: Starting with version 1.19, newly created node-pools will have COS_CONTAINERD as the default node image when no image type is specified. ERROR: (gcloud.container.node-pools.create) ResponseError: code=400, message=Accelerator type "nvidia-tesla-t4" does not exist in zone {zone-name}.
Environment
- Google Kubernetes Engine Cluster
Solution
- The GPU enabled cluster has to be available in all three zones. If zones in different regions are having GPUs, adding or removing zones is a workaround.
Cause
This error is raised when the Google Kubernetes Engine cluster does not find GPUs available in all targeted zones.
By default, regional cluster nodes are spread evenly across three zones in a region, but GPUs are not available across all Google Cloud regions/zones. GPU availability for zones and regions is listed in GPU regions and zones availability.