Troubleshooting the sample deployment

This guide describes troubleshooting steps that you might find helpful if you run into problems deploying the Anthos Sample Deployment.

For details about how to deploy and explore the sample, see the Explore GKE Enterprise tutorial.

If an issue occurs during deployment, you can find the relevant error information in the Deployment Manager view. If you see failed to deploy, click View Details, and then click the triangle icon to view all the resources and see what failed.

Prerequisite check errors

The Anthos Sample Deployment has a prerequisite script that you can run prior to deployment to check that your project meets the criteria for a successful deployment. The jump server also checks some of these prerequisites and causes the deployment to fail if any are not met. Any errors related to these prerequisite checks can be seen in the Deployment Manager view on the corresponding prerequisite-check resource. The following prerequisites are checked during the deployment:

The following screenshot shows an example of a Anthos Sample Deployment deployment failing due to prerequisites.

Screenshot of prerequisite check failure

Service Management API is not enabled (code 403)

Suggested fix: Enable the Service Management API for your chosen project and try deploying again.

Invalid project environment

A Qwiklabs project is detected. The Anthos Sample Deployment is not designed to run on Qwiklabs environment.

Suggested fix: Create a new project in your organization with a different project ID.

Invalid project ID

If your project is scoped to your domain, the project ID includes the name of the domain followed by a colon (:). Domain-scoped project IDs are a legacy feature.

Suggested fix: Create a new project that is not domain-scoped and try deploying again.

Not enough permissions to create service account

Suggested fix

Ensure that you have the resourcemanager.projects.setIamPolicy permission; typically project Owners or Service Account Admins have this. If you do not have one of these Identity and Access Management (IAM) roles, do one of the following:

  • Create a new project (you will automatically be the Owner).
  • Ask the project Owner or Service Account Admin to grant you one of the following roles:
    • roles/owner
    • roles/iam.serviceAccountAdmin

Insufficient quota to satisfy the request

Suggested fix: The deployment requires 7 vCPUs, 24.6 GiB of memory, and 310 GiB of disk space in your chosen zone or region, plus one VPC, two firewall rules, and one Cloud NAT in your chosen project. Ensure that your project has enough resource quota for the deployment. You can check your quota and if necessary, request an increase.

Project does not allow VPC peering

This error occurs if the compute.restrictVpcPeering organization policy constraint is enforced. This constraint prevents the creation of a GKE private cluster when there is no existing VPC Network Peering connection to the GKE control plane's VPC network.

Suggested fix

Do one of the following:

  • Create a new project in your current organization. Ask your Organization Policy Administrator to edit the constraints/compute.restrictVpcPeering policy to allow creating VPC peering for this project.

  • Use another project in a different organization or folder.

Project does not allow click-to-deploy images

This error occurs if the compute.trustedImagesProjects organization policy constraint is enforced. The Anthos Sample Deployment creates a Compute Engine instance with an image that is used to automate tasks for the Anthos setup and application deployment.

Suggested fix

Do one of the following:

  • Create a new project in your current organization. Ask your Organization Policy Administrator to edit the constraints/compute.trustedImageProjects policy for this project to include projects/click-to-deploy-images.

  • Use another project in a different organization or folder.

Project does not allow IP Forwarding

This error occurs if the compute.vmCanIpForward organization policy constraint is enforced. IP Forwarding is used as part of the sample application.

Suggested fix

Do one of the following:

  • Create a new project in your current organization. Ask your Organization Policy Administrator to edit the constraints/compute.vmCanIpForward policy for this project to allow VM IP Forwarding.

  • Use another project in a different organization or folder.

Project requires OS Login

This error occurs if the compute.requireOsLogin organization policy constraint is enforced. OS Login is not currently supported in Google Kubernetes Engine (GKE).

Suggested fix

Do one of the following:

  • Create a new project in your current organization. Ask your Organization Policy Administrator to edit the constraints/compute.requireOsLogin policy for this project to not require OS Login.

  • Use another project in a different organization or folder.

GKE clusters fail to deploy

GKE clusters might fail to deploy for various reasons.

Suggested fix

Do one of the following:

  • When you first encounter an error, try to deploy one more time.

  • (Recommended) Delete the project and start again with a new project.

  • Delete the deployment by following the instructions in Deleting the deployment. Do not try to deploy again in the same project without cleanup.

If the error persists, contact your support team, or leave feedback in our survey.

GKE Enterprise components fail to deploy

Deployment Manager might fail at any of the following steps during the process:

  • cluster-registration-with-hub
  • anthos-service-mesh-setup
  • anthos-config-management-setup
  • application-deployment

Suggested fix

Do one of the following:

  • When you first encounter an error, try to deploy one more time.

  • (Recommended) Delete the project and start again with a new project.

  • Delete the deployment by following the instructions in Deleting the deployment. Do not try to deploy again in the same project without cleanup.

If the error persists, contact your support team, or leave feedback in our survey.

Existing deployment of Anthos Sample Deployment

Previous deployments of Anthos Sample Deployment must be deleted before trying a new deployment. Delete the deployment by following the instructions in Deleting the deployment. Do not try to deploy again in the same project without cleanup.

Anthos Sample Deployment service account permissions

The Anthos Sample Deployment service account requires a number of IAM roles. The deployment first verifies whether it has permissions from roles/runtimeconfig.admin in order to report status for the subsequent prerequisite checks and installation steps.

Below is the full list of roles required by the Anthos Sample Deployment service account:

  • roles/cloudtrace.agent
  • roles/container.admin
  • roles/deploymentmanager.editor
  • roles/gkehub.admin
  • roles/iam.serviceAccountAdmin
  • roles/iam.workloadIdentityUser
  • roles/logging.configWriter
  • roles/logging.logWriter
  • roles/meshconfig.admin
  • roles/meshtelemetry.reporter
  • roles/monitoring.metricWriter
  • roles/resourcemanager.projectIamAdmin
  • roles/runtimeconfig.admin
  • roles/serviceusage.serviceUsageAdmin
  • roles/source.admin
  • roles/viewer

If the Anthos Sample Deployment service account is deleted and recreated with the same name, it may result in unexpected behaviour.