Version 1.11

Resolving managed Anthos Service Mesh issues

This section explains common Anthos Service Mesh problems and how to resolve them. If you need additional assistance, see Getting support.

Pod is injected with istiod.istio-system

This can occur if you did not replace the istio-injection: enabled label.

In addition, verify the mutating webhooks configuration by using the following command:

kubectl get mutatingwebhookconfiguration

...
istiod-asm-managed
…
# may include istio-sidecar-injector

kubectl get mutatingwebhookconfiguration   istio-sidecar-injector -o yaml

# Run debug commands
export T=$(echo '{"kind":"TokenRequest","apiVersion":"authentication.k8s.io/v1","spec":{"audiences":["istio-ca"], "expirationSeconds":2592000}}' | kubectl create --raw /api/v1/namespaces/default/serviceaccounts/default/token -f - | jq -j '.status.token')

export INJECT_URL=$(kubectl get mutatingwebhookconfiguration istiod-asmca -o json | jq -r .webhooks[0].clientConfig.url)
export ISTIOD_ADDR=$(echo $INJECT_URL | sed s/inject.*//)

curl -v -H"Authorization: Bearer $T" $ISTIOD_ADDR/debug/configz

The install tool generates HTTP 400 errors

The installation tool might generate HTTP 400 errors like the following:

HealthCheckContainerError, message: Cloud Run error: Container failed to start.
Failed to start and then listen on the port defined by the PORT environment
variable. Logs for this revision might contain more information.

The error can occur if you did not enable Workload Identity on your Kubernetes cluster, which you can do by using the following command:

export CLUSTER_NAME=...
export PROJECT_ID=...
export LOCATION=...
gcloud container clusters update $CLUSTER_NAME --zone $LOCATION \
    --workload-pool=$PROJECT_ID.svc.id.goog

Managed data plane status

The following command displays the status of the managed data plane:

kubectl -n istio-system get dataplanecontrol -o custom-columns=REVISION:.spec.revision,STATE:status.state

You should see the following approximately ten minutes after deploying the managed data plane:

REVISION            STATE
asm-managed-rapid   Ready

Empty status

If you don't see any output except for the column headers REVISION and STATE, this indicates that the data plane controller wasn't deployed to the cluster. To troubleshoot this, run the following command to see if the cluster is registered to the fleet:

gcloud container hub memberships list --project=PROJECT_ID
  • If output from the gcloud command is empty, or if the output doesn't have the name of your cluster, this means that the cluster isn't registered with the fleet. To fix this, rerun the installation tool and make sure that you include --enable-registration on the command line. (Note that you also need to include --option cni-managed when you rerun the tool.)

  • If the output includes the name of your cluster, run the following command to enable Anthos Service Mesh in the fleet:

    gcloud alpha container hub mesh enable --project=PROJECT_ID
    

<nil> status

This indicates an issue with the CNI DaemonSet. Run the following command to check if the CNI DaemonSet is healthy and running:

kubectl get pods -n kube-system -l k8s-app=istio-cni-node

If the CNI DaemonSet is healthy and running, the output is similar to the following:

NAME                   READY   STATUS    RESTARTS   AGE
istio-cni-node-8w88v   3/3     Running   0          12h
istio-cni-node-c69mn   3/3     Running   0          12h
istio-cni-node-n9pnr   3/3     Running   0          12h
  • If the CNI Pod isn't in the output, rerun the installation tool and make sure that you include --option cni-managed. (Note you should also include --enable-registration to make sure that the cluster is registered to the fleet when you rerun the tool.)

  • If the CNI Pod isn't healthy, get details on the Pod. In the following command, replace CNI_POD with the name of the unhealthy Pod:

    kubectl -n kube-system describe pod CNI_POD
    

    Contact Cloud Support and provide them the details about the unhealthy Pod.

Cluster membership error (No identity provider specified)

The installation tool might fail with Cluster membership errors like the following:

install_asm: [ERROR]: Cluster has memberships.hub.gke.io CRD but no identity
provider specified. Please ensure that an identity provider is available for the
registered cluster.

The error can occur if you don't have GKE workload identity enabled before registering the cluster. You can re-register the cluster on the command line with the following command: gcloud container hub memberships register --enable-workload-identity