This document explains common Anthos Service Mesh problems and how to resolve them,
such as when a pod is injected with istio.istio-system
, the installation
tool generates errors such as HTTP 400
status codes and cluster membership
errors.
If you need additional assistance troubleshooting Anthos Service Mesh, see Getting support.
Pod is injected with istiod.istio-system
This can occur if you did not replace the istio-injection: enabled
label.
In addition, verify the mutating webhooks configuration by using the following command:
kubectl get mutatingwebhookconfiguration
...
istiod-asm-managed
…
# may include istio-sidecar-injector
kubectl get mutatingwebhookconfiguration istio-sidecar-injector -o yaml
# Run debug commands
export T=$(echo '{"kind":"TokenRequest","apiVersion":"authentication.k8s.io/v1","spec":{"audiences":["istio-ca"], "expirationSeconds":2592000}}' | kubectl create --raw /api/v1/namespaces/default/serviceaccounts/default/token -f - | jq -j '.status.token')
export INJECT_URL=$(kubectl get mutatingwebhookconfiguration istiod-asmca -o json | jq -r .webhooks[0].clientConfig.url)
export ISTIOD_ADDR=$(echo $INJECT_URL | sed s/inject.*//)
curl -v -H"Authorization: Bearer $T" $ISTIOD_ADDR/debug/configz
The install tool generates HTTP 400 errors
The installation tool might generate HTTP 400
errors like the following:
HealthCheckContainerError, message: Cloud Run error: Container failed to start.
Failed to start and then listen on the port defined by the PORT environment
variable. Logs for this revision might contain more information.
The error can occur if you did not enable Workload Identity on your Kubernetes cluster, which you can do by using the following command:
export CLUSTER_NAME=...
export PROJECT_ID=...
export LOCATION=...
gcloud container clusters update $CLUSTER_NAME --zone $LOCATION \
--workload-pool=$PROJECT_ID.svc.id.goog
Managed data plane status
The following command displays the status of the managed data plane:
kubectl -n istio-system get dataplanecontrol -o custom-columns=REVISION:.spec.revision,STATE:status.state
You should see the following approximately ten minutes after deploying the managed data plane:
REVISION STATE
Regular Ready
Empty status
If you don't see any output except for the column headers REVISION
and
STATE
, this indicates that the data plane controller wasn't deployed to
the cluster. To troubleshoot this, run the following command to see if the
cluster is registered to the
fleet:
gcloud container hub memberships list --project=PROJECT_ID
If output from the
gcloud
command is empty, or if the output doesn't have the name of your cluster, this means that the cluster isn't registered with the fleet. To fix this, rerun the installation tool and make sure that you include--enable-registration
on the command line. (Note that you also need to include--option cni-managed
when you rerun the tool.)If the output includes the name of your cluster, run the following command to enable Anthos Service Mesh in the fleet:
gcloud alpha container hub mesh enable --project=PROJECT_ID
<nil>
status
This indicates an issue with the CNI DaemonSet
. Run the following command
to check if the CNI DaemonSet
is healthy and running:
kubectl get pods -n kube-system -l k8s-app=istio-cni-node
If the CNI DaemonSet
is healthy and running, the output is similar to
the following:
NAME READY STATUS RESTARTS AGE
istio-cni-node-8w88v 3/3 Running 0 12h
istio-cni-node-c69mn 3/3 Running 0 12h
istio-cni-node-n9pnr 3/3 Running 0 12h
If the CNI Pod isn't in the output, rerun the installation tool and make sure that you include
--option cni-managed
. (Note you should also include--enable-registration
to make sure that the cluster is registered to the fleet when you rerun the tool.)If the CNI Pod isn't healthy, get details on the Pod. In the following command, replace
CNI_POD
with the name of the unhealthy Pod:kubectl -n kube-system describe pod CNI_POD
Contact Cloud Support and provide them the details about the unhealthy Pod.
Cluster membership error (No identity provider specified)
The installation tool might fail with Cluster membership errors like the following:
asmcli: [ERROR]: Cluster has memberships.hub.gke.io CRD but no identity
provider specified. Please ensure that an identity provider is available for the
registered cluster.
The error can occur if you don't have GKE workload identity enabled
before registering the cluster. You can re-register the cluster on the command
line with the following command:
gcloud container hub memberships register --enable-workload-identity