Resolving traffic management issues in Anthos Service Mesh

This section explains common Anthos Service Mesh problems and how to resolve them. If you need additional assistance, see Getting support.

API server connection errors in Istiod logs

Istiod cannot contact the apiserver if you see errors similar to the following:

error Failed to watch *crd.IstioSomeCustomResource`…dial tcp connect: connection refused

You can use the regular expression string /error.*cannot list resource/ to find this error in the logs.

This error is usually transient and if you reached the proxy logs using kubectl, the issue might be resolved already. This error is usually caused by events that make the API server temporarily unavailable, such as when an API server that is not in a high availability configuration reboots for an upgrade or autoscaling change.

The istio-init container crashes

This problem can occur when the pod iptables rules are not applied to the pod network namespace. This can be caused by:

  • An overly restrictive Pod Security Policy (PSP)
  • An incomplete istio-cni installation
  • Insufficient workload pod permissions (missing CAP_NET_ADMIN permission)

If you use a Pod Security Policy that restricts CAP_NET_ADMIN permission, switch to use the Istio CNI plugin instead.

If you use the Istio CNI plugin, verify that you followed the instructions completely. Verify that the istio-cni-node container is ready, and check the logs. If the problem persists, establish a secure shell (SSH) into the host node and search the node logs for nsenter commands, and see if there are any errors present.

If don't use the Istio CNI plugin or a Pod Security Policy, verify that the workload pod has CAP_NET_ADMIN permission, which is automatically set by the sidecar injector.

Connection refused after pod starts

When a Pod starts and gets connection refused trying to connect to an endpoint, the problem might be that the application container started before the isto-proxy container. In this case, the application container sends the request to istio-proxy, but the connection is refused because istio-proxy isn't listening on the port yet.

In this case, you can:

  • Modify your application's startup code to make continuous requests to the istio-proxy health endpoint until the application receives a 200 code. The istio-proxy health endpoint is:

  • Add a retry request mechanism to your application workload.