Version 1.9

Resolving security issues in Anthos Service Mesh

This section explains common Anthos Service Mesh problems and how to resolve them. If you need additional assistance, see Getting support.

In Anthos Service Mesh, Mesh CA or Istiod issues certificates to workloads across all clusters in the mesh. Authentication (mTLS for example) and Authorization policies (allow/deny for example) are pushed to each cluster. These policies determine which workloads can communicate and how.

TLS Issues

The following sections explain how to resolve TLS-related problems in Anthos Service Mesh.

The examples in this section use the variable ${CTX}, which is the context name in the default Kubernetes configuration file that you use to access the cluster. Set the ${CTX} variable like the following example:


Verify TLS enforcement

Verify that plain-text requests are disallowed for a service, when the service requires TLS connections:

kubectl exec SOURCE_POD -n SOURCE_NAMESPACE -c \

Assuming the service requires TLS connections, the above plain-text request should fail, resulting in output similar to the following:

curl: (56) Recv failure: Connection reset by peer command terminated with exit code 56

Check mTLS certificates

When mTLS is enabled, check the workload's mTLS certificate by viewing the X-Forwarded-Client-Cert header. To do this, use the following steps:

  1. Deploy the httpbin sample service, which can display the headers that it receives.

  2. Use curl to view the X-Forwarded-Client-Cert header:

    kubectl exec --context=${CTX} SOURCE_POD -n SOURCE_NAMESPACE -c \
    SOURCE_CONTAINER -- curl http://httpbin.sample:8000/headers -s | \
    grep X-Forwarded-Client-Cert

    The X-Forwarded-Client-Cert header shows the mTLS certificates information, like the following example:

    X-Forwarded-Client-Cert": "By=spiffe://;Hash=0781d68adfdab85b08b6758ed502f352464e81166f065cc6acde9433337b4494;Subject=\"OU=istio_v1_cloud_workload,O=Google LLC,L=Mountain View,ST=California,C=US\";URI=spiffe://
  3. Alternatively, use openssl on the sidecar to view the entire certificate chain:

    kubectl exec --context=${CTX} SOURCE_POD -n SOURCE_NAMESPACE -c istio-proxy \
    openssl s_client -alpn istio -showcerts -connect httpbin.sample:8000

    The output will display the certificate chain. If you are using Mesh CA, verify the root certificate CN contains istio_v1_cloud_workload_root-signer-.... If you are using Istiod as the certificate authority, verify that the root certificate is set with O = <var>YOUR_TRUST_DOMAIN</var>.

TLS bad certificate errors in the Istiod logs

If you see TLS handshake bad certificate errors in the logs, it might indicate that Istiod is failing to establish a TLS connection to a service.

You can use the regular expression string TLS handshake error.*bad certificate to find these errors in the logs.

These errors are usually informational and transient. However, if they persist, they might indicate a problem in your system.

  1. Verify that your istio-sidecar-injector MutatingWebhookConfiguration has a CA bundle.

    The sidecar injector webhook (which is used for automatic sidecar injection) requires a CA bundle to establish secure connections with the API server and Istiod. This CA bundle is patched into the configuration by istiod, but can sometimes be overwritten (for example, if you reapply the webhook configuration).

  2. Verify the presence of the CA bundle:

    kubectl get istio-sidecar-injector -o=jsonpath='{.webhooks[0].clientConfig.caBundle}'

    If the output is not empty, the CA bundle is configured. If the CA bundle is missing, restart istiod to cause it to rescan the webhook and reinstall the CA bundle.

Authorization policy denial logging

The authorization policy denies a request if it is not allowed by the policy. For HTTP (including gRPC) protocols, the request will be denied with status code 403. For non-HTTP protocols, the connection will be terminated directly. For more information about authorization policies, see Istio authorization.

The Google Cloud's operations suite access log includes necessary information when the request is denied by authorization policy, which can be useful for some situations. For example, the log indicates how many requests are denied by the authorization policy, which can help you determine which policy rule caused the denial versus denials from the backend application.

The Google Cloud's operations suite access log includes the following labels for the authorization denial.

  • response_details: will be set to AuthzDenied if the denial is caused by the authorization policy.
  • policy_name: will include the namespace and name of the authorization DENY policy causing the denial. The value is in the format of <Namespace>.<Name>, for example, foo.deny-method-get means an authorization policy deny-method-get in the foo namespace.
  • policy_rule: will include the index of the rule inside the authorization policy causing the denial, for example, 0 means the first rule inside the policy.

For more information about how to get the access log, see Accessing logs in Cloud Logging.

Authorization policies are not enforced

If you observe symptoms of authorization policies not being enforced, use the following command to verify them:

kubectl exec --context=${CTX} -it SOURCE_POD -n SOURCE_NAMESPACE \

In the output, access denied messages indicate that authorization policies are properly enforced, like the following:

RBAC: access denied

If you confirm that authorization policies are not enforced, deny access to the namespace. The following example denies access to the namespace named authz-ns:

kubectl apply --context=${CTX} -f - <<EOF
kind: AuthorizationPolicy
  name: deny-authz-ns
  namespace: authz-ns

' is forbidden' error in Istiod logs

You might see errors similar to the following:

error failed to list CRDs: is forbidden: User "system:serviceaccount:istio-system:istiod-service-account" cannot list resource "customresourcedefinitions" in API group "" at the cluster scope

You can use the regular expression string /error.*cannot list resource/ to find these errors in the logs.

This error can occur when your Istiod deployment lacks the correct IAM binding or has insufficient RBAC permissions to read a custom resource.

  1. Check if you are missing an IAM binding in your account. First, ensure you have correctly set credentials and permissions. Then, check that the IAM binding is present using the following command. In this example, PROJECT_ID is the output of gcloud config get-value project and PROJECT_NUMBER is the output of gcloud projects list --filter="project_id=${PROJECT_ID}" --format="value(project_number)":

    gcloud projects add-iam-policy-binding ${PROJECT_ID} --member "serviceAccount:service-${PROJECT_NUMBER}" --role "roles/meshdataplane.serviceAgent"
  2. Check that your RBAC rules are installed correctly.

  3. If the RBAC rules are missing, rerun istioctl install (or the installation method you used to install Anthos Service Mesh) to recreate them.

  4. If the RBAC rules are present and the errors persist, check that the ClusterRoleBindings and RoleBindings are attaching the RBAC rules to the correct kubernetes Service Account. Also, verify that your istiod deployment is using the specified service account.

serverca process errors in Istiod logs

You might see errors similar to the following:

Authentication failed: Authenticator ClientCertAuthenticator at index 0 got error

You can use the regular expression string /serverca.*Authentication failed:.*JWT/ to find these errors in the logs.

This error can occur when the JWT issuer is misconfigured, a client is using an expired token, or some other security issue is preventing a connection from authenticating to istiod correctly.