Version 1.9

Resolving VM support issues in Anthos Service Mesh

The following steps and logs are useful to troubleshoot problems with Anthos Service Mesh VM support.

Debug the VM

If you see that VM instances are running but not reachable from the mesh, perform the following steps on the VM instance.

Verify the agent

  1. Check the envoy proxy health:

    curl localhost:15000/ready -v
  2. Check the envoy error log

    less /var/log/envoy/envoy.err.log
  3. Check for service-proxy-agent errors:

    journalctl -u service-proxy-agent
  4. Check the syslog either in the Google Cloud's operations suite logs for the instance or on the VM under /var/log/syslog for Debian, and /var/log/messages for Centos.

Verify proxy health

  1. To debug the configuration of the proxy, run the following command on the VM:

    curl localhost:15000/config_dump > config.out
  2. Copy that file and run the following command:

    istioctl proxy-config [cluster|route|listener] --file config.out

Invalid token errors

You might see an error similar to the following in the envoy error log:

E0217 17:59:17.206995798    2411]  Call to http server ended with error 500 [{
  "error": "invalid_target",
  "error_description": "federated token response does not have access token. {\"error\":\"invalid_grant\",\"error_description\":\"JWT expired.\"}",
  "error_uri": ""

In that case, check if the token in /var/run/secrets/tokens/istio-token on the VM is expired and confirm the exp (epoch seconds) value has not elapsed:

cat /var/run/secrets/tokens/istio-token | cut -d '.' -f2 | base64 -d | python -m json.tool
    "azp": "...",
    "email": "",
    "email_verified": true,
    "exp": 1613995395,
    "google": {
        "compute_engine": {
            "instance_creation_timestamp": 1613775765,
            "instance_id": "5678",
            "instance_name": "vm-instance-03-0mqh",
            "project_id": "...",
            "project_number": 1234,
            "zone": "us-central1-c"
    "iat": ...,
    "iss": "",
    "sub": "..."

Unsupported OS distribution warning info

In verify the agent , if you see a warning message similar to the following in the service-proxy-agent log:

E0217 17:59:17.206995798    2021-04-09T21:21:29.6091Z service-proxy-agent Warn
Detected image is unsupported: [Ubuntu|Fedora|Suse]. Envoy may not work correctly.

This means your Linux distribution might be unsupported, which might cause your proxy to have unexpected behavior.

Debug the cluster

Use the following steps to troubleshooting problems with your cluster.

Verify auto-registration is working

  1. Check the WorkloadEntry that istiod auto-generates:

    kubectl get workloadentry -n WORKLOAD_NAMESPACE

    In addition, you can check the Kubernetes Object Browser for its existence.

  2. If it doesn't exist, check for errors in the istiod logs, which should be available to you in Google Cloud's operations suite. Alternatively, you can retrieve them directly:

    kubectl -n istio-system get pods -l app=istiod

    The expected output is similar to:

    NAME                                       READY   STATUS    RESTARTS   AGE
    istiod-asm-190-1-7f6699cfb-5mzxw           1/1     Running   0          5d13h
    istiod-asm-190-1-7f6699cfb-pgvpf           1/1     Running   0          5d13h
  3. Set the pod environment variable and use it to export the logs:

    export ISTIO_POD=istiod-asm-190-1-7f6699cfb-5mzxw
    kubectl logs -n istio-system ${ISTIO_POD} | grep -i 'auto-register\|WorkloadEntry'

Check the connected proxies

You can use the proxy-status command to list all connected proxies, including those for VMs:

istioctl proxy-status

The output should show connected proxies similar to:

NAME                                                    CDS        LDS        EDS        RDS          ISTIOD                               VERSION
details-v1-5f449bdbb9-bhl8d.default                     SYNCED     SYNCED     SYNCED     SYNCED       istiod-asm-190-1-7f6699cfb-5mzxw     1.9.0-asm.1
httpbin-779c54bf49-647vd.default                        SYNCED     SYNCED     SYNCED     SYNCED       istiod-asm-190-1-7f6699cfb-pgvpf     1.9.0-asm.1
istio-eastwestgateway-5b6d4ddd9d-5rzx2.istio-system     SYNCED     SYNCED     SYNCED     NOT SENT     istiod-asm-190-1-7f6699cfb-pgvpf     1.9.0-asm.1
istio-ingressgateway-66b6ddd7cb-ctb6b.istio-system      SYNCED     SYNCED     SYNCED     SYNCED       istiod-asm-190-1-7f6699cfb-pgvpf     1.9.0-asm.1
istio-ingressgateway-66b6ddd7cb-vk4bb.istio-system      SYNCED     SYNCED     SYNCED     SYNCED       istiod-asm-190-1-7f6699cfb-5mzxw     1.9.0-asm.1
vm-instance-03-39b3.496270428946                        SYNCED     SYNCED     SYNCED     SYNCED       istiod-asm-190-1-7f6699cfb-pgvpf     1.9.0
vm-instance-03-nh5k.496270428946                        SYNCED     SYNCED     SYNCED     SYNCED       istiod-asm-190-1-7f6699cfb-pgvpf     1.9.0
vm-instance-03-s4nl.496270428946                        SYNCED     SYNCED     SYNCED     SYNCED       istiod-asm-190-1-7f6699cfb-5mzxw     1.9.0

For more information about the command options, see istioctl proxy-config.

Check the workload identity configuration

Verify the identity provider is set up correctly

Check the IdentityProvider resource fields:

 kubectl describe identityprovider

Ensure that the fields meet these requirements:

  • The ServiceAccount field is set to[email]
  • The IssuerURI field is set to (currently we only support google as the issuerURI)
  • The provider must be set to google, which is the only currently-supported provider.

    A valid IdentityProvider CR example:

    kind: IdentityProvider
      name: google

Verify the WorkloadGroup is set up correctly

Check the WorkloadGroup:

 kubectl get workloadgroup -n WORKLOAD_NAMESPACE

Ensure that the results meet these requirements:

  • The ServiceAccount field is set correctly, for example where the account is the same as the service account used by the VM instance
  • The under the annotation field is set. e.g. google
  • The workload group references a valid IdentityProvider, which you can verify by checking the existing identity provider:

    kubectl describe identityprovider

    The output should be a list of existing providers like this:

     NAME     AGE
     google   39m

    Check the field in the WorkloadGroup whether the provider exists in the list of existing providers.

    A valid WorkloadGroup CR example:

    kind: WorkloadGroup
    name: wg-a
    namespace: foo
      annotations: google
        app: wg-a
        grpc: 3550
        http: 8080

Internal Error Found

If you receive the message Internal Error Found, see Getting support.

Istio VM troubleshooting guide

For additional troubleshooting steps, see Debugging Virtual Machines.