Version 1.9

Resolving VM support issues in Anthos Service Mesh

The following steps and logs are useful to troubleshoot problems with Anthos Service Mesh VM support.

Debug the VM

If you see that VM instances are running but not reachable from the mesh, perform the following steps on the VM instance.

Verify the agent

  1. Check the envoy proxy health:

    curl localhost:15000/ready -v
    
  2. Check the envoy error log

    less /var/log/envoy/envoy.err.log
    
  3. Check for service-proxy-agent errors:

    journalctl -u service-proxy-agent
    
  4. Check the syslog either in the Google Cloud's operations suite logs for the instance or on the VM under /var/log/syslog.

Verify proxy health

  1. To debug the configuration of the proxy, run the following command on the VM:

    curl localhost:15000/config_dump > config.out
    
  2. Copy that file and run the following command:

    istioctl proxy-config [cluster|route|listener] --file config.out
    

Invalid token errors

You might see an error similar to the following in the envoy error log:

E0217 17:59:17.206995798    2411 oauth2_credentials.cc:152]  Call to http server ended with error 500 [{
  "error": "invalid_target",
  "error_description": "federated token response does not have access token. {\"error\":\"invalid_grant\",\"error_description\":\"JWT expired.\"}",
  "error_uri": ""
}].

In that case, check if the token in /var/run/secrets/tokens/istio-token on the VM is expired and confirm the exp (epoch seconds) value has not elapsed:

cat /var/run/secrets/tokens/istio-token | cut -d '.' -f2 | base64 -d | python -m json.tool
{
    ...
    "azp": "...",
    "email": "example-service-account@developer.gserviceaccount.com",
    "email_verified": true,
    "exp": 1613995395,
    "google": {
        "compute_engine": {
            "instance_creation_timestamp": 1613775765,
            "instance_id": "5678",
            "instance_name": "vm-instance-03-0mqh",
            "project_id": "...",
            "project_number": 1234,
            "zone": "us-central1-c"
        }
    },
    "iat": ...,
    "iss": "https://accounts.google.com",
    "sub": "..."
}

Debug the cluster

Use the following steps to troubleshooting problems with your cluster.

Verify auto-registration is working

  1. Check the WorkloadEntry that istiod auto-generates:

    kubectl get workloadentry -n WORKLOAD_NAMESPACE
    

    In addition, you can check the Kubernetes Object Browser for its existence.

  2. If it doesn't exist, check for errors in the istiod logs, which should be available to you in Google Cloud's operations suite. Alternatively, you can retrieve them directly:

    kubectl -n istio-system get pods -l app=istiod
    

    The expected output is similar to:

    NAME                                       READY   STATUS    RESTARTS   AGE
    istiod-asm-190-1-7f6699cfb-5mzxw           1/1     Running   0          5d13h
    istiod-asm-190-1-7f6699cfb-pgvpf           1/1     Running   0          5d13h
    
  3. Set the pod environment variable and use it to export the logs:

    export ISTIO_POD=istiod-asm-190-1-7f6699cfb-5mzxw
    kubectl logs -n istio-system {ISTIO_POD} | grep -i 'auto-register\|WorkloadEntry
    

Check the connected proxies

You can use the proxy-status command to list all connected proxies, including those for VMs:

istioctl proxy-status

The output should show connected proxies similar to:

NAME                                                    CDS        LDS        EDS        RDS          ISTIOD                               VERSION
details-v1-5f449bdbb9-bhl8d.default                     SYNCED     SYNCED     SYNCED     SYNCED       istiod-asm-190-1-7f6699cfb-5mzxw     1.9.0-asm.1
httpbin-779c54bf49-647vd.default                        SYNCED     SYNCED     SYNCED     SYNCED       istiod-asm-190-1-7f6699cfb-pgvpf     1.9.0-asm.1
istio-eastwestgateway-5b6d4ddd9d-5rzx2.istio-system     SYNCED     SYNCED     SYNCED     NOT SENT     istiod-asm-190-1-7f6699cfb-pgvpf     1.9.0-asm.1
istio-ingressgateway-66b6ddd7cb-ctb6b.istio-system      SYNCED     SYNCED     SYNCED     SYNCED       istiod-asm-190-1-7f6699cfb-pgvpf     1.9.0-asm.1
istio-ingressgateway-66b6ddd7cb-vk4bb.istio-system      SYNCED     SYNCED     SYNCED     SYNCED       istiod-asm-190-1-7f6699cfb-5mzxw     1.9.0-asm.1
vm-instance-03-39b3.496270428946                        SYNCED     SYNCED     SYNCED     SYNCED       istiod-asm-190-1-7f6699cfb-pgvpf     1.9.0
vm-instance-03-nh5k.496270428946                        SYNCED     SYNCED     SYNCED     SYNCED       istiod-asm-190-1-7f6699cfb-pgvpf     1.9.0
vm-instance-03-s4nl.496270428946                        SYNCED     SYNCED     SYNCED     SYNCED       istiod-asm-190-1-7f6699cfb-5mzxw     1.9.0

For more information about the command options, see istioctl proxy-config.

Istio VM troubleshooting guide

For additional troubleshooting steps, see Debugging Virtual Machines.