Version 1.9. This version is supported as outlined in the Anthos version support policy, offering the latest patches and updates for security vulnerabilities, exposures, and issues impacting Anthos clusters on VMware (GKE on-prem). Refer to the release notes for more details. This is the most recent version.

Troubleshooting identity and authorization

This document gives troubleshooting guidance for identity and authorization issues.

Keeping gcloud anthos auth up-to-date

You can avoid many common issues by verifying that the components of your gcloud anthos auth installation are up-to-date.

There are two pieces that have to be verified, because the gcloud anthos auth command has logic in the gcloud core component and a separately packaged anthos-auth component.

Update gcloud:

gcloud components update

Update anthos-auth:

gcloud components install anthos-auth

Invalid provider configuration

If your identity provider configuration is invalid, you will see an error screen from your identity provider after you click LOGIN. Follow the provider-specific instructions to correctly configure the provider or your cluster.

Invalid permissions

If you complete the authentication flow, but still don't see the details of the cluster, make sure you granted the correct RBAC permissions to the account that you used with OIDC. Note that this might be a different account from the one you use to access Cloud Console.

Missing refresh token

The following issue occurs when the authorization server prompts for consent, but the required authentication parameter wasn't provided.

Error: missing 'RefreshToken' field in 'OAuth2Token' in credentials struct

To resolve this issue, in your cluster configuration file, add prompt=consent to the authentication.oidc.extraParams field. Then regenerate the client authentication file.

Refresh token expired

This issue occurs when the refresh token in the kubeconfig file has expired:

Unable to connect to the server: Get {DISCOVERY_ENDPOINT}: x509:
    certificate signed by unknown authority

To resolve this issue, run the login command again.

gkectl create-login-config fails to get clientconfig

This issue occurs when the kubeconfig file passed to gkectl create-login-config is not for a user cluster or the ClientConfig custom resource did not come up during cluster creation.

Error getting clientconfig using KUBECONFIG

To resolve this issue, make sure you have the correct kubeconfig file for your user cluster. Then check to see whether the ClientConfig object is in the cluster:

kubectl --kubeconfig USER_CLUSTER_KUBECONFIG  get clientconfig default -n kube-public

gkectl create-login-config fails because of duplicate cluster name

This issue occurs if you attempt to write login configuration data that contains a cluster name that already exists in the destination file. Each login configuration file must contain unique cluster names.

error merging with file MERGE_FILE because MERGE_FILE contains a
  cluster with the same name as the one read from KUBECONFIG. Please write to
  a new output file

To resolve this issue, use the --output flag to specify a new destination file.

If you do not provide --output, this login configuration data is written to a file named kubectl-anthos-config.yaml in the current directory.

gcloud anthos auth login fails with proxyconnect tcp

This issue occurs when there is an error in the https_proxy or HTTPS_PROXY environment variable configurations. If there's an https:// specified in the environment variables, then the GoLang HTTP client libraries might fail if the proxy is configured to handle HTTPS connections using other protocols such as SOCK5.

Possible error message:

proxyconnect tcp: tls: first record does not look like a TLS handshake

To resolve this issue, modify the https_proxy and HTTPS_PROXY environment variables to omit the https:// prefix. On Windows, modify the system environment variables. For example, change the value of the https_proxy environment variable from https://webproxy.example.com:8000 to webproxy.example.com:8000.

Cluster access fails when using kubeconfig generated by gcloud anthos auth login

This issue occurs when the Kubernetes API server is unable to authorize the user. This can happen if the appropriate RBACs are missing or incorrect, or there is an error in the OIDC configuration for the cluster.

Unauthorized

To resolve this issue:

  1. In the kubeconfig file generated by gcloud anthos auth login, copy the value of id-token.

    kind: Config
    …
    users:
    — name: …
      user:
        auth-provider:
          config:
            id-token: xxxxyyyy
    
  2. Install jwt-cli and run:

    jwt ID_TOKEN
    
  3. Verify OIDC configuration.

    The authentication.oidc in the user cluster configuration file has the group and username fields, which are used to set the --oidc-group-claim and --oidc-username-claim flags in the Kubernetes API server. When the API server is presented with the token, it forwards the token to Anthos Identity Service, which returns the extracted group-claim and username-claim back to the AIP server. The API server uses the response to verify that the corresponding group or user has the correct permissions.

    Verify that the claims set for group and user in the authentication.oidc section of cluster configuration file are present in the ID token.

  4. Check RBACs that were applied.

    Verify that there is an RBAC with the correct permissions for either the user specified by username-claim or one of the groups listed group-claim from the previous step. The name of the user or group in the RBAC should be prefixed with the usernameprefix or groupprefix that was specified in the user cluster configuration file.

    Note that if usernameprefix is blank, and username is a value other than email, the prefix defaults to issuerurl#. To disable username prefixes, set usernameprefix to -.

    For more information about user and group prefixes, see Authenticating with OIDC,

    Note that the Kubernetes API server treats a backslash as an escape character. Therefore, if the name of the user or group contains \\, the API server reads it as a single \ when parsing the ID token. Therefore, the RBAC role binding applied for this user or group should only contain a single backslash, or you might see an Unauthorized error.

    Cluster configuration file:

    oidc:
      ...
      username: "unique_name"
      usernameprefix: "-"
      group: "group"
      groupprefix: "oidc:"
    

    ID token:

    {
      ...
      "email": "cluster-developer@example.com",
      "unique_name": "EXAMPLE\\cluster-developer",
      "group": [
        "Domain Users",
        "EXAMPLE\\developers"
      ],
    ...
    }
    

    The following RBAC bindings grant this group and user the pod-reader cluster role. Note the single slash in the name field instead of a double slash:

    Group ClusterRoleBinding:

    apiVersion:
    kind: ClusterRoleBinding
    metadata:
      name: example-binding
    subjects:
    — kind: Group
      name: "oidc:EXAMPLE\developers"
      apiGroup: rbac.authorization.k8s.io
    roleRef:
      kind: ClusterRole
      name: pod-reader
      apiGroup: rbac.authorization.k8s.io
    

    User ClusterRoleBinding:

    apiVersion:
    kind: ClusterRoleBinding
    metadata:
      name: example-binding
    subjects:
    — kind: User
      name: "EXAMPLE\cluster-developer"
      apiGroup: rbac.authorization.k8s.io
    roleRef:
      kind: ClusterRole
      name: pod-reader
      apiGroup: rbac.authorization.k8s.io
    
  5. Check the Kubernetes API server logs.

    If the OIDC plugin configured in the Kubernetes API server does not start up correctly, the API server returns an Unauthorized error when presented with the ID token. To see if there were any issues with the OIDC plugin in the API server, run:

    kubectl --kubeconfig ADMIN_CLUSTER_KUBECONFIG logs statefulset/kube-apiserver -n USER_CLUSTER_NAME