Troubleshoot workload to workload authentication


This document explains how to troubleshoot common errors related to Authenticate workloads to other workloads over mTLS.

Before you begin

  • If you haven't already, then set up authentication. Authentication is the process by which your identity is verified for access to Google Cloud services and APIs. To run code or samples from a local development environment, you can authenticate to Compute Engine by selecting one of the following options:
    1. Install the Google Cloud CLI, then initialize it by running the following command:

      gcloud init
    2. Set a default region and zone.

The generated credentials directory doesn't exist

If you get an error that the /var/run/secrets/workload-spiffe-credentials directory doesn't exist, do the following:

  1. Ensure that your VM supports workload to workload authentication by running the following command from inside the VM.

    curl  "http://metadata.google.internal/computeMetadata/v1/instance/gce-workload-certificates/config-status" -H "Metadata-Flavor: Google"
    
    1. If the response is an HTTP 404 error code with the following error message, then this VM doesn't support this feature.

      The requested URL /computeMetadata/v1/instance/gce-workload-certificates/config-status
      was not found on this server.  That's all we know.
      

      To resolve, create a new VM that supports workload to workload authentication, using one of the following methods:

    2. If the response is an HTTP 404 error code with the error message workload certificate feature not enabled, then the VM supports managed workload identities, but the feature isn't enabled. To enable the feature on the VM, see Enable managed workload identities on existing VMs.

  2. Ensure the VM is running a guest OS with Compute Engine guest agent version 20231103.01 or newer. Use the gcloud CLI to view the serial-port output to determine the current Compute Engine guest agent version:

    gcloud compute instances get-serial-port-output VM_NAME | grep "GCE Agent Started"
    

    Replace VM_NAME with the name of the VM.

    To update the Compute Engine guest agent, see Updating the guest environment.

  3. Check the service logs to verify that the gce-workload-cert-refresh.timer was able to successfully fetch the workload credentials and the trust bundle.

    # View timer logs to see when the gce-workload-cert-refresh.timer last ran
    journalctl -u gce-workload-cert-refresh.timer
    
    # View service logs from gce-workload-cert-refresh.service
    journalctl -u gce-workload-cert-refresh.service
    

The generated credentials directory contains only the config_status file

The generated credentials directory, /var/run/secrets/workload-spiffe-credentials, might contain only the config_status for a variety of reasons. Use the following steps to troubleshoot this issue.

  1. Check the contents of the config_status file to ensure that the managed workload identities feature is enabled. If the feature is not enabled using the appropriate VM metadata, the log file contains the error message workload certificate feature not enabled.

    To resolve this issue, create a new VM that supports workload to workload authentication, using one of the following methods:

  2. Check the contents of the config_status file to ensure that there are no errors due to missing attribute values or invalid configuration for the certificate issuance or the trust config. If such errors exist, update the configuration values by following the steps in Update certificate issuance and trust config.

  3. Ensure that the correct permissions were granted to the managed workload identities in the workload identity pool for accessing the subordinate CA pools. Use the following command:

    gcloud privateca pools get-iam-policy SUBORDINATE_CA_POOL_ID \
       --location=SUBORDINATE_CA_POOL_REGION \
    

    Replace the following:

    • SUBORDINATE_CA_POOL_ID: the ID for the subordinate CA pool.
    • SUBORDINATE_CA_POOL_REGION: the region of the subordinate CA pool.

    The output of this command should contain the following:

    bindings:
    - members:
      - principalSet://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/*
      -
      role: roles/privateca.poolReader
    - members:
      - principalSet://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/*
      role: roles/privateca.workloadCertificateRequester
    

    In the previous example:

    • PROJECT_NUMBER is the project number of your project.
    • POOL_ID is the ID of the workload identity pool.

    If you don't see output similar to the preceding example, grant the required permissions as described in Authorize managed workload identities to request certificates from the CA pool.

  4. If the config_status file contains no error messages, then check the value of iam.googleapis.com/workload-identity within the file.The value should match the following:

    spiffe://POOL_ID.global.PROJECT_NUMBER.workload.id.goog/ns/NAMESPACE_ID/sa/MANAGED_IDENTITY_ID
    

    In the previous example:

    • PROJECT_NUMBER is the project number for the project that contains the managed workload identity pool.
    • POOL_ID is the ID of the workload identity pool.
    • NAMESPACE_ID is the ID of the namespace in the workload identity pool.
    • MANAGED_IDENTITY_ID is the ID of the managed workload identity.

    If the value for iam.googleapis.com/workload-identity is incorrect, then you must create a new VM with the correct value because the managed identity value can only be updated during VM creation.

  5. If the config_status file contains no error messages, then ensure that the trust config contains a valid entry for the SPIFFE trust domain POOL_ID.global.PROJECT_NUMBER.workload.id.goog, which corresponds to the SPIFFE trust domain on the managed identity assigned to the VM. For more information, see Define the trust config.

  6. If the config_status file contains any error messages with the error code INTERNAL_ERROR, reach out to Cloud Customer Care or your Google Cloud contact with the error message.

Querying metadata server endpoints returns a 404 error

If you get a 404 response when querying the workload-identities or trust-anchors endpoint, then ensure that the VM supports the managed workload identities by running the following command from inside the VM:

curl  "http://metadata.google.internal/computeMetadata/v1/instance/gce-workload-certificates/config-status" -H "Metadata-Flavor: Google"
  • If the response is an HTTP 404 error code with the following error message:

      The requested URL /computeMetadata/v1/instance/gce-workload-certificates/config-status
      was not found on this server.  That's all we know.
    

    The VM doesn't support managed workload identities. To resolve the issue, do one of the following:

  • If the response is an HTTP 404 error code with the error message workload certificate feature not enabled, then this VM supports managed workload identities, but the feature is not enabled. Create a new VM with the feature enabled, or create a new instance template and managed instance group.

  • Ensure that the correct permissions were granted to the workload identity pool for accessing the subordinate CA pools by running the following command:

    gcloud privateca pools get-iam-policy SUBORDINATE_CA_POOL_ID \
      --location=SUBORDINATE_CA_POOL_REGION
    

    Replace the following:

    • SUBORDINATE_CA_POOL_ID: the ID for the subordinate CA pool.
    • SUBORDINATE_CA_POOL_REGION: the region of the subordinate CA pool.

    The output of this command should contain the following, where PROJECT_NUMBER is the project number of your project and POOL_ID is the ID of the workload identity pool.

    bindings:
    - members:
    - principalSet://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/*
    - role: roles/privateca.poolReader
    - members:
    - principalSet://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/*
    - role: roles/privateca.workloadCertificateRequester
    

    If your output doesn't contain these values, grant the correct permissions, as described in Authorize managed workload identities to request certificates from the CA pool.

  • Ensure that the iam.googleapis.com/workload-identity value is correct and matches the following:

    spiffe://POOL_ID.global.PROJECT_NUMBER.workload.id.goog/ns/NAMESPACE_ID/sa/MANAGED_IDENTITY_ID
    

    If the value doesn't match, then you must create a new VM because the managed identity value can't be updated after creating the VM.

  • Ensure that the trust config contains a valid entry for the SPIFFE trust domain POOL_ID.global.PROJECT_NUMBER.workload.id.goog, which corresponds to the SPIFFE trust domain on the managed identity assigned to the VM.