Troubleshoot deploying privileged Autopilot workloads


This page shows you how to resolve issues with privileged workloads that you deploy in Google Kubernetes Engine (GKE) Autopilot clusters.

Allowlist synchronization issues

When you deploy an AllowlistSynchronizer, GKE attempts to install and synchronize the allowlist files that you specify. If this synchronization fails, the status field of the AllowlistSynchronizer reports the error.

Get the status of the AllowlistSynchronizer object:

kubectl get allowlistsynchronizer ALLOWLIST_SYNCHRONIZER_NAME -o yaml

The output is similar to the following:

...
status:
  conditions:
  - type: Ready
    status: "False"
    reason: "SyncError"
    message: "some allowlists failed to sync: example-allowlist-1.yaml"
    lastTransitionTime: "2024-10-12T10:00:00Z"
    observedGeneration: 2
  managedAllowlistStatus:
    - filePath: "gs://path/to/allowlist1.yaml"
      generation: 1
      phase: Installed
      lastSuccessfulSync: "2024-10-10T10:00:00Z"
    - filePath: "gs://path/to/allowlist2.yaml"
      phase: Failed
      lastError: "Initial install failed: invalid contents"
      lastSuccessfulSync: "2024-10-08T10:00:00Z"

The conditions.message field and the managedAllowlistStatus.lastError field provide detailed information about the error. Use this information to resolve the issue.

Privileged workload deployment issues

After successfully installing an allowlist, you deploy the corresponding privileged workload in your cluster. In some cases, GKE might reject the workload.

Try the following resolution options:

  • Ensure that the GKE version of your cluster meets the version requirement of the workload.
  • Ensure that the workload that you're deploying is the workload to which the allowlist file applies.

To see why a privileged workload was rejected, request detailed information from GKE about allowlist violations:

  1. Get a list of the installed allowlists in the cluster:

    kubectl get workloadallowlist
    

    Find the name of the allowlist that should apply to the privileged workload.

  2. Open the YAML manifest of the privileged workload in a text editor. If you can't access the YAML manifests, for example if the workload deployment process uses other tooling, contact the workload provider to open an issue. Skip the remaining steps.

  3. Add the following label to the spec.metadata.labels section of the privileged workload Pod specification:

    labels:
      cloud.google.com/matching-allowlist: ALLOWLIST_NAME
    

    Replace ALLOWLIST_NAME with the name of the allowlist that you obtained in the previous step. Use the name from the output of the kubectl get workloadallowlist command, not the path to the allowlist file.

  4. Save the manifest and apply the workload to the cluster:

    kubectl apply -f WORKLOAD_MANIFEST_FILE
    

    Replace WORKLOAD_MANIFEST_FILE with the path to the manifest file.

    The output provides detailed information about which fields in the workload didn't match the specified allowlist, like in the following example:

    Error from server (GKE Warden constraints violations): error when creating "STDIN": admission webhook "warden-validating.common-webhooks.networking.gke.io" denied the request:
    
    ===========================================================================
    Workload Mismatches Found for Allowlist (example-allowlist-1):
    ===========================================================================
    HostNetwork Mismatch: Workload=true, Allowlist=false
    HostPID Mismatch: Workload=true, Allowlist=false
    Volume[0]: data
             - data not found in allowlist. Verify volume with matching name exists in allowlist.
    Container[0]:
    - Envs Mismatch:
            - env[0]: 'ENV_VAR1' has no matching string or regex pattern in allowlist.
            - env[1]: 'ENV_VAR2' has no matching string or regex pattern in allowlist.
    - Image Mismatch: Workload=k8s.gcr.io/diff/image, Allowlist=k8s.gcr.io/pause2. Verify that image string or regex match.
    - SecurityContext:
            - Capabilities.Add Mismatch: the following added capabilities are not permitted by the allowlist: [SYS_ADMIN SYS_PTRACE]
    - VolumeMount[0]: data
            - data not found in allowlist. Verify volumeMount with matching name exists in allowlist.
    

    In this example, the following violations occur:

    • The workload specifies hostNetwork: true, but the allowlist doesn't specify hostNetwork: true.
    • The workload specifies hostPID: true, but the allowlist doesn't specify hostPID: true.
    • The workload specifies a volume named data, but the allowlist doesn't specify a volume named data.
    • The container specifies environment variables named ENV_VAR1 and ENV_VAR2, but the allowlist doesn't specify these environment variables.
    • The container specifies the image k8s.gcr.io/diff/image, but the allowlist specifies k8s.gcr.io/pause2.
    • The container adds the SYS_ADMIN and SYS_PTRACE capabilities, but the allowlist doesn't allow adding these capabilities.
    • The container specifies a volume mount named data, but the allowlist doesn't specify a volume mount named data.

If you're deploying a workload that's provided by a third-party provider, open an issue with that provider to resolve the violations. Provide the output from the previous step in the issue.

Bugs and feature requests for privileged workloads and allowlists

Partners are responsible for creating, developing, and maintaining their privileged workloads and allowlists. If you encounter a bug or have a feature request for a privileged workload or allowlist, contact the corresponding partner.