This page explains how to enable permissive mode on a backup plan.
During backup execution, if Backup for GKE detects conditions that are likely to cause restore to fail, the backup itself fails. The reason for the failure is provided in the backup's state_reason field. In the Google Cloud console, this field is termed as Status reason.
About permissive mode
When backup failures aren't acceptable and it's not possible to address the underlying issues, you can enable permissive mode. Permissive mode ensures that backups complete successfully, even if GKE resources that could potentially cause restore failures are detected during the backup process. Details about the issues are provided in the backup's Status reason field.
We recommend using this option only if you understand the issues and can implement workarounds during the restoration process. For a list of potential error messages in the backup's Status reason field with recommended actions, see Troubleshoot backup failures.
Enable permissive mode
Use the following instructions to enable permissive mode:
gcloud
To enable permissive mode, run the
gcloud beta container backup-restore backup-plans update
command:
gcloud beta container backup-restore backup-plans update BACKUP_PLAN \
--project=PROJECT_ID \
--location=LOCATION
--permissive-mode
Replace the following:
BACKUP_PLAN
: the name of the backup plan that you want to update.PROJECT_ID
: the ID of your Google Cloud project.LOCATION
: the compute region for the resource, for exampleus-central1
. See About resource locations.For a full list of options, refer to the gcloud beta container backup-restore backup-plans update documentation.
Console
Use the following instructions to enable permissive mode in the Google Cloud console:
In the Google Cloud console, go to the Google Kubernetes Engine page.
In the navigation menu, click Backup for GKE.
Click the Backup plans tab.
Expand the cluster and click the plan name.
Click the Details tab to edit the plan details.
Click Edit to edit the section with Backup mode.
Click the Permissive mode checkbox and click Save changes.
Terraform
Update the existing google_gke_backup_backup_plan
resource.
resource "google_gke_backup_backup_plan" "NAME" {
...
backup_config {
permissive_mode = true
...
}
}
Replace the following:
NAME
: the name of thegoogle_gke_backup_backup_plan
that you want to update.
For more information, see gke_backup_backup_plan.
Troubleshoot backup failures
The following table provides explanations and recommended actions for various backup failure messages displayed in the backup's Status reason field.
Backup failure message | Message description and failure reason | Recommended action |
---|---|---|
|
Description: A Custom Resource Definition (CRD) in
the cluster was originally applied as
apiextensions.k8s.io/v1beta1 and lacks a structural schema
required in apiextensions.k8s.io/v1 .Reason: Backup for GKE cannot automatically define the structural schema. Restoring the CRD in Kubernetes v1.22+ clusters, where apiextensions.k8s.io/v1beta1 is not available, causes
the restore to fail. This failure happens when restoring custom
resources defined by the CRD.
|
We recommend you to use the following options:
When permissive mode is enabled, the CRD without a structural schema won't be backed up in a Kubernetes v1.22+ cluster. To successfully restore such a backup, you need to exclude the resources served by the CRD from restore or create the CRD in the target cluster before starting the restore. |
|
Description: An API service in the cluster is
misconfigured. This causes requests to the API path to return "Failed to
query API resources." The underlying service may not exist or may not be
ready yet. Reason: Backup for GKE is unable to back up any resources served by the unavailable API. |
Check the underlying service in the API service's
spec.service to make sure it is ready.When permissive mode is enabled, resources from the API groups that failed to load won't be backed up. |
|
Description: In Kubernetes v1.23 and earlier, service
accounts automatically generate a token backed by a secret. However, in
later versions, Kubernetes removed this auto-generated token feature. A
Pod in the cluster might have mounted the secret volume to its
containers' file system. Reason: If Backup for GKE attempts to restore a service account along with its auto-generated secret and a Pod that mounts the secret volume, the restore appears to be successful. However, Kubernetes removes the secret, which causes the Pod to get stuck in container creation and fail to start. |
Define the spec.serviceAccountName field in the Pod. This
action ensures that the token is automatically mounted on
/var/run/secrets/kubernetes.io/serviceaccount in the
containers. For more information, refer to
Configure Service Accounts for Pods
documentation.When permissive mode is enabled, the secret is backed up but can't be mounted in Pods in Kubernetes v1.24+ clusters. |
Common Custom Resource Definitions (CRDs) with issues and recommended actions
Here are some common CRDs that have backup issues and the actions we recommend to address the issues:
capacityrequests.internal.autoscaling.k8s.io
: This CRD was used temporarily in v1.21 clusters. Runkubectl delete crd capacityrequests.internal.autoscaling.k8s.io
to remove the CRD.scalingpolicies.scalingpolicy.kope.io
: This CRD was used to control fluentd resources, but GKE has migrated to using fluentbit. Runkubectl delete crd scalingpolicies.scalingpolicy.kope.io
to remove the CRD.memberships.hub.gke.io
: Runkubectl delete crd memberships.hub.gke.io
to remove the CRD if there are no membership resources. Enable permissive mode if there are membership resources.applications.app.k8s.io
: Enable permissive mode with an understanding of restore behavior.