Config Sync known issues

This page lists known issues for supported versions of Config Sync.

Many of the issues listed here have been fixed. The Fixed version column indicates the version in which the fix was introduced. To receive this fix, upgrade to the listed version or later.

To filter the known issues by a product version or problem category, select your filters from the following drop-down menus.

Select your Config Sync version:

Select your problem category:

Or, filter the known issues:

Category Identified version Fixed version Issue and workaround
Component health 1.15.0 1.17.0

Fixed: Reconciler container OOMKilled on AutoPilot

On Autopilot clusters, Config Sync component containers have resource limits set for CPU and memory. Under load, these containers can be killed by the kubelet or kernel for using too much memory.

Workaround:

If you can't upgrade to version 1.17.0 or later, specify a higher memory limit using resource overrides.

In version 1.17.0, the default CPU and memory limits were adjusted to help avoid out of memory errors for most use cases.

Component health 1.15.0

Reconciler unschedulable

Config Sync reconcilers require varying amounts of resources, depending on the configuration of the RootSync or RepoSync. Certain configurations require more resources than others.

If a reconciler is unschedulable, it might be due to requesting more resources than are available on your nodes.

If you're using standard mode GKE clusters, the reconciler resource requests are set very low. This setting was chosen in an attempt to allow scheduling, even if it would lead to throttling and slow performance, so that Config Sync works on small clusters and small nodes. However, on GKE Autopilotclusters, the reconciler requests are set higher, to more realistically represent usage while syncing.

Workaround:

GKE Autopilot or GKE Standard with node auto-provisioning enabled should be able to see how many resources are requested and create appropriately sized nodes to allow scheduling. However, if you're manually configuring the nodes or node instance sizes, you might need to adjust those settings to accommodate the reconciler Pod resource requirements.

KNV errors 1.15.0 Kubernetes version 1.27

Fixed: KNV1067 error even though config applied successfully

Due to an issue with OpenAPI v2, you might see a KNV1067 error even if your config was applied successfully.

Workaround:

If your cluster is running a Kubernetes version earlier than 1.27, ensure the protocol field is explicitly set under spec: containers: ports: even if you are using the default TCP.

KNV errors 1.15.0 1.16.0, Kubernetes version 1.28

Fixed: Config Sync fails to reconcile with KNV2002 error

If Config Sync is unable to reconcile with a KNV2002 error, it might be due to a known issue caused by a client-go issue. The issue causes an empty list of resources in the external.metrics.k8s.io/v1beta1 API group with an error message from the RootSync or RepoSync object, or the reconciler logs:

KNV2002: API discovery failed: APIServer error: unable to retrieve the complete list of server APIs: external.metrics.k8s.io/v1beta1: received empty response for:
external.metrics.k8s.io/v1beta1
Metrics 1.15.0 1.17.2

Fixed: Exporting failed: Unrecognized metric labels

In version 1.15.0, Config Sync added type and commit labels to many metrics. These labels increased metric cardinality, which increased the number of metrics being exported. Attribute processing was also added to filter these labels when exporting to Cloud Monarch, but this filtering was misconfigured, causing transformation errors in the otel-collector logs.

Metrics 1.15.0 1.16.1

Fixed: High metrics cardinality and transformation errors

In version 1.15.0, Config Sync added type and commit labels to many metrics. These labels increased metric cardinality, which increased the number of metrics being exported. Attribute processing was also added to filter these labels when exporting to Cloud Monarch, but this filtering was misconfigured, causing transformation errors in the otel-collector logs.

In version 1.16.1, the type field was removed, the filtering was fixed, and the commit field was additionally filtered from Cloud Monitoring. This fixed the errors and reduced the cardinality of the metrics.

Metrics 1.15.0

Exporting failed. Permission denied

By default, when the reconciler-manager detects Application Default Credentials, the otel-collector is configured to export metrics to Prometheus, Cloud Monitoring, and Monarch.

Workaround:

otel-collector logs errors if you haven't Configured Cloud Monitoring or Disabled Cloud Monitoring and Cloud Monarch.

Metrics 1.15.0

otel-collector crashing with custom config

If you try to modify or delete one of the default ConfigMaps, otel-collector or otel-collector-google-cloud, the otel-collector might error or crash from not being able to load the required ConfigMap.

Workaround:

To customize the metrics export configuration, create a ConfigMap named otel-collector-custom in the config-management-monitoring namespace.

nomos cli 1.15.0 1.17.2

Fixed: nomos status and nomos bugreport don't work in a Pod

Before nomos version 1.17.2, nomos bugreport and nomos status could only connect to the local cluster when run inside a Kubernetes Pod. In nomos version 1.17.2, the authorization method was changed to work more like kubectl. Because of this change, the local cluster is targeted by default. You can override the config by specifying the KUBECONFIG environment variable.

Remediation

Config Sync fighting with itself

Config Sync might appear to be in a controller fight. with itself. This issue occurs if you set the default value for an optional field of a resource in the Git repository. For example, setting apiGroup: "" for the subject of a RoleBinding triggers this because the apiGroup field is optional and an empty string is the default value. The default values of string, boolean, and integer fields are "", false, and 0 (respectively).

Workaround:

Remove the field from the resource declaration.

Remediation

Config Sync fighting with Config Connector resources

Config Sync might appear to be fighting Config Connector over a resource, for example a StorageBucket. This issue occurs if you don't set the value of an optional field of a resource spec.lifecycleRule.condition.withState in the source of truth.

Workaround:

You can avoid this issue by adding the withState=ANY field on the resource declaration. Alternatively, you can abandon and then reacquire the resource with the cnrm.cloud.google.com/state-into-spec: absent annotation.

Source of truth 1.17.3 1.18.3

Fixed: Git SSH Authentication Failure with GitHub

git-sync v4.2.1 has a bug that removes the username from the repository URL when using SSH, causing authentication to fail when connecting to GitHub, which requires the user to be git.

The error message from git is: git-sync@github.com: Permission denied (publickey).\r\nfatal: Could not read from remote repository.

Workaround:

Use a different authentication method.

Source of truth 1.16.1 1.16.2

Fixed: Periodically unable to evaluate the source link

Config Sync can experience issues when the reconciler starts where it's periodically unable to evaluate the source link. This issue happens because git-sync has not yet cloned the source repository.

In versions 1.16.2 and later, this is a transient error, so it is logged but not reported as an error.

Source of truth 1.15.0 1.18.0

Fixed: Periodically invalid authentication credentials for Cloud Source Repositories

Config Sync can error periodically when the authentication token expires for Cloud Source Repositories. This issue is caused by the token refresh waiting until expiration before refreshing the token.

In version 1.18.0 and later, the token is refreshed on the first request within five minutes of token expiration. This prevents the invalid authentication credentials error unless the credentials are actually invalid.

Source of truth 1.15.0 1.17.0

Fixed: Error syncing repository: context deadline exceeded

In versions earlier than 1.17.0, Config Sync checked out the full Git repository history by default. This could lead to the fetch request timing out on large repositories with many commits.

In version 1.17.0 and later, the Git fetch is performed with --depth=1, which only fetches the latest commit. This speeds up source fetching, avoids most timeouts, and reduces the Git server load.

If you're still experiencing this issue after upgrading, it's likely that your Source of truth has many files, your Git server is responding slowly, or there is some other networking problem.

Syncing 1.15.0

High number of ineffective PATCH requests in the audit logs

The Config Sync remediator uses Dry-run to detect drift. This can cause PATCH requests to show up in the audit log, even when the PATCH isn't persisted, because the audit log doesn't distinguish between dry-runs and normal requests.

Workaround:

Because the audit log cannot distinguish between dry-run and non-dry-run requests, you can ignore the PATCH requests.
Syncing 1.17.0 1.17.3

Fixed: Config Sync fails to pull the latest commit from a branch

In Config Sync versions 1.17.0, 1.17.1, and 1.17.2, you might encounter a problem where Config Sync fails to pull the latest commit from the HEAD of a specific branch when the same branch is referenced in multiple remotes and they are out of sync. For example, the main branch of a remote repository origin might be ahead of the same branch in the remote repository upstream, but Config Sync only fetches the commit SHA from the last line, which might not be the latest commit.

The following example shows what this issue might look like:

git ls-remote -q [GIT_REPOSITORY_URL] main  main^{}
244999b795d4a7890f237ef3c8035d68ad56515d    refs/heads/main               # the latest commit
be2c0aec052e300028d9c6d919787624290505b6    refs/remotes/upstream/main    # the commit Config Sync pulls from

In version 1.17.3 and later, the git-sync dependency was updated with a different fetch mechanism.

If you can't upgrade, you can set your Git revision (spec.git.revision) to the latest commit SHA regardless of the value set for the Git branch (spec.git.branch). For more information about the Git configs, see Configuration for the Git repository.

Private registry 1.19.0

Config Sync doesn't use private registry for reconciler Deployments

Config Sync should replace the images for all Deployments when a private registry is configured. However, Config Sync does not replace the image registry for images in the reconciler Deployments.

Workaround:

A workaround for this issue is to configure the image registry mirror in containerd.

Syncing 1.17.0 1.18.3

Fixed: Config Sync reconciler is crashlooping

In Config Sync versions 1.17.0 or later, you might encounter a problem where the reconciler fails to create a rest config in some Kubernetes providers.

The following example shows what this issue might look like in the reconciler logs:

Error creating rest config: failed to build rest config: reading local kubeconfig: loading REST config from "/.kube/config": stat /.kube/config: no such file or directory
Terraform Terraform version 5.41.0

Config Sync can't be installed or upgraded using Terraform

Terraform version 5.41.0 introduced a new field to the google_gke_hub_feature_membership: config_sync.enabled. Because the default value of this field is false, it causes Config Sync installations to fail when Terraform is upgraded to version 5.41.0.

Workaround:

  • If you use the google_gke_hub_feature_membership resource, manually set the config_sync.enabled to true.
  • If you use the acm submodule, it's recommended to switch to an alternative way to install Config Sync. If you're unable to switch, upgrade to v33.0.0.

Back to top

What's next