Config Sync known issues

This page lists known issues for supported versions of Config Sync. To filter the known issues by a product version or problem category, select your filters from the following drop-down menus.

Select your Config Sync version:

Select your problem category:

Or, filter the known issues:

Category Identified version Fixed version Issue and workaround
Component health 1.15.0 1.17.0

Reconciler container OOMKilled on AutoPilot

On Autopilot clusters, Config Sync component containers have resource limits set for CPU and memory. Under load, these containers can be killed by the kubelet or kernel for using too much memory.

Workaround:

Upgrade to version 1.17.0 or later. In Config Sync version 1.17.0, the default CPU and memory limits were adjusted to help avoid out of memory errors for most use cases.

If you can't upgrade, specify a higher memory limit using resource overrides.

Component health 1.15.0

Reconciler unschedulable

Config Sync reconcilers require varying amounts of resources, depending on the configuration of the RootSync or RepoSync. Certain configurations require more resources than others.

If a reconciler is unschedulable, it might be due to requesting more resources than are available on your nodes.

If you're using standard mode GKE clusters, the reconciler resource requests are set very low. This setting was chosen in an attempt to allow scheduling, even if it would lead to throttling and slow performance, so that Config Sync works on small clusters and small nodes. However, on GKE Autopilotclusters, the reconciler requests are set higher, to more realistically represent usage while syncing.

Workaround:

GKE Autopilot or GKE Standard with node auto-provisioning enabled should be able to see how many resources are requested and create appropriately sized nodes to allow scheduling. However, if you're manually configuring the nodes or node instance sizes, you might need to adjust those settings to accommodate the reconciler Pod resource requirements.

KNV errors 1.15.0 Kubernetes version 1.27

KNV1067 error even though config applied successfully

Due to an issue with OpenAPI v2, you might see a KNV1067 error even if your config was applied successfully.

Workaround:

If your cluster is running a Kubernetes version earlier than 1.27, ensure the protocol field is explicitly set under spec: containers: ports: even if you are using the default TCP.

KNV errors 1.15.0 1.16.0

Config Sync fails to reconcile with KNV2002 error

If Config Sync is unable to reconcile with a KNV2002 error, it might be due to a known issue caused by a client-go issue. The issue causes an empty list of resources in the external.metrics.k8s.io/v1beta1 API group with an error message from the RootSync or RepoSync object, or the reconciler logs:

KNV2002: API discovery failed: APIServer error: unable to retrieve the complete list of server APIs: external.metrics.k8s.io/v1beta1: received empty response for:
external.metrics.k8s.io/v1beta1

Workaround:

To resolve the issue, upgrade yourGKE cluster to GKE version 1.28 or later or upgrade Config Sync to version 1.16.0 or later. Both of these versions contain fixes to the client-go issue.

Metrics 1.15.0 1.17.2

Exporting failed: Unrecognized metric labels

In version 1.15.0, Config Sync added type and commit labels to many metrics. These labels increased metric cardinality, which increased the number of metrics being exported. Attribute processing was also added to filter these labels when exporting to Cloud Monarch, but this filtering was misconfigured, causing transformation errors in the otel-collector logs.

Workaround:

Upgrade to version 1.17.2 or later.

Metrics 1.15.0 1.16.1

High metrics cardinality and transformation errors

In version 1.15.0, Config Sync added type and commit labels to many metrics. These labels increased metric cardinality, which increased the number of metrics being exported. Attribute processing was also added to filter these labels when exporting to Cloud Monarch, but this filtering was misconfigured, causing transformation errors in the otel-collector logs.

Workaround:

Upgrade to version 1.16.1 or later. In version 1.16.1, the type field was removed, the filtering was fixed, and the commit field was additionally filtered from Cloud Monitoring. This fixed the errors and reduced the cardinality of the metrics.

Metrics 1.15.0

Exporting failed. Permission denied

By default, when the reconciler-manager detects Application Default Credentials, the otel-collector is configured to export metrics to Prometheus, Cloud Monitoring, and Monarch.

Workaround:

otel-collector logs errors if you haven't Configured Cloud Monitoring or Disabled Cloud Monitoring and Cloud Monarch.

Metrics 1.15.0

otel-collector crashing with custom config

If you try to modify or delete one of the default ConfigMaps, otel-collector or otel-collector-google-cloud, the otel-collector might error or crash from not being able to load the required ConfigMap.

Workaround:

To customize the metrics export configuration, create a ConfigMap named otel-collector-custom in the config-management-monitoring namespace.

Metrics 1.14.0

Missing metrics totals

In Config Sync version 1.14.0, the following metrics were removed: resource_count_total, ready_resource_count_total, and kcc_resource_count_total.

Workaround:

To track total values, use the Sum aggregation type in Cloud Monitoring.

Metrics 1.14.1

Missing Pod metrics

In Config Sync version 1.14.1, most Config Sync metrics were changed to use the k8s_container type, instead of the k8s_pod type. This allowed for identifying which container the metric came from, which is especially useful for the reconciler Pods, which have many containers. Because of this change, dashboards and alerts that were tracking these metrics might have stopped working.

Workaround:

Update the metrics to track the k8s_container type metric.

nomos cli 1.15.0 1.17.2

nomos status and nomos bugreport don't work in a Pod

Before nomos version 1.17.2, nomos bugreport and nomos status could only connect to the local cluster when run inside a Kubernetes Pod. In nomos version 1.17.2, the authorization method was changed to work more like kubectl. Because of this change, the local cluster is targeted by default. You can override the config by specifying the KUBECONFIG environment variable.

Workaround:

Upgrade to nomos version 1.17.2.

Remediation

Config Sync fighting with itself

Config Sync might appear to be in a controller fight. with itself. This issue occurs if you set the default value for an optional field of a resource in the Git repository. For example, setting apiGroup: "" for the subject of a RoleBinding triggers this because the apiGroup field is optional and an empty string is the default value. The default values of string, boolean, and integer fields are "", false, and 0 (respectively).

Workaround:

Remove the field from the resource declaration.

Remediation

Config Sync fighting with Config Connector resources

Config Sync might appear to be fighting Config Connector over a resource, for example a StorageBucket. This issue occurs if you don't set the value of an optional field of a resource spec.lifecycleRule.condition.withState in the source of truth.

Workaround:

You can avoid this issue by adding the withState=ANY field on the resource declaration. Alternatively, you can abandon and then reacquire the resource with the cnrm.cloud.google.com/state-into-spec: absent annotation.

Source of truth 1.16.1 1.16.2

Periodically unable to evaluate the source link

Config Sync can experience issues when the reconciler starts where it's periodically unable to evaluate the source link. This issue happens because git-sync has not yet cloned the source repository.

Workaround:

Update Config Sync to version 1.16.2 or later. In these versions, this is a transient error, so it is logged but not reported as an error.

Source of truth 1.15.0 1.17.0

Error syncing repository: context deadline exceeded

In versions earlier than 1.17.0, Config Sync checked out the full Git repository history by default. This could lead to the fetch request timing out on large repositories with many commits.

Workaround:

Upgrade to version 1.17.0 or later. In version 1.17.0 and later, the Git fetch is performed with --depth=1, which only fetches the latest commit. This speeds up source fetching, avoids most timeouts, and reduces the Git server load.

If you're still experiencing this issue after upgrading, it's likely that your Source of truth has many files, your Git server is responding slowly, or there is some other networking problem.

Syncing 1.15.0

High number of ineffective PATCH requests in the audit logs

The Config Sync remediator uses Dry-run to detect drift. This can cause PATCH requests to show up in the audit log, even when the PATCH isn't persisted, because the audit log doesn't distinguish between dry-runs and normal requests.

Workaround:

Because the audit log cannot distinguish between dry-run and non-dry-run requests, you can ignore the PATCH requests.
Syncing 1.17.0

Config Sync fails to pull the latest commit from a branch

In Config Sync versions 1.17.0 and later, you might encounter a problem where Config Sync fails to pull the latest commit from the HEAD of a specific branch when the same branch is referenced in multiple remotes and they are out of sync. For example, the main branch of a remote repository origin might be ahead of the same branch in the remote repository upstream, but Config Sync only fetches the commit SHA from the last line, which might not be the latest commit.

The following example shows what this issue might look like:

git ls-remote -q [GIT_REPOSITORY_URL] main  main^{}
244999b795d4a7890f237ef3c8035d68ad56515d    refs/heads/main               # the latest commit
be2c0aec052e300028d9c6d919787624290505b6    refs/remotes/upstream/main    # the commit Config Sync pulls from

Workaround:

To mitigate this issue, you can set your Git revision (spec.git.revision) to the latest commit SHA regardless of the value set for the Git branch (spec.git.branch). For more information about the Git configs, see Configuration for the Git repository.

Back to top

What's next