Error reference
Config Sync error messages consist of an error ID in the format KNV1234
where 1234
is a unique number, followed by a description of the problem and
a suggestion for how to fix it. K
is inherited from Kubernetes conventions,
rules with prefix N
are specific to nomos, Codes for errors detectable in the
initial state of the repository and the cluster are of the form KNV1XXX. Codes
for errors that can only be detected at runtime are of the form KNV2XXX.
This page documents each of those error messages and steps to resolve the error.
You can also view the introduction to troubleshooting Config Sync and known issues.
KNV1000: InternalError
The ID of InternalError changed to KNV9998
with Config Sync 1.6.1.
KNV1001: ReservedDirectoryNameError
Deprecated in Config Sync 1.3.
KNV1002: DuplicateDirectoryNameError
Deprecated in Config Sync 1.3.
KNV1003: IllegalNamespaceSubdirectoryError
When using a hierarchical repo structure, a directory that contains a namespace config must not contain any subdirectories.
A directory without a namespace config is an abstract namespace directory and has directories inheriting from it, and consequently must have subdirectories. A directory containing a namespace config is a namespace directory and cannot be inherited from, so it must not have any subdirectories.
To fix, either remove the namespace config from the parent directory, or move the subdirectory somewhere else.
This issue can happen if a directory containing a namespace has a subdirectory.
namespaces/
└── prod/
├── namespace.yaml
└── us_west_1/
# namespaces/prod/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: prod
That directory structure and the contents of namespace.yaml
produce this error:
KNV1003: A Namespace directory MUST NOT have subdirectories. Remove the
Namespace policy from "prod", or move "us_west_1" to an Abstract
Namespace:
path: namespaces/prod/us_west_1
name: us_west_1
KNV1004: IllegalSelectorAnnotationError
A cluster-scoped object must not declare the annotation
configmanagement.gke.io/namespace-selector
. NamespaceSelectors can only be
declared for namespace scoped objects.
To fix the error, remove configmanagement.gke.io/namespace-selector
from the
metadata.annotations field.
The following ClusterRole config produces this error:
# cluster/namespace-reader-clusterrole.yaml
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: namespace-reader
annotations: {
"configmanagement.gke.io/namespace-selector" : "shipping-dev",
}
rules:
- apiGroups: [""]
resources: ["namespaces"]
verbs: ["get", "watch"]
If you attempt to include this ClusterRole in your cluster, nomos vet
returns
the following error:
KNV1004: Cluster-scoped objects may not be namespace-selected, and so MUST NOT declare the annotation 'configmanagement.gke.io/namespace-selector'. To fix, remove `metadata.annotations.configmanagement.gke.io/namespace-selector` from:
source: cluster/namespace-reader-clusterrole.yaml
metadata.name: namespace-reader
group: rbac.authorization.k8s.io
version: v1
kind: ClusterRole
A Cluster object must not declare the annotation
configmanagement.gke.io/cluster-selector
. To fix the error, remove
configmanagement.gke.io/cluster-selector
from metadata.annotations
.
If a Cluster object declares configmanagement.gke.io/cluster-selector
,
nomos vet
returns the following error:
KNV1004: Clusters may not be cluster-selected, and so MUST NOT declare the annotation 'configmanagement.gke.io/cluster-selector'. To fix, remove `metadata.annotations.configmanagement.gke.io/cluster-selector` from:
source: clusterregistry/cluster.yaml
metadata.name: default-name
group: clusterregistry.k8s.io
version: v1alpha1
kind: Cluster
KNV1005: IllegalManagementAnnotationError
The only valid setting for the management annotation is
configmanagement.gke.io/managed=disabled
. This setting is used to explicitly
unmanage a resource in the Git repository while leaving the config checked in.
The annotation configmanagement.gke.io/managed=enabled
is not necessary.
For more information, see Managing objects.
Setting a different annotation results in an error like the following:
KNV1005: Config has invalid management annotation configmanagement.gke.io/managed=invalid. If set, the value must be "disabled".
source: namespaces/foo/role.yaml
metadata.name: default-name
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1006: ObjectParseError
This error occurs when an object declared in the repository could not be
parsed. To fix, validate your yaml format with a tool such as
kubectl --validate
.
Example:
KNV1006: The following config could not be parsed as a rbac.authorization.k8s.io/v1, Kind=Role:
source: namespaces/foo/role.yaml
metadata.name: default-name
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1007: IllegalAbstractNamespaceObjectKindError
When using an unstructured repo, configs must not be declared in an abstract namespace directory. For more information about using unstructured repos, see Using an unstructured repo.
KNV1007: Config "default-name" illegally declared in an abstract namespace directory. Move this config to a namespace directory:
source: namespaces/foo/bar/role.yaml
metadata.name: default-name
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1009: IllegalMetadataNamespaceDeclarationError
When using the hierarchical repo structure, configs either declare namespaces that match the namespace directory containing them or omit the field.
The following is an example of a Role config that triggers the error:
# namespaces/shipping-prod/pod-reader-role.yaml
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: pod-reader
namespace: shipping-dev
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]
If you declare a config with such a namespace, this error occurs:
KNV1009: A config MUST either declare a `namespace` field exactly matching the directory containing the config, "shipping-prod", or leave the field blank:
source: namespaces/shipping-prod/pod-reader-role.yaml
namespace: shipping-dev
metadata.name: pod-reader
group: rbac.authorization.k8s.io
version: v1
kind: Role
For more information about the hierarchical repo structure, see Structure of the hierarchical repo.
KNV1010: IllegalAnnotationDefinitionError
Configs must not declare unsupported annotations starting with
configmanagement.gke.io
.
Supported annotations are:
configmanagement.gke.io/managed
: For more information about use, see Managing objects.configmanagement.gke.io/namespace-selector
: For more information about use, see Namespace-scoped objects.configmanagement.gke.io/cluster-selector
: For more information about use, see ClusterSelectors.
Example error:
KNV1010: Configs MUST NOT declare unsupported annotations starting with
"configmanagement.gke.io/". The config has invalid annotations:
"configmanagement.gke.io/invalid", "configmanagement.gke.io/sync-token"
source: namespaces/foo/role.yaml
metadata.name: role
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1011: IllegalLabelDefinition
Configs must not have labels with keys that begin with
configmanagement.gke.io/
. This label key prefix is reserved for use by
Config Sync.
The following is an example of a ConfigMap that triggers this error:
# namespaces/prod/mymap.yaml
kind: ConfigMap
apiVersion: v1
metadata:
name: my-map
labels:
configmanagement.gke.io/bad-label: label-value
data:
mydata: moredata
If you declare a config with such a label, this error occurs:
KNV1011: Configs MUST NOT declare labels starting with "configmanagement.gke.io/". The config has disallowed labels: "configmanagement.gke.io/bad-label"
source: namespaces/prod/mymap.yaml
metadata.name: my-map
group:
version: v1
kind: ConfigMap
KNV1012: NamespaceSelectorMayNotHaveAnnotation
Deprecated in Config Sync 1.3.
KNV1013: ObjectHasUnknownSelector
The config refers to a ClusterSelector or NamespaceSelector that does not exist. Before you can use a selector in an annotation for a config, the selector must exist.
If the selector is removed, remove any configs that refer to it as well. In this example, assume that there is no unknown-cluster-selector ClusterSelector in the clusterregistry/ directory of the repo.
# namespaces/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: foo
annotations:
configmanagement.gke.io/cluster-selector: unknown-cluster-selector
That causes this error:
KNV1013: Config "foo" MUST refer to an existing ClusterSelector, but has
annotation
"configmanagement.gke.io/cluster-selector=unknown-cluster-selector",
which maps to no declared ClusterSelector
NamespaceSelector annotations have the additional requirement that the referenced NamespaceSelector is defined in either the same directory or a parent directory of the config reference. Failure to do so results in this error:
KNV1013: Config "default-name" MUST refer to a NamespaceSelector in its directory or a parent directory. Either remove the annotation "configmanagement.gke.io/namespace-selector=default-ns-selector" from "default-name" or move NamespaceSelector "default-ns-selector" to a parent directory of "default-name".
source: namespaces/bar/selector.yaml
metadata.name: default-ns-selector
group: configmanagement.gke.io
version: v1
kind: NamespaceSelector
source: namespaces/foo/role.yaml
metadata.name: default-name
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1014: InvalidSelectorError
ClusterSelector and NamespaceSelector configs use correct syntax, but a syntax error was found. To fix, ensure that you specify the config according to the appropriate data schema:
For example, this invalid ClusterSelector:
kind: ClusterSelector
apiVersion: configmanagement.gke.io/v1
metadata:
name: selector-1
spec:
selector:
someUnknownField: # This field is not defined for a LabelSelector
foo: bar
Causes the following error:
KNV1014: ClusterSelector has validation errors that must be corrected: invalid field "someUnknownField"
source: clusterregistry/cs.yaml
metadata.name: selector-1
group: configmanagement.gke.io
version: v1
kind: ClusterSelector
In particular, ClusterSelector and NamespaceSelector definitions define the
spec.selector
field. Failure to do so causes the following error:
KNV1014: NamespaceSelectors MUST define `spec.selector`
source: namespaces/ns.yaml
metadata.name: ns-selector-1
group: configmanagement.gke.io
version: v1
kind: NamespaceSelector
KNV1016: PolicyManagementNotInstalledError
Deprecated in Config Sync 1.3.2.
KNV1017: MissingRepoError
When using the hierarchical repo structure, a Repo config must exist in the system/
directory of the repo and must include required information such as the repo's
semantic version.
If a Repo config doesn't exist, the following error occurs:
KNV1017: The system/ directory must declare a Repo Resource.
path: system/
To fix, define at least a minimal Repo config.
# system/repo.yaml
kind: Repo
apiVersion: configmanagement.gke.io/v1
metadata:
name: repo
spec:
version: "0.1.0"
For more information about the hierarchical repo structure, see Structure of the hierarchical repo.
KNV1018: IllegalSubdirectoryError
Deprecated in Config Sync 1.3.
KNV1019: IllegalTopLevelNamespaceError
When using the hierarchical repo structure, Namespaces must not be declared directly in namespaces/.
The following is a config that triggers the error:
# namespaces/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: namespaces
source: namespaces/namespace.yaml
metadata.name: namespaces
group:
version: v1
kind: Namespace
KNV1019: Namespaces MUST be declared in subdirectories of 'namespaces/'. Create a subdirectory for the following Namespace configs:
source: namespaces/namespace.yaml
metadata.name: namespaces
group:
version: v1
kind: Namespace
For more information about the hierarchical repo structure, see Structure of the hierarchical repo.
KNV1020: InvalidNamespaceNameError
When using the hierarchical repo structure, a namespace config declares
metadata.name
, and its value must match the name of the namespace's directory.
To fix, correct the namespace's metadata.name
or its directory.
The following is a config that triggers the error:
# namespaces/prod/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: dev
KNV1020: A Namespace MUST declare `metadata.name` that matches the name of its
directory.
expected `metadata.name`: prod
source: namespaces/prod/namespace.yaml
metadata.name: dev
group:
version: v1
kind: Namespace
For more information about the hierarchical repo structure, see Structure of the hierarchical repo.
KNV1021: UnknownObjectError
KNV1021: No CustomResourceDefinition is defined for the resource in the cluster.
Resource types that are not native Kubernetes objects must have a
CustomResourceDefinition.
source: namespaces/foo/role.yaml
metadata.name: role
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1024: IllegalKindInSystemError
KNV1024: Configs of this Kind may not be declared in the `system/` directory of
the repo:
source: namespaces/foo/role.yaml
metadata.name: role
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1027: UnsupportedRepoSpecVersion
The spec.version
field in the Repo config represents the semantic version of
the repo. This error indicates that you are using an unsupported version.
If your repo's format is compatible with the supported version, update
the spec.version
field.
If you need to upgrade, follow the instructions in the release notes.
# system/repo.yaml
kind: Repo
apiVersion: configmanagement.gke.io/v1
metadata:
name: repo
spec:
version: "0.0.0"
That produces this error:
KNV1027: Unsupported Repo spec.version: "0.0.0". Must use version "main"
source: system/repo.yaml
name: repo
group: configmanagement.gke.io
version: v1
kind: Repo
KNV1028: InvalidDirectoryNameError
KNV1028: Directory names have fewer than 64 characters, consist of lower case
alphanumeric characters or '-', and must start and end with an
alphanumeric character. Rename or remove directory:
path: namespaces/a.b`c
name: a.b`c
KNV1029: MetadataNameCollisionError
KNV1029: Configs of the same Kind MUST have unique names in the same Namespace
and their parent abstract namespaces:
source: namespaces/foo/r1.yaml
metadata.name: role
group: rbac.authorization.k8s.io
version: v1
kind: Role
source: namespaces/foo/r2.yaml
metadata.name: role
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1030: MultipleSingletonsError
KNV1030: Multiple Namespace resources cannot exist in the same directory. To fix, remove the duplicate config(s) such that no more than 1 remains:
source: namespaces/foo/namespace.yaml
metadata.name: foo
group:
version: v1
kind: Namespace
source: namespaces/foo/namespace.yaml
metadata.name: foo
group:
version: v1
kind: Namespace
KNV1031: MissingObjectNameError
All configs must declare metadata.name
. To fix, add the
metadata.name
field to the problematic configs.
KNV1031: A config must declare metadata.name:
source: namespaces/foo/role.yaml
metadata.name:
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1032: IllegalHierarchicalKindErrorCode
KNV1032: The type Repo.configmanagement.gke.io is not allowed if `sourceFormat` is set to `unstructured`. To fix, remove the problematic config, or convert your repo to use `sourceFormat: hierarchy`.
source: system/repo.yaml
metadata.name: repo
group: configmanagement.gke.io
version: v1
kind: Repo
KNV1033: IllegalSystemResourcePlacementError
Some Kinds can only be declared inside the system/ directory. The following is a list of Kinds that can exist exclusively in the system/ directory: - HierarchyConfig - Repo
KNV1033: A config of the below Kind MUST NOT be declared outside system/:
source: namespaces/foo/repo.yaml
metadata.name: repo
group: configmanagement.gke.io
version: v1
kind: Repo
KNV1034: IllegalNamespaceError
It is forbidden to declare the config-management-system
namespace, or
resources within it. To fix, remove the config-management-system
namespace and
any configs in that namespace.
KNV1034: Configs must not be declared in the "config-management-system" namespace
source: namespaces/config-management-system/role.yaml
namespace: namespaces/config-management-system
metadata.name: default-name
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1034: The "config-management-system" namespace must not be declared
source: namespaces/config-management-system/namespace.yaml
metadata.name: config-management-system
group:
version: v1
kind: Namespace
Starting from Config Sync version 1.17.0, the namespaces resource-group-system
and
config-management-monitoring
can't be declared in a source of
truth. It's also not recommended to declare any resources under the
resource-group-system
and
config-management-monitoring
namespaces.
If your source of truth contains these two namespaces, Config Sync reports the following error:
KNV1034: The "config-management-system" namespace must not be declared
source: namespaces/config-management-monitoring/namespace.yaml
metadata.name: config-management-monitoring
group:
version: v1
kind: Namespace
To resolve this issue and safely unmanage the controller namespace:
Update Config Sync to stop managing the namespace and any resource declared underneath.
Wait for a sync and then confirm that the corresponding resources are still available on the cluster, but not in
nomos status
.Remove the controller namespace YAML file from the source.
Let Config Sync resume managing the resources.
If you were previously syncing to a hierarchical repository and had to declare the controller namespace alongside any resources, consider switching to unstructured repository for more flexibility in your source structure.
KNV1036: InvalidMetadataNameError
The metadata.name
supplied is of invalid format. A valid metadata.name
must:
- Be shorter than 254 characters.
- Consist of lower case alphanumeric characters, '-', or '.'.
- Start and end with an alphanumeric character.
To fix, change the metadata.name
to satisfy the preceding conditions.
KNV1036: Configs MUST define a metadata.name that is shorter than 254
characters, consists of lower case alphanumeric characters, '-' or '.',
and must start and end with an alphanumeric character. Rename or remove
the config:
source: namespaces/foo/role.yaml
metadata.name: a`b.c
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1037: IllegalKindInClusterregistryError
Deprecated in Config Sync 1.3.
KNV1038: IllegalKindInNamespacesError
KNV1038: Configs of the below Kind may not be declared in `namespaces/`:
source: cluster/cr.yaml
metadata.name: role
group: rbac.authorization.k8s.io
version: v1
kind: ClusterRole
KNV1039: IllegalKindInClusterError
It is forbidden to declare a namespace-scoped object outside of namespaces/ or a cluster-scoped object outside of cluster/. To fix, relocate the problematic configs such that they are in a legal directory.
For more information about cluster-scoped objects, see Cluster-scoped objects.
For more information about namespace-scoped objects, see Namespace-scoped objects.
KNV1039: Namespace-scoped configs of the below Kind must not be declared in
cluster/:
source: namespaces/foo/role.yaml
metadata.name: role
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1040: UnknownResourceInHierarchyConfigError
Deprecated in Config Sync 1.3.
KNV1041: UnsupportedResourceInHierarchyConfigError
KNV1041: This Resource Kind MUST NOT be declared in a HierarchyConfig:
source: system/hc.yaml
group: configmanagement.gke.io
kind: Repo
KNV1042: IllegalHierarchyModeError
An illegal value for HierarchyMode was detected on a HierarchyConfig. HierarchyMode must be either none or inherit.
To read more about HierarchyConfigs, see Disabling Inheritance for an Object Type.
KNV1042: HierarchyMode invalid is not a valid value for the APIResource Role.rbac.authorization.k8s.io. Allowed values are [none,inherit].
source: system/hc.yaml
metadata.name: default-name
group: configmanagement.gke.io
version: v1
kind: HierarchyConfig
KNV1043: UnsupportedObjectError
KNV1043: Config Sync cannot configure this object. To fix, remove this
config from the repo.
source: namespaces/foo/role.yaml
metadata.name: role
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1044: UnsyncableResourcesErrorCode
KNV1044: An Abstract Namespace directory with configs MUST have at least one
Namespace subdirectory. To fix, do one of the following: add a Namespace
directory below "bar", add a Namespace config to "bar", or remove the configs in
"bar":
path: namespaces/foo/bar/
KNV1045: IllegalFieldsInConfigError
Example error message:
KNV1045: Configs with "metadata.ownerReference" specified are not allowed. To
fix, either remove the config or remove the "metadata.ownerReference" field in
the config:
source: namespaces/foo/replicaset.yaml
metadata.name: replicaSet
group: apps
version: v1
kind: ReplicaSet
The status
field is one of the illegal fields that causes this error. Using
Config Sync to sync the status
field isn't allowed because another controller
should manage and update the status
field in the cluster dynamically. If
Config Sync tries to control the wanted state of the status
field, it
fights with the controller responsible for managing the status
field.
To fix this error, remove the status
field from the source repository. For
third party configs that you don't own, use
kustomize
patches to remove
status
fields specified in your manifests in bulk.
KNV1046: ClusterScopedResourceInHierarchyConfigError
KNV1046: This HierarchyConfig references the APIResource "ClusterSelector.configmanagement.gke.io" which has cluster scope. Cluster scoped objects are not permitted in HierarchyConfig.
source: system/hc.yaml
metadata.name: hierarchyconfig
group: configmanagement.gke.io
version: v1
kind: HierarchyConfig
KNV1047: UnsupportedCRDRemovalError
KNV1047: Removing a CRD and leaving the corresponding Custom Resources in the
repo is disallowed. To fix, remove the CRD along with the Custom Resources.
source: cluster/crd.yaml
metadata.name: customResourceDefinition
group: apiextensions.k8s.io
version: v1beta1
kind: CustomResourceDefinition
KNV1048: InvalidCRDNameError
KNV1048: The CustomResourceDefinition has an invalid name. To fix, change the
name to `spec.names.plural+"."+spec.group`.
source: cluster/crd.yaml
metadata.name: customResourceDefinition
group: apiextensions.k8s.io
version: v1beta1
kind: CustomResourceDefinition
KNV1050: DeprecatedGroupKindError
KNV1050: The config is using a deprecated Group and Kind. To fix, set the Group and Kind to "Deployment.apps"
source: namespaces/deployment.yaml
metadata.name: default-name
group: extensions
version: v1beta1
kind: Deployment
KNV1052: IllegalNamespaceOnClusterScopedResourceError
KNV1052: cluster-scoped resources MUST NOT declare metadata.namespace
namespace: foo
metadata.name: default-name
group: rbac.authorization.k8s.io
version: v1
kind: ClusterRole
KNV1053: MissingNamespaceOnNamespacedResourceError
KNV1053: namespace-scoped resources MUST either declare either metadata.namespace or metadata.annotations.configmanagement.gke.io/namespace-selector
metadata.name: default-name
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1054: InvalidAnnotationValueError
This error occurs when the configs contain an invalid value for an annotation.
Example error:
KNV1054: Values in metadata.annotations MUST be strings. To fix, add quotes around the values. Non-string values for:
metadata.annotations.foo
metadata.annotations.bar
metadata.name: default-name
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1055: InvalidNamespaceError
This error indicates the value of metadata.namespace
is not a valid
Kubernetes Namespace name.
Example error:
KNV1055: metadata.namespace MUST be valid Kubernetes Namespace names. Rename "FOO" so that it:
1. has a length of 63 characters or fewer;
2. consists only of lowercase letters (a-z), digits (0-9), and hyphen '-'; and
3. begins and ends with a lowercase letter or digit.
namespace: FOO
metadata.name: repo
group: configmanagement.gke.io
version: v1
kind: Repo
KNV1056: ManagedResourceInUnmanagedNamespace
This error indicates a resource is declared in an unmanaged namespace. To fix
it, either remove the configmanagement.gke.io/managed: disabled
annotation, or
add the annotation to the declared resource.
Example error:
KNV1056: Managed resources must not be declared in unmanaged Namespaces. Namespace "foo" is declared unmanaged but contains managed resources. Either remove the managed: disabled annotation from Namespace "foo" or declare its resources as unmanaged by adding configmanagement.gke.io/managed:disabled annotation.
metadata.name: default-name
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1057: IllegalDepthLabel
This error indicates a resource has an illegal label. To fix it, remove
the labels ending with .tree.hnc.x-k8s.io/depth
.
Example error:
KNV1057: Configs MUST NOT declare labels ending with ".tree.hnc.x-k8s.io/depth". The config has disallowed labels: "label.tree.hnc.x-k8s.io/depth"
metadata.name: default-name
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV1058: BadScopeError
A Namespace repository can only declare namespace-scoped resources in the
Namespace the repo applies to. For example, the repository for the shipping
Namespace
repo may only manage resources in the shipping
namespace.
The value of metadata.namespace
is optional. By default, Config Sync
assumes that all resources in a Namespace repository belong in that Namespace.
For example, if a config in the shipping
Namespace repo
declared metadata.namespace: billing
, the nomos
command prints the following error.
KNV1058: Resources in the "shipping" repo must either omit metadata.namespace or declare metadata.namespace="shipping"
namespace: billing
metadata.name: default-name
group: rbac.authorization.k8s.io
version: v1
kind: Role
KNV 1059: MultipleKptfilesError
A Namespace repository can declare at most one Kptfile resource.
For example, if a Namespace repository declared two Kptfiles, the nomos
command prints the following error:
KNV1059: Namespace Repos may contain at most one Kptfile
metadata.name: package-a
group: kpt.dev
version: v1alpha1
kind: Kptfile
metadata.name: package-b
group: kpt.dev
version: v1alpha1
kind: Kptfile
For more information, see https://g.co/cloud/acm-errors#knv1059
KNV 1060: ManagementConflictError
When managing objects in multiple sources of truth, conflicts can arise when the same object (matching group, kind, name, and namespace) is declared in more than one source.
The following are a few scenarios that can produce conflicts:
RootSync vs. RepoSync: When the same object is managed by a RootSync and a RepoSync, the RootSync wins. If the RootSync applies first, the RepoSync will report a KNV1060 status error. If the RepoSync applies first, the RootSync will overwrite the RepoSync's object and the RepoSync will report a KNV1060 status error when it sees the update.
RepoSync vs. RepoSync: When the same object is managed by two RepoSyncs, the first RepoSync to apply the object wins. The second RepoSync to apply will see that the object is already managed by the first RepoSync and report a KNV1060 status error.
RootSync vs. RootSync when admission webhook is enabled: When the same object is managed by two RootSyncs and the admission webhook is enabled, the first RootSync to apply the object wins. The second RootSync to apply will receive an error from the admission webhook that the object is already managed and will report a KNV1060 status error.
RootSync vs. RootSync when admission webhook is disabled: When the same object is managed by two RootSyncs and the admission webhook is disabled, the two RootSync objects will continuously fight to adopt ownership of the object and both will report the KNV1060 status error.
An example of the conflicting error:
KNV1060: The root reconciler detects a management conflict for a resource declared in another repository. Remove the declaration for this resource from either the current repository, or the repository managed by root-reconciler.
metadata.name: default-name
group: rbac.authorization.k8s.io
version: v1
kind: Role
When the error happens, you can resolve the conflict by updating the config to match with the other source of truth, or by deleting the conflicting object from one of the sources.
KNV 1061: InvalidRepoSyncError
RepoSync objects must be properly configured for Config Sync
to sync configuration from Namespace repos. An InvalidRepoSyncError
reports
that a RepoSync is improperly configured, with a message explicitly stating
how to fix it.
For example, if the shipping
repository must have a RepoSync named repo-sync
, but the RepoSync is named invalid
, the nomos
command prints the following error.
KNV1061: RepoSyncs must be named "repo-sync", but the RepoSync for Namespace "shipping" is named "invalid"
metadata.name: invalid
group: configsync.gke.io
version: v1alpha1
kind: RepoSync
KNV1062: InvalidKptfileError
This error occurs when the Kptfile doesn't have a valid inventory field.
A Kptfile should have a non-empty inventory field with both identifier and
namespace specified. To fix it, you need to specify the values for
.inventory.identifier
and .inventory.namespace
in the Kptfile.
Example errors:
KNV1062: Invalid inventory invalid name
metadata.name: default-name
group: kpt.dev
version: v1alpha1
kind: Kptfile
KNV1063: KptfileExistError
This error occurs when Kptfiles are found in the Root repository. Kptfiles are only supported in namespace-scoped repos.
To fix, remove the Kptfiles from the Root repo.
Example errors:
KNV1063: Found Kptfile(s) in the Root Repo. Kptfile(s) are only supported in Namespace Repos. To fix, remove the Kptfile(s) from the Root Repo.
namespace: namespace
metadata.name: default-name
group: kpt.dev
version: v1alpha1
kind: Kptfile
For more information, see https://g.co/cloud/acm-errors#knv1063
KNV1064: InvalidAPIResourcesError
This error indicates that the api-resources.txt file in a repository could not be parsed.
Example errors:
KNV1064: invalid NAMESPACED column value "other" in line:
rbac other Role
Re-run "kubectl api-resources > api-resources.txt" in the root policy directory
path: /api-resources.txt
For more information, see https://g.co/cloud/acm-errors#knv1064
KNV1064: unable to find APIVERSION column. Re-run "kubectl api-resources > api-resources.txt" in the root policy directory
path: /api-resources.txt
For more information, see https://g.co/cloud/acm-errors#knv1064
KNV1064: unable to read cached API resources: missing file permissions
path: /api-resources.txt
For more information, see https://g.co/cloud/acm-errors#knv1064
In nomos
version 1.16.1 and earlier, you also see this error:
KNV1064: unable to find APIGROUP column. Re-run "kubectl api-resources > api-resources.txt" in the root policy directory
path: /api-resources.txt
For more information, see https://g.co/cloud/acm-errors#knv1064
This is caused by the change of column name from APIGROUP
to APIVERSION
.
To mitigate this issue, manually replace APIVERSION
in api-resources.txt
back to APIGROUP
.
KNV1065: MalformedCRDError
This error occurs when the CustomResourceDefinition is malformed. To fix, check the field specified by the error message and make sure its value is correctly formatted.
Example errors:
KNV1065: malformed CustomResourceDefinition: spec.names.shortNames accessor error: foo is of the type string, expected []interface{}.
path: namespaces/foo/crd.yaml
For more information, see https://g.co/cloud/acm-errors#knv1065
KNV1066: ClusterSelectorAnnotationConflictError
A config object MUST declare ONLY ONE cluster-selector annotation.
This error occurs when both the legacy annotation (configmanagement.gke.io/cluster-selector
)
and the inline annotation (configsync.gke.io/cluster-name-selector
) exist.
To fix it, remove one of the annotations from the metadata.annotations field.
For example, if a Namespace config declared both annotations, the nomos
command prints the following error:
KNV1066: Config "my-namespace" MUST declare ONLY ONE cluster-selector annotation, but has both inline annotation "configsync.gke.io/cluster-name-selector" and legacy annotation "configmanagement.gke.io/cluster-selector". To fix, remove one of the annotations from:
metadata.name: my-namespace
group:
version: v1
kind: Namespace
For more information, see https://g.co/cloud/acm-errors#knv1066
KNV1067: EncodeDeclaredFieldError
This error occurs when the reconciler fails to encode the declared fields into a format that is compatible with server-side apply. It could be caused by an out-of-date schema. To fix, check the field specified by the error message and make sure it matches the schema of the resource kind.
Example errors:
KNV1067: failed to encode declared fields: .spec.version not defined
metadata.name: my-namespace
group:
version: v1
kind: Namespace
For more information, see https://g.co/cloud/acm-errors#knv1067
There is a known issue for clusters running Kubernetes versions earlier than
1.27 that can result in this error. To resolve it, ensure the protocol
field
is explicitly set under spec: containers: ports:
even if you are using the
default TCP
.
KNV1068: ActionableRenderingError
This error indicates the rendering process encounters a user-actionable issue.
One example is that the Git repository contains Kustomize configurations, but no
kustomization.yaml
file exists in the Git sync directory:
KNV1068: Kustomization config file is missing from the sync directory 'foo/bar'. To fix, either add kustomization.yaml in the sync directory to trigger the rendering process, or remove kustomization.yaml from all sub directories to skip rendering.
For more information, see https://g.co/cloud/acm-errors#knv1068
If the error is caused by kustomize build
failures, you might need to update the
Kustomize configurations in your Git repository. You can preview and
validate the updated configs locally by using nomos hydrate
and nomos vet
respectively. If the updated configs are rendered successfully, you can push a
new commit to fix the KNV1068 error.
For more details, see
Viewing the result of all configs in the repo
and Checking for errors in the repo.
Example kustomize build
error:
KNV1068: Error in the hydration-controller container: unable to render the source configs in /repo/source/3b724d1a17314c344fa24512239cb3b22b9d90ec: failed to run kustomize build ...
For more information, see https://g.co/cloud/acm-errors#knv1068
If a kustomize build
error happens when pulling remote bases from public
repositories, you need to set spec.override.enableShellInRendering
to
true
.
Example kustomize build
error:
KNV1068: failed to run kustomize build in /repo/source/0a7fd88d6c66362584131f9dfd024024352916af/remote-base, stdout:...
no 'git' program on path: exec: "git": executable file not found in $PATH
For more information, see https://g.co/cloud/acm-errors#knv1068
KNV1069: SelfManageError
This error occurs when a reconciler reconciles its own RootSync or RepoSync object. A RootSync object can manage other RootSync and RepoSync objects; A RepoSync object can manage other RepoSync objects, but they cannot self-manage. To fix the issue, you can remove the RootSync or RepoSync object from the source of truth that the object syncs from.
Example errors:
KNV1069: RootSync config-management-system/root-sync must not manage itself in its repo
namespace: config-management-system
metadata.name: root-sync
group: configsync.gke.io
version: v1beta1
kind: RootSync
For more information, see https://g.co/cloud/acm-errors#knv1069
KNV2001: pathError
This error occurs when an OS-level system call accessing a file system resource fails.
Invalid YAML configuration
When you have an invalid configuration in your YAML file, you might see an error message similar to the following:
KNV2001: yaml: line 2: did not find expected node content path:...
To resolve this issue, check your YAML files and resolve any configuration problems. This can be caused by any YAML configuration within the repository.
Special characters in path name
If your file name or path name contains special characters, you might see an error message similar to the following:
KNV2001: yaml: control characters are not allowed path:
/repo/source/.../._pod.yaml
In this example ._pod.yaml
is not a valid file name.
To resolve this issue, remove special characters from your file or path names.
KNV2002: apiServerError
This error occurs when a request accessing the API Server fails.
KNV2003: osError
This error occurs when a generic OS-level system call fails.
KNV2004: SourceError
This error indicates that Config Sync cannot read from the source of truth. It is usually caused by one of the following errors.
The Git repository is unreachable from within the cluster
The git-sync
container throws an error in its logs that indicates that it
can't reach the repository. For example, ssh: connect to host
source.developers.google.com port 2022: Network is unreachable
. To fix the
issue, adjust the firewall or network configuration of your cluster.
Invalid configuration directory
Check for mistakes such as an incorrect value for policyDir
in the
ConfigManagement
object or spec.git.dir
or spec.oci.dir
in the RootSync
or RepoSync object. The value of the directory is included in the error; verify
the value against your Git repository or OCI image.
Invalid chart name
When syncing from a Helm repository, make sure you set the correct value for
spec.helm.chart
. The chart name doesn't contain the chart version or .tgz
.
You can verify your chart name with the
helm template
command.
Invalid Git or Helm credentials
Check the logs for the git-sync
or helm-sync
container for one of the
following errors:
Could not read from remote repository. Ensure you have the correct access rights and the repository exists.
Invalid username or password. Authentication failed for ...
401 Unauthorized
For a Git repository, verify that the Git credentials and the git-creds
Secret are
configured correctly.
For a Helm repository, verify that Helm credentials are configured correctly.
Invalid Git repository URL
Check the logs for the git-sync
container for an error such as Repository not
found.
Check that you are using the right URL format. For example, if you are using an SSH key pair to authenticate to the Git repository, make sure that the URL that you enter for your Git repository when you configure Config Sync uses the SSH protocol.
Invalid Helm repository URL
Check the logs for the helm-sync
container for an error such as ...not a valid chart repository
.
Check that you are using the right URL format. For example, if you are syncing from an OCI registry,
the URL should start with oci://
. You can verify your Helm repository URL with the helm template
command.
Invalid Git branch
Check the logs for the git-sync
container for an error such as Remote branch
BRANCH_NAME not found in upstream origin
or warning: Could
not find remote branch BRANCH_NAME to clone.
Note that the default branch is set to master
if not specified.
Server certificate verification failed
This failure can occur when connecting to a Git server over HTTPS if the Git server
is using a certificate issued by a Certificate Authority (CA) which is unrecognized
by the git-sync
client.
If you are managing your own CA, verify that the Git server is issued a certificate
from the CA and the caCertSecretRef
field is
configured correctly.
Permission issues when using Google service account
Missing reader access
When using a Google service account (spec.git.gcpServiceAccountEmail
, or
spec.oci.gcpServiceAccountEmail
, or spec.helm.gcpServiceAccountEmail
) to
authenticate to Cloud Source Repositories or Artifact Registry, the Google service
account requires the following reader access:
- Cloud Source Repositories:
roles/source.reader
- Artifact Registry:
roles/artifactregistry.reader
Otherwise, git-sync
or oci-sync
or helm-sync
will fail with the error:
failed to pull image us-docker.pkg.dev/...: GET https://us-docker.pkg.dev/v2/token?scope=repository...: DENIED: Permission \"artifactregistry.repositories.downloadArtifacts\" denied on resource \"projects/.../locations/us/repositories/...\" (or it may not exist)
or
"Err":"failed to render the helm chart: exit status 1, stdout: Error: failed to download ...
To fix it, grant the service account with the correct reader access.
Missing IAM policy binding with Workload Identity
When using a Google service account for authentication, an IAM policy binding is required between the Google service account and Kubernetes service account. If the IAM policy binding is missing, you get the following error:
KNV2004: unable to sync repo Error in the git-sync container: ERROR: failed to call ASKPASS callback URL: auth URL returned status 500, body: "failed to get token from credentials: oauth2/google: status code 403: {\n \"error\": {\n \"code\": 403,\n \"message\": \"The caller does not have permission\",\n \"status\": \"PERMISSION_DENIED\"\n }\n}\n"
To fix the issue, create the following IAM policy binding:
gcloud iam service-accounts add-iam-policy-binding \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:PROJECT_ID.svc.id.goog[config-management-system/KSA_NAME]" \
GSA_NAME@PROJECT_ID.iam.gserviceaccount.com
Replace the following:
PROJECT_ID
: If you're using GKE Workload Identity, add your organization's project ID. If you're using fleet Workload Identity, you can use two different project IDs. InserviceAccount:PROJECT_ID
, add the project ID of the fleet that your cluster is registered to. InGSA_NAME@PROJECT_ID
, add a project ID for any project that has read access to the repository in Cloud Source Repositories.KSA_NAME
: the Kubernetes service account for the reconciler. For root repositories, if the RootSync name isroot-sync
,KSA_NAME
isroot-reconciler
. Otherwise, it isroot-reconciler-ROOT_SYNC_NAME
.For namespace repositories, if the RepoSync name is
repo-sync
,KSA_NAME
isns-reconciler-NAMESPACE
. Otherwise, it isns-reconciler-NAMESPACE-REPO_SYNC_NAME
.GSA_NAME
: the custom Google service account that you want to use to connect to Cloud Source Repositories. Make sure that the Google service account that you select has thesource.reader
role.
Missing cloud-platform
scope to access Cloud Source Repositories
When granting a Google service account access to your Git repository in Cloud Source Repositories, the read-only scope must be included in access scopes for the nodes in the cluster.
The read-only scope can be added by including cloud-source-repos-ro
in the
--scopes
list specified at cluster creation time, or by using the
cloud-platform
scope at cluster creation time. For example:
gcloud container clusters create CLUSTER_NAME --scopes=cloud-platform
If the read-only scope is missing, you'll see an error similar to the following:
Error in the git-sync container: {"Msg":"unexpected error syncing repo, will retry","Err":"Run(git clone -v --no-checkout -b main --depth 1 https://source.developers.google.com/p/PROJECT_ID/r/csr-auth-test /repo/source): exit status 128: { stdout: \"\", stderr: \"Cloning into '/repo/source'...\\nremote: INVALID_ARGUMENT: Request contains an invalid argument\\nremote: [type.googleapis.com/google.rpc.LocalizedMessage]\\nremote: locale: \\\"en-US\\\"\\nremote: message: \\\"Invalid authentication credentials. Please generate a new identifier: https://source.developers.google.com/new-password\\\"\\nremote: \\nremote: [type.googleapis.com/google.rpc.RequestInfo]\\nremote: request_id: \\\"fee01d10ba494552922d42a9b6c4ecf3\\\"\\nfatal: unable to access 'https://source.developers.google.com/p/PROJECT_ID/r/csr-auth-test/': The requested URL returned error: 400\\n\" }","Args":{}}
Missing storage-ro
scope to access Artifact Registry
Image layers are kept in Cloud Storage buckets. When granting a Google service account access to your OCI image or Helm Chart in Artifact Registry, the read-only scope must be included in access scopes for the nodes in the cluster.
The read-only scope can be added by including storage-ro
in the --scopes
list specified at cluster creation time, or by using the cloud-platform
scope
at cluster creation time. For example:
gcloud container clusters create CLUSTER_NAME --scopes=cloud-platform
If the read-only scope is missing, you'll see an error similar to the following:
Error in the oci-sync container: {"Msg":"unexpected error fetching package, will retry","Err":"failed to pull image us-docker.pkg.dev/...: GET https://us-docker.pkg.dev/v2/token?scope=repository%3A...%3Apull\u0026service=us-docker.pkg.dev: UNAUTHORIZED: failed authentication","Args":{}}
The error message might not include the full details for what caused the error,
but it does provide a command which prints the logs from the git-sync
container
which might have more information.
If you are using RootSync
or RepoSync
APIs:
kubectl logs -n config-management-system -l app=reconciler -c git-sync
If you don't enable RootSync
or RepoSync
APIs:
kubectl logs -n config-management-system -l app=git-importer -c git-sync
git fetch
failed with the remote did not send all necessary objects
error
Config Sync creates a shallow clone of your Git repository. In rare cases, Config Sync might not be able to find the commit from the shallow clone. When this happens, Config Sync increases the number of Git commits to fetch.
You can set the number of Git commits to fetch by setting the
spec.override.gitSyncDepth
field
in a RootSync or RepoSync object.
- If this field is not provided, Config Sync configures it automatically.
- Config Sync does a full clone if this field is 0, and a shallow clone if this field is greater than 0.
- Setting this field to a negative value is not allowed.
If you installed Config Sync using the Google Cloud console or
Google Cloud CLI, create an
editable RootSync object so that you can set spec.override.gitSyncDepth
. For
details, see Configure Config Sync with kubectl
commands.
Here is an example of setting the number of Git commits to fetch to 88
:
apiVersion: configsync.gke.io/v1beta1
kind: RootSync
metadata:
name: root-sync
namespace: config-management-system
spec:
override:
gitSyncDepth: 88
git:
...
Run the following command to verify that the change is applied
(GIT_SYNC_DEPTH
should be set to 88
in the data
field of the
root-reconciler-git-sync
ConfigMap):
kubectl get cm root-reconciler-git-sync -n config-management-system -o yaml
You can override the number of Git commits to fetch in a namespace reconciler similarly.
Unable to mount Git Secret
If you receive the following error when the git-sync
container tries to sync
a repository with a Secret, then the Git Secret isn't mounted successfully to the
git-sync
container:
KNV2004: unable to sync repo Error in the git-sync container: ERROR: can't configure SSH: can't access SSH key: stat /etc/git-secret/ssh: no such file or directory: lstat /repo/root/rev: no such file or directory
The error could be caused by switching your Git repository authentication
type from none
, gcenode
, or gcpserviceaccount
to other types that need a
Secret.
To resolve this issue, run the following commands to restart the Reconciler Manager and the Reconcilers:
# Stop the reconciler-manager Pod. The reconciler-manager Deployment will spin
# up a new Pod which can pick up the latest `spec.git.auth`.
kubectl delete po -l app=reconciler-manager -n config-management-system
# Delete the reconciler Deployments. The reconciler-manager will recreate the
# reconciler Deployments with correct volume mount.
kubectl delete deployment -l app=reconciler -n config-management-system
KNV2005: ResourceFightError
This error indicates that Config Sync is fighting with another controller over a resource. Such fights consume a high amount of resources and can degrade your performance. Fights are also known as resource contention.
If you are using the RootSync
or RepoSync
API with Config Sync
version 1.15.0 or later, you can review the fight errors by using the
nomos status
command or by checking the status field in the RootSync
or
RepoSync
object.
If you are using the RootSync
or RepoSync
API with a Config Sync
version earlier than 1.15.0, you can review the logs for the Config Sync
reconciler
by running the following command:
kubectl logs -n config-management-system -l app=reconciler -c reconciler
If you don't have RootSync
or RepoSync
APIs enabled, to detect fights, check the
Config Sync git-importer
logs by running the following command:
kubectl logs --namespace config-management-system deployment/git-importer -c importer
If you see KNV2005
in the results, then there is a resource fight.
To find more information about any resource conflicts, watch updates to the resource's YAML file by running the following command:
kubectl get resource --watch -o yaml
Replace resource with the kind of resource that is being fought over. You can see which resource to add based on the log results.
This command returns a stream of the state of the resource after updates are applied to the API server. You can use a file comparison tool to compare the output.
Some resources should belong to other controllers (for example, some
operators
install or maintain CRDs). These other controllers automatically remove any
metadata specific to Config Sync. If another component in your Kubernetes
cluster removes Config Sync metadata, stop managing the resource. For
information about how to do this, see
Stop managing a managed object.
Alternatively, if you don't want Config Sync to maintain the state of the
object in the cluster after it exists, you can add the
client.lifecycle.config.k8s.io/mutation: ignore
annotation to the object that
you want Config Sync to ignore mutations in. For information about how to do
this, see Ignore object mutations.
KNV2006: Config Management Errors
To help prevent accidental deletion, Config Sync does not let you remove all namespaces or cluster-scoped resources in a single commit.
If you have committed changes in attempt to remove all resources, you should fix it to the original status and then delete in two steps.
If the Config Sync admission webhook is disabled, you can revert the commit that deletes all resources.
If the Config Sync admission webhook is enabled, your namespace might be stuck terminating. To fix it, run the following steps:
- Disable Config Sync,
and wait until all resources are cleaned or in a stable status. For example, you
can run
kubectl get ns
to make sure the namespaces are deleted. - Re-enable Config Sync.
- Revert the commit that deletes all resources.
How to delete
If you want to delete the full set of resources under management, it requires two steps:
- Remove all but one namespace or cluster-scoped resource in a first commit and allow Config Sync to sync those changes.
- Remove the final resource in a second commit.
KNV2008: APIServerConflictError
This type of error occurs when a resource on the API Server is modified or deleted while Config Sync is also attempting to modify it. If this type of error only appears at startup or infrequently, you can ignore these errors.
If these errors are not transient (persisting multiple minutes), it may
indicate a serious issue and nomos status
reports resource conflicts.
Example errors:
KNV2008: tried to create resource that already exists: already exists
metadata.name: default-name
group: rbac.authorization.k8s.io
version: v1
kind: Role
For more information, see https://g.co/cloud/acm-errors#knv2008
KNV2008: tried to update resource which does not exist: does not exist
metadata.name: default-name
group: rbac.authorization.k8s.io
version: v1
kind: Role
For more information, see https://g.co/cloud/acm-errors#knv2008
KNV2008: tried to update with stale version of resource: old version
metadata.name: default-name
group: rbac.authorization.k8s.io
version: v1
kind: Role
For more information, see https://g.co/cloud/acm-errors#knv2008
KNV2009: ApplyError
This error is a generic error indicating that Config Sync failed to sync some configs to the cluster. Example error:
KNV2009: no matches for kind "Anvil" in group "acme.com".
Operation on certain resources is forbidden
If you receive an error similar to the following:
KNV2009: failed to list demo.foobar.com/v1, Kind=Cluster: clusters.demo.foobar.com is forbidden: User "system:serviceaccount:config-management-system:ns-reconciler-abc" cannot list resource "clusters" in API group "demo.foobar.com" in the namespace "abc"
This error means that the namespace reconciler doesn't have necessary access to
apply configs to the cluster. To fix it, follow the step in
Configure syncing from more than one source of truth
to configure a RoleBinding
to grant the namespace reconciler's service account
permission based on what's shown missing in the error message.
Admission webhook request i/o timeout
If you receive the following error when reconciler tries to apply a config to
the cluster, the admission webhook port 8676
might be blocked by the firewall
to the control plane network:
KNV2009: Internal error occurred: failed calling webhook "v1.admission-webhook.configsync.gke.io": Post https://admission-webhook.config-management-system.svc:8676/admission-webhook?timeout=3s: dial tcp 10.1.1.186:8676: i/o timeout
To resolve this issue, add a firewall rule
to allow port 8676
, which the Config Sync admission webhook uses for
drift prevention.
Admission webhook connection refused
You might receive the following error when reconciler tries to apply a config to the cluster:
KNV2009: Internal error occurred: failed calling webhook "v1.admission-webhook.configsync.gke.io": Post "https://admission-webhook.config-management-system.svc:8676/admission-webhook?timeout=3s": dial tcp 10.92.2.14:8676: connect: connection refused
This error means that the admission webhook isn't ready yet. It's a transient error that you might see when bootstrapping Config Sync.
If the issue persists, look at the admission webhook Deployment to see if its Pods can be scheduled and are healthy.
kubectl describe deploy admission-webhook -n config-management-system
kubectl get pods -n config-management-system -l app=admission-webhook
ResourceGroup resource exceeds the etcd
object size limit
If you receive the following error when reconciler tries to apply configurations to the cluster:
KNV2009: too many declared resources causing ResourceGroup.kpt.dev, config-management-system/root-sync failed to be applied: task failed (action: "Inventory", name: "inventory-add-0"): Request entity too large: limit is 3145728. To fix, split the resources into multiple repositories.
This error means that the ResourceGroup resource exceeds the etcd
object size
limit. We recommend that you split your Git repository into multiple
repositories.
If you are not able to break up the Git repository, in Config Sync v1.11.0
and later, you can mitigate it by disabling surfacing status data. This is done
by setting the field .spec.override.statusMode
of the RootSync or RepoSync
object to disabled
. By doing so, Config Sync stops updating the managed
resources status in the ResourceGroup object. It reduces the size of the
ResourceGroup object. However, you will not be able to view the status for
managed resources from either nomos status
or
gcloud alpha anthos config sync
.
Dependency apply reconcile timeout
You might receive an error similar to the following example when the reconciler tries
to apply objects with the config.kubernetes.io/depends-on
annotation to the
cluster:
KNV2009: skipped apply of Pod, bookstore/pod4: dependency apply reconcile timeout: bookstore_pod3__Pod For more information, see https://g.co/cloud/acm-errors#knv2009
This error means the dependency object did not reconcile within the default
reconcile timeout of 5 minutes. Config Sync cannot apply the dependent
object since with the config.kubernetes.io/depends-on
annotation,
Config Sync only applies objects in the order you want. You can override
the default reconcile timeout to a longer time by setting
spec.override.reconcileTimeout
.
Inventory info is nil
If you receive the following error when reconciler tries to apply configurations to the cluster, it is likely that your inventory has no resources or the manifest has an unmanaged annotation:
KNV2009: inventory info is nil\n\nFor more information, see https://g.co/cloud/acm-errors#knv2009
To resolve this issue, try the following:
- Avoid setting up syncs where all resources have the
configmanagement.gke.io/managed: disabled
annotation, by ensuring at least one resource is managed by Config Sync. - Add the annotation
configmanagement.gke.io/managed: disabled
only after completing an initial sync of the resource without this annotation.
Multiple inventory object templates
If you receive the following error when the reconciler tries to apply configurations to the cluster, it is likely that you have an inventory config from kpt in the source of truth, for example a Git repository:
KNV2009: Package has multiple inventory object templates. The package should have one and only one inventory object template. For more information, see https://g.co/cloud/acm-errors#knv2009
The issue happens because Config Sync manages its own inventory config. To resolve this issue, delete the inventory config in your source of truth.
KNV2010: resourceError
This error is a generic error indicating a problem with a resource or set of resources. The message includes the specific resources which caused the error.
KNV2010: Resources were improperly formatted.
Affected resources:
source: system/hc.yaml
group: configmanagement.gke.io
kind: Repo
KNV2011: MissingResourceError
This error indicates a specific resource is required to proceed, but the resource was not found. For example, ConfigManagement Operator tried to update a resource, but the resource was deleted while calculating the update.
KNV2012: MultipleSingletonsError
This error reports that more than one instance of an APIResource was found in a context where exactly one of that APIResource is allowed. For example, only one Repo resource may exist on a cluster.
KNV2013: InsufficientPermissionError
This error occurs when a namespace reconciler has insufficient permissions to manage resources. To fix, make sure the reconciler has sufficient permissions.
Example errors:
KNV2013: could not create resources: Insufficient permission. To fix, make sure the reconciler has sufficient permissions.: deployments.apps is forbidden: User 'Bob' cannot create resources
For more information, see https://g.co/cloud/acm-errors#knv2013
KNV2014: InvalidWebhookWarning
This warning occurs when the Config Sync webhook configuration is illegally modified. The illegal webhook configurations are ignored.
Example warning:
KNV2014: invalid webhook
KNV2015: InternalRenderingError
This error indicates the rendering process encounters an internal issue. One example is unable to access the file system. It might indicate the Pod is not healthy. You can restart the reconciler Pod by running the following command:
# restart a root reconciler
kubectl delete pod -n config-management-system -l configsync.gke.io/reconciler=root-reconciler
# restart a namespace reconciler
kubectl delete pod -n config-management-system -l configsync.gke.io/reconciler=ns-reconciler-<NAMESPACE>
If the error still persists after restart, create a bug report.
KNV2016: TransientError
This error represents a transient issue that should automatically resolve at a later time.
For example, when the source commit changed while listing files, the following error may occur:
KNV2016: source commit changed while listing files, was 90c1d9e9633a988ee3c3fc4dd145e62af30e9d1f, now 1d60597c56ebe07b269cc0d5ff126f638626c3b7. It will be retried in the next sync
For more information, see https://g.co/cloud/acm-errors#knv2016
Another example is when the rendering state doesn't match the source configs:
sync source contains only wet configs and hydration-controller is running
For more information, see https://g.co/cloud/acm-errors#knv2016
KNV9998: InternalError
KNV9998 indicates a problem with the nomos
command itself. Please file a
bug report with the exact command you ran and the message you received.
Example errors:
KNV9998: we made a mistake: internal error
For more information, see https://g.co/cloud/acm-errors#knv9998
KNV9999: UndocumentedError
You've encountered an error with no documented error message. We haven't yet written documentation specific to the error you encountered.
Other error messages
The KNV prefix is unique to Config Sync, but you might occasionally see an error message without the KNV prefix.
Cannot build exporters
When a component in Open Telemetry Collector can't access the default service
account under the same namespace, you might notice that the otel-collector
Pod under config-management-monitoring
is in CrashLoopBackoff status, or
you might see an error message similar to one of the following messages:
Error: cannot build exporters: error creating stackdriver exporter: cannot configure Google Cloud metric exporter: stackdriver: google: could not find default credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.
Error: Permission monitoring.timeSeries.create denied (or the resource may not exist).
This issue usually happens when Workload Identity is enabled in a cluster.
To solve this problem, follow the instructions in Monitoring Config Sync to grant metric write permission to the default service account.
If the error persists after setting up the IAM, restart the otel-collector
Pod
for the changes to take effect.
If you are using a custom monitoring solution,
but forked the default otel-collector-googlecloud
ConfigMap, check and rebase any difference.
Server certificate verification failed
If the git-sync
container fails to clone the Git repository, you might see
the following error message:
server certificate verification failed. CAfile:/etc/ca-cert/cert CRLfile: none
This message indicates that the Git server is configured
with certificates from a custom Certificate Authority (CA). However, the custom
CA is not properly configured, resulting in the failure of the git-sync
container to clone the Git repository.
First, you can verify whether the spec.git.caCertSecretRef.name
field has been
specified in your RootSync or RepoSync object, and also check if the Secret
object exists.
If the field has been configured and the Secret object exists, ensure that the Secret object contains the full certificates.
Depending on how the custom CA is provisioned, the approaches for checking the full certificates may vary.
Here is an example of how to list the server certificates:
echo -n | openssl s_client -showcerts -connect HOST:PORT -servername SERVER_NAME 2>/dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p'
You can request your network administration team to obtain the CA certificates for you.
Unable to retrieve pull secret, the image pull may not succeed
If you're using a private registry with GKE on VMware, the Config Sync installation or upgrade can get stuck. You'll see an error similar to the following:
Error message: "MESSAGE": "Unable to retrieve pull secret, the image pull may not succeed." pod="config-management-system/config-management-operator-7d84fccc5c-khrx4" secret="" err="secret \"private-registry-creds\" not found"",
To resolve this issue, follow the steps in Update Policy Controller, Config Sync and Config Controller using a private registry before installing or upgrading Config Sync.
Webhook errors
The following section details errors that you might encounter when using the Config Sync admission webook. To learn more about the webhook, see Prevent config drift.
Failed to delete all resource types
A namespace stuck in the Terminating
phase should have the following condition:
message: 'Failed to delete all resource types, 1 remaining: admission webhook
"v1.admission-webhook.configsync.gke.io" denied the request: system:serviceaccount:kube-system:namespace-controller
is not authorized to delete managed resource "_configmap_bookstore_cm1"'
reason: ContentDeletionFailed
status: "True"
type: NamespaceDeletionContentFailure
This error happens when you are trying to delete a namespace from a root repository, but
some objects under the namespace are still actively managed by a namespace reconciler.
When a namespace is deleted, the
namespace controller,
whose service account is
system:serviceaccount:kube-system:namespace-controller
, would try to delete
all the objects in that namespace. However, the Config Sync admission webhook
only allows the root or namespace reconciler to delete these objects, and denies the
namespace controller to delete these objects.
To workaround this issue, delete the Config Sync admission webhook:
kubectl delete deployment.apps/admission-webhook -n config-management-system
The ConfigManagement Operator recreates the Config Sync admission webhook.
If this workaround does not work, you might need to reinstall Config Sync.
To avoid running into the error again, remove the namespace repository before removing the namespace.
Webhooks field not found in ValidatingWebhookConfiguration
The following errors appear in the Config Sync admission webhook logs
when running kubectl logs -n config-management-system -l app=admission-webhook
if the root-reconciler
hasn't synced any resource to the cluster.:
cert-rotation "msg"="Unable to inject cert to webhook." "error"="`webhooks` field not found in ValidatingWebhookConfiguration" "gvk"={"Group":"admissionregistration.k8s.io","Version":"v1","Kind":"ValidatingWebhookConfiguration"} "name"="admission-webhook.configsync.gke.io"
controller-runtime/manager/controller/cert-rotator "msg"="Reconciler error" "error"="`webhooks` field not found in ValidatingWebhookConfiguration" "name"="admission-webhook-cert" "namespace"="config-management-system"
This error could happen because the root-reconciler
isn't ready yet, or there's
nothing to be synced from the Git repository (for example, the sync directory is
empty). If the issue persists, check the health of the root-reconciler
:
kubectl get pods -n config-management-system -l configsync.gke.io/reconciler=root-reconciler
If the root-reconciler
is crashlooping or OOMKilled,
increase its resource limits.
Admission webhook denied a request
If you receive the following error when you try to apply a change to a field that Config Sync manages, then you might have made a conflicting change:
error: OBJECT could not be patched: admission webhook "v1.admission-webhook.configsync.gke.io"
denied the request: fields managed by Config Sync can not be modified
When you declare a field in a config and your repository is synced to a cluster, Config Sync manages that field. Any change you attempt to make to that field is a conflicting change.
For example, if you have a Deployment config in your repository
with a label of environment:prod
and you try to change
that label to environment:dev
in your cluster, there would be a conflicting
change and you would receive the preceding error message. However, if you add
a new label (for example, tier:frontend
) to the Deployment, there wouldn't be
a conflict.
If you want Config Sync to ignore any changes to an object, you can add the annotation described in Ignoring object mutations.
Permission denied
If you receive an error similar to the following example when you try to configure Config Sync, you might not have the GKE Hub Admin role:
Permission 'gkehub.features.create' denied on 'projects/PROJECT_ID/locations/global/features/configmanagement'
To ensure that you have the required permissions, make sure you have granted the required IAM roles.
What's next
- Learn about monitoring Config Sync
- If you need additional support, reach out to Cloud Customer Care.