Introduction to troubleshooting

If you're encountering difficulties with Config Sync, this page introduces you to some common tools and procedures that can help you identify and resolve problems that you experience.

Upgrade to a supported version

Consider upgrading Config Sync to a supported version. Upgrading often resolves common problems and gives you access to the most current functionalities.

Use the nomos command-line tool

The nomos command-line tool provides essential insights into your Config Sync setup. The commands described in the following sections are particularly helpful when you're trying to determine the source of your problem or when you need to work with Cloud Customer Care.

View Config Sync status

The nomos status command provides you with aggregated data and errors to help you understand what's happening with your Config Sync installation. The following information is available with nomos status:

  • Installation status per cluster
  • Syncing errors (both reading from Git as well as reconciling the changes)

Create a bug report

If you have a problem with Config Sync that requires help from Cloud Customer Care, you can provide them with valuable debugging information by using the nomos bugreport command.

This command generates a timestamped zip file with information on the Kubernetes cluster set in your kubectl context. The file also contains logs from Config Sync Pods. It doesn't contain information from the resources synced with Config Sync.

View the overview dashboard

The Config Sync dashboard provides you with an overview of the status of the packages that Config Sync manages and the status of the resources in these packages. Exploring this dashboard can help you to get a quick overview of the status of your Config Sync installation and discover any packages that have issues.

  • To access the dashboard, in the Google Cloud console go to the Config page in the Features section:

    Go to Config

Use monitoring and log analysis

Monitoring Config Sync and exploring its logs can help you determine the source of bugs and to better understand any unexpected behavior.

Understand Config Sync metrics

Use Config Sync metrics to gain visibility into the health of Config Sync.

Monitor RootSync and RepoSync objects

When you install Config Sync using the Google Cloud console or Google Cloud CLI, Config Sync automatically creates a RootSync object for you. When you Configure syncing from multiple repositories, you can create RepoSync objects that contain configuration information about your namespace repositories.

Monitoring these objects can reveal valuable information about the state of Config Sync. To learn more, see Monitor RootSync and RepoSync objects.

Use service level indicators (SLIs)

To receive notifications when Config Sync isn't working as intended, use Config Sync SLIs.

Query logs

You can use the Logs Explorer to retrieve, view, and analyze log data for Config Sync. These logs can contain valuable historical data that isn't captured by nomos bugreport when the operator or reconciler Pods are restarted. For examples of queries that might help you diagnose your issue, see Query Config Sync logs.

Examine resources with the kubectl command-line tool

Config Sync is composed of multiple custom resources that you can query by using kubectl commands. These commands help you understand the status of each of Config Sync's objects.

You should know the following information about the Kubernetes resources that Config Sync manages:

  • config-management-system is the namespace we use to run all core system components of Config Sync.
  • configmanagement.gke.io and configsync.gke.io are the API groups we use for all custom resources.

Examples

The following sections show you how you might use kubectl commands to examine Config Sync.

List custom resources

  • You can get a full list of the custom resources by running the following command:

    kubectl api-resources | grep -E "configmanagement.gke.io|configsync.gke.io"
    
  • Individual custom resources can be consumed by running the following command:

    kubectl get RESOURCE -o yaml.
    

    Replace RESOURCE with the name of the resource that you want to query.

    For example, the output of the following command lets you check the status of a RootSync object:

    kubectl get rootsync -n config-management-system -o yaml
    

Check an object's token annotation

You might want to know when a managed Kubernetes object was last updated by Config Sync. Each managed object is annotated with the hash of the Git commit when it was last modified, and the path to the config that contained the modification.

For example, to get the annotation of a ClusterRoleBinding named namespace-readers, run the following command:

kubectl get clusterrolebinding namespace-readers

The output is similar to the following:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  annotations:
    configmanagement.gke.io/source-path: cluster/namespace-reader-clusterrolebinding.yaml
    configmanagement.gke.io/token: bbb6a1e2f3db692b17201da028daff0d38797771
  name: namespace-readers
...

For more information, see labels and annotations.

Accelerate diagnosis with Gemini Cloud Assist

Sometimes, the cause of your issue isn't immediately obvious, even after using the tools discussed in the preceding sections. Investigating complex cases can be time-consuming and requires deep expertise. For scenarios like this, Gemini Cloud Assist can help. It can automatically detect hidden patterns, surface anomalies, and provide summaries to help you quickly pinpoint a likely cause.

Access Gemini Cloud Assist

To access Gemini Cloud Assist, complete the following steps:

  1. In the Google Cloud console, go to any page.
  2. In the Google Cloud console toolbar, click Open or close Gemini Cloud Assist chat.

    The Cloud Assist panel opens. You can click example prompts if they are displayed, or you can enter a prompt in the Enter a prompt field.

Explore example prompts

To help you understand how Gemini Cloud Assist can help you, here are some example prompts:

Theme Scenario Example prompt How Gemini Cloud Assist can help
Initial setup A platform engineer is setting up Config Sync for the first time so that they can manage GKE clusters from a Git repository. How do I set up Config Sync to sync manifests from my GitHub repository to my GKE cluster? Gemini Cloud Assist provides a step-by-step guide to setting up Config Sync, covering fleet registration and enabling the feature, and explaining details like repository URL, branch, path, and authentication methods (for example, public, token, or ssh).
Troubleshooting sync errors A developer commits a new manifest, but the resource fails to apply to the cluster, and the sync status shows an error code. My Config Sync RootSync object shows "KNV2009: the server could not find the requested resource". What does this mean and how do I fix it? Gemini Cloud Assist analyzes the error code, explaining that it generally indicates Config Sync cannot locate or interact with an expected Kubernetes resource. It then details common causes, including missing RBAC permissions, exceeding resource object size limits, incorrect directory paths, external inventory conflicts, and issues with unmanaged resources, providing specific troubleshooting steps for each cause.
Managing multiple teams An organization needs to allow application teams to manage their own configurations in specific namespaces without giving them access to the central platform repository. What's the difference between a RootSync and a RepoSync object in Config Sync? When should I use RepoSync?

Gemini Cloud Assist explains the core difference between RootSync and RepoSync objects: RootSync objects are cluster-scoped and typically used by administrators for cluster-wide configurations, while RepoSync objects are namespace-scoped and designed for application teams to manage resources within a specific namespace, promoting delegation and multi-tenancy.

Gemini Cloud Assist also details scenarios where RepoSync objects should be used, emphasizing its benefits for multi-tenancy and reducing the affected area of misconfigurations.

Proactive validation A developer wants to ensure their new manifest is valid before committing it to the repository to avoid breaking the sync in production. How can I check my Kubernetes manifests for Config Sync errors on my local machine before I push them to the Git repository? Gemini Cloud Assist explains how to check Kubernetes manifests for Config Sync errors by using the nomos command-line tool. It details how to use the nomos vet command for syntax validation and the nomos hydrate command for previewing rendered configurations from Kustomize or Helm. Gemini Cloud Assist also outlines a recommended workflow to integrate these checks before pushing to Git.

For more information, see the following resources:

Use Gemini Cloud Assist Investigations

In addition to interactive chat, Gemini Cloud Assist can perform more automated, in-depth analysis through Gemini Cloud Assist Investigations. This feature is integrated directly into workflows like Logs Explorer, and is a powerful root-cause analysis tool.

When you initiate an investigation from an error or a specific resource, Gemini Cloud Assist analyzes logs, configurations, and metrics. It uses this data to produce ranked observations and hypotheses about probable root causes, and then provides you with recommended next steps. You can also transfer these results to a Google Cloud support case to provide valuable context that can help you resolve your issue faster.

For more information, see Gemini Cloud Assist Investigations in the Gemini documentation.

Read additional troubleshooting documentation

If you're still experiencing problems, the following resources might be helpful:

  • If you've received an error message, see the error reference page for advice on resolving the error.

  • Check to see if the problem you're having is caused by a known issue.

  • If you're having difficulties with a specific area, one of the targeted troubleshooting guides listed in the Troubleshoot by issue type section of the table of contents might help.

What's next

  • If you can't find a solution to your problem in the documentation, see Get support for further help, including advice on the following topics: