Monitor Config Sync with Cloud Monitoring

The page describes how to send metrics from Config Sync to Cloud Monitoring.

Config Sync uses OpenTelemetry to create, record, and export its metrics. This page explains how to configure Cloud Monitoring metrics. For other ways to export metrics, see Monitor Config Sync with Prometheus or Monitor Config Sync with custom monitoring.

Configuring Cloud Monitoring metrics requires iam.serviceAccounts.setIamPolicy permission on the project. For examples on how to view these metrics, see Example debugging procedures. You can view these metrics with Metrics Explorer or by using the Cloud Monitoring API.

Grant metric-writing permission for Cloud Monitoring

To configure Cloud Monitoring for Config Sync, you must grant metric-writing permission to a service account in your project. The permission needed depends on whether Workload Identity Federation for GKE is enabled.

Configure Cloud Monitoring with Workload Identity Federation for GKE

If Workload Identity Federation for GKE is enabled, allow Config Sync to send metrics by running this command:

gcloud projects add-iam-policy-binding PROJECT_ID \
   --role=roles/monitoring.metricWriter \
   --member="serviceAccount:PROJECT_ID.svc.id.goog[config-management-monitoring/default]"

Replace PROJECT_ID with the cluster's project ID.

Configure Cloud Monitoring without Workload Identity Federation for GKE

If Workload Identity Federation for GKE is not enabled and Config Sync is running inside a Google Cloud environment, you can use the Compute Engine default service account. If automatic Editor role (roles/editor) grants are disabled, grant the service account the Monitoring Metric Writer (roles/monitoring.metricWriter) IAM role by running the following command:

gcloud projects add-iam-policy-binding PROJECT_ID \
   --role=roles/monitoring.metricWriter \
   --member=serviceAccount:PROJECT_NUMBER-compute@developer.gserviceaccount.com

Replace the following:

  • PROJECT_ID: your project ID.
  • PROJECT_NUMBER: your project number.

Default list of metrics in Cloud Monitoring

Name Type
api_duration_seconds Distribution
apply_duration_seconds Distribution
apply_operations_total Count
declared_resources Last Value
internal_errors_total Count
last_sync_timestamp Last Value
pipeline_error_observed Last Value
reconciler_errors Last Value
resource_fights_total Count
reconcile_duration_seconds Distribution
resource_group_total Last Value
resource_count Last Value
ready_resource_count Last Value
resource_ns_count Last Value
cluster_scoped_resource_count Last Value
kcc_resource_count Gauge

To modify the metrics allowlist in Cloud Monitoring, follow the instructions to patch the otel collector deployment with ConfigMap.

Example debugging procedures for Cloud Monitoring

The following Cloud Monitoring examples illustrate some patterns for using OpenCensus metrics to detect and diagnose problems related to Config Sync when you are using the RootSync and RepoSync APIs.

Metric format

In Cloud Monitoring, metrics have the following format: custom.googleapis.com/opencensus/config_sync/METRIC.

This metric name is composed of the following components:

  • custom.googleapis.com: all custom metrics have this prefix
  • opencensus: this prefix is added because Config Sync uses the OpenCensus library
  • config_sync/: metrics that Config Sync exports to Cloud Monitoring have this prefix
  • METRIC: the name of the metric that you want to query

Query metrics by reconciler

RootSync and RepoSync objects are instrumented with high-level metrics that give you useful insight into how Config Sync is operating on the cluster. Almost all metrics are tagged by the reconciler name, so you can see if any errors have occurred and can set up alerts for them in Cloud Monitoring.

A reconciler is a Pod that is deployed as a Deployment. It syncs manifests from a source of truth to a cluster. When you create a RootSync object, Config Sync creates a reconciler called root-reconciler-ROOT_SYNC_NAME or root-reconciler if the name of RootSync is root-sync. When you create a RepoSync object, Config Sync creates a reconciler called ns-reconciler-NAMESPACE-REPO_SYNC_NAME-REPO_SYNC_NAME_LENGTH or ns-reconciler-NAMESPACE if the name of RepoSync is repo-sync, where NAMESPACE is the namespace you created your RepoSync object in.

The following diagram shows you how reconciler Pods function when the source of truth is a Git repository:

Reconciler flow

For example, to filter by the reconciler name when you are using Cloud Monitoring, complete the following tasks:

  1. In the Google Cloud console, go to Monitoring:

    Go to Monitoring

  2. In the Monitoring navigation pane, click Metrics explorer.

  3. In the Select a metric drop-down list, add: custom.googleapis.com/opencensus/config_sync/reconciler_errors.

  4. In the Filter drop-down list, select reconciler. A filter fields box appears.

  5. In the filter fields box, select = in the first field and the reconciler name (for example, root-reconciler) in the second.

  6. Click Apply.

You can now see metrics for your RootSync objects.

For more instructions on how to filter by a specific data type, see Filtering the data.

Query Config Sync operations by component and status

When you have enabled the RootSync and RepoSync APIs, importing and sourcing from a source of truth and syncing to a cluster is handled by the reconcilers. The reconciler_errors metric is labeled by component so you can see where any errors occurred.

For example, to filter by component when you are using Cloud Monitoring, complete the following tasks:

  1. In the Google Cloud console, go to Monitoring:

    Go to Monitoring

  2. In the Monitoring navigation pane, click Metrics explorer.

  3. In the Select a metric drop-down list, add custom.googleapis.com/opencensus/config_sync/reconciler_errors.

  4. In the Filter drop-down list, select component. A filter fields box appears.

  5. In the filter fields box, select = in the first box and source in the second.

  6. Click Apply.

You can now see errors that occurred when sourcing from a source of truth for your reconcilers.

You can also check the metrics for the source and sync processes themselves by querying the following metrics and filtering by the status tag:

custom.googleapis.com/opencensus/config_sync/parser_duration_seconds
custom.googleapis.com/opencensus/config_sync/apply_duration_seconds
custom.googleapis.com/opencensus/config_sync/remediate_duration_seconds