In some situations, you might need to quickly stop Config Sync from syncing configs from your repo. One such scenario is if someone commits a syntactically valid but incorrect config to the repo, and you want to limit its effects on your running clusters while the config is removed or fixed.
This topic is for single repositories and it shows you how to quickly stop syncing, and how to resume syncing when the problem is fixed. To learn how to stop syncing for multiple repositories, see Syncing from multiple repositories.
Prerequisites
The user running the commands discussed in this topic needs the following
Kubernetes RBAC permissions in the kube-system
and config-management-system
namespaces on all clusters where you want to stop syncing:
- apiGroups: ["extensions"]
resources: ["deployments", "deployments/scale"]
verbs: ["get", "update"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["list", "watch"]
Stopping syncing
To stop syncing, run the following commands, which are provided as a single command for your convenience, but can also be run separately:
kubectl -n kube-system scale deployment config-management-operator --replicas=0 \ && kubectl wait -n kube-system --for=delete pods -l k8s-app=config-management-operator \ && kubectl scale deployment -n config-management-system --replicas=0 --all \ && kubectl wait -n config-management-system --for=delete pods --all
The commands do the following, in sequence. If a command fails, the remaining commands do not run.
- Reduce the
replicas
count in the Config Sync Operator Deployment to 0. - Reduce the
replicas
count of all Pods running in theconfig-management-system
namespace to 0. The exact set of Pods affected varies by product version.
All deployments are still in the cluster, but no replicas of the Operator or any of the processes responsible for syncing are available, so configs are not synced from the repo.
Stopping syncing on all enrolled clusters
If you need to stop syncing on all enrolled clusters in a single Google Cloud
project, rather than a single cluster at a time, you can create a script that
uses the nomos status
command to get the list of all enrolled clusters. The script then
creates a kubectl
context for each cluster using the
gcloud container clusters get-credentials
command and runs the above commands
on each of them. The following is a naive example of such a script:
#!/bin/bash
nomos status |grep SYNCED | awk {'print $1'} |while read i; do
gcloud container clusters get-credentials "$i"
kubectl -n kube-system scale deployment config-management-operator --replicas=0 \
&& kubectl wait -n kube-system --for=delete pods -l k8s-app=config-management-operator \
&& kubectl scale deployment -n config-management-system --replicas=0 --all \
&& kubectl wait -n config-management-system --for=delete pods --all
done
Resuming syncing
To resume syncing, run the following command:
kubectl -n kube-system scale deployment config-management-operator --replicas=1
This command scales the Operator Deployment to 1 replica. The
Operator then notices that the Pods in the
config-management-system
namespace Deployments are scaled incorrectly and
scales them to their appropriate replica count.
Resuming syncing on all enrolled clusters
If you need to resume syncing on all enrolled clusters in a Google Cloud project,
rather than a single cluster at a time, you can create a script that uses
nomos status
to get the
list of all enrolled clusters. The script then creates a kubectl
context for
each cluster using the gcloud container clusters get-credentials
command and
runs the above command on each of them. The following is a naive example of such
a script:
#!/bin/bash
nomos status |grep SYNCED | awk {'print $1'} |while read i; do
gcloud container clusters get-credentials "$i"
kubectl -n kube-system scale deployment config-management-operator --replicas=1
done