[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2024-11-26。"],[],[],null,["# Configure Config Controller for high-availability\n\nThis page shows you how best to use Config Controller when operating\nhighly-available services or managing resources in multiple Google Cloud regions.\n\nThis page is for Admins and architects and Operators who\nmanage the lifecycle of the underlying tech infrastructure and plan capacity and\ninfrastructure needs. To learn more about common roles and example tasks that we\nreference in Google Cloud content, see\n[Common GKE user roles and tasks](/kubernetes-engine/enterprise/docs/concepts/roles-tasks).\n\nConfig Controller runs in a single region,\nso it can tolerate the failure of an availability zone, but if an entire region\nfails, Config Controller loses availability. There are two different\nstrategies to deal with regional failure, and your choice depends on what you\nwould do if a region fails:\n\n- If you would make configuration changes in response to a regional failure,\n [create a second Config Controller instance](#second-config-controller-manual-failover).\n\n- If you would not make configuration changes,\n [use a single Config Controller instance](#single-config-controller).\n\n| **Note:** While planning for failure is important, [regions offer sufficient availability for many applications](/compute/docs/regions-zones).\n\nUnderstand failure scenarios\n----------------------------\n\nConfig Controller uses a\n[regional GKE cluster](/kubernetes-engine/docs/concepts/regional-clusters).\nAlthough the regional cluster can tolerate the failure of a single zone in a\nregion, the cluster becomes unavailable if multiple zones in the region fail.\n\nIf your Config Controller instance fails, your existing Google Cloud\nresources remain in their current state. However, even if your applications are\nstill running, you cannot change their configuration when Config Controller\nis unavailable. This applies to resources in the same region and to\nresources in other regions that you are managing from the Config Controller\nin the unavailable region.\n\nBecause you can't reconfigure resources in the same region, if a regional\nfailure *does* affect existing Google Cloud resources in the Config Controller\nregion, you cannot repair those resources.\n\nBecause you also can't reconfigure resources in other regions, a failure\nin one region has now affected your ability to make changes in another region.\n\nOther failure scenarios are also possible. For example, if you configure\nConfig Sync to pull from an external Git provider, you should consider the\nfailure modes of that Git provider. You might not be able to make configuration\nchanges because you cannot push changes to that Git provider. Or if\nConfig Sync cannot read from Git, then any Git changes aren't applied to the\ncluster, and so Config Connector does not apply them. However, regional failure is\noften the most important failure scenario, because other failure scenarios are\ntypically uncorrelated with Config Controller failure.\n\nUse a single cluster for regional availability\n----------------------------------------------\n\nIn many scenarios, you would not perform any reconfiguration if a region fails.\nIn that case, you might choose to accept that a regional failure causes your\nconfiguration control-plane to become unavailable.\n\nFor example, if you only operate in a single region, there might not be any\nuseful reconfiguration you can do if that region fails. Similarly, if you have a\nsingle point of failure database in a single region, you might not be able to\nrecover until that region recovers. For applications that do not need the\nabsolute highest availability, this situation can be a reasonable trade-off\nagainst cost and complexity.\n\nLocating the Config Controller instance in the same region gives you a shared\nfate: Config Controller is available as long as your primary region is\navailable. Locating the Config Controller instance in a *different* region\ncan also be a good choice; although you now have to think about potential\nfailures in two regions, you avoid the correlated-failure of your configuration\ncontrol-plane with the failure of your primary region.\n\nAlternatively, if you have a multi-regional redundant configuration,\nyour system might automatically steer away from failed regions. Here too, you\nmight not want to do reconfiguration if a region fails. In this case, you might\nchoose a single Config Controller instance.\n\nManually failover to a second Config Controller instance\n--------------------------------------------------------\n\nYou might want to do some reconfiguration if a region fails so that you can\nremedy the failure. You might also want to continue configuring resources in\nother regions, even if your Config Controller instance is located in a\nfailed region. In this case, we recommend using a second Config Controller\ninstance in a second region.\n\nThough it is not recommended, two Config Controller instances can run with\nidentical configurations. Both instances race to read from the same Git repository\nand apply the same changes to Google Cloud. However, numerous edge-cases\nmake this configuration unreliable. The two Config Controller instances observe\nthe Git repository at slightly different times; they might attempt to apply\nslightly different versions of your Google Cloud configuration. Multiple\nactive writers to Google Cloud make it more likely that you encounter quotas or rate limits.\nA small number of Config Connector resources are also\n[not idempotent](/config-connector/docs/how-to/managing-deleting-resources#resources_with_restrictions_around_acquisition),\nand need extra care as discussed in the rest of this section. We therefore\nrecommend against having two Config Controller clusters both actively\nreconciling.\n\nWe recommend instead that if the region running your Config Controller\nfails, then you run another Config Controller in a second\nregion. The new Config Controller instance should be configured identically\nto the first one, reading from the same Git repository. Pre-preparing\na script to bring up and configure your Config Controller instance might\nbe useful in this scenario. When you create your new Config Controller instance,\nConfig Sync reads and applies the desired state from Git to Kubernetes;\nConfig Connector synchronizes the desired state to Google Cloud.\n\nThere are two things to be careful of in this situation:\n\n- If the first Config Controller cluster is still running, or starts\n running when the first region recovers, then it might attempt to apply the old\n state to Google Cloud. If you can stop the Config Controller cluster in the\n first region before starting a second Config Controller cluster,\n you can avoid this potential conflict.\n\n- Not all Config Connector resources can be seamlessly reapplied from Git.\n For the list of resources that need special care, see [resources with\n restrictions around acquisition](/config-connector/docs/how-to/managing-deleting-resources#resources_with_restrictions_around_acquisition).\n In particular, we recommend being careful around `Folder` resources, and\n avoiding `IAMServiceAccountKey` resources (for example, using GKE\n Workload Identity Federation for GKE instead).\n\n| **Note:** If you need to reconfigure resources when a region fails, you should consider running Config Controller in a *different* region from your primary Google Cloud services. This configuration helps you avoid a single regional failure affecting your Config Controller instance at the exact time you need it.\n\nOne Config Controller instance per region\n-----------------------------------------\n\nIf you want to avoid a Config Controller instance in one region affecting\nanother region, you might also consider running a Config Controller\ninstance per region, where each Config Controller instance manages\nresources in that region.\n\nThis configuration is workable, but it isn't one of our recommended options\nfor the following reasons:\n\n- Some resources span multiple regions (such as Cloud DNS), which makes this\n strategy limited.\n\n- Generally, having a Config Controller cluster in the same region\n encounters the correlated-failure problem: you want to reconfigure resources\n exactly when a regional failure affects the Config Controller in\n that region.\n\n- You have to split up your Config Connector resources by region.\n\n- Config Controller is not currently available in all regions.\n\nDirectly configuring Google Cloud resources\n-------------------------------------------\n\nIn exceptional circumstances, you might make changes directly to the\nunderlying Google Cloud resources, without going through Git or Config Connector.\nConfig Connector tries to remediate any \"drift\", so if your Config Controller\ninstance is still running, Config Connector considers any changes you make manually\nto be \"drift\" and tries to revert them.\n\nHowever, if you stop your Config Controller instance, or if the region is\noffline, this can be a useful stop-gap measure.\n\nWhen your Config Controller instance recovers, Config Connector will likely try\nto revert your manual changes. To avoid this situation, you can make corresponding\nchanges in Git for any changes you make manually."]]