本页面简要介绍了 Memorystore for Redis 的手动故障切换。要了解如何执行故障切换,请参阅启动手动故障切换。
什么是手动故障切换?
标准层级 Memorystore for Redis 实例使用副本节点来备份主节点。当主节点健康状况不佳时会发生正常故障切换,从而使副本被指定为新的主实例。手动故障切换与正常故障切换的不同之处在于由用户自行启动手动故障切换。如需详细了解 Memorystore for Redis 复制功能的工作原理,请参阅高可用性。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-05。"],[],[],null,["# About manual failover\n\nThis page gives an overview of **manual failover** for Memorystore for Redis. To\nlearn how to perform a failover, see [Initiating a manual failover](/memorystore/docs/redis/initiating-manual-failover).\n\nWhat is a manual failover?\n--------------------------\n\nA standard tier Memorystore for Redis instance uses a replica node to back up\nthe primary node. A normal failover occurs when the primary node becomes\nunhealthy, causing the replica to be designated as the new primary. A manual\nfailover differs from a normal failover because you initiate it yourself. For\nmore information on how Memorystore for Redis replication works, see [High availability](/memorystore/docs/redis/high-availability).\n\nWhy initiate a manual failover?\n-------------------------------\n\nInitiating a manual failover allows you to test how your application responds to\na failover. This knowledge can ensure a smoother failover process if an\nunexpected failover occurs later on.\n\nOptional data protection mode\n-----------------------------\n\nThe two available data protection modes are:\n\n- `limited-data-loss` mode (default).\n- `force-data-loss` mode.\n\nTo set the data protection mode, use one of the following commands: \n\n gcloud redis instances failover INSTANCE_NAME --data-protection-mode=limited-data-loss\n\nor \n\n gcloud redis instances failover INSTANCE_NAME --data-protection-mode=force-data-loss\n\nHow data protection modes work\n------------------------------\n\nThe `limited-data-loss` mode minimizes data loss by verifying that the\ndifference in data between the primary and replica is below 30 MB before\ninitiating the failover. The offset on the primary is incremented for each byte\nof data that must be synchronized to its replicas. In the `limited-data-loss`\nmode, the failover will abort if the greatest offset delta between the primary\nand each replica is 30MB or greater. If you can tolerate more data loss and want\nto aggressively execute the failover, try setting the data protection mode to\n`force-data-loss`.\n\nThe `force-data-loss` mode employs a chain of failover strategies to\naggressively execute the failover. It does not check the offset delta between\nthe primary and replicas before initiating the failover; you can potentially\nlose more than 30MB of data changes.\n\nBytes pending replication metric\n--------------------------------\n\nThe **bytes pending replication** metric tells you how many remaining bytes the\nreplica needs to copy before the primary is fully backed up. You may observe\nan increase in bytes pending as the primary replicates to the replica during\na failover. If the failover is triggered by hardware error, you may observe empty in bytes pending replication as the offset value could not be obtained until the new replica repaired from host error.\n\nYou can access this\nmetric in the Google Cloud console on the instance details page. To view the\ninstance details page, click the instance id in your project's [instances list](http://console.cloud.google.com/memorystore/redis/instances?memorystore=true)\npage.\n\nAlternatively, access the Metrics Explorer for your\nproject, and search for the **redis.googlapis.com/replication/offset_diff**\nmetric.\n\nWhen to run a manual failover\n-----------------------------\n\nManual failovers using the default `limited-data-loss` protection mode only\nsucceed if the **bytes pending replication** metric is less than 30MB. If you\nwant to run a manual failover with **bytes pending replication** higher than\n30MB, use the `force-data-loss` protection mode.\n\nIf you are trying to preserve as much data as possible, temporarily stop your\napplication from writing to the Redis instance, and wait to run your manual\nfailover until the **bytes pending replication** metric is as low as you deem\nacceptable.\n\nPotential issues blocking a manual failover\n-------------------------------------------\n\n- Running a manual failover on a Basic Tier instance does not work because Basic\n Tier instances do not have replicas to which the primary can failover.\n\n- If your Redis instance is unhealthy, then a limited-data-loss manual failover\n operation fails because it is blocked for data-loss minimization.\n\n- If you are running a Lua script that is executing indefinitely, then you must\n use `force-data-loss` to initiate a failover. In this situation a\n limited-data-loss failover operation will not complete successfully.\n\n- If your instance has incomplete operations pending, such as scaling or\n updating, the manual failover operation is blocked. You must wait until your\n instance is in the `READY` state to run a manual failover.\n\nClient application connection\n-----------------------------\n\nWhen your primary node fails over to the replica, existing connections to\nMemorystore for Redis are dropped. However, on reconnect, your application is\nautomatically redirected to the new primary node using the same connection string\nor IP address.\n\nVerifying a manual failover\n---------------------------\n\nYou can verify the success of a manual failover operation with the\nGoogle Cloud console or `gcloud`.\n\n### Google Cloud console verification\n\nBefore you start a manual failover, go to the Memorystore for Redis [instances\nlist page](http://console.cloud.google.com/memorystore/instances?memorystore=true),\nand click the name of your instance.\n\nThen, in the **Configuration** tab, next to **Primary Location**, view which zone\nyour primary node is in. Make a note of the zone. Check this page again when you\ncomplete your manual failover to confirm that the primary node switched zones.\n\n### Cloud Monitoring verification\n\nTo view the metrics for a monitored resource by using the\nMetrics Explorer, do the following:\n\n1. In the Google Cloud console, go to the\n *leaderboard* **Metrics explorer** page:\n\n [Go to **Metrics explorer**](https://console.cloud.google.com/monitoring/metrics-explorer)\n\n \u003cbr /\u003e\n\n If you use the search bar to find this page, then select the result whose subheading is\n **Monitoring**.\n2. In the toolbar of the Google Cloud console, select your Google Cloud project. For [App Hub](/app-hub/docs/overview) configurations, select the App Hub host project or the app-enabled folder's management project.\n3. In the **Metric** element, expand the **Select a metric** menu, enter `Node role` in the filter bar, and then use the submenus to select a specific resource type and metric:\n 1. In the **Active resources** menu, select **Cloud Memorystore Redis**.\n 2. In the **Active metric categories** menu, select **replication**.\n 3. In the **Active metrics** menu, select **Node role**.\n 4. Click **Apply**.\n4. To remove time series from the display, use the\n [**Filter** element](/monitoring/charts/metrics-selector#filter-option).\n\n5. To combine time series, use the menus on the\n [**Aggregation** element](/monitoring/charts/metrics-selector#select_display).\n For example, to display the CPU utilization for your VMs, based on their zone, set the\n first menu to **Mean** and the second menu to **zone**.\n\n All time series are displayed when the first menu of the **Aggregation** element is set\n to **Unaggregated** . The default settings for the **Aggregation** element\n are determined by the metric type you selected.\n6. For quota and other metrics that report one sample per day, do the following:\n 1. In the **Display** pane, set the **Widget type** to **Stacked bar chart**.\n 2. Set the time period to at least one week.\n\nThe Cloud Monitoring chart represents the primary and replica nodes with two\nlines. When a node's line has a value of zero on the chart, it is the replica\nnode. When a node's line has a value of one on the chart, it is the primary node.\nThe chart represents a failover by showing how the lines switch from one to\nzero, and zero to one, respectively.\n\n### `gcloud` verification\n\nBefore you initiate a manual failover, use the following command to check which\nzone your primary node is in: \n\n```\ngcloud redis instances describe [INSTANCE_ID] --region=[REGION]\n```\n\nYour primary node is in the zone labeled `currentLocationId`. Make a note of the\nzone.\n\nAfter you complete a manual failover, you can confirm that your primary node\nswitched to a new zone by running the `gcloud redis instances describe` command\nagain and checking that the `currentLocationId` changed zones.\n\nAdditionally, the `locationId` label tells you the zone in which you originally\nprovisioned your primary node. The `alternativeLocationId` label tells you the\nzone in which system originally provisioned your replica node. Each time a\nfailover occurs the primary and replica switch between these two zones. However,\nthe zones associated with `locationId` and `alternativeLocationId` do not\nchange."]]