This document describes how to set up an application-based health check to autoheal VMs in a managed instance group (MIG). It also describes how to do the following: use a health check without autohealing, remove a health check, view autohealing policy, and check the health state of each VM.
You can configure an application-based health check to verify that your application on a VM is responding as expected. If the health check that you configure detects that your application on a VM isn't responding, then the MIG marks that VM as unhealthy and repairs it. Repairing a VM based on an application-based health check is called autohealing.
You can also turn off repairs in a MIG so that you can use a health check without triggering autohealing.
To know more about repairs in a MIG, see About repairing VMs for high availability.
Before you begin
-
If you haven't already, then set up authentication.
Authentication is
the process by which your identity is verified for access to Google Cloud services and APIs.
To run code or samples from a local development environment, you can authenticate to
Compute Engine by selecting one of the following options:
Select the tab for how you plan to use the samples on this page:
Console
When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.
gcloud
-
Install the Google Cloud CLI, then initialize it by running the following command:
gcloud init
- Set a default region and zone.
Terraform
To use the Terraform samples on this page in a local development environment, install and initialize the gcloud CLI, and then set up Application Default Credentials with your user credentials.
- Install the Google Cloud CLI.
-
To initialize the gcloud CLI, run the following command:
gcloud init
-
If you're using a local shell, then create local authentication credentials for your user account:
gcloud auth application-default login
You don't need to do this if you're using Cloud Shell.
For more information, see Set up authentication for a local development environment.
REST
To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
Install the Google Cloud CLI, then initialize it by running the following command:
gcloud init
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.
-
Pricing
When you set up an application-based health check, whenever a VM's health state changes, by default Compute Engine writes a log entry in Cloud Logging. Cloud Logging provides a free allotment per month after which logging is priced by data volume. To avoid costs, you can disable the health state change logs.
Set up an application-based health check and autohealing
To set up an application-based health check and autohealing in a MIG, you must do the following:
- Create a health check, if you haven't already.
- Configure an autohealing policy in the MIG to apply the health check.
Create a health check
You can apply a single health check to a maximum of 50 MIGs. If you have more than 50 groups, create multiple health checks.
The following example shows how to create a health check for autohealing. You
can create either a regional
or a global health check for
autohealing in MIGs. In this example, you create a global health check that
looks for a web server
response on port 80
. To enable the health check probes to reach the web
server, configure a firewall rule.
Console
Create a health check for autohealing that is more conservative than a load balancing health check.
For example, create a health check that looks for a response on port
80
and that can tolerate some failure before it marks VMs asUNHEALTHY
and causes them to be recreated. In this example, a VM is marked as healthy if the health check returns successfully once. The VM is marked as unhealthy if the health check returns unsuccessfully3
consecutive times.In the Google Cloud console, go to the Create a health check page.
Give the health check a name, such as
example-check
.Select a Scope. You can select either Regional or Global. For this example, select Global.
For Protocol, make sure that HTTP is selected.
For Port, enter
80
.In the Health criteria section, provide the following values:
- For Check interval, enter
5
. - For Timeout, enter
5
. - Set a Healthy threshold to determine how many consecutive
successful health checks must be returned before an unhealthy
VM is marked as healthy. Enter
1
for this example. - Set an Unhealthy threshold to determine how many consecutive
unsuccessful health checks must be returned before a healthy VM is
marked as unhealthy. Enter
3
for this example.
- For Check interval, enter
Click Create to create the health check.
Create a firewall rule to allow health check probes to connect to your app.
Health check probes come from addresses in the ranges
130.211.0.0/22
and35.191.0.0/16
, so make sure your network firewall rules allow the health check to connect. For this example, the MIG uses thedefault
network and its VMs are listening on port80
. If port80
is not already open on the default network, create a firewall rule.In the Google Cloud console, go to the Firewall policies page.
Click Create firewall rule.
Enter a Name for the firewall rule. For example,
allow-health-check
.For Network, select the
default
network.For Targets, select
All instances in the network
.For Source filter, select
IPv4 ranges
.For Source IPv4 ranges, enter
130.211.0.0/22
and35.191.0.0/16
.In Protocols and ports, select Specified protocols and ports and do the following:
- Select TCP.
- In the Ports field, enter
80
.
Click Create.
gcloud
Create a health check for autohealing that is more conservative than a load balancing health check.
For example, create a health check that looks for a response on port
80
and that can tolerate some failure before it marks VMs asUNHEALTHY
and causes them to be recreated. In this example, VM is marked as healthy if it returns successfully once. The VM is marked as unhealthy if it returns unsuccessfully3
consecutive times. The following command creates a global health check.gcloud compute health-checks create http example-check --port 80 \ --check-interval 30s \ --healthy-threshold 1 \ --timeout 10s \ --unhealthy-threshold 3 \ --global
Create a firewall rule to allow health check probes to connect to your app.
Health check probes come from addresses in the ranges
130.211.0.0/22
and35.191.0.0/16
, so make sure your firewall rules allow the health check to connect. For this example, the MIG uses thedefault
network, and its VMs listen on port80
. If port80
isn't already open on the default network, create a firewall rule.gcloud compute firewall-rules create allow-health-check \ --allow tcp:80 \ --source-ranges 130.211.0.0/22,35.191.0.0/16 \ --network default
Terraform
Create a health check using the
google_compute_http_health_check
resource.For example, create a health check that looks for a response on port
80
and that can tolerate some failure before it marks VMs asUNHEALTHY
and causes them to be recreated. In this example, a VM is marked as healthy if it returns successfully once. The VM is marked as unhealthy if it returns unsuccessfully3
consecutive times. The following request creates a global health check.Create a firewall using the
google_compute_firewall
resource.Health check probes come from addresses in the ranges
130.211.0.0/22
and35.191.0.0/16
, so make sure your firewall rules allow the health check to connect. For this example, the MIG uses thedefault
network and its VMs are listening on port80
. If port80
is not already open on the default network, create a firewall rule.
To learn how to apply or remove a Terraform configuration, see Basic Terraform commands.
REST
Create a health check for autohealing that is more conservative than a load balancing health check.
For example, create a health check that looks for a response on port
80
and that can tolerate some failure before it marks VMs asUNHEALTHY
and causes them to be recreated. In this example, a VM is marked as healthy if it returns successfully once. The VM is marked as unhealthy if it returns unsuccessfully3
consecutive times. The following request creates a global health check.POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/global/healthChecks { "name": "example-check", "type": "http", "port": 80, "checkIntervalSec": 30, "healthyThreshold": 1, "timeoutSec": 10, "unhealthyThreshold": 3 }
Create a firewall rule to allow health check probes to connect to your app.
Health check probes come from addresses in the ranges
130.211.0.0/22
and35.191.0.0/16
, so make sure your firewall rules allow the health check to connect. For this example, the MIG uses thedefault
network and its VMs are listening on port80
. If port80
is not already open on the default network, create a firewall rule.POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/global/firewalls { "name": "allow-health-check", "network": "https://www.googleapis.com/compute/v1/projects/PROJECT_ID/global/networks/default", "sourceRanges": [ "130.211.0.0/22", "35.191.0.0/16" ], "allowed": [ { "ports": [ "80" ], "IPProtocol": "tcp" } ] }
Replace
PROJECT_ID
with your project ID.
Configure an autohealing policy in a MIG
In a MIG, you can set up only one autohealing policy to apply a health check.
You can use either a regional or a global health check for autohealing in MIGs. Regional health checks reduce cross-region dependencies and help to achieve data residency. Global health checks are convenient if you want to use the same health check for MIGs in multiple regions.
Before you configure an autohealing policy:
- If you don't have a health check already, then create one.
- If you want to prevent false-triggering of autohealing while setting up a new health check, then you must first turn off repairs in the MIG and then configure the autohealing policy.
Console
In the Google Cloud console, go to the Instance groups page.
Under the Name column of the list, click the name of the MIG in which you want to apply the health check.
Click Edit to modify this MIG.
In the VM instance lifecycle section, under Autohealing, select a global or a regional Health check.
Change or keep the Initial delay setting.
The initial delay is the number of seconds that a new VM takes to initialize and run its startup script. During a VM's initial delay period, the MIG ignores unsuccessful health checks because the VM might be in the startup process. This prevents the MIG from prematurely recreating a VM. If the health check receives a healthy response during the initial delay, it indicates that the startup process is complete and the VM is ready. The initial delay timer starts when the VM's
currentAction
field changes toVERIFYING
. The value of initial delay must be between 0 and 3600 seconds. In the console, the default value is 300 seconds.Click Save to apply your changes.
gcloud
To configure autohealing policy in an existing MIG, use the
update
command.
For example, use the following command to configure autohealing policy in an existing zonal MIG:
gcloud compute instance-groups managed update MIG_NAME \ --health-check HEALTH_CHECK_URL \ --initial-delay INITIAL_DELAY \ --zone ZONE
To configure autohealing policy when creating a MIG, use the
create
command.
For example, use the following command to configure autohealing policy when creating a zonal MIG:
gcloud compute instance-groups managed create MIG_NAME \ --size SIZE \ --template INSTANCE_TEMPLATE_URL \ --health-check HEALTH_CHECK_URL \ --initial-delay INITIAL_DELAY \ --zone ZONE
Replace the following:
MIG_NAME
: The name of the MIG in which you want to set up autohealing.SIZE
: The number of VMs in the group.INSTANCE_TEMPLATE_URL
: The partial URL of the instance template that you want to use to create the VMs in the group. For example:- Regional instance template:
projects/example-project/regions/us-central1/instanceTemplates/example-template
. - Global instance template:
projects/example-project/global/instanceTemplates/example-template
.
- Regional instance template:
HEALTH_CHECK_URL
: The partial URL of the health check that you want to set up for autohealing. If you want to use a regional health check, you must provide the partial URL of the regional health check. For example:- Regional health check:
projects/example-project/regions/us-central1/healthChecks/example-health-check
. - Global health check:
projects/example-project/global/healthChecks/example-health-check
.
- Regional health check:
INITIAL_DELAY
: The number of seconds that a new VM takes to initialize and run its startup script. During a VM's initial delay period, the MIG ignores unsuccessful health checks because the VM might be in the startup process. This prevents the MIG from prematurely recreating a VM. If the health check receives a healthy response during the initial delay, it indicates that the startup process is complete and the VM is ready. The initial delay timer starts when the VM'scurrentAction
field changes toVERIFYING
. The value of initial delay must be between0
and3600
seconds. The default value is0
.ZONE
: The zone where the MIG is located. For a regional MIG, use the--region
flag.
Terraform
To configure an autohealing policy in a MIG, use the auto_healing_policies
block.
The following sample configures autohealing policy in a zonal MIG. For more
information about the resource used in the sample, see google_compute_instance_group_manager
. For a
regional MIG, use the google_compute_region_instance_group_manager
resource.
To learn how to apply or remove a Terraform configuration, see Basic Terraform commands.
REST
To configure autohealing policy in an existing MIG, use the patch
method as
follows:
- For a zonal MIG, use the
instanceGroupManager.patch
method. - For a regional MIG, use the
regionInstanceGroupManager.patch
method.
For example, make the following call to set up autohealing in an existing zonal MIG:
PATCH https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers/MIG_NAME { "autoHealingPolicies": [ { "healthCheck": "HEALTH_CHECK_URL", "initialDelaySec": INITIAL_DELAY } ] }
To configure autohealing policy when creating a MIG, use the insert
method as follows:
- For a zonal MIG, use the
instanceGroupManager.insert
method. - For a regional MIG, use the
regionInstanceGroupManager.insert
method.
For example, make the following call to configure autohealing policy when creating a zonal MIG:
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers { "name": "MIG_NAME", "targetSize": SIZE, "instanceTemplate": "INSTANCE_TEMPLATE_URL" "autoHealingPolicies": [ { "healthCheck": "HEALTH_CHECK_URL", "initialDelaySec": INITIAL_DELAY } ], }
Replace the following:
PROJECT_ID
: Your project ID.MIG_NAME
: The name of the MIG in which you want to set up autohealing.SIZE
: The number of VMs in the group.INSTANCE_TEMPLATE_URL
: The partial URL of the instance template that you want to use to create the VMs in the group. For example:- Regional instance template:
projects/example-project/regions/us-central1/instanceTemplates/example-template
. - Global instance template:
projects/example-project/global/instanceTemplates/example-template
.
- Regional instance template:
HEALTH_CHECK_URL
: The partial URL of the health check that you want to set up for autohealing. For example:- Regional health check:
projects/example-project/regions/us-central1/healthChecks/example-health-check
. - Global health check:
projects/example-project/global/healthChecks/example-health-check
.
- Regional health check:
INITIAL_DELAY
: The number of seconds that a new VM takes to initialize and run its startup script. During a VM's initial delay period, the MIG ignores unsuccessful health checks because the VM might be in the startup process. This prevents the MIG from prematurely recreating a VM. If the health check receives a healthy response during the initial delay, it indicates that the startup process is complete and the VM is ready. The initial delay timer starts when the VM'scurrentAction
field changes toVERIFYING
. The value of initial delay must be between0
and3600
seconds. The default value is0
.ZONE
: The zone where the MIG is located. For a regional MIG, useregions/REGION
in the URL.
After the autohealing setup is complete, it can take 10 minutes before autohealing begins monitoring VMs in the group. After the monitoring begins, Compute Engine begins to mark VMs as healthy (or else recreates them) based on your autohealing configuration. For example, if you configure an initial delay of 5 minutes, a health check interval of 1 minute, and a healthy threshold of 1 check, the timeline looks like the following:
- 10 minute delay before autohealing begins monitoring VMs in the group
- + 5 minutes for the configured initial delay
- + 1 minute for the check interval * healthy threshold (60s * 1)
- = 16 minutes before the VM is either marked as healthy or is recreated
If you had turned off repairs in the MIG before configuring the autohealing policy, then you can monitor the VM health states to confirm that the health check is working as expected and then set the MIG back to repairing VMs.
Use a health check without autohealing
You can use the health check that is configured in a MIG without autohealing by turning off repairs in the MIG. This is useful in scenarios when you want to use the health check only to monitor your application health or when you want to implement your own repair logic based on the health check.
To set the MIG back to repairing unhealthy VMs, see Set a MIG to repair failed and unhealthy VMs.
Remove a health check
You can remove a health check configured in an autohealing policy as follows:
Console
In the Google Cloud console, go to the Instance groups page.
- Click the name of the MIG from which you want to remove the health check.
- Click Edit to modify this MIG.
- In the VM instance lifecycle section, under Autohealing, select No health check.
- Click Save to apply the changes.
gcloud
To remove the health check configuration in an autohealing policy, in the
update
command
use the --clear-autohealing
flag as follows:
gcloud compute instance-groups managed update MIG_NAME \ --clear-autohealing
Replace MIG_NAME
with the name of a MIG.
REST
To remove the health check configuration in an autohealing policy, set the autohealing policy to an empty value.
- For a zonal MIG, use the
instanceGroupManagers.patch
method - For a regional MIG, use the
regionInstanceGroupManagers.patch
method
For example, to remove health check in a zonal MIG, make the following request:
PATCH https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers/MIG_NAME { "autoHealingPolicies": [ {} ] }
Replace the following:
PROJECT_ID
: Your project ID.MIG_NAME
: The name of the MIG in which you want to set up autohealing.ZONE
: The zone where the MIG is located. For a regional MIG, useregions/REGION
.
View autohealing policy in a MIG
You can view the autohealing policy of a MIG as follows:
Console
In the Google Cloud console, go to the Instance groups page.
Click the name of the MIG of which you want to view the autohealing policy.
Go to the Details tab.
In the VM instance lifecycle section, the Autohealing field displays the health check and the initial delay configured in the autohealing policy.
gcloud
To view the autohealing policy in a MIG, use the following command:
gcloud compute instance-groups managed describe MIG_NAME \ --format="(autoHealingPolicies)"
Replace MIG_NAME
with the name of a MIG.
The following is a sample output:
autoHealingPolicies: healthCheck: https://www.googleapis.com/compute/v1/projects/example-project/global/healthChecks/example-health-check initialDelaySec: 300
REST
To view the autohealing policy in a MIG, use the REST methods as follows:
- For a zonal MIG, use the
instanceGroupManagers.get
method - For a regional MIG, use the
regionInstanceGroupManagers.get
method
For example, make the following request to view the autohealing policy in a zonal MIG:
GET https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers/MIG_NAME
In the response body, check for the autoHealingPolicies[]
object.
The following is a sample response:
{ ... "autoHealingPolicies": [ { "healthCheck": "https://www.googleapis.com/compute/v1/projects/example-project/global/healthChecks/example-health-check", "initialDelaySec": 300 } ], ... }
Replace the following:
PROJECT_ID
: Your project ID.MIG_NAME
: The name of the MIG in which you want to set up autohealing.ZONE
: The zone where the MIG is located. For a regional MIG, useregions/REGION
.
Check the status
After you set up an application-based health check in a MIG, you can verify that a VM is running and its application is responding using the following ways:
Check whether VMs are healthy
If you have configured an application-based health check in your MIG, you can review the health state of each managed instance.
Inspect your managed instance health states to:
- Identify unhealthy VMs that are not being repaired. A VM might not
be repaired immediately even if it has been diagnosed as unhealthy in the
following situations:
- The VM is still booting, and its initial delay has not passed.
- A significant share of unhealthy instances is being repaired. The MIG delays further autohealing to ensure that the group keeps running a subset of instances.
- Detect health check configuration errors. For example, you can detect
misconfigured firewall rules or an invalid application health checking
endpoint if the instance reports a health state of
TIMEOUT
. - Determine the initial delay value to configure by measuring the amount of time
between when the VM transitions to a
RUNNING
status and when the VM transitions to aHEALTHY
health state. You can measure this gap by polling thelist-instances
method or by observing the time betweeninstances.insert
operation and the first healthy signal received.
Use the
console, the
gcloud
command-line tool, or
REST
to view health states.
Console
In the Google Cloud console, go to the Instance groups page.
Under the Name column of the list, click the name of the MIG that you want to examine. A page opens with the instance group properties and a list of VMs that are included in the group.
If a VM is unhealthy, you can see its health state in the Health check status column.
gcloud
Use the list-instances
sub-command.
gcloud compute instance-groups managed list-instances instance-group
NAME ZONE STATUS HEALTH_STATE ACTION INSTANCE_TEMPLATE VERSION_NAME LAST_ERROR
igm-with-hc-fvz6 europe-west1 RUNNING HEALTHY NONE my-template
igm-with-hc-gtz3 europe-west1 RUNNING HEALTHY NONE my-template
The HEALTH_STATE
column shows each VM's health state.
REST
For a regional MIG, construct a POST
request to the
listManagedInstances
method:
POST https://compute.googleapis.com/compute/v1/projects/project-id/regions/region/instanceGroupManagers/instance-group/listManagedInstances
For a zonal MIG, use the zonal MIG
listManagedInstances
method:
POST https://compute.googleapis.com/compute/v1/projects/project-id/zones/zone/instanceGroupManagers/instance-group/listManagedInstances
The request returns a response similar to the following, which
includes an instanceHealth
field for each managed instance.
{ "managedInstances": [ { "instance": "https://www.googleapis.com/compute/v1/projects/project-id/zones/zone/instances/example-group-5485", "instanceStatus": "RUNNING", "currentAction": "NONE", "lastAttempt": { }, "id": "6159431761228150698", "instanceTemplate": "https://www.googleapis.com/compute/v1/projects/project-id/global/instanceTemplates/example-template", "version": { "instanceTemplate": "https://www.googleapis.com/compute/v1/projects/project-id/global/instanceTemplates/example-template" }, "instanceHealth": [ { "healthCheck": "https://www.googleapis.com/compute/v1/projects/project-id/global/healthChecks/http-basic-check", "detailedHealthState": "HEALTHY" } ] }, { "instance": "https://www.googleapis.com/compute/v1/projects/project-id/zones/zone/instances/example-group-sfdp", "instanceStatus": "STOPPING", "currentAction": "DELETING", "lastAttempt": { }, "id": "6622324799312181783", "instanceHealth": [ { "healthCheck": "https://www.googleapis.com/compute/v1/projects/project-id/global/healthChecks/http-basic-check", "detailedHealthState": "TIMEOUT" } ] } ] }
Health states
The following VM health states are available:
HEALTHY
: The VM is reachable, a connection to the application health checking endpoint can be established, and the response conforms to the requirements defined by the health check.DRAINING
: The VM is being drained. Existing connections to the VM have time to complete, but new connections are being refused.UNHEALTHY
: The VM is reachable, but does not conform to the requirements defined by the health check.TIMEOUT
: The VM is unreachable, a connection to the application health checking endpoint cannot be established, or the server on a VM does not respond within the specified timeout. For example, this may be caused by misconfigured firewall rules or an overloaded server application on a VM.UNKNOWN
: The health checking system is not aware of the VM or its health is not known at the moment. It can take 10 minutes for monitoring to begin on new VMs in a MIG.
New VMs return an UNHEALTHY
state until they are verified by the
health checking system.
Whether a VM is repaired depends on its health state:
- If a VM has a health state of
UNHEALTHY
orTIMEOUT
, and it has passed its initialization period, then the MIG immediately attempts to repair it. - If a VM has a health state of
UNKNOWN
, then the MIG doesn't repair it immediately. This is to prevent an unnecessary repair of a VM for which the health checking signal is temporarily unavailable.
Autohealing attempts can be delayed if:
- A VM remains unhealthy after multiple consecutive repairs.
- A significant overall share of unhealthy VMs exists in the group.
We want to learn about your use cases, challenges, or feedback about VM health state values. You can share your feedback with our team at mig-discuss@google.com.
Check current actions on VMs
When a MIG is in the process of creating a VM instance, the MIG sets
that instance's read-only currentAction
field to CREATING
. If an autohealing
policy is attached to the group, after the VM is created and running, the MIG
sets the instance's current action to VERIFYING
and the health checker
begins to probe the VM's application. If the application passes this initial
health check within the time that it takes for the application to start, then
the VM is verified and the MIG changes the VM's currentAction
field to NONE
.
To check the current actions on VMs, see View current actions on VMs.
Check whether the MIG is stable
At the group level, Compute Engine populates a read-only field called
status
that contains an isStable
flag.
If all VMs in the group are running and healthy (that is, the
currentAction
field for each managed instance is set to NONE
), then the MIG sets the
status.isStable
field to true
. Remember that the stability of a MIG depends
on group configurations beyond the autohealing policy; for example, if your
group is autoscaled, and if it is being scaled in or out, then the MIG sets
the status.isStable
field to false
due to the autoscaler operation.
To check the values of your MIG's status.isStable
field, see
Check whether a MIG is stable.
View historical autohealing operations
You can use the gcloud CLI or the REST to view past autohealing events.
gcloud
Use the gcloud compute operations list
command with a
filter
to see only the autohealing repair events in your project.
gcloud compute operations list --filter='operationType~compute.instances.repair.*'
For more information about a specific repair operation, use the
describe
command. For example:
gcloud compute operations describe repair-1539070348818-577c6bd6cf650-9752b3f3-1d6945e5 --zone us-east1-b
REST
For regional MIGs, submit a GET
request to the
regionOperations
resource and include a filter to scope the output list to
compute.instances.repair.*
events.
GET https://compute.googleapis.com/compute/v1/projects/project-id/region/region/operations?filter=operationType+%3D+%22compute.instances.repair.*%22
For zonal MIGs, use the
zoneOperations
resource.
GET https://compute.googleapis.com/compute/v1/projects/project-id/zones/zone/operations?filter=operationType+%3D+%22compute.instances.repair.*%22
For more information about a specific repair operation, submit a GET
request for that specific operation. For example:
GET https://compute.googleapis.com/compute/v1/projects/project-id/zones/zone/operations/repair-1539070348818-577c6bd6cf650-9752b3f3-1d6945e5
What makes a good autohealing health check
Health checks used for autohealing should be conservative so they don't preemptively delete and recreate your instances. When an autohealer health check is too aggressive, the autohealer might mistake busy instances for failed instances and unnecessarily restart them, reducing availability.
unhealthy-threshold
. Should be more than1
. Ideally, set this value to3
or more. This protects against rare failures like a network packet loss.healthy-threshold
. A value of2
is sufficient for most apps.timeout
. Set this time value to a generous amount (five times or more than the expected response time). This protects against unexpected delays like busy instances or a slow network connection.check-interval
. This value should be between 1 second and two times the timeout (not too long nor too short). When a value is too long, a failed instance is not caught soon enough. When a value is too short, the instances and the network can become measurably busy, given the high number of health check probes being sent every second.
What's next
- Try the tutorial, Using autohealing for highly available apps.
- Monitor VM health state changes.
- Apply configuration updates during repairs.