This document describes how a cluster behaves if vCenter Server is down.
While vCenter Server is down:
The machines are in the
Available
stateThe nodes are in the
Ready
state.The Pods are in the
Running
state.There are some expected errors in Pods that connect to vCenter Server; for example, the
vsphere-controller-manager
andcluster-health-controller
Pods.Stateless Pods can be created and deleted.
The creation of a stateful Pod will fail, because attaching a disk requires access to vCenter Server. These Pods will be in the
Pending
state.The
gkectl diagnose
command will fail with an error similar to the following:Exit with error: failed to prepare diagnose parameters: failed to create vSphere client: Post "https://my-server": dial tcp 203.0.113.1:443: connect: connection timed out
Auto repair is not triggered. This is because the machine and node states do not change states on connection errors to vCenter Server.
After vCenter Server comes back online (versions < 7.0U2)
The machines go to the
Unavailable
state, and auto repair or or a manual workaround is needed to get back the correct states.The cluster functions correctly even though the machines are in the
Unavailable
state.
After vCenter Server comes back online (versions >= 7.0U2)
- No extra steps are needed, and the cluster is healthy again.