This page explains how to run preflight checks against your GKE on-prem configuration file.
Overview
During installation, you run gkectl create-config
to generate a cluster configuration file. The configuration file
drives your installation: you provide information about your vSphere environment,
your network and load balancer, and how you'd like your clusters to look. You're
able to generate a configuration file before or after you've created an admin
workstation. For certain checks to pass, they need to be run from the admin
workstation.
After you've modified the file to meet the needs of your environment and your clusters, you use the file to create your clusters in your on-prem environment.
Before you create clusters, you run
gkectl check-config
to
validate the configuration file with several preflight checks.
If the command returns any FAILURE
messages, you must first fix the issues and validate the file again.
Preflight check modes and skipping validations
gkectl check-config
has a default mode and a fast mode:
In default mode, the command comprehensively validates each field. Also, the default mode creates temporary vSphere virtual machines (VMs) as part of its validations, which can take more time.
In fast mode, the command skips checks that create test VMs and runs only the fast checks. You enable fast mode by passing in the
--fast
flag.
You can skip specific validations by passing in other flags, which are described
in gkectl check-config --help
.
Traffic between the admin workstation and the test VMs
In default mode, the preflight check creates test VMs for the admin cluster the user cluster. Each test VM runs an HTTP server that listens on port 443 and on node ports that you specified in your configuration file.
Several IP addresses are assigned to the test VMs. If your configuration file indicates that your cluster nodes will get their IP addresses from a DHCP server, then the preflight check uses a DHCP server to assign IP addresses to the test VMs. If your configuration file indicates that your cluster nodes will be assigned static IP addresses, then the preflight check assigns static IP addresses that you specified in your hostconfig files to the test VMs.
The preflight check, running on the admin workstation, sends HTTP requests to the test VMs using the various IP addresses that are assigned to the VMs. The requests are sent to port 443 and to the node ports that you specified in your configuration file.
When should I run preflight checks?
It is a best practice to run preflight checks early and before attempting to create clusters. Running preflight checks early can help confirm that you've configured your vSphere environment and your network correctly.
If you are using GKE on-prem version 1.2.0-gke.6, run
gkectl check-config
twice:
Run
gkectl check-config --fast
.Run
gkectl prepare
.Run
gkectl check-config
again, without the--fast
flag.
The reason for running twice is that gkectl prepare
uploads the VM template for
the cluster node OS image to your vSphere environment. That VM template must be
in place before you run the full set of validations.
In GKE on-prem version 1.2.1 and later, the check-config
command
itself uploads the VM template, so you can run the full set of validations
before you run gkectl prepare
:
Run
gkectl check-config
, without the--fast
flag.Run
gkectl prepare
.
The preflight checks validate the values you've provided to the file. You don't
need to fill every field in the configuration file to run preflight checks
against the file; rather, you can validate the file iteratively as you populate
its fields. For example, if you only wanted to validate your vCenter
configuration, you could fill only the vcenter
fields and run checks against
those.
Keep in mind that your GKE on-prem configuration becomes immutable after you've created your clusters. Running preflight checks helps you discover and resolve issues in your configuration before creating your clusters.
Preserving the test VM for debugging
Starting with GKE on-prem version 1.2.1, the gkectl check-config
command has a --cleanup
flag.
When gkectl check-config
performs a full set of validations, it creates a
test VM and an associated SSH key. If you want to preserve the test VM and the
SSH key for debugging purposes, set --cleanup
to false.
The default value of --cleanup
is true.
List of preflight checks
The preflight checks validate each field in the configuration file. Here are the current checks:
Category | Description | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Configuration file | Generally validates that each field and specification has the expected format and values. Skipped with
Skip | ||||||||||||||||
Internet | Validates internet access to required domains. Validates proxy configuration based on where you are running gkectl. Skipped with the |
||||||||||||||||
OS image | Validates that OS images exist. Skipped with the |
||||||||||||||||
Windows OS version | Validates Windows OS version. Validates that the Windows version is supported when creating admin workstations with the command line tool |
||||||||||||||||
Cluster version | Validates that the admin cluster version, user cluster version, and
Skipped with the |
||||||||||||||||
Cluster health | Validates that the admin or user cluster is healthy before upgrade:
Skipped with the |
||||||||||||||||
Reserved IP | Validates that enough IP addresses are available for create and upgrade. Skipped with the |
||||||||||||||||
Google Cloud |
--skip-validation-gcp flag.
| ||||||||||||||||
Access to gcr.io/gke-on-prem-release access |
Validates access to GKE on-prem's container image
registry hosted in Container Registry.
Skipped by the |
||||||||||||||||
Docker registry |
privateregistryconfig If configured, valiates access to the Docker registry.
Skipped with the |
||||||||||||||||
vCenter | Checks that all vcenter fields are present, and also
checks the following:
Skipped with the |
||||||||||||||||
Hosts for anti-affinity groups | Validates that the number of physical vCenter hosts is at least
three if
To disable Skipped with the |
||||||||||||||||
Load balancer | Validates load balancing configuration:
--skip-validation-load-balancer flag.
|
||||||||||||||||
Networking | Validates that the provided CIDR ranges, VIPs, and static IPs (if configured) are available. Checks that IP addresses don't overlap. Skipped with the |
||||||||||||||||
DNS | Validates that the provided DNS server is available. Skipped with the |
||||||||||||||||
NTP | Validates that the provided Network Time Protocol (NTP) server is available. Skipped with the |
||||||||||||||||
VIPs | Pings the VIPs provided. This check is successful if the ping fails, indicating the expected the VIP is not already taken. Skipped with the
|
||||||||||||||||
Node IPs | Pings the node IP addresses provided. This check is successful if the ping fails, indicating the expected the node IP is not already taken. Skipped with the
|
Preflight check results
Preflight checks can return the following results:
- SUCCESS
- The field and its value passed the check.
- FAILURE
- The field and/or its value did not pass the check. If a check returns a
FAILURE
message, fix the issues and validate the file again. - SKIPPED
The check was skipped, likely because the check is not relevant to your configuration. For example, if you are using a DHCP server, checks for DNS and node IPs checks—relevant only to a static IP configuration—are skipped.
If you pass in a flag that skips a validation, the skipped check does not return a SKIPPED result; rather, the validation isn't run and doesn't appear in the command output at all.
- UNKNOWN
The skip returned a non-zero code. You can consider UNKNOWN results to be failed checks. UNKNOWN usually indicates that the check failed to run some system package, such as failing to run nslookup or failing to run gcloud.
Coming soon
The following preflight checks will be added in a future release:
- NTP server
Running preflight checks
You run preflight checks by running the following command:
gkectl check-config --config [CONFIG]
where [CONFIG] is the path to your GKE on-prem configuration file
Running in fast mode
If you prefer, you can run preflight checks in "fast mode," which skips the
validations that create temporary test VMs, such as the load balancing VIP and
node IP validations. To do so, pass in --fast
:
gkectl check-config --config [CONFIG] --fast
Skipping specific validations
You can pass in flags to granularly skip specific validations, such as DNS,
proxy, and networking. Each skip flag is prefixed with --skip-[VALIDATION]
.
To learn about the available skip flags, run the following command. Optionally, see gkectl check-config reference:
gkectl check-config --help
For example, to skip the load balancer validations:
gkectl check-config --config my-config.yaml --skip-validation-load-balancer
Cancelling preflight checks
If you started running preflight checks and want to cancel, press CTRL + C twice. If a preflight check created a test VM, cancelling should also clean up the VM automatically.
Cleaning up a test VM
If a test VM is leftover after preflight checks are complete, you can delete the VM from vCenter. A test VM has a name like this:
check-config-[dhcp|static]-[random number]
To delete the VM:
Right-click the VM, and click Power > Power Off
After the VM has powered off, right-click the VM again, and click Delete from Disk.
Example
Below is an example of the command's output. In this example, the configuration being validated uses integrated load balancing mode and static IPs without an external Docker registry:
- Validation Category: Config Check - [SUCCESS] Config - Validation Category: Internet Access - [SUCCESS] Internet access to required domains - Validation Category: GCP - [SUCCESS] GCP Service - [SUCCESS] GCP Service Account - Validation Category: Docker Registry - [SUCCESS] gcr.io/gke-on-prem-release access - Validation Category: vCenter - [SUCCESS] Credentials - [SUCCESS] Version - [SUCCESS] Datacenter - [SUCCESS] Datastore - [SUCCESS] Data Disk - [SUCCESS] Resource Pool - [SUCCESS] Network - Validation Category: F5 BIG-IP - [SUCCESS] Admin Cluster F5 (credentials, partition and user role) - [SUCCESS] User Cluster F5 (credentials, partition and user role) - Validation Category: Network Configuration - [SUCCESS] CIDR, VIP and static IP (availability and overlapping) - Validation Category: DNS - [SUCCESS] DNS (availability) - Validation Category: VIPs - [SUCCESS] ping (availability) - Validation Category: Node IPs - [SUCCESS] ping (availability) Now running slow validation checks. ... Reusing VM template "gke-on-prem-osimage-xxx" that already exists in vSphere. Creating test VMs with admin cluster configuration... DONE Waiting to get IP addresses from test VMs... DONE Waiting for test VMs to become ready... DONE Reusing VM template "gke-on-prem-osimage-xxx" that already exists in vSphere. Creating test VMs with user cluster configuration... DONE Waiting to get IP addresses from test VMs... DONE Waiting for test VMs to become ready... DONE - Validation Category: F5 BIG-IP - [SUCCESS] Admin Cluster VIP and NodeIP - [SUCCESS] Admin Cluster F5 Access - [SUCCESS] User Cluster VIP and NodeIP - [SUCCESS] User Cluster F5 Access - Validation Category: Internet Access - [SUCCESS] Internet access to required domains - Validation Category: vCenter on test VMs - [SUCCESS] Test VM: VCenter Access and Permission - Validation Category: DNS on test VMs - [SUCCESS] Test VM: DNS Availability - Validation Category: TOD on test VMs - [SUCCESS] Test VM: TOD Availability - Validation Category: Docker Registry - [SUCCESS] gcr.io/gke-on-prem-release access Deleting test VMs with admin cluster configuration... DONE Deleting test VMs with user cluster configuration... DONE
Known issues
For version 1.3.0-gke.16:
You must run fast validation checks,
gkectl check-config --fast
, for your preflight checks if both of the following apply:- You configured GKE on-prem to use a proxy (static IP | DHCP).
And you installed one of the following bundles:
- The
/var/lib/gke/bundles/gke-onprem-vsphere-1.3.0-gke.16.tgz
bundle from the Downloads page. - The
/var/lib/gke/bundles/gke-onprem-vsphere-1.3.0-gke.16.tgz
bundle from the admin workstation.
- The
You can run full set of validation only if you installed the full bundle (static IP | DHCP). For example:
/var/lib/gke/bundles/gke-onprem-vsphere-1.3.0-gke.16-full.tgz
For version 1.2.0-gke.6:
If you are using nested resource pools or the default resource pool,
gkectl check-config
fails when you attempt to do a full set of validations. However, you can do a smaller set of validations by passing the--fast
flag.gkectl check-config --config [CONFIG] --fast