Version 1.6. This version is supported as outlined in the Anthos version support policy, offering the latest patches and updates for security vulnerabilities, exposures, and issues impacting Anthos clusters on VMware (GKE on-prem). Refer to the release notes for more details. This is not the most recent version.

Creating a user cluster

This page shows how to create a user cluster for Anthos clusters on VMware (GKE on-prem).

The instructions here are complete. For a shorter introduction to creating a user cluster, see Create a user cluster (quickstart).

Before you begin

Create an admin cluster.

Get an SSH connection to your admin workstation

Get an SSH connection to your admin workstation.

Recall that gkeadm activated your component access service account on the admin workstation.

Do all the remaining steps in this topic on your admin workstation in the home directory.

Credentials configuration file

When you used gkeadm to create your admin workstation, you filled in a credentials configuration file named credential.yaml. This file holds the username and password for your vCenter server.

Admin cluster configuration file

When gkeadm created your admin workstation, it generated a configuration file named user-cluster.yaml. This configuration file is for creating your user cluster.

Filling in your configuration file

name

Set the name field to a name of your choice for the user cluster.

gkeOnPremVersion

This field is already filled in for you.

vCenter

The values you set in the vCenter section of your admin cluster configuration file are global. That is, they apply to your admin cluster and your user clusters.

For each user cluster that you create, you have the option of overriding some of the global vCenter values.

If you want to override any of the global vCenter values, fill in the relevant fields in the vCenter section of your user cluster configuration file.

network

Set network.ipMode.type to the same value that you set in your admin cluster configuration file: either "dhcp" or "static".

If you set ipMode.type to "static", create an IP block file that provides the static IP addresses for the nodes in your user cluster. Then set network.ipMode.ipBlockFilePath to the path of your IP block file.

Provide values for the remaining fields in the network section.

Regardless of whether you rely on a DHCP server or specify a list of static IP addresses, you need enough IP addresses to satisfy the following:

  • The nodes in your user cluster

  • An additional node in the user cluster to be used temporarily during upgrades

As mentioned previously, if you want to use static IP addresses, then you need to provide an IP block file. Here is an example of an IP block file with six hosts. This is enough addresses for a cluster that has five nodes and an occasional sixth node for upgrades:

blocks:
  - netmask: 255.255.252.0
    gateway: 172.16.23.254
    ips:
    - ip: 172.16.20.21
      hostname: user-host1
    - ip: 172.16.20.22
      hostname: user-host2
    - ip: 172.16.20.23
      hostname: user-host3
    - ip: 172.16.20.24
      hostname: user-host4
    - ip: 172.16.20.25
      hostname: user-host5
    - ip: 172.16.20.26
      hostname: user-host6

loadBalancer

Set aside a VIP for the Kubernetes API server of your user cluster. Set aside another VIP for the ingress service of your user cluster. Provide your VIPs as values for loadBalancer.vips.controlPlaneVIP and loadBalancer.vips.ingressVIP.

Set loadBalancer.kind to the same value that you set in your admin cluster configuration file: "ManualLB", "F5BigIP", or "Seesaw". Then fill in the corresponding section: manualLB, f5BigIP, or seesaw.

proxy

If the network that will have your user cluster nodes is behind a proxy server, fill in the proxy section.

masterNode

Fill in the masterNode section.

nodePools

Fill in the nodePools section.

antiAffinityGroups

Set antiAffinityGroups.enabled to true or false.

authentication

If you want to use OpenID Connect (OIDC) to authenticate users, fill in the authentication.oidc section.

If you want to provide an additional serving certificate for your user cluster's vCenter server, fill in the authentication.sni section.

stackdriver

Fill in the stackdriver section.

gkeConnect

Fill in the gkeConnect section.

cloudRun

Set cloudRun.enabled to true or false.

usageMetering

If you want to enable usage metering for your cluster, then fill in the usageMetering section.

cloudAuditLogging

If you want to integrate the audit logs from your cluster's Kubernetes API server with Cloud Audit Logs, fill in the cloudAuditLogging section.

Validating your configuration file

After you've filled in your user cluster configuration file, run gkectl check-config to verify that the file is valid:

gkectl check-config --kubeconfig [ADMIN_CLUSTER_KUBECONFIG] --config [CONFIG_PATH]

where:

  • [ADMIN_CLUSTER_KUBECONFIG] is the path of the kubeconfig file for your admin cluster.

  • [CONFIG_PATH] is the path of your user cluster configuration file.

If the command returns any failure messages, fix the issues and validate the file again.

If you want to skip the more time-consuming validations, pass the --fast flag. To skip individual validations, use the --skip-validation-xxx flags. To learn more about the check-config command, see Running preflight checks.

Creating a Seesaw load balancer for your user cluster

If you have chosen to use the bundled Seesaw load balancer, do the step in this section. Otherwise, skip this section.

Create and configure the VMs for your Seesaw load balancer:

gkectl create loadbalancer --kubeconfig kubeconfig --config user-cluster.yaml

Creating the user cluster

Create the user cluster:

gkectl create cluster --kubeconfig [ADMIN_CLUSTER_KUBECONFIG]  
  --config [CONFIG_PATH] --skip-validation-all

where

  • [CONFIG_PATH] is the path of your user cluster configuration file.

  • [ADMIN_CLUSTER_KUBECONFIG] is the path of the kubeconfig file for your admin cluster.

The gkectl create cluster command creates a kubeconfig file named [USER_CLUSTER_NAME]-kubeconfig in the current directory. You will need this kubeconfig file later to interact with your user cluster.

Verifying that your user cluster is running

Verify that your user cluster is running:

kubectl get nodes --kubeconfig [USER_CLUSTER_KUBECONFIG]

where [USER_CLUSTER_KUBECONFIG] is the path of your kubeconfig file.

The output shows the user cluster nodes.

Troubleshooting

Diagnosing cluster issues using gkectl

Use gkectl diagnosecommands to identify cluster issues and share cluster information with Google. See Diagnosing cluster issues.

Default logging behavior

For gkectl and gkeadm it is sufficient to use the default logging settings:

  • By default, log entries are saved as follows:

    • For gkectl, the default log file is /home/ubuntu/.config/gke-on-prem/logs/gkectl-$(date).log, and the file is symlinked with the logs/gkectl-$(date).log file in the local directory where you run gkectl.
    • For gkeadm, the default log file is logs/gkeadm-$(date).log in the local directory where you run gkeadm.
  • All log entries are saved in the log file, even if they are not printed in the terminal (when --alsologtostderr is false).
  • The -v5 verbosity level (default) covers all the log entries needed by the support team.
  • The log file also contains the command executed and the failure message.

We recommend that you send the log file to the support team when you need help.

Specifying a non-default location for the log file

To specify a non-default location for the gkectl log file, use the --log_file flag. The log file that you specify will not be symlinked with the local directory.

To specify a non-default location for the gkeadm log file, use the --log_file flag.

Locating Cluster API logs in the admin cluster

If a VM fails to start after the admin control plane has started, you can try debugging this by inspecting the Cluster API controllers' logs in the admin cluster:

  1. Find the name of the Cluster API controllers Pod in the kube-system namespace, where [ADMIN_CLUSTER_KUBECONFIG] is the path to the admin cluster's kubeconfig file:

    kubectl --kubeconfig [ADMIN_CLUSTER_KUBECONFIG] -n kube-system get pods | grep clusterapi-controllers
  2. Open the Pod's logs, where [POD_NAME] is the name of the Pod. Optionally, use grep or a similar tool to search for errors:

    kubectl --kubeconfig [ADMIN_CLUSTER_KUBECONFIG] -n kube-system logs [POD_NAME] vsphere-controller-manager

Debugging F5 BIG-IP issues using the admin cluster control plane node's kubeconfig

After an installation, Anthos clusters on VMware generates a kubeconfig file in the home directory of your admin workstation named internal-cluster-kubeconfig-debug. This kubeconfig file is identical to your admin cluster's kubeconfig, except that it points directly at the admin cluster's control plane node, where the admin control plane runs. You can use the internal-cluster-kubeconfig-debug file to debug F5 BIG-IP issues.

gkectl check-config validation fails: can't find F5 BIG-IP partitions

Symptoms

Validation fails because F5 BIG-IP partitions can't be found, even though they exist.

Potential causes

An issue with the F5 BIG-IP API can cause validation to fail.

Resolution

Try running gkectl check-config again.

gkectl prepare --validate-attestations fails: could not validate build attestation

Symptoms

Running gkectl prepare with the optional --validate-attestations flag returns the following error:

could not validate build attestation for gcr.io/gke-on-prem-release/.../...: VIOLATES_POLICY
Potential causes

An attestation might not exist for the affected image(s).

Resolution

Try downloading and deploying the admin workstation OVA again, as instructed in Creating an admin workstation. If the issue persists, reach out to Google for assistance.

Debugging using the bootstrap cluster's logs

During installation, Anthos clusters on VMware creates a temporary bootstrap cluster. After a successful installation, Anthos clusters on VMware deletes the bootstrap cluster, leaving you with your admin cluster and user cluster. Generally, you should have no reason to interact with this cluster.

If something goes wrong during an installation, and you did pass --cleanup-external-cluster=false to gkectl create cluster, you might find it useful to debug using the bootstrap cluster's logs. You can find the Pod, and then get its logs:

kubectl --kubeconfig /home/ubuntu/.kube/kind-config-gkectl get pods -n kube-system
kubectl --kubeconfig /home/ubuntu/.kube/kind-config-gkectl -n kube-system get logs [POD_NAME]

For more information, refer to Troubleshooting.

What's next