This page explains how to install GKE on-prem to a VMware vSphere 6.5 or 6.7 Update 3 environment using an existing Dynamic Host Configuration Protocol (DHCP) server to assign IP addresses to cluster nodes. You can also install using static IPs.
Overview
This page shows how to create an admin cluster and one user cluster with three nodes. Each node runs on a virtual machine (VM) in a vSphere cluster, and each node has an IP address assigned to it by a DHCP server in your environment.
After you've created the clusters, you can create additional user clusters and add or remove nodes in a user cluster.
Before you begin
Set up your on-premises environment as described in vSphere requirements.
Create an admin workstation in vSphere.
SSH into your admin workstation:
ssh -i ~/.ssh/vsphere_workstation ubuntu@[IP_ADDRESS]
-
If you are behind a proxy, all
gkectl
commands automatically use the same proxy that is set in yourconfig.yaml
for internet requests from the admin workstation. This is the recommended environment, where your admin workstation and all of your clusters use the same proxy. In this use case, you do not need to set proxy environment variables.Manual proxy options: If your admin workstation is not located behind the same proxy, you must manually configure your environment to ensure it has access to the internet. You can set the
HTTPS_PROXY
environment variable to proxy allHTTPS
requests, including yourgkectl
commands, but you must also configure theNO_PROXY
environment variable for all request that you want to exclude from being proxied.If you also need to individually run the
gcloud
commands, you can configure the Google Cloud CLI to always use a specific proxy. For instructions, see Configuring gcloud CLI for use behind a proxy/firewall.Use the following options to manually set a proxy for your
gkectl
commands:- All
gkectl
commands:You can use the
HTTPS_PROXY
andNO_PROXY
environment variable to manually set how all of yourgkectl
commands are proxied:- Set a different proxy for your
gkectl
commands. Example:HTTPS_PROXY="http://my.other.proxy" NO_PROXY="10.0.1.0/24,private-registry.example,10.0.2.1"
- Exclude your
gkectl
commands from being proxied. Example:HTTPS_PROXY=""
export HTTP_PROXY="http://[PROXY_ADDRESS]" export HTTPS_PROXY="http://[PROXY_ADDRESS]" export NO_PROXY="[NO_PROXY_ADDRESSES]"
where
- [PROXY_ADDRESS] can be empty (
""
), a proxy IP address, or the hostname of the proxy. - [NO_PROXY_ADDRESSES] can be a comma separated list of URLs, IP addresses, or hostnames that you want to exclude from being proxied but cannot contain spaces or tabs.
- Set a different proxy for your
- Single
gkectl
commands:You can also prefix an individual
gkectl
command with the environment variable to use a specified proxy for that call only.Examples:
To proxy your
gkectl
commands through a proxy that is different from what is specified in your configuration file (config.yaml
), you use theHTTPS_PROXY
environment variable:- To use the
http://my.other.proxy
proxy:-
HTTPS_PROXY="http://my.other.proxy" gkectl create cluster --config config.yaml
-
HTTPS_PROXY="http://my.other.proxy" gkectl prepare --config config.yaml
-
- Use an empty value to exclude a proxy:
HTTPS_PROXY="" gkectl create cluster --config config.yaml
HTTPS_PROXY="" gkectl check-config --config config.yaml
- To use the
- All
Log in to Google Cloud using your Google Cloud user account credentials. The user account must hold at least the Viewer IAM role:
gcloud auth login
If you are using a proxy, you must configure Google Cloud CLI so that the
gcloud
commands use that proxy. For instructions, see Configuring gcloud CLI for use behind a proxy/firewall.Set a default project. Setting a default Google Cloud causes all gcloud CLI commands to run against the project, so that you don't need to specify your project for each command:
gcloud config set project [PROJECT_ID]
Replace
[PROJECT_ID]
with your project ID. (You can find your project ID in Google Cloud console, or by runninggcloud config get-value project
.)
Using DHCP reservations for cluster nodes
In Kubernetes, it's important that node IP addresses never change. If a node IP address changes or becomes unavailable, it can break the cluster. To prevent this, consider using DHCP reservations to assign permanent addresses nodes in your admin and user clusters. Using DHCP reservations ensures that each node is assigned the same IP addresses after restart or lease renewal.
IP addresses needed for admin and user clusters
Your DHCP server must be able to provide enough IP addresses for your admin and user cluster nodes.
IP addresses needed for the admin cluster
The admin cluster needs addresses for the following nodes:
- One node for the admin cluster control plane
- Two nodes for add-ons in the admin cluster
- An occasional temporary node during an upgrade of the admin cluster
- For each associated user cluster, one or three nodes
For a high availability (HA) user cluster, the admin cluster has three nodes that run control plane components for the user cluster. For a non-HA user cluster, the admin cluster has one node that runs control plane components for the user cluster.
Suppose N is the number of non-HA user clusters you intend to create, and H is the number HA user clusters you intend to create. Then your DHCP server must be able to provide at least this many IP addresses for admin cluster nodes:
4 + N + 3 x H
For example, suppose you intend to create an admin cluster and one HA user cluster. Then your DHCP server would need to provide seven IP addresses for your admin cluster.
IP addresses needed for a user cluster
A user cluster needs an IP address for each node and one additional IP address to be used for a temporary node during an upgrade of the user cluster.
For example, suppose you intend to create a user cluster that has five nodes. Then your DHCP server would need to provide six IP addresses for your user cluster.
Choosing a container image registry for installation
To install, GKE on-prem needs to know where to pull its containerized cluster components. You have two options:
Container Registry
By default, GKE on-prem uses an existing, Google-owned container
image registry hosted by Container Registry.
Apart from setting up your proxy to allow traffic from gcr.io
, this doesn't
require additional setup.
Private Docker registry
You can choose to use a private Docker registry for installation. GKE on-prem
pushes its cluster components to that Docker registry. To specify a private
Docker registry, set the
privateregistryconfig
field.
Configuring a private Docker registry for installation (optional)
This section explains how to configure an existing Docker registry for
installing GKE on-prem. To learn how to create a Docker registry, see
Run an externally-accessible registry.
After you've configured the registry, you populate the
privateregistryconfig
field of the
GKE on-prem configuration file.
If you want to use your private Docker registry for installation, your admin workstation VM must trust the CA that signed your certificate. GKE on-prem does not support unsecured Docker registries. When you start your Docker registry, you must provide a certificate and a key. The certificate can be signed by a public certificate authority (CA), or it can be self-signed.
To establish this trust, perform the following steps from your admin workstation VM:
Create a folder to hold the certificate:
sudo mkdir -p /etc/docker/certs.d/[REGISTRY_SERVER]
where [REGISTRY_SERVER] is the IP address or hostname of the VM that runs your Docker registry.
Copy your certificate file to
/etc/docker/certs.d/[REGISTRY_SERVER]/ca.crt
. You must name the fileca.crt
, even if it had a different name originally.Restart the Docker service:
sudo service docker restart
Verify that you can log in to Docker:
docker login -u [USERNAME] -p [PASSWORD] [REGISTRY_SERVER]
where [USERNAME] and [PASSWORD] are the credentials for logging in to the Docker registry.
Now, when you run gkectl prepare
during installation, the images needed for
installation are pushed to your Docker registry.
Troubleshooting registry configuration
GET https://[REGISTRY_SERVER]/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
: Make sure you have the correct IP address for the VM that runs your Docker registry.login attempt to https://[REGISTRY_SERVER]/v2/ failed with status: 401 Unauthorized
: Make sure your username and password are correct.GET https://[REGISTRY_SERVER]/v1/users/: x509: certificate signed by unknown authority
: Your admin workstation VM doesn't trust the certificate.
Create service accounts' private keys in your admin workstation
If you have not already created JSON keys for your service accounts, create them now.
Component access service account
gcloud iam service-accounts keys create component-access-key.json \ --iam-account [COMPONENT_ACCESS_SERVICE_ACCOUNT_EMAIL]
where [COMPOONENT_ACCESS_SERVICE_ACCOUNT_EMAIL] is the email address of your component access service account.
Connect-register service account
gcloud iam service-accounts keys create connect-register-key.json \ --iam-account [REGISTER_SERVICE_ACCOUNT_EMAIL]
where [CONECT_REGISTER_SERVICE_ACCOUNT_EMAIL] is the email address of your connect-register service account.
Connect-agent service account
gcloud iam service-accounts keys create connect-agent-key.json \ --iam-account [CONNECT_AGENT_SERVICE_ACCOUNT_EMAIL]
where [CONNECT_AGENT_SERVICE_ACCOUNT_EMAIL] is the email address of your connect-agent service account.
Logging-monitoring service account
gcloud iam service-accounts keys create logging-monitoring-key.json \ --iam-account [LOGGING_MONITORING_SERVICE_ACCOUNT_EMAIL]
where [LOGGING_MONITORINGH_SERVICE_ACCOUNT_EMAIL] is the email address of your logging-monitoring service account.
Generating a configuration file
To start an installation, you run gkectl create-config
to generate a
configuration file. You modify the file with your environment's specifications
and with the cluster specifications you want.
To generate the file, run the following command, where
--config [PATH]
is optional and accepts a path and
name for the configuration file. Omitting
--config
creates config.yaml
in the current working directory:
gkectl create-config [--config [PATH]]
Modifying the configuration file
Now that you've generated the configuration file, you need to modify it to be suitable for your environment and to meet your expectations for your clusters. The following sections explain each field, the values it expects, and where you might find the information. Some fields are commented out by default. If any of those fields are relevant to your installation, uncomment them and provide values.
The instructions in this section show how to use a single command that creates an admin cluster and one user cluster. Starting with version 1.2, you can create your admin and user clusters separately.
bundlepath
The GKE on-prem bundle file
contains all of the components in a particular release of GKE on-prem.
When you create an admin workstation, it comes with a
full bundle at
/var/lib/gke/bundles/gke-onprem-vsphere-[VERSION]-full.tgz
. This bundle's
version matches the version of
the OVA you imported
to create the admin workstation.
Set the value of bundlepath
to the path of your admin workstation's bundle
file. That is, set bundlepath
to:
/var/lib/gke/bundles/gke-onprem-vsphere-[VERSION]-full.tgz
where [VERSION] is the version of GKE on-prem that you are installing. The latest version is 1.5.2-gke.3.
Note that you are free to keep your bundle file in a different location or give
it a different name. Just make sure that in your configuration file, the value
of bundlepath
is the path to your bundle file, whatever that might be.
vCenter specification
The vCenter Server specification, vcenter
, holds information about your
vCenter Server instance that GKE on-prem needs to install to your
environment.
vcenter.credentials.address
The vcenter.credentials.address
field holds the IP address or the hostname
of your vCenter server.
Before you fill in the vsphere.credentials.address field
, download and inspect
the serving certificate of your vCenter server. Enter the following command to
download the certificate and save it to a file named vcenter.pem
.
true | openssl s_client -connect [VCENTER_IP]:443 -showcerts 2>/dev/null | sed -ne '/-BEGIN/,/-END/p' > vcenter.pem
Open the certificate file to see the Subject Common Name and the Subject Alternative Name:
openssl x509 -in vcenter.pem -text -noout
The output shows the Subject
Common Name (CN). This might be an IP address, or
it might be a hostname. For example:
Subject: ... CN = 203.0.113.100
Subject: ... CN = my-host.my-domain.example
The output might also include one or more DNS names under
Subject Alternative Name
:
X509v3 Subject Alternative Name: DNS:vcenter.my-domain.example
Choose the Subject
Common Name or one of the DNS names under
Subject Alternative Name
to use as the value of vcenter.credentials.address
in your configuration file. For example:
vcenter: credentials: address: "203.0.113.1" ...
vcenter: credentials: address: "my-host.my-domain.example" ...
You must choose a value that appears in the certificate. For example, if the IP
address does not appear in the certificate, you cannot use it for
vcenter.credentials.address
.
vcenter.credentials
GKE on-prem needs to know your vCenter Server's username, and
password. To provide this information, set the username
and password
values
under vcenter.credentials
. For example:
vcenter: credentials: ... username: "my-name" password: "my-password"
vcenter.datacenter
, .datastore
, .cluster
, .network
GKE on-prem needs some information about the structure of your
vSphere environment. Set the values under vcenter
to provide this information.
For example:
vcenter: ... datacenter: "MY-DATACENTER" datastore: "MY-DATASTORE" cluster: "MY-VSPHERE-CLUSTER" network: "MY-VIRTUAL-NETWORK"
vcenter.resourcepool
A vSphere resource pool
is a logical grouping of vSphere VMs in your vSphere cluster. If you are using
a resource pool other than the default, provide its name to
vcenter.resourcepool
. For example:
vcenter: ... resourcepool: "my-pool"
If you want
GKE on-prem to deploy its nodes to the vSphere cluster's default
resource pool, provide an empty string to vcenter.resourcepool
. For example:
vcenter: ... resourcepool: ""
vcenter.datadisk
GKE on-prem creates a virtual machine disk (VMDK) to hold the
Kubernetes object data for the admin cluster. The installer creates the VMDK for
you, but you must provide a name for the VMDK in the vcenter.datadisk
field.
For example:
vcenter: ... datadisk: "my-disk.vmdk"
- vSAN datastore: Creating a folder for the VMDK
If you are using a vSAN datastore, you need to put the VMDK in a folder. You must manually create the folder ahead of time. To do so, you could use
govc
to create a folder:govc datastore.mkdir -namespace=true my-gke-on-prem-folder
Then set
vcenter.datadisk
to the path of the VMDK, including the folder. For example:vcenter: ... datadisk: "my-gke-on-prem-folder/my-disk.vmdk"
In version 1.1.1 and earlier, a known issue requires that you provide the folder's universally unique identifier (UUID) path, rather than its file path, to
vcenter.datadisk
. Copy this from the output of the abovegovc
command.Then, provide the folder's UUID in the
vcenter.datadisk
field. Do not put a forward slash in front of the UUID. For example:vcenter: ... datadisk: "14159b5d-4265-a2ba-386b-246e9690c588/my-disk.vmdk"
This issue has been fixed in versions 1.1.2 and later.
vcenter.cacertpath
When a client, like GKE on-prem, sends a request to vCenter Server, the server must prove its identity to the client by presenting a certificate or a certificate bundle. To verify the certificate or bundle, GKE on-prem must have the root certificate in the chain of trust.
Set vcenter.cacertpath
to the path of the root certificate. For example:
vcenter: ... cacertpath: "/my-cert-folder/the-root.crt"
Your VMware installation has a certificate authority (CA) that issues a certificate to your vCenter server. The root certificate in the chain of trust is a self-signed certificate created by VMware.
If you do not want to use the VMWare CA, which is the default, you can configure VMware to use a different certificate authority.
If your vCenter server uses a certificate issued by the default VMware CA, there are several ways you can get the root certificate:
curl -k "https://[SERVER_ADDRESS]/certs/download.zip" > download.zip
where [SERVER_ADDRESS] is the address of your vCenter server.
In a browser, enter the address of your vCenter server. In the gray box at the right, click Download trusted root CA certificates.
Enter this command to get the serving certificate:
true | openssl s_client -connect [SERVER_ADDRESS]:443 -showcerts
In the output, find a URL like this: https://[SERVER_ADDRESS]/afd/vecs/ca. Enter the URL in a browser. This downloads the root certificate.
The downloaded file is named downloads.zip
.
Unzip the file:
unzip downloads.zip
If the unzip command doesn't work the first time, enter the command again.
Find the certificate file in certs/lin
.
Proxy specification
If your network is behind a proxy server, populate the proxy
field with HTTPS
proxy and the addresses that should be excluded from proxying. For example:
proxy: url: "https://username:password@domain" noproxy: "10.0.1.0/24,private-registry.example,10.0.2.1"
proxy.url
is the URL of the HTTPS proxy.proxy.noproxy
includes the CIDR ranges, IP addresses, domains, and hostnames that should not be proxied. For example, calls to the IP addresses of cluster nodes should not be proxied. So if you have a subnet that contains only cluster nodes, you could list the CIDR range of the subnet in thenoproxy
field. Note that ifprivateregistryconfig
is specified, that address is automatically added to prevent calls to your private registry.
Admin cluster specification
The admin cluster specification, admincluster
, holds information that
GKE on-prem needs to create the admin cluster.
admincluster.vcenter.network
In admincluster.vcenter.network
, you can specify a vCenter network
for your admin cluster nodes. Note that this overrides the global setting you
provided in vcenter
. For example:
admincluster: vcenter: network: MY-ADMIN-CLUSTER-NETWORK
admincluster.ipblockfilepath
This field is used if you are using static IPs. Since you are using a DHCP
server to allocate IP addresses, leave the admincluster.ipblockfilepath
field
commented out.
admincluster.bigip.credentials
(integrated load balancing mode)
If you are using integrated load balancing mode, GKE on-prem needs to
know the IP address or hostname, username, and password of your F5 BIG-IP load balancer. Set
the values under admincluster.bigip
to provide this information. For example:
admincluster: ... bigip: credentials: address: "203.0.113.2" username: "my-admin-f5-name" password: "rJDlm^%7aOzw"
admincluster.bigip.credentials
(integrated load balancing mode)
If you are using integrated load balancing mode, you must create a BIG-IP
partition for your admin cluster. Set admincluster.bigip.partition
to the name
of your partition. For example:
admincluster: ... bigip: partition: "my-admin-f5-partition"
admincluster.vips
Set the value of admincluster.vips.controlplanevip
to the
IP address that you have chosen to configure on the load balancer
for the Kubernetes API server of the admin cluster. Set the value of
ingressvip
to the IP address you have chosen to configure on the load balancer
for the admin cluster's ingress controller. For example:
admincluster: ... vips: controlplanevip: 203.0.113.3 ingressvip: 203.0.113.4
admincluster.serviceiprange
and admincluster.podiprange
The admin cluster must have a
range of IP addresses
to use for Services and a range of IP addresses to use for Pods. These ranges
are specified by the admincluster.serviceiprange
and admincluster.podiprange
fields. These fields are populated when you run gkectl create-config
. If you
like, you can change the populated values to values of your choice.
The Service and Pod ranges must not overlap. Also, the Service and Pod ranges must not overlap with IP addresses that are used for nodes in any cluster.
Example:
admincluster: ... serviceiprange: 10.96.232.0/24 podiprange: 192.168.0.0/16
User cluster specification
The user cluster specification, usercluster
, holds information that
GKE on-prem needs to create the initial user cluster.
Disabling VMware DRS anti-affinity rules (optional)
GKE on-prem automatically creates VMware Distributed Resource Scheduler (DRS) anti-affinity rules for your user cluster's nodes, causing them to be spread across at least three physical hosts in your datacenter.
This feature requires that your vSphere environment meets the following conditions:
- VMware DRS is enabled. VMware DRS requires vSphere Enterprise Plus license edition. To learn how to enable DRS, see Enabling VMware DRS in a cluster.
- The vSphere user account provided in the
vcenter
field has theHost.Inventory.EditCluster
permission. - There are at least three physical hosts available.
Recall that if you have a vSpphere Standard license, you cannot enable VMware DRS.
If you do not have DRS enabled, or if you do not have at least three hosts to
which vSphere VMs can be scheduled, add
usercluster.antiaffinitygroups.enabled: false
to your configuration file.
For example:
usercluster: ... antiaffinitygroups: enabled: false
For more information, see the release notes for version 1.1.0-gke.6.
- For clusters running more than three nodes
- If vSphere vMotion moves a node to a different host, the node's workloads will need to be restarted before they are distributed across hosts again.
usercluster.vcenter.network
In usercluster.vcenter.network
, you can specify a vCenter network
for your user cluster nodes. Note that this overrides the global setting you
provided in vcenter
. For example:
usercluster: vcenter: network: MY-USER-CLUSTER-NETWORK
usercluster.ipblockfilepath
This field is used if you are using static IPs. Since you are using a DHCP
server to allocate IP addresses, leave the usercluster.ipblockfilepath
field
commented out.
usercluster.bigip.credentials
(integrated load balancing mode)
If you are using integrated load balancing mode, GKE on-prem needs to
know the IP address or hostname, username, and password of the F5 BIG-IP load
balancer that you intend to use for the user cluster. Set the values under
usercluster.bigip
to provide this information. For example:
usercluster: ... bigip: credentials: address: "203.0.113.5" username: "my-user-f5-name" password: "8%jfQATKO$#z" ...
usercluster.bigip.partition
(integrated load balancing mode)
You must create a BIG-IP partition for your user cluster.
Set usercluster.bigip.partition
to the name of your partition. For example:
usercluster: ... bigip: partition: "my-user-f5-partition" ...
usercluster.vips
Set the value of usercluster.vips.controlplanevip
to the
IP address that you have chosen to configure on the load balancer
for the Kubernetes API server of the user cluster. Set the value of
ingressvip
to the IP address you have chosen to configure on the load balancer
for the user cluster's ingress controller. For example:
usercluster: ... vips: controlplanevip: 203.0.113.6 ingressvip: 203.0.113.7
usercluster.serviceiprange
and usercluster.podiprange
The user cluster must have a
range of IP addresses
to use for Services and a range of IP addresses to use for Pods. These ranges
are specified by the usercluster.serviceiprange
and usercluster.podiprange
fields. These fields are populated when you run gkectl create-config
. If you
like, you can change the populated values to values of your choice.
The Service and Pod ranges must not overlap. Also, the Service and Pod ranges must not overlap with IP addresses that are used for nodes in any cluster.
Example:
usercluster: ... serviceiprange: 10.96.233.0/24 podiprange: 172.16.0.0/12
usercluster.clustername
Set the value of usercluster.clustername
to a name of your choice. Choose a
name that is no longer than 40 characters. For example:
usercluster: ... clustername: "my-user-cluster-1"
usercluster.masternode.replicas
The usercluster.masternode.replicas
field specifies how many control plane nodes you
want the user cluster to have. A user cluster's control plane node runs the user
control plane, the Kubernetes control plane components. This value must be 1
or 3
:
- Set this field to
1
to run one user control plane. - Set this field to
3
if you want to have a high availability (HA) user control plane composed of three control plane nodes that each run a user control plane.
usercluster.masternode.cpus
and usercluster.masternode.memorymb
The usercluster.masternode.cpus
and usercluster.masternode.memorymb
fields
specify how many CPUs and how much memory, in megabytes, is allocated to each
control plane node of the user cluster. For example:
usercluster: ... masternode: cpus: 4 memorymb: 8192
usercluster.workernode.replicas
The usercluster.workernode.replicas
field specifies how many worker nodes you
want the user cluster to have. The worker nodes run the cluster workloads.
usercluster.workernode.cpus
and usercluster.workernode.memorymb
The usercluster.masternode.cpus
and usercluster.masternode.memorymb
fields
specify how many CPUs and how much memory, in megabytes, is allocated to each
worker node of the user cluster. For example:
usercluster: ... workernode: cpus: 4 memorymb: 8192 replicas: 3
usercluster.oidc
If you intend for clients of the user cluster to use OIDC authentication, set
values for the fields under usercluster.oidc
. Configuring OIDC is optional.
To learn how to configure OIDC, see Authenticating with OIDC.
- About installing version 1.0.2-gke.3
Version 1.0.2-gke.3 introduces the following OIDC fields (
usercluster.oidc
). These fields enable logging in to a cluster from Google Cloud console:- usercluster.oidc.kubectlredirecturl
- usercluster.oidc.clientsecret
- usercluster.oidc.usehttpproxy
In version 1.0.2-gke.3, if you want to use OIDC, the
clientsecret
field is required even if you don't want to log in to a cluster from Google Cloud console. In that case, you can provide a placeholder value forclientsecret
:oidc: clientsecret: "secret"
usercluster.sni
Server Name Indication (SNI), an extension to Transport Layer Security (TLS), allows servers to present multiple certificates on a single IP address and TCP port, depending on the client-indicated hostname.
If your CA is already distributed as a trusted CA to clients outside your user cluster and you want to rely on this chain to identify trusted clusters, you can configure the Kubernetes API server with an additional certificate that is presented to external clients of the load balancer IP address.
To use SNI with your user clusters, you need to have your own CA and Public Key Infrastructure (PKI). You provision a separate serving certificate for each user cluster, and GKE on-prem adds each additional serving certificate to its respective user cluster.
To configure SNI for the Kubernetes API server of the user cluster, provide
values for usercluster.sni.certpath
(path to the external certificate) and
usercluster.sni.keypath
(path to the external certificate's private key file).
For example:
usercluster: ... sni: certpath: "/my-cert-folder/example.com.crt" keypath: "/my-cert-folder/example.com.key"
lbmode
You can use integrated load balancing with DHCP. Integrated load balancing mode applies to your admin cluster and your initial user cluster. It also applies to any additional user clusters that you create in the future. Integrates load balancing mode supports using F5 BIG-IP as your load balancer.
Set the value of lbmode
to Integrated
. For example:
lbmode: Integrated
gkeconnect
The gkeconnect
specification holds information that GKE on-prem
needs to set up management of your on-prem clusters from Google Cloud console.
Set gkeconnect.projectid
to the project ID of the Google Cloud project
where you want to manage your on-prem clusters.
Set the value of gkeconnect.registerserviceaccountkeypath
to the path of the
JSON key file for your
register service account.
Set the value of gkeconnect.agentserviceaccountkeypath
to the path of the
JSON key file for your
connect service account.
Example:
gkeconnect: projectid: "my-project" registerserviceaccountkeypath: "/my-key-folder/register-key.json" agentserviceaccountkeypath: "/my-key-folder/connect-key.json"
stackdriver
The stackdriver
specification holds information that GKE on-prem
needs to store log entries generated by your on-prem clusters.
Set stackdriver.projectid
to the project ID of the Google Cloud project
where you want to view Stackdriver logs that pertain to your on-prem clusters.
Set stackdriver.clusterlocation
to a Google Cloud region where you want
to store Stackdriver logs. It is a good idea to choose a region that is near
your on-premises data center.
Set stackdriver.enablevpc
to true
if you have your cluster's network
controlled by a VPC. This ensures that all
telemetry flows through Google's restricted IP addresses.
Set stackdriver.serviceaccountkeypath
to the path of the JSON key file for
your
Stackdriver Logging service account.
For example:
stackdriver: projectid: "my-project" clusterlocation: "us-west1" enablevpc: false serviceaccountkeypath: "/my-key-folder/stackdriver-key.json"
privateregistryconfig
If you have a
private Docker registry,
the privateregistryconfig
field holds information that GKE on-prem
uses to push images to your private registry. If you don't specify a private
registry, gkectl
pulls GKE on-prem's container images from its
Container Registry repository, gcr.io/gke-on-prem-release
, during installation.
Under privatedockerregistry.credentials
, set address
to the IP address of
the machine that runs your private Docker registry. Set username
and
password
to the username and password of your private Docker registry. The
value that you set for address
gets automatically added to
proxy.noproxy
.
When Docker pulls an image from your private registry, the registry must prove its identity by presenting a certificate. The registry's certificate is signed by a certificate authority (CA). Docker uses the CA's certificate to validate the registry's certificate.
Set privateregistryconfig.cacertpath
to the path of the CA's certificate. For
example:
privateregistryconfig ... cacertpath: /my-cert-folder/registry-ca.crt
gcrkeypath
Set the value of gcrkeypath
to the path of the JSON key file for your
component access service account.
For example:
gcrkeypath: "/my-key-folder/component-access-key.json"
cloudauditlogging
If you want to send your Kubernetes audit logs to your Google Cloud
project, populate the cloudauditlogging
specification. For example:
cloudauditlogging: projectid: "my-project" # A Google Cloud region where you would like to store audit logs for this cluster. clusterlocation: "us-west1" # The absolute or relative path to the key file for a Google Cloud service account used to # send audit logs from the cluster serviceaccountkeypath: "/my-key-folder/audit-logging-key.json"
Learn more about using audit logging.
Validating the configuration file
Complete this step from your admin workstation.
After you've modified the configuration file, run gkectl check-config
to
verify that the file is valid and can be used for installation:
gkectl check-config --config config.yaml
If the command returns any FAILURE
messages, fix the issues and validate the
file again.
If you want to skip the more time-consuming validations, pass the --fast
flag.
To skip individual validations, use the --skip-validation-xxx
flags. To
learn more about the check-config
command, see
Running preflight checks.
Running gkectl prepare
Before you install, you need to run gkectl prepare
on your admin workstation
to initialize your vSphere environment. The gkectl prepare
performs the
following tasks:
Import the node OS image to vSphere and mark it as a template.
Optionally, validate the container images' build attestations, thereby verifying the images were built and signed by Google and are ready for deployment.
Run gkectl prepare
with the GKE on-prem configuration file, where
--validate-attestations
is optional:
gkectl prepare --config [CONFIG_FILE] --validate-attestations
Positive output from --validate-attestations
is Image [IMAGE_NAME] validated
.
Installing GKE on-prem
You've created a configuration file that specifies how your environment looks
and how you'd like your clusters to look, and you've validated the file. You ran
gkectl prepare
to initialize your environment with the GKE on-prem
software. Now you're ready to initiate a fresh installation of
GKE on-prem.
To install GKE on-prem, you create the admin and user clusters. The following steps create both the admin cluster and the user cluster during the same process. If you want to create each cluster separately, see Creating admin and user clusters separately for details.
To create the admin and user clusters:
Create the admin cluster and the user cluster by running the
gkectl create cluster
command.gkectl create cluster --config [CONFIG_FILE]
where [CONFIG_FILE] is the configuration file you created earlier.
The
gkectl create cluster
command createskubeconfig
files named[CLUSTER_NAME]-kubeconfig
in the current directory where [CLUSTER_NAME] is the name that you set forcluster
. Example:MY-VSPHERE-CLUSTER-kubeconfig
The GKE on-prem documentation uses the following placeholders to refer to these
kubeconfig
files:- Admin cluster: [ADMIN_CLUSTER_KUBECONFIG]
- User cluster: [USER_CLUSTER_KUBECONFIG]
Verify that the cluster are created and running:
To verify the admin cluster, run the following command:
kubectl get nodes --kubeconfig [ADMIN_CLUSTER_KUBECONFIG]
The output shows the admin cluster nodes.
To verify the user cluster, run the following command:
kubectl get nodes --kubeconfig [USER_CLUSTER_KUBECONFIG]
The output shows the user cluster nodes. For example:
NAME STATUS ROLES AGE VERSION xxxxxx-1234-ipam-15008527 Ready <none> 12m v1.14.7-gke.24 xxxxxx-1234-ipam-1500852a Ready <none> 12m v1.14.7-gke.24 xxxxxx-1234-ipam-15008536 Ready <none> 12m v1.14.7-gke.24
Learn how to reuse the configuration file to create additional user clusters.
Resuming an installation
If your installation is interrupted after the admin cluster is created, you can resume the installation with the following steps:
- Remove the
admincluster
specification from the configuration file. Run
gkectl create cluster
with both the--kubeconfig
and--skip-validation-all
flags to pass in the admin cluster's kubeconfig file and skip the preflight checks:gkectl create cluster \ --config [CONFIG_FILE] \ --kubeconfig [ADMIN_CLUSTER_KUBECONFIG] \ --skip-validation-all
where [ADMIN_CLUSTER_NAME] is the admin cluster's kubeconfig, which was created in the working directory when you started the installation.
Connecting clusters to Google
When you populate the
gkeconnect
specification, your user cluster is automatically registered with Google Cloud console. You can view a registered GKE on-prem cluster in Google Cloud console's Kubernetes clusters menu. From there, you can sign into the cluster to view its workloads.If you don't see your cluster in Google Cloud console within one hour of creating it, refer to Connect troubleshooting.
Enabling ingress
After your user cluster is running, you must enable ingress by creating a Gateway object. The first part of the Gateway manifest is always this:
apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: istio-autogenerated-k8s-ingress namespace: gke-system spec: selector: istio: ingress-gke-system
You can tailor the rest of the manifest according to your needs. For example, this manifest says that clients can send requests on port 80 using the HTTP/2 protocol and any hostname:
apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: istio-autogenerated-k8s-ingress namespace: gke-system spec: selector: istio: ingress-gke-system servers: - port: number: 80 protocol: HTTP2 name: http hosts: - "*"
If you want to accept HTTPS requests, then you must provide one or more certificates that your ingress controller can present to clients.
To provide a certificate:
- Create a Secret that holds your certificate and key.
- Create a Gateway object, or modify an existing Gateway object, that refers
to your Secret. The name of the Gateway object must be
istio-autogenerated-k8s-ingress
.
For example, suppose you have already created a certificate file,
ingress-wildcard.crt
, and a key file ingress-wildcard.key
.
Create a Secret named ingressgateway-wildcard-certs
:
kubectl create secret tls \ --namespace gke-system \ ingressgateway-wildcard-certs \ --cert ./ingress-wildcard.crt \ --key ./ingress-wildcard.key
Here's a manifest for a Gateway that refers to your Secret. Clients can call on port 443 using the HTTPS protocol and any hostname that matches *.example.com. Note that the hostname in the certificate must match the hostname in the manifest, *.example.com in this example:
apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: istio-autogenerated-k8s-ingress namespace: gke-system spec: selector: istio: ingress-gke-system servers: - port: number: 80 protocol: HTTP2 name: http hosts: - "*" - hosts: - "*.example.com" port: name: https-demo-wildcard number: 443 protocol: HTTPS tls: mode: SIMPLE credentialName: ingressgateway-wildcard-certs
You can create multiple TLS certs for different hosts by modifying your Gateway manifest.
Save your manifest to a file named my-gateway.yaml
, and create the Gateway:
kubectl apply -f my-gateway.yaml
Now you can use Kubernetes Ingress objects in the standard way.
Creating admin and user clusters separately
Starting with GKE on-prem version 1.2, you can create your admin and user clusters separately. That is, you can start by creating only an admin cluster. Then you can create one or more user clusters as needed.
Before version 1.2:
Your first user cluster always used the admin clusters's datastore. User clusters created subsequently could use a datastore that was separate from the admin cluster's datastore.
If you specified a separate datastore for a user cluster, the user cluster worker nodes and PersistentVolumes (PVs) for the worker nodes used the separate datastore. But the user control-plane VMs and associated PVs used the admin cluster's datastore.
Starting with version 1.2:
Any user cluster, even your first use cluster, can use a datastore that is separate from the admin cluster's datastore.
If you specify a separate datastore for a user cluster, the user cluster worker nodes, PVs for the user cluster worker nodes, user control-plane VMs, and PVs for the user control-plane VMs all use the separate datastore.
To create only an admin cluster, remove the entire usercluster
section from
your cluster configuration file. Then enter the gkectl create
command:
gkectl create --config [ADMIN_CONFIG_FILE]
where [ADMIN_CONFIG_FILE] is the path of your configuration file that
has the usercluster
section removed.
Next, create a configuration file that has the entire admincluster
section
removed. In this file, you can specify a vSphere datastore that is different
from the admin cluster's datastore. To specify a datastore, enter a value for
vcenter.credentials.datastore
. For example:
vcenter: credentials: ... ... datastore: "my-user-cluster-datastore"
To create a user cluster, enter this command:
gkectl create --kubeconfig [ADMIN_CLUSTER_KUBECONFIG] --config [USER_CLUSTER_CONFIG]
where:
- [ADMIN_CLUSTER_KUBECONFIG] is your admin cluster's kubeconfig file.
- [USER_CLUSTER_CONFIG] is the configuration file for your user cluster.
Limitations
Limitation | Description |
---|---|
Maximum and minimum limits for clusters and nodes | See Quotas and limits. Your environment's performance might impact these limits. |
Uniqueness for user cluster names | All user clusters registered to the same Google Cloud project must have unique names. |
Cannot deploy to more than one vCenter and/or vSphere datacenter | Currently, you can only deploy an admin cluster and a set of associated user clusters to a single vCenter and/or vSphere datacenter. You cannot deploy the same admin and user clusters to more than one vCenter and/or vSphere datacenter. |
Cannot declaratively change cluster configurations after creation | While you can create additional clusters and resize existing clusters, you cannot change an existing cluster through its configuration file. |
Troubleshooting
For more information, refer to Troubleshooting.
Diagnosing cluster issues using gkectl
Use gkectl diagnose
commands to identify cluster issues
and share cluster information with Google. See
Diagnosing cluster issues.
Default logging behavior
For gkectl
and gkeadm
it is sufficient to use the
default logging settings:
-
By default, log entries are saved as follows:
-
For
gkectl
, the default log file is/home/ubuntu/.config/gke-on-prem/logs/gkectl-$(date).log
, and the file is symlinked with thelogs/gkectl-$(date).log
file in the local directory where you rungkectl
. -
For
gkeadm
, the default log file islogs/gkeadm-$(date).log
in the local directory where you rungkeadm
.
-
For
- All log entries are saved in the log file, even if they are not printed in
the terminal (when
--alsologtostderr
isfalse
). - The
-v5
verbosity level (default) covers all the log entries needed by the support team. - The log file also contains the command executed and the failure message.
We recommend that you send the log file to the support team when you need help.
Specifying a non-default location for the log file
To specify a non-default location for the gkectl
log file, use
the --log_file
flag. The log file that you specify will not be
symlinked with the local directory.
To specify a non-default location for the gkeadm
log file, use
the --log_file
flag.
Locating Cluster API logs in the admin cluster
If a VM fails to start after the admin control plane has started, you can try debugging this by inspecting the Cluster API controllers' logs in the admin cluster:
Find the name of the Cluster API controllers Pod in the
kube-system
namespace, where [ADMIN_CLUSTER_KUBECONFIG] is the path to the admin cluster's kubeconfig file:kubectl --kubeconfig [ADMIN_CLUSTER_KUBECONFIG] -n kube-system get pods | grep clusterapi-controllers
Open the Pod's logs, where [POD_NAME] is the name of the Pod. Optionally, use
grep
or a similar tool to search for errors:kubectl --kubeconfig [ADMIN_CLUSTER_KUBECONFIG] -n kube-system logs [POD_NAME] vsphere-controller-manager
Debugging F5 BIG-IP issues using the admin cluster control plane node's kubeconfig
After an installation, GKE on-prem generates a kubeconfig file in
the home directory of your admin workstation named
internal-cluster-kubeconfig-debug
. This kubeconfig file is
identical to your admin cluster's kubeconfig, except that it points directly at
the admin cluster's control plane node, where the admin control plane runs. You can use
the internal-cluster-kubeconfig-debug
file to debug F5 BIG-IP
issues.
gkectl check-config
validation fails: can't find F5 BIG-IP partitions
- Symptoms
Validation fails because F5 BIG-IP partitions can't be found, even though they exist.
- Potential causes
An issue with the F5 BIG-IP API can cause validation to fail.
- Resolution
Try running
gkectl check-config
again.
gkectl prepare --validate-attestations
fails: could not validate build attestation
- Symptoms
Running
gkectl prepare
with the optional--validate-attestations
flag returns the following error:could not validate build attestation for gcr.io/gke-on-prem-release/.../...: VIOLATES_POLICY
- Potential causes
An attestation might not exist for the affected image(s).
- Resolution
Try downloading and deploying the admin workstation OVA again, as instructed in Creating an admin workstation. If the issue persists, reach out to Google for assistance.
Debugging using the bootstrap cluster's logs
During installation, GKE on-prem creates a temporary bootstrap cluster. After a successful installation, GKE on-prem deletes the bootstrap cluster, leaving you with your admin cluster and user cluster. Generally, you should have no reason to interact with this cluster.
If something goes wrong during an installation, and you did pass
--cleanup-external-cluster=false
to gkectl create cluster
,
you might find it useful to debug using the bootstrap cluster's logs. You can
find the Pod, and then get its logs:
kubectl --kubeconfig /home/ubuntu/.kube/kind-config-gkectl get pods -n kube-system
kubectl --kubeconfig /home/ubuntu/.kube/kind-config-gkectl -n kube-system get logs [POD_NAME]
What's next
- Learn how to create additional user clusters.
- View your clusters in Google Cloud console.
- Log in to your clusters.