Best practices for GKE networking

This document outlines the best practices for configuring networking options for Google Kubernetes Engine (GKE) clusters. It is intended to be an architecture planning guide for cloud architects and network engineers with recommendations that are applicable to most GKE clusters. Before you create clusters, we recommend that you review all sections to understand the networking options and implications. This document is not intended to introduce Kubernetes networking concepts or terminology and assumes that you already have some level of Kubernetes networking knowledge. For more information, see the GKE network overview.

While reviewing this document consider the exposure level of your cluster and cluster type, how you plan to expose workloads internally to your Virtual Private Cloud (VPC) network or externally to the internet, how you plan to scale your workloads, and what types of Google services will be consumed.

VPC design

When designing your VPC networks, follow best practices for VPC design. The following section provides some GKE-specific recommendations for VPC network design.

Use VPC-native clusters

Before you create a cluster, you need to choose either a routes-based or VPC-native cluster. We recommend choosing a VPC-native cluster because they use alias IP address ranges on GKE nodes and scale more easily than routes-based clusters. VPC-native clusters are required for private GKE clusters and for creating clusters on Shared VPCs. For clusters created in the Autopilot mode, VPC-native mode is always on and cannot be turned off.

VPC-native clusters scale more easily than routes-based clusters without consuming Google Cloud routes and so are less susceptible to hitting routing limits. The advantages to using VPC-native clusters go hand-in-hand with alias IP support. For example, network endpoint groups (NEGs) can only be used with secondary IP addresses, so they are only supported on VPC-native clusters.

Use Shared VPC networks

GKE clusters require careful IP address planning. Most organizations tend to have a centralized management structure with a network administrator who can allocate IP address space for clusters and a platform admin for operating the clusters. This type of organization structure works well with Google Cloud's Shared VPC network architecture. In the Shared VPC network architecture, a network admin can create subnets and share them with particular principals. You can then create GKE clusters in service projects in those subnets.

In general, a Shared VPC network is a frequently used architecture that is suitable for most organizations with a centralized management team. We recommend using Shared VPC networks to create the subnets for your GKE clusters and to avoid IP address conflicts across your organization.

IP address management strategies

Kubernetes clusters require a unique IP address for every Pod. In GKE, all these addresses are routable throughout the VPC network. Therefore, IP address planning is necessary because addresses cannot overlap with private IP address space used on-premises or in other connected environments. The following sections suggest strategies for IP address management with GKE.

Plan the required IP address allotment

Private clusters are recommended and are further discussed in the Network security section. In the context of private clusters, only VPC-native clusters are supported and require the following IP ranges to be defined:

  • Control plane IP address range: an RFC 1918 /28 subnet that shouldn't overlap any other classless inter-domain routing (CIDR) in the VPC network.
  • Node subnet: the subnet with the primary IP range that you want to allocate for all the nodes in your cluster. Services with the type LoadBalancer that use the cloud.google.com/load-balancer-type: "Internal" annotation also use this subnet.
  • Pod IP address range: the IP range that you allocate for all Pods in your cluster, also known as the cluster CIDR.
  • Service IP address range: the IP address range that you allocate for all services in your cluster, also known as the services CIDR.

The control plane IP address range is dedicated to the GKE-managed control plane that resides in a Google-managed tenant project peered with your VPC. This IP address range shouldn't overlap with any IP addresses in your VPC peering group.

When creating a cluster, the subnet has a primary range for the nodes of the cluster and it should exist prior to cluster creation. The subnet should accommodate the maximum number of nodes that you expect in your cluster. You can use the cluster autoscaler to limit the maximum number of nodes.

The Pod and service IP address ranges are represented as distinct secondary ranges of your subnet, implemented as alias IP addresses in VPC-native clusters.

Choose wide enough address ranges so that you can accommodate all nodes, Pods and services for the cluster.

Use non-RFC 1918 space if needed

For some environments, RFC 1918 space in large contiguous CIDR blocks might already be allocated in an organization. You can use non-RFC 1918 space for additional CIDRs for GKE clusters, if they don't overlap Google-owned public IP addresses. We recommend using part of the RFC 6598 address space (100.64.0.0/10) because Class E address space can present interoperability issues with on-premises hardware. Also, when using privately reused public IPs, use with caution and consider controlling route advertisements in on-prem networks to the internet when choosing this option.

Avoid the use of SNAT in cluster Pod to Pod and Pod to services traffic. When using privately reused public IPs with private clusters, the default behaviour assumes SNAT for all non-RFC 1918 address space. You can fix this by configuring the IP masquerade agent correctly. Your nonMasqueradeCIDRs should contain at least your cluster CIDR and your services CIDR.

Use custom subnet mode

When you set up the network, you also select the subnet mode: auto (default) or custom (recommended). The auto mode leaves the subnet allocation up to Google and is a good option to get started without IP address planning. However, we recommend selecting the custom mode because this mode lets you choose IP address ranges that won't overlap other ranges in your environment. If you are using a Shared VPC, either an organizational admin or network admin can select this mode.

The following example creates a network called my-net-1 with custom subnet mode:

gcloud compute networks create my-net-1 --subnet-mode custom

Plan Pod density per node

By default, Standard clusters reserve a /24 range for every node out of the Pod address space in the subnet and allows for up to 110 Pods per node. Depending on the size of your nodes and the application profile of your Pods, you might run considerably fewer Pods on each node.

If you don't expect to run more than 64 Pods per node, we recommend that you adjust the maximum Pods per node to preserve IP address space in your Pod subnet.

For Autopilot clusters, the maximum number of Pods per node is set to 32, reserving a /26 range for every node. This setting is non-configurable in Autopilot clusters.

Avoid overlaps with IP addresses used in other environments

You can connect your VPC network to an on-premises environment or other cloud service providers through Cloud VPN or Cloud Interconnect. These environments can share routes, making the on-premises IP address management scheme important in IP address planning for GKE. We recommend making sure that the IP addresses don't overlap with the IP addresses used in other environments.

Create a load balancer subnet

Create a separate load balancer subnet to expose services with internal TCP/UDP load balancing. If a separate load balancer subnet is not used, these services are exposed by using an IP address from the node subnet, which can lead to the use of all allocated space in that subnet earlier than expected and can stop you from scaling your GKE cluster to the expected number of nodes.

Using a separate load balancer subnet also means that you can filter traffic to and from the GKE nodes separately to services that are exposed by internal TCP/UDP load balancing, which lets you set stricter security boundaries.

Reserve enough IP address space for cluster autoscaler

You can use the cluster autoscaler to dynamically spin up and down nodes in the cluster so that you can control costs and improve utilization. However, when you are using the cluster autoscaler, make sure that your IP address planning accounts for the maximum size of all node pools. Each new node requires its own node IP address as well as its own allocatable set of Pod IP addresses based on the configured Pods per node. The number of Pods per node can be configured differently than what is configured at the cluster level. You cannot change the number of Pods per node after you create the cluster or node pool. You should consider your workload types and assign them to distinct node pools for optimal IP address allocation.

Network security options

A few key recommendations are outlined in this section for cluster isolation. Network security for GKE clusters is a shared responsibility between Google and your cluster administrator(s).

Choose a private cluster type

Two network isolation types exist for clusters: public and private. Public clusters have both private and public IP addresses on nodes and only a public endpoint for control plane nodes. Private clusters provide more isolation by only having private IP addresses on nodes, and having private and public endpoints for control plane nodes (which can be further isolated and is discussed in the Minimize the cluster control plane exposure section). In private clusters, you can still access Google APIs with Private Google Access. We recommend choosing private clusters for network isolation.

In a private cluster, Pods are isolated from inbound and outbound communication (the cluster perimeter). You can control these directional flows by exposing services by using load balancing and Cloud NAT, discussed in the cluster connectivity section in this document. The following diagram shows this kind of setup:

Private cluster communication
Diagram 1: Private cluster communication

This diagram shows how a private cluster can communicate. On-premises clients can connect to the cluster with the kubectl client. Access to Google Services is provided through Private Google Access, and communication to the internet is available only by using Cloud NAT.

For more information, review the requirements, restrictions, and limitations of private clusters.

Minimize the cluster control plane exposure

In a private cluster, the GKE API server can be exposed as a public or a private endpoint. You can decide which endpoint to use when you create the cluster. You can control access with authorized networks, where both the public and private endpoints default to allowing all communication between the Pod and the node IP addresses in the cluster. To enable a private endpoint when you create a cluster, use the --enable-private-endpoint flag.

Authorize access to the control plane

Authorized networks can help dictate which IP address subnets are able to access the GKE control plane nodes. After enabling these networks, you can restrict access to specific source IP address ranges. If the public endpoint is disabled, these source IP address ranges should be private. If a public endpoint is enabled, you can allow public or private IP address ranges. Configure custom route advertisements to allow the private endpoint of the cluster control plane to be reachable from an on-premises environment. You can make the private GKE API endpoint be globally reachable by using the --enable-master-global-access option when you create a cluster.

The following diagram shows typical control plane connectivity using authorized networks:

control plane connectivity using authorized networks
Diagram 2: Control plane connectivity using authorized networks

This diagram shows trusted users being able to communicate with the GKE control plane through the public endpoint as they are part of authorized networks, while access from untrusted actors is blocked. Communication to and from the GKE cluster happens through the private endpoint of the control plane.

Allow control plane connectivity

Certain system Pods on every worker node will need to reach services such as the Kubernetes API server (kube-apiserver), Google APIs, or the metadata server. The kube-apiserveralso needs to communicate with some system Pods, such as event-exporter specifically. This communication is allowed by default. If you deploy VPC firewall rules within the projects (more details in the Restrict cluster traffic section), ensure those Pods can keep communicating to the kube-apiserver as well as to Google APIs.

Deploy proxies for control plane access from peered networks

Access to the control plane for private GKE clusters is through VPC Network Peering. VPC Network Peering is non-transitive, therefore you cannot access the cluster's control plane from another peered network.

If you want direct access from another peered network or from on-premises when using a hub-and-spoke architecture, deploy proxies for control plane traffic. For more information, see Creating Kubernetes private clusters with network proxies for control plane access.

Restrict cluster traffic using network policies

Multiple levels of network security are possible for cluster workloads that can be combined: VPC firewall rules, hierarchical firewall policies, and GKE network policies. VPC firewall rules and hierarchical firewall policies apply at the virtual machine (VM) level, that is the worker nodes on which the Pods of the GKE cluster reside. GKE network policies apply at the Pod-level to enforce Pod traffic paths. For more information, see the Anthos security blueprint: restricting traffic.

If you implement VPC firewalls, it can break the default, required control plane communication—for example kubelet communication with the control plane. GKE creates required firewall rules by default, but they can be overwritten. Some deployments might require the control plane to reach the cluster on the service. You can use VPC firewalls to configure an ingress policy that makes the service accessible.

GKE network policies are configured through the Kubernetes Network Policy API to enforce a cluster's Pod and service communication. You can enable network policies when you create a cluster by using the gcloud container clusters create option --enable-network-policy. To restrict traffic using network policies, you can follow the Anthos restricting traffic blueprint implementation guide.

GKE Dataplane V2 is based on eBPF and provides an integrated network security and visibility experience. When you create a cluster using GKE Dataplane V2, you don't need to explicitly enable network policies because GKE Dataplane V2 manages service routing, network policy enforcement and logging. Enable the new dataplane with the gcloud --enable-dataplane-v2 option when creating a cluster. After network policies are configured, a default NetworkLogging CRD object can be configured to log allowed and denied network connections.

Enable Google Cloud Armor security policies for Ingress

Using Google Cloud Armor security policies, you can protect applications that are using External HTTP(S) Load Balancing from DDoS attacks and other web-based attacks by blocking such traffic at the network edge. In GKE, enable Google Cloud Armor security policies for applications by using Ingress for External HTTP(S) Load Balancing and adding a security policy to the BackendConfig attached to the Ingress object.

Use Identity-Aware Proxy to provide authentication for applications with IAM users

If you want to deploy services to be accessed only by users within the organization, but without the need of being on the corporate network, you can use Identity-Aware Proxy to create an authentication layer for these applications. To enable Identity-Aware Proxy for GKE, follow the configuration steps to add Identity-Aware Proxy as part of the BackendConfig for your service Ingress. Identity-Aware Proxy can be combined with Google Cloud Armor.

Use organization policy constraints to further enhance security

Using organizational policy constraints, you can set policies to further enhance your security posture. For example, you can use constraints to restrict Load Balancer creation to certain types, such as internal load balancers only, or restricting external IP address usage.

Scaling cluster connectivity

This section covers scalable options for DNS and outbound connectivity from your clusters towards the internet and Google services.

Enable NodeLocal DNSCache

GKE uses kube-dns in order to provide the cluster's local DNS service as a default cluster add-on. kube-dns is replicated across the cluster as a function of the total number of cores and nodes in the cluster.

You can improve DNS performance with NodeLocal DNSCache. NodeLocal DNSCache is an add-on that is deployed as a DaemonSet, and doesn't require any Pod configuration changes. DNS lookups to the local Pod service don't create open connections that need to be tracked on the node which allows for greater scale. External hostname lookups are forwarded to Cloud DNS whereas all other DNS queries go to kube-dns.

Enable NodeLocal DNSCache for more consistent DNS query lookup times and improved cluster scale. For Autopilot clusters, NodeLocal DNSCache is enabled by default and cannot be overridden.

The following gcloud command-line tool option enables NodeLocal DNSCache when you create a cluster: --addons NodeLocalDNS.

If you have control over the name that applications are looking to resolve, there are ways to improve DNS scaling. For example, Use an FQDN (end the hostname with a period) or disable search path expansion through the Pod.dnsConfig manifest option.

Use Cloud NAT for internet access from private clusters

By default, private clusters don't have internet access. In order to allow Pods to reach the internet, enable Cloud NAT for each region. At a minimum, enable Cloud NAT for the primary and secondary ranges in the GKE subnet. Make sure that you allocate enough IP addresses for Cloud NAT and ports per VM.

You should avoid double SNAT for Pods traffic (SNAT first at the GKE node and then again with Cloud NAT). Unless you require SNAT to hide the pod IPs towards on-premises networks connected by Cloud VPN or Cloud Interconnect, disable-default-snat and offload the SNAT tracking to Cloud NAT for scalability. This solution works for all primary and secondary subnet IP ranges. Use network policies to restrict external traffic after enabling Cloud NAT. Cloud NAT is not required to access Google services.

Use Private Google Access for access to Google services

In private clusters, Pods do not have public IP addresses to reach out to public services, including Google APIs and services. Private Google Access lets private Google Cloud resources reach Google services. Typical Google services supported by Private Google Access include BigQuery, Binary Authorization, Artifact Registry, and more.

This option is off by default and needs to be enabled on the subnet associated with the cluster during subnet creation time.

The --enable-private-ip-google-access gcloud command-line tool option enables Private Google Access when you create the subnet.

Serving applications

When creating applications that are reachable either externally or internal to your organization, make sure you use the right load balancer type and options. This section gives some recommendations on exposing and scaling applications with Cloud Load Balancing.

Use container-native load balancing

Use container-native load balancing when exposing services by using HTTP(S) externally. Container-native load balancing allows for fewer network hops, lower latency, and more exact traffic distribution. It also increases visibility in round-trip time and lets you use load-balancing features such as Cloud CDN and Google Cloud Armor.

Choose the correct GKE resource to expose your application

Depending on the scope of your clients (internal, external, or even cluster-internal), the regionality of your application, and the protocols that you use, there are different GKE resources that you can choose to use to expose your application. The Service networking overview explains these options and can help you choose the best resource to expose each part of your application by using Google Cloud load balancing options.

Create health checks based on BackendConfig

If you use an Ingress to expose services, use a health check configuration in a BackendConfig CRD to use the health check functionality of HTTP(S) Load Balancing. You can direct the health check to the appropriate endpoint and set your own thresholds. Without a BackendConfig CRD, health checks are inferred from readiness probe parameters or use default parameters.

Use local traffic policy to preserve original IP addresses

When you use an internal TCP/UDP load balancer with GKE, set the externalTrafficPolicy option to Local to preserve the source IP address of the requests. Use this option if your application requires the original source IP address. However, the externalTrafficPolicy local option can lead to less optimal load spreading, so only use this feature when required. For HTTP(S) services, you can use Ingress controllers and get the original IP address by reading the X-Forwarded-For header in the HTTP request.

Operations and administration

The following sections contain operational best practices which help you ensure granular authorization options for your workloads. To avoid creating manual firewall rules, follow the operational best practices in this section. It also includes recommendations for distributing your workloads and for monitoring and logging in GKE.

Use Workload Identity to authenticate your workloads towards Google APIs

Use Workload Identity to access other Google Cloud services from within your GKE cluster. Workload Identity helps you maintain the principle of least privilege by being able to link Kubernetes service accounts to Google service accounts. Kubernetes service accounts can be assigned to Pods which lets you have granular permissions to access Google APIs per workload. Workload identity is enabled by default in Autopilot clusters.

Use IAM for GKE permissions to control policies in Shared VPC networks

When using Shared VPC networks, firewall rules for load balancers are automatically created in the host project. To avoid having to manually create firewall rules, grant the Compute Security Admin role to the GKE service account in the host project named service-HOST_PROJECT_NUMBER@container-engine-robot.iam.gserviceaccount.com. Replace HOST_PROJECT_NUMBER with the project number of the host project for the Shared VPC.

In addition, firewall rules created by GKE always have the default priority of 1000, so you can disallow specific traffic from flowing by creating firewall rules at a higher priority.

If you want to restrict creation of certain load balancer types, use organizational policies to restrict load balancer creation.

Use regional clusters and distribute your workloads for high availability

Regional clusters can increase the availability of applications in a cluster because the cluster control plane and nodes are spread across multiple zones.

However, to have the best possible user experience in case of a zone failure, use the cluster autoscaler to make sure that your cluster can handle the required load at any time. Also use Pod anti-affinity to ensure that Pods of a given service are scheduled in multiple zones. For more information about how to configure these settings for high availability and cost optimizations, see the Best practices for highly-available GKE clusters.

Use Cloud Operations for GKE and network policy logging

While each organization has different requirements for visibility and auditing, we recommend enabling network policy logging (beta). This feature is only available with GKE Dataplane v2 (also beta). Network policy logging provides visibility into policy enforcement and Pod traffic patterns. Be aware that there are costs involved for network policy logging.

For GKE clusters using version 1.14 or later, Cloud Operations for GKE enables both monitoring and logging by default and provides a dashboard for your GKE clusters. It also enables GKE annotations for VPC Flow Logs. By default, Cloud Operations for GKE collects logs for all workloads deployed to the cluster but a system-only logs option also exists. Use the Cloud Operations for GKE dashboard to observe and set alerts. Be aware that there are costs involved for the Google Cloud's operations suite. For clusters created in the Autopilot mode, Cloud Operations for GKE is automatically enabled and not configurable.

Checklist summary

The following table summarizes the best practices for configuring networking options for GKE clusters.

Area Practice
VPC design
IP address management strategies
Security
Scaling
Serving applications
Operations and administration

What's next