Best practices for GKE networking


This document outlines the best practices for configuring networking options for Google Kubernetes Engine (GKE) clusters. It is intended to be an architecture planning guide for cloud architects and network engineers with cluster configuration recommendations that are applicable to most GKE clusters. Before you create your GKE clusters, we recommend that you review all the sections in this document to understand the networking options that GKE supports and their implications.

The networking options that you choose impact the architecture of your GKE clusters. Some of these options cannot be changed once configured without recreating the cluster.

This document is not intended to introduce Kubernetes networking concepts or terminology and assumes that you already have some level of general networking concepts and Kubernetes networking knowledge. For more information, see the GKE network overview.

While reviewing this document consider the following:

  • How you plan to expose workloads internally to your Virtual Private Cloud (VPC) network, other workloads in the cluster, other GKE clusters, or externally to the internet.
  • How you plan to scale your workloads.
  • What types of Google services you want to consume.

For a summarized checklist of all the best practices, see the Checklist summary.

VPC design

When designing your VPC networks, follow best practices for VPC design.

The following section provides some GKE-specific recommendations for VPC network design.

Use VPC-native clusters

We recommend that you use VPC-native clusters. VPC-native clusters use alias IP address ranges on GKE nodes and are required for private GKE clusters and for creating clusters on Shared VPCs, and have many other benefits. For clusters created in the Autopilot mode, VPC-native mode is always on and cannot be turned off.

VPC-native clusters scale more easily than routes-based clusters without consuming Google Cloud routes and so are less susceptible to hitting routing limits.

The advantages to using VPC-native clusters go hand-in-hand with alias IP support. For example, network endpoint groups (NEGs) can only be used with secondary IP addresses, so they are only supported on VPC-native clusters.

Use Shared VPC networks

GKE clusters require careful IP address planning. Most organizations tend to have a centralized management structure with a network administration team who can allocate IP address space for clusters and a platform administrator for operating the clusters. This type of organization structure works well with Google Cloud's Shared VPC network architecture. In the Shared VPC network architecture, a network administrator can create subnets and share them with VPCs. You can create GKE clusters in a service project and use the subnets shared from the Shared VPC on the host project. The IP address component stays in the host project, and your other cluster components live in the service project.

In general, a Shared VPC network is a frequently used architecture that is suitable for most organizations with a centralized management team. We recommend using Shared VPC networks to create the subnets for your GKE clusters and to avoid IP address conflicts across your organization. You might also want to use Shared VPCs for governance of operational functions. For example, you can have a network team that works only on network components and reliability, and another team that works on GKE.

IP address management strategies

All Kubernetes clusters, including GKE clusters, require a unique IP address for every Pod.

To learn more, see the GKE networking model.

In GKE, all these IP addresses are routable throughout the VPC network. Therefore, IP address planning is necessary because addresses cannot overlap with internal IP address space used on-premises or in other connected environments. The following sections suggest strategies for IP address management with GKE.

Plan the required IP address allotment

Private clusters are recommended and are further discussed in the Network security section. In the context of private clusters, only VPC-native clusters are supported and require the following IP address ranges:

  • Control plane IP address range: use a /28 subnet within the IP address private ranges included on the RFC 1918. You must ensure this subnet doesn't overlap any other classless inter-domain routing (CIDR) in the VPC network.
  • Node subnet: the subnet with the primary IP address range that you want to allocate for all the nodes in your cluster. Services with the type LoadBalancer that use the cloud.google.com/load-balancer-type: "Internal" annotation also use this subnet by default. You can also use a dedicated subnet for internal load balancers.
  • Pod IP address range: the IP range that you allocate for all Pods in your cluster. GKE provisions this range as an alias of the subnet. For more information, see IP address ranges for VPC-native clusters
  • Service IP address range: the IP address range that you allocate for all Services in your cluster. GKE provisions this range as an alias of the subnet.

For private clusters, you must define a node subnet, a Pod IP address range, and a Service IP address range.

If you want to use IP address space more efficiently, see Reduce internal IP address usage in GKE.

The control plane IP address range is dedicated to the GKE-managed control plane that resides in a Google-managed tenant project peered with your VPC. This IP address range shouldn't overlap with any IP addresses in your VPC peering group because GKE imports this route into your project. This means that if you have any routes to the same CIDR in your project, you might experience routing issues.

When creating a cluster, the subnet has a primary range for the nodes of the cluster and it should exist prior to cluster creation. The subnet should accommodate the maximum number of nodes that you expect in the cluster and the internal load balancer IP addresses across the cluster using the subnet.

You can use the cluster autoscaler to limit the maximum number of nodes.

The Pod and service IP address ranges are represented as distinct secondary ranges of your subnet, implemented as alias IP addresses in VPC-native clusters.

Choose wide enough IP address ranges so that you can accommodate all nodes, Pods, and Services for the cluster.

Consider the following limitations:

Use more than private RFC 1918 IP addresses

For some environments, RFC 1918 space in large contiguous CIDR blocks might already be allocated in an organization. You can use non-RFC 1918 space for additional CIDRs for GKE clusters, if they don't overlap with Google-owned public IP addresses. We recommend using the 100.64.0.0/10 part of the RFC address space because Class E address space can present interoperability issues with on-premises hardware. You can use privately reused public IPs (PUPI).

When using privately reused public IP addresses, use with caution and consider controlling route advertisements in on-premises networks to the internet when choosing this option.

You shouldn't use source network address translation (SNAT) in a cluster with Pod-to-Pod and Pod-to-Service traffic. This breaks the Kubernetes networking model.

Kubernetes assumes that all non-RFC 1918 IP addresses are privately reused public IP addresses and uses SNAT for all traffic originating from these addresses.

If you are using a non-RFC 1918 IP address for your GKE cluster, for Standard clusters, you will need to either explicitly disable SNAT or configure the configure the IP masquerade agent agent to exclude your cluster's Pod IP addresses and the secondary IP address ranges for Services from SNAT. For Autopilot clusters, this doesn't require any extra steps.

Use custom subnet mode

When you set up the network, you also select the subnet mode: auto (default) or custom (recommended). The auto mode leaves the subnet allocation up to Google and is a good option to get started without IP address planning. However, we recommend selecting the custom mode because this mode lets you choose IP address ranges that won't overlap other ranges in your environment. If you are using a Shared VPC, either an organizational administrator or network administrator can select this mode.

The following example creates a network called my-net-1 with custom subnet mode:

gcloud compute networks create my-net-1 --subnet-mode custom

Plan Pod density per node

By default, Standard clusters reserve a /24 range for every node out of the Pod address space in the subnet and allows for up to 110 Pods per node. However, you can configure a Standard cluster to support up to 256 Pods per node, with a /23 range reserved for every node. Depending on the size of your nodes and the application profile of your Pods, you might run considerably fewer Pods on each node.

If you don't expect to run more than 64 Pods per node, we recommend that you adjust the maximum Pods per node to preserve IP address space in your Pod subnet.

If you expect to run more than the default 110 Pods per node, you can increase the maximum Pods per node up to 256, with /23 reserved for every node. With this type of high Pod density configuration, we recommend using instances with 16 or more CPU cores to ensure the scalability and performance of your cluster.

For Autopilot clusters, the maximum number of Pods per node is set to 32, reserving a /26 range for every node. This setting is non-configurable in Autopilot clusters.

Avoid overlaps with IP addresses used in other environments

You can connect your VPC network to an on-premises environment or other cloud service providers through Cloud VPN or Cloud Interconnect. These environments can share routes, making the on-premises IP address management scheme important in IP address planning for GKE. We recommend making sure that the IP addresses don't overlap with the IP addresses used in other environments.

Create a load balancer subnet

Create a separate load balancer subnet to expose services with internal TCP/UDP load balancing. If a separate load balancer subnet is not used, these services are exposed by using an IP address from the node subnet, which can lead to the use of all allocated space in that subnet earlier than expected and can stop you from scaling your GKE cluster to the expected number of nodes.

Using a separate load balancer subnet also means that you can filter traffic to and from the GKE nodes separately to services that are exposed by internal TCP/UDP load balancing, which lets you set stricter security boundaries.

Reserve enough IP address space for cluster autoscaler

You can use the cluster autoscaler to dynamically add and remove nodes in the cluster so that you can control costs and improve utilization. However, when you are using the cluster autoscaler, make sure that your IP address planning accounts for the maximum size of all node pools. Each new node requires its own node IP address as well as its own allocatable set of Pod IP addresses based on the configured Pods per node. The number of Pods per node can be configured differently than what is configured at the cluster level. You cannot change the number of Pods per node after you create the cluster or node pool. You should consider your workload types and assign them to distinct node pools for optimal IP address allocation.

Consider using node auto-provisioning, with the cluster autoscaler, particularly if you're using VPC-native clusters. For more information, see Node limiting ranges.

Share IP addresses across clusters

You might need to share IP addresses across clusters if you have a centralized team that is managing the infrastructure for clusters. To share IP addresses across GKE clusters, see Sharing IP address ranges across GKE clusters. You can reduce IP exhaustion by creating three ranges, for Pods, Services and nodes, and reusing or sharing them, especially in a Shared VPC model. This setup can also make it easier for network administrators to manage IP addresses by not requiring them to create specific subnets for each cluster.

Consider the following:

  • As a best practice, use separate subnets and IP address ranges for all clusters.
  • You can share the secondary Pod IP address range, but it is not recommended because one cluster might use all of the IP addresses.
  • You can share secondary Service IP address ranges, but this feature does not work with VPC-scope Cloud DNS for GKE.

If you run out of IP addresses, you can create additional Pod IP address ranges using discontiguous multi-Pod CIDR.

Share IP addresses for internal LoadBalancer Services

You can share a single IP address with up to 50 backends using different ports. This lets you reduce the number of IP addresses you need for internal LoadBalancer Services.

For more information, see Shared IP.

Network security options

A few key recommendations are outlined in this section for cluster isolation. Network security for GKE clusters is a shared responsibility between Google and your cluster administrator(s).

Use GKE Dataplane V2

GKE Dataplane V2 is based on eBPF and provides an integrated network security and visibility experience. When you create a cluster using GKE Dataplane V2 you don't need to explicitly enable network policies because GKE Dataplane V2 manages service routing, network policy enforcement and logging. Enable the new dataplane with the Google Cloud CLI --enable-dataplane-v2 option when creating a cluster. After network policies are configured, a default NetworkLogging CRD object can be configured to log allowed and denied network connections. We recommend creating clusters with GKE Dataplane V2 to take full advantage of the built-in features such as network policy logging.

Choose a private cluster type

Public clusters have both private and public IP addresses on nodes and only a public endpoint for control plane nodes. Private clusters provide more isolation by only having internal IP addresses on nodes, and having private or public endpoints for control plane nodes (which can be further isolated and is discussed in the Minimize the cluster control plane exposure section). In private clusters, you can still access Google APIs with Private Google Access. We recommend choosing private clusters.

In a private cluster, Pods are isolated from inbound and outbound communication (the cluster perimeter). You can control these directional flows by exposing services by using load balancing and Cloud NAT, discussed in the cluster connectivity section in this document. The following diagram shows this kind of setup:

Diagram 1: Private cluster communication

This diagram shows how a private cluster can communicate. On-premises clients can connect to the cluster with the kubectl client. Access to Google Services is provided through Private Google Access, and communication to the internet is available only by using Cloud NAT.

For more information, review the requirements, restrictions, and limitations of private clusters.

Minimize the cluster control plane exposure

In a private cluster, the GKE API server can be exposed as a public or a private endpoint. You can decide which endpoint to use when you create the cluster. You can control access with authorized networks, where both the public and private endpoints default to allowing all communication between the Pod and the node IP addresses in the cluster. To enable a private endpoint when you create a cluster, use the --enable-private-endpoint flag.

Authorize access to the control plane

Authorized networks can help dictate which IP address subnets are able to access the GKE control plane nodes. After enabling these networks, you can restrict access to specific source IP address ranges. If the public endpoint is disabled, these source IP address ranges should be private. If a public endpoint is enabled, you can allow public or internal IP address ranges. Configure custom route advertisements to allow the private endpoint of the cluster control plane to be reachable from an on-premises environment. You can make the private GKE API endpoint be globally reachable by using the --enable-master-global-access option when you create a cluster.

The following diagram shows typical control plane connectivity using authorized networks:

Diagram 2: Control plane connectivity using authorized networks

This diagram shows trusted users being able to communicate with the GKE control plane through the public endpoint as they are part of authorized networks, while access from untrusted actors is blocked. Communication to and from the GKE cluster happens through the private endpoint of the control plane.

Allow control plane connectivity

Certain system Pods on every worker node will need to reach services such as the Kubernetes API server (kube-apiserver), Google APIs, or the metadata server. The kube-apiserveralso needs to communicate with some system Pods, such as event-exporter specifically. This communication is allowed by default. If you deploy VPC firewall rules within the projects (more details in the Restrict cluster traffic section), ensure those Pods can keep communicating to the kube-apiserver as well as to Google APIs.

Deploy proxies for control plane access from peered networks

Access to the control plane for private GKE clusters is through VPC Network Peering. VPC Network Peering is non-transitive, therefore you cannot access the cluster's control plane from another peered network.

If you want direct access from another peered network or from on-premises when using a hub-and-spoke architecture, deploy proxies for control plane traffic.

Restrict cluster traffic using network policies

Multiple levels of network security are possible for cluster workloads that can be combined: VPC firewall rules, Hierarchical firewall policies, and Kubernetes network policies. VPC firewall rules and Hierarchical firewall policies apply at the virtual machine (VM) level, that is the worker nodes on which the Pods of the GKE cluster reside. Kubernetes network policies apply at the Pod-level to enforce Pod to Pod traffic paths.

If you implement VPC firewalls, it can break the default, required control plane communication—for example the kubelet communication with the control plane. GKE creates required firewall rules by default, but they can be overwritten. Some deployments might require the control plane to reach the cluster on the service. You can use VPC firewalls to configure an ingress policy that makes the service accessible.

GKE network policies are configured through the Kubernetes Network Policy API to enforce a cluster's Pod communication. You can enable network policies when you create a cluster by using the gcloud container clusters create option --enable-network-policy. To restrict traffic using network policies, you can follow the Anthos restricting traffic blueprint implementation guide.

Enable Google Cloud Armor security policies for Ingress

Using Google Cloud Armor security policies, you can protect applications that are using external Application Load Balancers from DDoS attacks and other web-based attacks by blocking such traffic at the network edge. In GKE, enable Google Cloud Armor security policies for applications by using Ingress for external Application Load Balancers and adding a security policy to the BackendConfig attached to the Ingress object.

Use Identity-Aware Proxy to provide authentication for applications with IAM users

If you want to deploy services to be accessed only by users within the organization, but without the need of being on the corporate network, you can use Identity-Aware Proxy to create an authentication layer for these applications. To enable Identity-Aware Proxy for GKE, follow the configuration steps to add Identity-Aware Proxy as part of the BackendConfig for your service Ingress. Identity-Aware Proxy can be combined with Google Cloud Armor.

Use organization policy constraints to further enhance security

Using organizational policy constraints, you can set policies to further enhance your security posture. For example, you can use constraints to restrict Load Balancer creation to certain types, such as internal load balancers only.

Scaling cluster connectivity

This section covers scalable options for DNS and outbound connectivity from your clusters towards the internet and Google services.

Use Cloud DNS for GKE

You can use Cloud DNS for GKE to provide Pod and Service DNS resolution with managed DNS without a cluster-hosted DNS provider. Cloud DNS removes the overhead of managing a cluster-hosted DNS server and requires no scaling, monitoring, or managing of DNS instances because it is a hosted Google service.

Enable NodeLocal DNSCache

GKE uses kube-dns in order to provide the cluster's local DNS service as a default cluster add-on. kube-dns is replicated across the cluster as a function of the total number of cores and nodes in the cluster.

You can improve DNS performance with NodeLocal DNSCache. NodeLocal DNSCache is an add-on that is deployed as a DaemonSet, and doesn't require any Pod configuration changes. DNS lookups to the local Pod service don't create open connections that need to be tracked on the node which allows for greater scale. External hostname lookups are forwarded to Cloud DNS whereas all other DNS queries go to kube-dns.

Enable NodeLocal DNSCache for more consistent DNS query lookup times and improved cluster scale. For Autopilot clusters, NodeLocal DNSCache is enabled by default and cannot be overridden.

The following Google Cloud CLI option enables NodeLocal DNSCache when you create a cluster: --addons NodeLocalDNS.

If you have control over the name that applications are looking to resolve, there are ways to improve DNS scaling. For example, use an FQDN (end the hostname with a period) or disable search path expansion through the Pod.dnsConfig manifest option.

Use Cloud NAT for internet access from private clusters

By default, private clusters don't have internet access. In order to allow Pods to reach the internet, enable Cloud NAT for each region. At a minimum, enable Cloud NAT for the primary and secondary ranges in the GKE subnet. Make sure that you allocate enough IP addresses for Cloud NAT and ports per VM.

Use the following Cloud NAT Gateway configuration best practices while using Cloud NAT for private clusters:

  • When you create your Cloud NAT gateway, enable it only for the subnet ranges used by your clusters. By counting all the nodes in all the clusters, you can determine how many NAT consumer VMs you have in the project.
  • Use dynamic port allocation to allocate different numbers of ports per VM, based on the VM's usage. Start with minimum ports of 64 and maximum ports of 2048.

  • If you need to manage many simultaneous connections to the same destination 3-tuple, lower the TCP TIME_WAIT timeout from its default value of 120s to 5s. For more information, see Specify different timeouts for NAT.

  • Enable Cloud NAT error logging to check related logs.

  • Check the Cloud NAT Gateway logs after configuring the gateway. To decrease allocation status dropped problems, you might need to increase the maximum number of ports per VM.

You should avoid double SNAT for Pods traffic (SNAT first at the GKE node and then again with Cloud NAT). Unless you require SNAT to hide the Pod IP addresses towards on-premises networks connected by Cloud VPN or Cloud Interconnect, disable-default-snat and offload the SNAT tracking to Cloud NAT for scalability. This solution works for all primary and secondary subnet IP ranges. Use network policies to restrict external traffic after enabling Cloud NAT. Cloud NAT is not required to access Google services.

Use Private Google Access for access to Google services

In private clusters, Pods don't have public IP addresses to reach out to public services, including Google APIs and services. Private Google Access lets private Google Cloud resources reach Google services.

This option is off by default and needs to be enabled on the subnet associated with the cluster during subnet creation time.

The --enable-private-ip-google-access Google Cloud CLI option enables Private Google Access when you create the subnet.

Serving applications

When creating applications that are reachable either externally or internal to your organization, make sure you use the right load balancer type and options. This section gives some recommendations on exposing and scaling applications with Cloud Load Balancing.

Use container-native load balancing

Use container-native load balancing when exposing services by using HTTP(S) externally. Container-native load balancing allows for fewer network hops, lower latency, and more exact traffic distribution. It also increases visibility in round-trip time and lets you use load-balancing features such as Google Cloud Armor.

Choose the correct GKE resource to expose your application

Depending on the scope of your clients (internal, external, or even cluster-internal), the regionality of your application, and the protocols that you use, there are different GKE resources that you can choose to use to expose your application. The Service networking overview explains these options and can help you choose the best resource to expose each part of your application by using Google Cloud load balancing options.

Create health checks based on BackendConfig

If you use an Ingress to expose services, use a health check configuration in a BackendConfig CRD to use the health check functionality of the external Application Load Balancer. You can direct the health check to the appropriate endpoint and set your own thresholds. Without a BackendConfig CRD, health checks are inferred from readiness probe parameters or use default parameters.

Use local traffic policy to preserve original IP addresses

When you use an internal passthrough Network Load Balancer with GKE, set the externalTrafficPolicy option to Local to preserve the source IP address of the requests. Use this option if your application requires the original source IP address. However, the externalTrafficPolicy local option can lead to less optimal load spreading, so only use this feature when required. For HTTP(S) services, you can use Ingress controllers and get the original IP address by reading the X-Forwarded-For header in the HTTP request.

Use Private Service Connect

You can use Private Service Connect to share internal passthrough Network Load Balancer Services across other VPC networks. This is useful for Services that are hosted on GKE clusters but are serving customers that are running in different projects and different VPCs.

You can use Private Service Connect to reduce IP address consumption by providing connectivity between VPCs with overlapping IP addresses.

Operations and administration

The following sections contain operational best practices which help you ensure granular authorization options for your workloads. To avoid creating manual firewall rules, follow the operational best practices in this section. It also includes recommendations for distributing your workloads and for monitoring and logging in GKE.

Use IAM for GKE permissions to control policies in Shared VPC networks

When using Shared VPC networks, firewall rules for load balancers are automatically created in the host project.

To avoid having to manually create firewall rules, assign a least-privilege custom role to the GKE service account in the host project named service-HOST_PROJECT_NUMBER@container-engine-robot.iam.gserviceaccount.com.

Replace HOST_PROJECT_NUMBER with the project number of the host project for the Shared VPC.

The custom role that you create should have the following permissions:

  • compute.firewalls.create
  • compute.firewalls.get
  • compute.firewalls.list
  • compute.firewalls.delete

In addition, firewall rules created by GKE always have the default priority of 1000, so you can disallow specific traffic from flowing by creating firewall rules at a higher priority.

If you want to restrict creation of certain load balancer types, use organizational policies to restrict load balancer creation.

Use regional clusters and distribute your workloads for high availability

Regional clusters can increase the availability of applications in a cluster because the cluster control plane and nodes are spread across multiple zones.

However, to have the best possible user experience in case of a zone failure, use the cluster autoscaler to make sure that your cluster can handle the required load at any time.

You can also use Pod anti-affinity to ensure that Pods of a given service are scheduled in multiple zones.

For more information about how to configure these settings for high availability and cost optimizations, see the Best practices for highly-available GKE clusters.

Use Cloud Logging and Cloud Monitoring and enable network policy logging

While each organization has different requirements for visibility and auditing, we recommend enabling network policy logging. This feature is only available with GKE Dataplane V2. Network policy logging provides visibility into policy enforcement and Pod traffic patterns. Be aware that there are costs involved for network policy logging.

For GKE clusters using version 1.14 or later, Logging and Monitoring are both enabled by default. Monitoring provides a dashboard for your GKE clusters. Logging also enables GKE annotations for VPC Flow Logs. By default, Logging collects logs for all workloads deployed to the cluster but a system-only logs option also exists. Use the GKE dashboard to observe and set alerts. For clusters created in the Autopilot mode, monitoring and logging are automatically enabled and not configurable.

Be aware that there are costs involved for the Google Cloud Observability.

Checklist summary

Area Practice
VPC design
IP address management strategies
Network security options
Scaling
Serving applications
Operations and administration

What's next