NAT for a GKE Pod CIDR block

Stay organized with collections Save and categorize content based on your preferences.

This document discusses the Google Kubernetes Engine (GKE) IP address-management options available to organizations that are IPv4-address constrained.

This document describes a solution where the Pod CIDR block is assigned an RFC 1918 address block that is already in use in the on-premises network. This CIDR block is then translated with the ip-masq-agent feature. This approach effectively hides the Pod IPv4 addresses behind the Node IP addresses.

CIDR block considerations

When you install a GKE cluster, you must allocate three CIDR blocks: a Node, a Pod, and a Service block. After reading the first two documents in this series, you now have a list of the netmasks required for these blocks, and you know that you must reuse address space for the Pod CIDR block.

When you select an address range, follow these block characteristics and rules:

  • The Pod CIDR block is considered an isolated-VPC reused address range. This is because you must select the CIDR block from address space that's already assigned on-premises, and you must configure the block in the isolated VPC.
  • The Service CIDR block can be considered an isolated-VPC reused address range. It makes sense to assign a reused address to this space because it's used only for inter-Pod communication within the cluster.
  • The Node CIDR block must be a routed-domain allocated address.
  • The Pod and Service CIDR blocks must not be a subset of the CIDR block assigned to the isolated VPC or a CIDR block assigned to any peered VPCs.
  • The Pod and Service CIDR blocks must not be reused in any peered VPC.
  • The Pod, Node, and Service CIDR blocks must not overlap.
  • When selecting an address range to reuse for the Pod CIDR block, you must select a range assigned to resources that the Pods will never need to connect to. This approach avoids the need to assign routed-domain reused addresses for those on-premises resources.

Solution

The following figure shows a high-level view of the solution. On the right of the diagram is your on-premises network. On the left is a GKE cluster inside your isolated VPC. The cluster Nodes have the ip-masq-agent feature configured. The isolated VPC is advertising the GKE Node CIDR to the on-premises network. Each functional component is discussed in the following sections.

NAT for a Pod CIDR block in a GKE cluster. (For a screen-readable PDF version, click the image.)
Figure 1. NAT for a Pod CIDR block in a GKE cluster. (For a screen-readable PDF version, click the image.)

Isolated VPC

The isolated VPC is a regular VPC. It is called isolated to denote that the reused address space needs to be isolated from the on-premises global routing table. This VPC has the following characteristics:

  • It has the GKE API enabled.
  • It contains the GKE cluster.
  • It contains internal load balancers for routed-domain access to the GKE Service IP addresses.

In this VPC, Pod CIDR ranges are allocated reused address space. The Service CIDR ranges can be allocated reused or allocated address space. The Node CIDR blocks must be routed-domain allocated address space.

Routing

When you advertise routes to the on-premises network, you must not advertise any isolated-VPC reused address ranges—that is, the Pod and possibly the Service CIDR ranges. On-premises routers or firewalls that are peering with the cloud routers should not accept any isolated-VPC reused address blocks from their routed peers.

NAT configuration

You must configure the GKE cluster to use the ip-masq-agent feature. The cluster Nodes should not translate the Pod or Service CIDR block ranges in the VPC or any peered VPCs.

Traffic originating in the isolated VPC going to the routed domain

Traffic sourcing from Pod isolated-VPC reused addresses is translated at the Node interface. Therefore, all traffic from a Pod has a source address of the Node it resides on. A Pod also cannot initiate communication with identical CIDR blocks used as routed-domain allocated addresses. The addresses are said to be tethered to the isolated VPC because those subnets are always considered directly connected. That traffic is never routed to the isolated-VPC gateway. This is a routing issue, not a NAT issue.

Traffic originating in the routed domain going to the isolated VPC

Traffic from the routed domain connects to the Pods through the internal load balancer IP addresses. These IP addresses are assigned from the routed-domain allocated Node CIDR block.

Resources in the on-premises network that share the same CIDR block as the Pods—that is, a CIDR block that's used as both a routed-domain allocated address and as an isolated-VPC reused address—cannot communicate with the Pods because the Pods see that CIDR range as locally connected. The addresses are tethered to the cluster. The following figure shows how address tethering occurs.

Internal load balancer IP addresses tethered to a cluster. (For a screen-readable PDF version, click the image.)
Figure 2. Internal load balancer IP addresses tethered to a cluster. (For a screen-readable PDF version, click the image.)

In the figure, because the routed-domain 10.0.0.2 address is considered part of the Pod 10.0.0.0/11 address space in the isolated VPC, the following events occur:

  1. Traffic leaves the Host1 Node destined for the Host2 Node with a source address of 10.0.0.2.
  2. The packet enters Host2 in the isolated VPC as expected and is processed, but then in a process referred to as black-holing, it is tethered to the isolated VPC.
  3. The reply packet with a destination address of 10.0.0.2 is misrouted to a Pod address on the Host3 Node because the routing decision at the Host2 Node sees the network as local. To avoid this issue, you must select reused addresses from resources that don't need to connect to the Pods. If that's not possible, you must use the NAT for all GKE CIDR blocks solution.

Load balancing

Using the GKE Service configuration file, you make your Service IP addresses available to the routed domain through internal load balancer IPv4 addresses. You select these internal load balancing addresses from the Node IPv4 CIDR range that's assigned to the primary subnet in the isolated VPC. Because the Node block is a routed-domain allocated address, there are no additional configuration steps.

Other issues

Application Layer Gateway NAT is not supported within the packet payload. This solution only translates at layer 3.

Additional considerations

This section discusses additional factors to consider when configuring your GKE cluster.

Identity and Access Management

Review the following IAM roles and procedures and assign them as appropriate within your organization:

Cost

With billing enabled for the project, the accompanying tutorial incurs the following additional costs; exact costs are implementation dependent:

To help estimate costs, you can use the Google Cloud pricing calculator.

Quota and limits

Keep the following quotas and limits in mind:

APIs

You must enable the GKE API in the isolated VPC.

Security

We recommend that you discuss the following security points with your network and security team:

  • Because misconfiguration in address reuse scenarios can potentially cause network outages, set up automation and email aliases such as ipv4addrspaceviolation@example.com and ipv4routingviolation@example.com to alert on problems.
  • Track translations for any future data analysis.
  • To ensure that your firewall rules are in compliance with your established standards, review any external connection to the internet from the isolated VPC.
  • Before applying firewall rules in production, review each rule.

What's next