This document discusses the Google Kubernetes Engine (GKE) IP address-management options available to organizations that are IPv4-address constrained.
This document describes a solution where the Pod CIDR block is assigned an RFC
1918 address block that is already in use in the on-premises network. This CIDR
block is then translated with the
ip-masq-agent
feature.
This approach effectively hides the Pod IPv4 addresses behind the Node IP
addresses.
CIDR block considerations
When you install a GKE cluster, you must allocate three CIDR blocks: a Node, a Pod, and a Service block. After reading the first two documents in this series, you now have a list of the netmasks required for these blocks, and you know that you must reuse address space for the Pod CIDR block.
When you select an address range, follow these block characteristics and rules:
- The Pod CIDR block is considered an isolated-VPC reused address range. This is because you must select the CIDR block from address space that's already assigned on-premises, and you must configure the block in the isolated VPC.
- The Service CIDR block can be considered an isolated-VPC reused address range. It makes sense to assign a reused address to this space because it's used only for inter-Pod communication within the cluster.
- The Node CIDR block must be a routed-domain allocated address.
- The Pod and Service CIDR blocks must not be a subset of the CIDR block assigned to the isolated VPC or a CIDR block assigned to any peered VPCs.
- The Pod and Service CIDR blocks must not be reused in any peered VPC.
- The Pod, Node, and Service CIDR blocks must not overlap.
- When selecting an address range to reuse for the Pod CIDR block, you must select a range assigned to resources that the Pods will never need to connect to. This approach avoids the need to assign routed-domain reused addresses for those on-premises resources.
Solution
The following figure shows a high-level view of the solution. On the right of
the diagram is your on-premises network. On the left is a GKE
cluster inside your isolated VPC. The cluster Nodes have the ip-masq-agent
feature configured. The isolated VPC is advertising the GKE Node
CIDR to the on-premises network. Each functional component is discussed in the
following sections.
Isolated VPC
The isolated VPC is a regular VPC. It is called isolated to denote that the reused address space needs to be isolated from the on-premises global routing table. This VPC has the following characteristics:
- It has the GKE API enabled.
- It contains the GKE cluster.
- It contains internal load balancers for routed-domain access to the GKE Service IP addresses.
In this VPC, Pod CIDR ranges are allocated reused address space. The Service CIDR ranges can be allocated reused or allocated address space. The Node CIDR blocks must be routed-domain allocated address space.
Routing
When you advertise routes to the on-premises network, you must not advertise any isolated-VPC reused address ranges—that is, the Pod and possibly the Service CIDR ranges. On-premises routers or firewalls that are peering with the cloud routers should not accept any isolated-VPC reused address blocks from their routed peers.
NAT configuration
You must configure the GKE cluster to use the ip-masq-agent
feature. The cluster Nodes should not translate the Pod or Service CIDR block
ranges in the VPC or any peered VPCs.
Traffic originating in the isolated VPC going to the routed domain
Traffic sourcing from Pod isolated-VPC reused addresses is translated at the Node interface. Therefore, all traffic from a Pod has a source address of the Node it resides on. A Pod also cannot initiate communication with identical CIDR blocks used as routed-domain allocated addresses. The addresses are said to be tethered to the isolated VPC because those subnets are always considered directly connected. That traffic is never routed to the isolated-VPC gateway. This is a routing issue, not a NAT issue.
Traffic originating in the routed domain going to the isolated VPC
Traffic from the routed domain connects to the Pods through the internal load balancer IP addresses. These IP addresses are assigned from the routed-domain allocated Node CIDR block.
Resources in the on-premises network that share the same CIDR block as the Pods—that is, a CIDR block that's used as both a routed-domain allocated address and as an isolated-VPC reused address—cannot communicate with the Pods because the Pods see that CIDR range as locally connected. The addresses are tethered to the cluster. The following figure shows how address tethering occurs.
In the figure, because the routed-domain 10.0.0.2
address is considered part
of the Pod 10.0.0.0/11
address space in the isolated VPC, the following
events occur:
- Traffic leaves the
Host1
Node destined for theHost2
Node with a source address of10.0.0.2
. - The packet enters
Host2
in the isolated VPC as expected and is processed, but then in a process referred to as black-holing, it is tethered to the isolated VPC. - The reply packet with a destination address of
10.0.0.2
is misrouted to a Pod address on theHost3
Node because the routing decision at theHost2
Node sees the network as local. To avoid this issue, you must select reused addresses from resources that don't need to connect to the Pods. If that's not possible, you must use the NAT for all GKE CIDR blocks solution.
Load balancing
Using the GKE Service configuration file, you make your Service IP addresses available to the routed domain through internal load balancer IPv4 addresses. You select these internal load balancing addresses from the Node IPv4 CIDR range that's assigned to the primary subnet in the isolated VPC. Because the Node block is a routed-domain allocated address, there are no additional configuration steps.
Other issues
Application Layer Gateway NAT is not supported within the packet payload. This solution only translates at layer 3.
Additional considerations
This section discusses additional factors to consider when configuring your GKE cluster.
Identity and Access Management
Review the following IAM roles and procedures and assign them as appropriate within your organization:
Cost
With billing enabled for the project, the accompanying tutorial incurs the following additional costs; exact costs are implementation dependent:
To help estimate costs, you can use the Google Cloud pricing calculator.
Quota and limits
Keep the following quotas and limits in mind:
APIs
You must enable the GKE API in the isolated VPC.
Security
We recommend that you discuss the following security points with your network and security team:
- Because misconfiguration in address reuse scenarios can potentially
cause network outages, set up automation and email aliases such as
ipv4addrspaceviolation@example.com
andipv4routingviolation@example.com
to alert on problems. - Track translations for any future data analysis.
- To ensure that your firewall rules are in compliance with your established standards, review any external connection to the internet from the isolated VPC.
- Before applying firewall rules in production, review each rule.
What's next
- For a better understanding of the underlying technology discussed in this document, read the following references:
- Read the associated tutorial.