Cloud NAT

Cloud NAT (network address translation) allows Google Cloud Platform (GCP) virtual machine (VM) instances without external IP addresses and private Google Kubernetes Engine (GKE) clusters to connect to the Internet.

Cloud NAT implements outbound NAT in conjunction with a default route to allow your instances to reach the Internet. It does not implement inbound NAT. Hosts outside of your VPC network can only respond to established connections initiated by your instances; they cannot initiate their own, new connections to your instances via NAT.

Cloud NAT is a regional resource. You can configure it to allow traffic from all primary and secondary IP ranges of subnets in a region, or you can configure it to apply to only some of those ranges.

The following is an example of a VPC network containing three subnets in two regions:

  • us-east1
    • subnet 1 (10.240.0.0/16)
    • subnet 2 (172.16.0.0/16)
  • europe-west1
    • subnet 3 (192.168.1.0/24)

The instances in subnet 1 (10.240.0.0/16) and subnet 3 (192.168.1.0/24) do not have external IP addresses, but need to fetch periodic updates from an external server at 203.0.113.1.

The instances in subnet 2 (172.16.0.0/16) do not need to get these updates and should not be allowed to connect to Internet at all, even with NAT.

Cloud NAT (click to enlarge)
Cloud NAT (click to enlarge)

To achieve this setup, first configure one NAT gateway for each region and for each network within the region:

  1. Configure NAT-GW-US-EAST for subnet 1 (10.240.0.0/16).
  2. Configure NAT-GW-EU for subnet 3 (192.168.1.0/24).
  3. On each NAT gateway, configure the subnets it should translate traffic for.
    • subnet 1 (10.240.0.0/16)
    • subnet 3 (192.168.1.0/24)
  4. Do not configure NAT for subnet 2. This isolates it from the Internet, which is required for this example.

Cloud NAT benefits

Security

You can provision your application servers without public IP addresses. These servers can access the Internet for updates and patches, and in some cases, for bootstrapping. The NAT IP addresses can be whitelisted on the Internet servers.

High availability

Cloud NAT is a managed service that provides high availability without user management and intervention. This is handled for both automatically allocated and manually allocated IP addresses.

A failure in a Cloud Router or a NAT gateway does not affect the NAT configuration or result in the inability of a host to perform NAT.

Scalability

Cloud NAT scales seamlessly with the number of instances and the volume of network traffic. Cloud NAT supports autoscaled managed instance groups. The network bandwidth available for each instance is not affected by the number of instances that use a NAT gateway and is similar to instances with an external IP address.

Cloud NAT features

Cloud NAT allows instances without an external IP address to access the Internet. Cloud NAT allows outbound connections only. Inbound traffic is allowed only if it is in response to a connection initiated by an instance.

NAT type

There are many ways of classifying NATs, based on different RFCs. As per RFC 5128, Cloud NAT is an "Endpoint-Independent Mapping, Endpoint-Dependent Filtering" NAT. This means that if an instance tries to contact different destinations using the same source IP:port, then these connections are allocated the same NAT IP:port, because the mapping is endpoint independent. Response packets from the Internet are allowed through only if they are from an IP:port where an instance previously sent packets because the filtering is endpoint dependent.

Using the terminology in RFC 3489, Cloud NAT is a Port Restricted Cone NAT. It's not a symmetric NAT because the mapping is endpoint independent.

NAT traversal

Because the NAT mapping is endpoint independent, Cloud NAT supports widely used NAT traversal protocols like STUN and TURN, which allow clients behind NATs to communicate with each other.

  • STUN (Session Traversal Utilities for NAT, RFC 5389) allows direct communication between the peers once the communication channel is established. This is lightweight, but works only when the mapping is endpoint independent.
  • TURN (Traversal Using Relays around NAT, RFC 5766) requires all communication to happen via a relay server with a public IP address to which both peers connect. TURN is more robust and works for all types of NAT, but consumes a lot of bandwidth and resources on the TURN server.

Note that the servers, configuration, and protocols required to use STUN/TURN are not supplied by Cloud NAT. Customers need to provide their own STUN servers.

NAT timeouts

The RFCs recommend higher values, but don't enforce them. Cloud NAT uses slightly lower values that you can override.

  • UDP Mapping Idle Timeout: 30s (RFC 4787 REQ-5)
  • ICMP Mapping Idle Timeout: 30s (RFC 5508 REQ-2)
  • TCP Established Connection Idle Timeout: 1200s (RFC 5382 REQ-5)
  • TCP Transitory Connection Idle Timeout: 30s (RFC 5382 REQ-5)

Bandwidth limitations

An instance with Cloud NAT has as much external bandwidth as an instance with an external IP.

Cases where NAT is not performed on traffic

GCP allows instances without external IP addresses to reach destinations that normally require an external IP address. Cloud NAT is not used in the following cases, even if you have configured it:

  • If your VM has an external IP address, then packets sent from the primary IP address of that interface do not use Cloud NAT. The existence of the external IP address on the interface takes precedence. However, alias IP addresses on the interface can use Cloud NAT, provided you have configured your NAT gateway to handle them.

  • Regular (non-private) GKE clusters assign each node an external IP address, so such clusters cannot use Cloud NAT to send packets from the node's primary interface. Pods can still use Cloud NAT if they send packets with source IP addresses set to the pod IP.

  • Backend VMs for HTTP(S), SSL Proxy, and TCP Proxy load balancers do not need external IP addresses themselves, nor do they need Cloud NAT to send replies to the load balancer. HTTP(S), SSL Proxy, and TCP Proxy load balancers communicate with backend VMs using their primary internal IP addresses.

  • Cloud NAT does not process traffic for Google APIs and services. However, when you enable Cloud NAT, GCP automatically enables Private Google Access for the subnets to which NAT applies in the region. See Private Google Access for details.

  • If you have overridden your default Internet gateway next hop with a more specific custom route, such as for an instance-based proxy or gateway, Cloud NAT does not apply to packets that use those routes.The following illustrate some examples where custom routes might interfere with Cloud NAT:

    • If you create custom static routes with next hops set to other instances, packets with destination IPs matching the destination of the route are sent to the other VMs. For example, if you use VM instances to provide NAT services, you would necessarily create custom static routes to direct traffic to those VMs as the next hop. Those NAT gateway VMs require external IP addresses themselves. Thus, neither traffic from the VMs that rely upon your NAT gateway VMs nor the NAT gateway VMs themselves would use Cloud NAT.

    • If you create a custom static route whose next hop is a Cloud VPN tunnel, Cloud NAT does not apply to that route. For example, a custom static route with destination 0.0.0.0/0 and next hop being a Cloud VPN tunnel directs traffic to that tunnel, not to the default Internet gateway. Consequently, Cloud NAT would not apply to the destination (0.0.0.0/0 in this example). This holds true even for more specific destinations.

    • If your on-premises router advertises a custom dynamic route to a Cloud Router managing a Cloud VPN tunnel or Cloud Interconnect, Cloud NAT does not apply to that route. For example, if your on-premises router advertises a custom dynamic route with destination 0.0.0.0/0 and next hop being its BGP IP address, traffic to 0.0.0.0/0 would be directed to the Cloud VPN tunnel or Cloud Interconnect. This holds true even for more specific destinations.

Translation example

In this example, instance 10.240.0.4 needs to download an update from external server 203.0.113.1.

Cloud NAT translation example (click to enlarge)
Cloud NAT translation example (click to enlarge)

Each of the instances in subnet 1 has been allocated 64 ports each for NAT translation using NAT IPs 192.0.2.50 or 192.0.2.60.

In the example, suppose instance 10.240.0.3 has been allocated port range 34000 to 34063 for NAT IP 192.0.2.50.

A flow from this instance has:

  • Source IP: 10.240.0.3 (instance IP)
  • Source port: 24000 (instance port)
  • Destination IP: 203.0.113.1 (Update server)
  • Destination port: port 80 (Update service port)

Because you have configured this subnet for NAT and the associated NAT-GW-US-EAST has NAT IP 192.0.2.50, this flow is translated to:

  • Source IP: 192.0.2.50 (NAT IP)
  • Source port: 34022 (NAT port - one of the ports allocated to this instance)
  • Destination IP: 203.0.113.1 (Update server)
  • Destination Port: 80 (Update service port)

The flow is then sent to the Internet after the translation. The response has the following characteristics:

  • Source IP: 203.0.113.1 (Update server)
  • Source port: 80 (Update service port)
  • Destination IP: 192.0.2.50 (NAT IP)
  • Destination Port: 34022 (NAT port - one of the ports allocated to this instance)

This packet is translated and given to the instance. To the instance, the packet looks like this:

  • Source IP: 203.0.113.1 (Update server)
  • Source port: port 80 (Update service port)
  • Destination IP: 10.240.0.3 (instance IP)
  • Destination port: 24000 (instance port)

A packet sent to a different destination from this IP address might use the same NAT IP address and NAT port at the same time to allow more connections than the number of ports assigned to this instance.

IP address allocation

Specifying IP addresses in NAT config

A Cloud NAT IP pool contains one or more IP addresses used to translate the internal addresses of instances. See Number of NAT ports and connections for a detailed explanation of the relationship between NAT IP addresses, ports, and instances.

You can choose between two modes for NAT pool IP allocation:

  • Recommended: Configure auto-allocation of IPs by a NAT gateway

    • If you do not specify the IP addresses for NAT, the system uses auto-allocation. Allowing GCP to automatically allocate is the best way of ensuring that enough IPs are always available for NAT, irrespective of the number of VMs created.
    • Adding IP addresses to the "allow" list becomes harder because some IP addresses may get freed up when they are not required anymore. Later on, new IPs may get allocated if required.

OR

  • Configure specific NAT pool IP(s) to be used by a NAT gateway

    • You manually specify one or more NAT IPs to be used by the NAT gateway. These IPs have to be reserved static IP addresses. You can specify one or more IPs.
    • In this case, no auto-allocation of IPs is performed.
    • If the number of NAT IPs is not enough to allocate NAT ports to all instances configured to use NAT, then some instances cannot use NAT. Cloud Router logs and status convey this information when more NAT IPs are required.
    • When you use auto-allocation, GCP reserves IP addresses in your project automatically. These addresses count against your static IP address quotas in the project.

Cloud NAT: Under the hood

Cloud NAT is a distributed, fully managed, software-defined service, not an instance or appliance-based solution.

Cloud NAT (click to enlarge)
Cloud NAT (click to enlarge)

Google Cloud NAT architecture differs from the traditional NAT proxy solutions. Unlike the NAT proxy solutions, there are no NAT proxy instances in the path from instance to destination for Cloud NAT. Instead each instance is allocated a globally unique set of NAT IPs and associated port-ranges, which are used by Andromeda, Google's network virtualization stack to perform NAT.

NAT Gateway: A "NAT Gateway" is a configuration object that operates in the control plane, and represents the NAT mapping required to be set on a host/s, essentialy the NAT Gateway represents an internal process. The NAT configuration is software-defined and pushed down to the host/s. Once that takes place the host/s contains their configuration regardless of the NAT Gateways's state or status.

Cloud Router (in regards to NAT Gateway): A Cloud Router aggregates one or multiple NAT Gateway configurations. Since it operates only in the control plane a failure in a Cloud Router has no impact on NAT Gateways and operational NAT taking place.

Traditional NAT vs. Cloud NAT (click to enlarge)
Traditional NAT vs. Cloud NAT (click to enlarge)

As a result, with Cloud NAT, there are no choke points in the path between your internal instances and external destinations. This leads to better scalability, performance, throughput and availability.

Number of NAT ports and connections

Every Cloud NAT IP address has 64K (65536) ports available for TCP and another 64K for UDP. Of these, the first 1024 well-known ports are not used by Cloud NAT. So we have 64,512 available ports per NAT IP address. By default, each instance with Cloud NAT gets a NAT IP and 64 ports from that NAT IP address's 64,512 ports. Therefore, up to 1008 instances can be supported by a single NAT IP address. If you have between 1 to 1008 instances, you need 1 NAT IP address, for 1009 to 2016 instances, you need 2 NAT IP addresses, etc.

If you use the VM instance with Google Kubernetes Engine (GKE) containers, the per-VM number of ports is the number of ports available to all containers on that VM.

If you allocate more ports per VM, say 4096, then you need a NAT IP address for every 15 VMs.

With the default 64 ports, a VM can open 64 TCP and 64 UDP connections to a given destination IP address and port. It can open another 64 TCP and 64 UDP connections to a different destination IP address and port, and so on as long as the VM has less than 64,000 connections.

If you want to open 1500 connections simultaneously to a given destination IP address and port, you will need to allocate 1500 ports to that VM. This is because all the fields of the IP packets from the destination to the VM will be identical, except for the NAT port which can be different. To know which connection a packet belongs to, the NAT port must be different for each connection.

However, if you want to open 1500 connections to 1500 different destination IP:ports, then you just need 1 port. All the response packets for the 1500 connections (from different destinations to the VM) will look different even if we use the same NAT port.

You can set the number of ports per VM to be anything from 2 through 57344 (=56K). With static NAT IP address allocation, you need to make sure that enough NAT IP addresses exist to cover all the VM instances that need NAT.

Cloud NAT with Google Kubernetes Engine

You can support Cloud NAT for Google Kubernetes Engine (GKE) containers by configuring Cloud NAT to NAT-translate all ranges in the subnet.

Both Nodes and Pods can use Cloud NAT. If you do not want Pods to be able to use NAT, you can create a Cluster Network Policy to prevent it.

Cloud NAT with GKE (click to enlarge)
Cloud NAT with GKE (click to enlarge)

In the example above, you want your containers to be NAT-translated. To enable NAT for all the containers and the GKE node, you must choose all the IP ranges of Subnet 1 as the NAT candidates. It is not possible to NAT only container1 or container2.

Cloud NAT does not affect the total bandwidth of individual VM nodes. Each node has the same external bandwidth whether using Cloud NAT or an external IP address.

When using Cloud NAT, each VM node is allocated a certain number of ports that can be used for outgoing connections. Containers on a node cannot use more than this number of ports collectively. If a node has, for example, 64 NAT ports, all the containers together cannot have more than 64 simultaneous connections to the same external IP address and port pair. See Number of NAT ports and connections for more details.

Cloud NAT with other GCP services

Shared VPC

Shared VPC enables multiple projects belonging to a single organization to use a single shared common network. In a Shared VPC setup, there is a Shared VPC host project where the shared network is configured by a Network Admin. The Shared VPC host project is associated with one or more service projects where resources can be created and attached to the shared network configured in the Shared VPC host project.

You have several choices for customizing Cloud NAT deployments with Shared VPC based on your use cases.

Use Case 1:

You want a centralized Cloud NAT gateway that can be used by instances of the host project as well as all of the service projects in a shared network (within a region).

In order to achieve this, you should create the Cloud NAT gateway as a part of the shared network configured in the host project. Note that there is one Cloud NAT gateway required per network per region, so if you have multiple regions, then one NAT gateway is required for each of these regions for a network. All instances that are in this shared network/region ( in host and service projects) are able to use this Cloud NAT gateway.

Cloud NAT use case 1 (click to enlarge)
Cloud NAT use case 1 (click to enlarge)

In the example above, Network 1 is the shared network created in host Project A and shared by instances in host project A as well as in service projects B and C. Cloud NAT gateway, NAT-GW-1, is configured in the host project for network 1 (for region us-central). Instances from Projects A,B,C that are part of Network 1 in region us-central are be able to use this NAT gateway to access the Update server.

Use Case 2: You want a centralized Cloud NAT gateway for all associated instances in a Shared VPC network. You want a separate gateway in an unrelated network in a service project.

Cloud NAT use case 2 (click to enlarge)
Cloud NAT use case 2 (click to enlarge)

In the example above, instances using Network 1 and belonging to Project A, B, and C in region us-central1 use NAT-GW-1 to access the update server. Instances in Project C belonging to Network 2 in region us-central1 uses NAT-GW-2 to access the log server.

VPC Network Peering

VPC Network Peering allows GCP VPCs to peer with each other so that instances in each of these peered VPC networks can communicate. Although certain configurations like internal load balancing rules are imported into a peered network, NAT configuration is not. For example, if network N1 peers with network N2, and N1 has a NAT gateway, only subnetworks in N1 can use the NAT gateway to reach the Internet. If N2 also wants its subnets to use NAT, then another NAT gateway has to be independently configured in network N2.

Multiple network interfaces

Cloud NAT supports multiple network interface instances, with or without alias IP ranges.

Firewall rules

Firewall rules are applied to instances. For egress traffic, firewall rules are applied before NAT is performed. For ingress traffic, firewall rules are applied after NAT IP addresses are translated to instance internal IPs.

Private Google Access

Cloud NAT never applies to traffic sent to the public IP addresses for Google APIs and services. Requests sent to Google APIs and services never use the external IPs configured for Cloud NAT as their sources.

When you configure Cloud NAT to provide NAT for the primary IP range of a subnet, GCP automatically enables Private Google Access for that subnet. As long as Cloud NAT provides NAT for the primary IP ranges of a subnet, Private Google Access remains enabled (and cannot be manually disabled).

Private Google Access allows VM instances without external IP addresses to reach the public IPs for certain Google APIs and services.

Using Private Google Access along with Cloud NAT does not change the behavior of Private Google Access:

  • If the VM has an external IP address, it can access the public IPs for Google APIs and services directly as long the Internet access requirements are met. Private Google Access does not apply to VMs that have external IP addresses.
  • If the VM only has an internal IP, it can access the public IPs for Google APIs and services only if Private Google Access is enabled for its subnet and the requirements for Private Google Access are met.

What's next

Var denne side nyttig? Giv os en anmeldelse af den: