This tutorial shows you how to configure a solution in which you assign the Pod
CIDR block an RFC 1918 address block that is already in use in an on-premises
network. You then translate the CIDR block (by NAT) by using the
ip-masq-agent
feature.
This approach effectively hides the Pod IPv4
addresses behind the node IP addresses. You then use
Terraform
to automate the infrastructure build and use the Google Cloud CLI to inspect the
components shown in the following figure.
You set up the following components with Terraform:
- A Cloud project with a VPC-native GKE cluster that hosts a Hello World app. This app is exposed through an internal load balancer.
- A subnetwork for the GKE cluster.
- A subnetwork simulating an on-premises CIDR block.
For a more in-depth discussion of the overall solution and individual components, see NAT for a GKE Pod CIDR block.
This tutorial assumes you are familiar with the following:
- Linux sysadmin commands
- GKE
- Compute Engine
- NAT
- Terraform
Objectives
With Terraform, deploy the following:
- A project with a VPC-native GKE cluster
- A subnetwork for the GKE cluster
- A subnetwork simulating an on-premises CIDR block
- A Hello World app
- An internal load balancer to expose the Hello World app
With the Google Cloud CLI, do the following:
- Inspect each solution component.
- Verify the Hello World app.
- Verify the translation of the Pod CIDR block from the simulated on-premises machine.
Costs
This tutorial uses the following billable components of Google Cloud:
To generate a cost estimate based on your projected usage,
use the pricing calculator.
When you finish this tutorial, you can avoid continued billing by deleting the resources you created. For more information, see Clean up.
Before you begin
In this section, you prepare Cloud Shell, set up your environment variables, and deploy the supporting infrastructure.
Prepare Cloud Shell
In the Google Cloud console, open Cloud Shell.
You complete most of this tutorial from the Cloud Shell terminal using HashiCorp's Terraform and the Google Cloud CLI.
In Cloud Shell, clone the GitHub repository and change to the local working directory:
git clone https://github.com/GoogleCloudPlatform/terraform-gke-nat-connectivity.git kam cd kam/podnat
The repository contains all the files that you need to complete this tutorial. For a complete description of each file, see the
README.md
file in the repository.Make all shell scripts executable:
sudo chmod 755 *.sh
Initialize Terraform:
terraform init
The output is similar to the following:
... Initializing provider plugins... The following providers do not have any version constraints in configuration, so the latest version was installed. To prevent automatic upgrades to new major versions that may contain breaking changes, it is recommended to add version = "..." constraints to the corresponding provider blocks in configuration, with the constraint strings suggested below. ∗ provider.google: version = "~> 2.5" Terraform has been successfully initialized! You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure. All Terraform commands should now work. If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary. ...
Set environment variables
Set and verify the
TF_VAR_org_id
variable, replacingyour-organization-name
with the Google Cloud organization name you want to use in this tutorial:export TF_VAR_org_id=$(gcloud organizations list | \ awk '/your-organization-name/ {print $2}')
Verify that the environment variable is set correctly:
echo $TF_VAR_org_id
The command output lists your numeric organization ID and looks similar to the following:
... 123123123123 ...
Set the remaining environment variables:
source set_variables.sh
Verify that the environment variables are set correctly:
env | grep TF_
The output is similar to the following:
... TF_VAR_zone=us-west1-b TF_VAR_cluster_password=ThanksForAllTheFish TF_VAR_node_cidr=10.32.1.0/24 TF_VAR_region=us-west1 TF_VAR_billing_account=QQQQQQ-XAAAAA-E87690 TF_VAR_cluster_cidr=192.168.1.0/24 TF_VAR_org_id=406999999999 TF_VAR_ilb_ip=10.32.1.49 TF_VAR_isolated_vpc_pid=ivpc-pid--999999999 TF_VAR_gcp_user=user@example TF_VAR_on_prem_cidr=10.32.2.0/24 TF_VAR_cluster_username=dolphins TF_VAR_pod_cidr=172.16.0.0/16 ...
Create an environment variable file:
env | grep TF_ | sed 's/^/export /' > TF_ENV_VARS
This command chain redirects the environment variables you created into a file called
TF_ENV_VARS
. Each variable is prepended with theexport
command. You can use this file to reset the environment variables in case your Cloud Shell session is terminated. These variables are used by the Terraform scripts, Cloud Shell scripts, and thegcloud
command-line tools.If you need to reinitialize the variables later, you can run the following command from the directory where the file resides:
source TF_ENV_VARS
Deploy supporting infrastructure
In Cloud Shell, deploy the Terraform supporting infrastructure:
terraform apply
Terraform prompts for confirmation before making any changes. Answer
yes
to apply either configuration.The
terraform apply
command instructs Terraform to deploy all the solution's components. To better understand how the infrastructure is declaratively defined, you can read through the Terraform manifests—that is, the files with the.tf
extension.
Inspecting the supporting infrastructure
You now use the Google Cloud CLI to view and verify the infrastructure that Terraform created. Verification involves running a command to see if the resource responds and was created correctly.
Verify the projects
In Cloud Shell, list the project:
gcloud projects list | grep ivpc-pid
The output is similar to the following:
... isolated-vpc-pid isolated-vpc-pname 777999333962 ...
List the API status:
gcloud services list --project=$TF_VAR_isolated_vpc_pid \ | grep -E "compute|container"
The output is similar to the following:
... compute.googleapis.com Compute Engine API container.googleapis.com Google Kubernetes Engine API ...
Verify the networks and subnetworks
In Cloud Shell, verify the networks and subnetworks:
gcloud compute networks describe ivpc \ --project=$TF_VAR_isolated_vpc_pid gcloud compute networks subnets describe node-cidr \ --project=$TF_VAR_isolated_vpc_pid \ --region=$TF_VAR_region gcloud compute networks subnets describe simulated-on-prem \ --project=$TF_VAR_isolated_vpc_pid \ --region=$TF_VAR_region
The output is similar to the following:
... kind: compute#network name: ivpc routingConfig: routingMode: GLOBAL ... subnetworks: ‐ https://www.googleapis.com/compute/v1/projects/ivpc-pid--695116665/regions/us-west1/subnetworks/node-cidr x_gcloud_bgp_routing_mode: GLOBAL ... gatewayAddress: 10.32.1.1 ... ipCidrRange: 10.32.1.0/24 kind: compute#subnetwork name: node-cidr ... secondaryIpRanges: ‐ ipCidrRange: 172.16.0.0/16 rangeName: pod-cidr ... subnetworks: ‐ https://www.googleapis.com/compute/v1/projects/ivpc-pid--695116665/regions/us-west1/subnetworks/simulated-on-prem x_gcloud_bgp_routing_mode: GLOBAL ... gatewayAddress: 10.32.2.1 ... ipCidrRange: 10.32.2.0/24 kind: compute#subnetwork name: simulated-on-prem ...
Verify the firewall rules
In Cloud Shell, verify the firewall rules in the isolated VPC:
gcloud compute firewall-rules list --project=$TF_VAR_isolated_vpc_pid
The output is similar to the following:
... NAME NETWORK DIRECTION PRIORITY ALLOW DENY DISABLED allow-rfc1918-in-fwr isolated-vpc-net INGRESS 1000 all False allow-ssh-in-fwr isolated-vpc-net INGRESS 1000 22 False ...
Verify the virtual machines
In Cloud Shell, verify the virtual machines:
gcloud compute instances list --project=$TF_VAR_isolated_vpc_pid
The output is similar to the following:
... NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS gke-cluster1-default-pool-fc9ba891-4xhj us-west1-b n1-standard-1 10.32.1.4 34.83.33.188 RUNNING gke-cluster1-default-pool-fc9ba891-d0bd us-west1-b n1-standard-1 10.32.1.3 34.83.48.81 RUNNING gke-cluster1-default-pool-fc9ba891-xspg us-west1-b n1-standard-1 10.32.1.2 35.247.62.159 RUNNING simulated-on-prem-host us-west1-b n1-standard-1 10.32.2.2 35.227.173.106 RUNNING ...
Verify the GKE cluster and its resources
In Cloud Shell, get the cluster credentials:
gcloud container clusters get-credentials cluster1 \ --project=$TF_VAR_isolated_vpc_pid \ --zone $TF_VAR_zone
The output is similar to the following:
... Fetching cluster endpoint and auth data. kubeconfig entry generated for cluster1. ...
Verify the cluster:
gcloud container clusters list \ --project=$TF_VAR_isolated_vpc_pid \ --zone=$TF_VAR_zone
The output is similar to the following:
... NAME LOCATION MASTER_VERSION MASTER_IP MACHINE_TYPE NODE_VERSION NUM_NODES STATUS cluster1 us-west1-b 1.11.8-gke.6 192.0.2.58 n1-standard-1 1.11.8-gke.6 3 RUNNING ...
Verify the Hello World app:
kubectl get deployment my-app
The output is similar to the following:
... NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE my-app 3 3 3 3 118m ...
Verify the internal load balancer service:
kubectl get service hello-server
The output is similar to the following:
... NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE hello-server LoadBalancer 10.32.11.49 <pending> 8080:30635/TCP 3m ...
Verifying the solution
To verify the solution, you must verify the following:
- The
ip-masq-agent
feature - That the Hello World app is externally accessible
- The Pod CIDR NAT
Verify the ip-masq-agent feature
In Cloud Shell, get a node name:
export NODE_NAME=$(kubectl get nodes | awk '/gke/ {print $1}' | head -n 1)
Use SSH to connect to a cluster node:
gcloud compute ssh $NODE_NAME \ --project=$TF_VAR_isolated_vpc_pid \ --zone=$TF_VAR_zone
Verify the
ip-masq-agent
configuration:sudo iptables -t nat -L
The output is similar to the following:
... Chain IP-MASQ (2 references) target prot opt source destination RETURN all -- anywhere 169.254.0.0/16 /* ip-masq-agent: local traffic is not subject to MASQUERADE */ RETURN all -- anywhere 10.32.1.0/24 /* ip-masq-agent: local traffic is not subject to MASQUERADE */ RETURN all -- anywhere 172.16.0.0/16 /* ip-masq-agent: local traffic is not subject to MASQUERADE */ RETURN all -- anywhere 192.168.1.0/24 /* ip-masq-agent: local traffic is not subject to MASQUERADE */ MASQUERADE all -- anywhere anywhere /* ip-masq-agent: outbound traffic is subject to MASQUERADE (must be last in chain) */ ...
Exit the SSH session.
exit
Verify that the Hello World app is externally accessible
Use SSH to connect to the simulated on-premises VM:
gcloud compute ssh simulated-on-prem-host \ --project=$TF_VAR_isolated_vpc_pid \ --zone=$TF_VAR_zone
Verify the Hello World app:
curl http://10.32.1.49:8080
The output is similar to the following:
... Hello, world! Version: 1.0.0 Hostname: my-app-77748bfbd8-nqwl2 ...
Verify the Pod CIDR NAT
On the simulated on-premises VM, run
tcpdump
:sudo tcpdump -n icmp
The output is similar to the following:
... tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes ...
This command starts the
tcpdump
utility to capture the packets that traverse the isolated-VPC gateway. The utility only captures packets that contain the IP address of the internal load balancer that the GKE service created.In the Google Cloud console, open a new Cloud Shell terminal.
From the new terminal, set environment variables by changing to the Git repository directory and entering the following command:
source TF_ENV_VARS
Get the cluster credentials:
gcloud container clusters get-credentials cluster1 \ --project=$TF_VAR_isolated_vpc_pid \ --zone $TF_VAR_zone
Get the Pod name:
export POD_NAME=$(kubectl get pods | awk '/my-app/ {print $1}' | head -n 1)
Connect to the Pod shell:
kubectl exec -it $POD_NAME -- /bin/sh
In the original Cloud Shell terminal, ping the simulated on-premises VM:
ping 10.32.2.2
In the original terminal, the output is similar to the following:
... 05:43:40.669371 IP 10.32.1.3 > 10.32.2.2: ICMP echo request, id 3328, seq 0, length 64 05:43:40.669460 IP 10.32.2.2 > 10.32.1.3: ICMP echo reply, id 3328, seq 0, length 64 ...
Notice the source IP address is from the node
10.32.1.0/24
CIDR block. The Pod10.32.2.0/24
CIDR block has been translated behind the node address. PressControl+C
to stop the ping and then exit all terminal sessions.
Clean up
Destroy the infrastructure
- From the first Cloud Shell terminal, exit from the SSH session to
the isolated-VPC gateway by typing
exit
. Destroy all of the tutorial's components:
terraform destroy
Terraform prompts for confirmation before making the change. Answer
yes
to destroy the configuration.You might see the following Terraform error:
... ∗ google_compute_network.ivpc (destroy): 1 error(s) occurred: ∗ google_compute_network.ivpc: Error waiting for Deleting Network: The network resource 'projects/ivpc-pid--1058675427/global/networks/isolated-vpc-net' is already being used by 'projects/ivpc-pid--1058675427/global/firewalls/k8s-05693142c93de80e-node-hc' ...
This error occurs when the command attempts to destroy the isolated-VPC network before destroying the GKE firewall rules. Run the following script to remove the non-default firewall rules from the isolated VPC:
./k8-fwr.sh
The output shows you which firewall rules will be removed.
Review the rules and, when prompted, type
yes
.From the first Cloud Shell terminal, reissue the following command:
terraform destroy
Terraform prompts for confirmation before making the change. Answer
yes
to destroy the configuration.From the original Cloud Shell terminal, issue the following command:
cd ../.. rm -rf kam
What's next
- Read the associated conceptual document that describes this tutorial.
- Explore reference architectures, diagrams, and best practices about Google Cloud. Take a look at our Cloud Architecture Center.