This tutorial is intended for network architects who are struggling to allocate
RFC 1918
IPv4 addresses for GKE Pods, Nodes, and Services due to private
address space exhaustion or fragmentation. The tutorial shows you how to log the
NAT translations of GKE CIDR blocks. You use
Terraform
to automate the infrastructure build, the Google Cloud CLI to inspect the
database-related components, and the psql
client utility to build and
inspect the associated database tables.
The following diagram shows the overall solution.
At a high level, you implement the infrastructure described in-depth in NAT for all GKE CIDR blocks. You then add the following:
- A Cloud SQL for PostgreSQL resource.
- A reserved RFC 1918 address block for a private IP address connection.
- A private IP connection to the Cloud SQL for PostgreSQL resource.
- The
psql
client andulog2
utility to the isolated-VPC gateway. - An
iptables
configuration to log NAT entries in the connection tables.
This tutorial assumes you are familiar with the following:
- Linux sysadmin commands
- GKE
- Compute Engine
- Cloud SQL for PostgreSQL
- The
ulogd2
utility - Terraform
Objectives
- Create and inspect a Cloud SQL for PostgreSQL resource.
- Create and inspect a reserved RFC 1918 address block for a private IP address connection.
- Create and inspect a private IP connection to the Cloud SQL for PostgreSQL resource.
- Install the
psql
client andulog2
utility on the isolated-VPC gateway. - Apply an
iptables
configuration to log NAT entries in the connection table on the isolated-VPC gateway.
Costs
This tutorial uses the following billable components of Google Cloud:
To generate a cost estimate based on your projected usage,
use the pricing calculator.
When you finish this tutorial, you can avoid continued billing by deleting the resources you created. For more information, see Clean up.
Before you begin
In this section, you prepare Cloud Shell, set up your environment variables, and deploy the supporting infrastructure.
Prepare Cloud Shell
In the Google Cloud console, open Cloud Shell.
You complete most of this tutorial from the Cloud Shell terminal using HashiCorp's Terraform and the Google Cloud CLI.
From the Cloud Shell terminal, clone this solution's GitHub repository:
git clone https://github.com/GoogleCloudPlatform/terraform-gke-nat-connectivity.git kam cd kam/logging
The repository contains all the files you need to complete this tutorial. For a complete description of each file, see the
README.md
file in the repository.Make all shell scripts executable:
sudo chmod 755 *.sh
Set up Terraform:
- Install Terraform.
Initialize Terraform:
terraform init
The output is similar to the following:
... Initializing provider plugins... The following providers do not have any version constraints in configuration, so the latest version was installed. To prevent automatic upgrades to new major versions that may contain breaking changes, it is recommended to add version = "..." constraints to the corresponding provider blocks in configuration, with the constraint strings suggested below. ∗ provider.google: version = "~> 2.5" Terraform has been successfully initialized! You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure. All Terraform commands should now work. If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary. ...
Set environment variables
In Cloud Shell, set your environment variables:
Set and verify the
TF_VAR_org_id
variable, replacingyour-organization-name
with the Google Cloud organization name that you want to use in this tutorial:export TF_VAR_org_id=$(gcloud organizations list | \ awk '/your-organization-name/ {print $2}')
Verify that the environment variable is set correctly:
echo $TF_VAR_org_id
The command output lists your numeric organization ID and looks similar to the following:
... 123123123123 ...
Set the environment variables for the remaining resources:
source set_variables.sh
Verify that the environment variables are set correctly:
env | grep TF_
The output is similar to the following:
... TF_VAR_isolated_net_name=isolated-vpc TF_VAR_isolated_pname=isolated-vpc-project TF_VAR_isolated_pid=isolated-vpc-project-id TF_VAR_shared_net_name=routed-domain-vpc TF_VAR_shared_pname=routed-domain-project TF_VAR_shared_pid=routed-domain-project-id TF_VAR_isolated_vpc_gw_host_nic_ip=10.97.0.2 TF_VAR_isolated_vpc_gw_service_nic_ip=10.32.0.2 TF_VAR_isolated_vpc_gw_service_dgw_ip=10.32.0.1 TF_VAR_org_id=123123123123 TF_VAR_region=us-west1 TF_VAR_zone=us-west1-b TF_VAR_user_account=user@example.com TF_VAR_node_cidr=10.32.0.0/19 TF_VAR_pod_cidr=10.0.0.0/11 TF_VAR_service_cidr=10.224.0.0/20 TF_VAR_shared_cidr=10.97.0.0/24 TF_VAR_test1_cidr=10.0.0.0/11 TF_VAR_test2_cidr=10.160.0.0/11 TF_VAR_test3_cidr=10.64.0.0/19 TF_VAR_ilb_cidr=10.32.31.0/24 TF_VAR_masquerade=true TF_VAR_db_username=ulog2 TF_VAR_db_password=ThanksForAllTheFish TF_VAR_private_access_cidr=192.168.0.0 ...
Create an environment variable file:
env | grep TF_ | sed 's/^/export /' > TF_ENV_VARS
This command chain redirects the environment variables that you created into a file called
TF_ENV_VARS
. Each variable is prepended with theexport
command. You can use this file to reset the environment variables in case your Cloud Shell session is terminated. These variables are used by the Terraform scripts, shell scripts, and the SDK commands in this tutorial.If you need to reinitialize the variables, you can run the following command from the directory where the file resides:
source TF_ENV_VARS
Deploy the supporting infrastructure
In Cloud Shell, deploy the Terraform supporting infrastructure:
terraform apply
Terraform prompts for confirmation before making any changes. Answer
yes
.The preceding command instructs Terraform to deploy all of the tutorial's components. To better understand how the infrastructure is declaratively defined, you can read through the Terraform manifests, that is, the files with the
.tf
extensions.At the end of the command output, you see the following Terraform output:
... Apply complete! Resources: 45 added, 0 changed, 0 destroyed. Outputs: password = 5daa52500549753f ...
Note the password. This is the database password that you use later to connect to the database instance with the Postgres client.
Terraform might display an error and stop deploying. This error is the result of a race condition in resource creation. If you see this error, rerun the
terraform apply
command.
Inspecting the supporting infrastructure
You now use the Google Cloud CLI to view and verify the infrastructure that Terraform created. Verification involves running a command to see if the resource responds and was created correctly.
Verify the private access configuration
In Cloud Shell, list the reserved private address space:
gcloud compute addresses list --project=$TF_VAR_isolated_vpc_pid
The output is similar to the following:
... NAME ADDRESS/RANGE TYPE PURPOSE NETWORK REGION SUBNET STATUS private-ip-address 10.231.0.0/16 INTERNAL VPC_PEERING isolated-vpc-net RESERVED ...
List the VPC peering:
gcloud beta compute networks peerings list --project=$TF_VAR_isolated_vpc_pid
The output is similar to the following:
... NAME NETWORK PEER_PROJECT PEER_NETWORK IMPORT_CUSTOM_ROUTES EXPORT_CUSTOM_ROUTES STATE STATE_DETAILS cloudsql-postgres-googleapis-com isolated-vpc-net speckle-umbrella-pg-5 cloud-sql-network-88508764482-eb87f4a6a6dc2193 False False ACTIVE [2019-06-06T09:59:57.053-07:00]: Connected. servicenetworking-googleapis-com isolated-vpc-net k5370e732819230f0-tp servicenetworking False False ACTIVE [2019-06-06T09:57:05.900-07:00]: Connected. compute.googleapis.com Compute Engine API container.googleapis.com Google Kubernetes Engine API ...
Verify the database
In Cloud Shell, set the database instance name:
export DB_INSTANCE=$(gcloud sql instances list \ --project=$TF_VAR_isolated_vpc_pid \ | grep master-instance \ | awk '{print $1}')
Verify the instance:
gcloud sql instances describe $DB_INSTANCE --project=$TF_VAR_isolated_vpc_pid
The output is similar to the following:
... backendType: SECOND_GEN connectionName: ivpc-pid-1812005657:us-west1:master-instance-b2aab5f6 databaseVersion: POSTGRES_9_6 etag: 6e3f96efff84e69da0a0c10e5e6cab7232aa2f4b2b803080950685a2a2517747 gceZone: us-west1-b instanceType: CLOUD_SQL_INSTANCE ipAddresses: ‐ ipAddress: 10.231.0.3 type: PRIVATE kind: sql#instance name: master-instance-b2aab5f6 project: ivpc-pid-1812005657 region: us-west1 selfLink: https://www.googleapis.com/sql/v1beta4/projects/ivpc-pid-1812005657/instances/master-instance-b2aab5f6 serverCaCert: ...
Write down the
ipAddress
of the database. In the preceding output, the address is10.231.0.3
. Also note the CIDR block used by the Cloud SQL for PostgreSQL address. You derive this block by applying a/24
netmask to the Cloud SQL for PostgreSQLipAddress
value. Using this output, the CIDR block would be10.231.0.0/24
. You need this information to connect to the database.Verify the database:
gcloud sql databases list \ --project=$TF_VAR_isolated_vpc_pid \ --instance=$DB_INSTANCE
The output is similar to the following:
... NAME CHARSET COLLATION postgres UTF8 en_US.UTF8 ulog2 UTF8 en_US.UTF8 ...
Verify the database user:
gcloud sql users list \ --project=$TF_VAR_isolated_vpc_pid \ --instance=$DB_INSTANCE
The output is similar to the following:
... NAME HOST postgres ulog2 ...
Configure NAT logging
In this section, you connect to the isolated-VPC gateway and install the utilities and databases that you use for the rest of the tutorial.
In Cloud Shell, connect to the isolated-VPC gateway by using
ssh
:gcloud compute ssh isolated-vpc-gw \ --project=$TF_VAR_isolated_vpc_pid \ --zone=$TF_VAR_zone
Install the
ulogd2
utility:sudo apt-get install -y ulogd2 ulogd2-pgsql nfacct
The output is similar to the following:
... Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: libnetfilter-acct1 libnetfilter-log1 Suggested packages: ulogd2-dbi ulogd2-json ulogd2-mysql ulogd2-pcap ulogd2-pgsql ulogd2-sqlite3 The following NEW packages will be installed: libnetfilter-acct1 libnetfilter-log1 ulogd2 0 upgraded, 3 newly installed, 0 to remove and 0 not upgraded. Need to get 125 kB of archives. After this operation, 529 kB of additional disk space will be used. Get:1 http://deb.debian.org/debian stretch/main amd64 libnetfilter-log1 amd64 1.0.1-1.1 [9,582 B] Get:2 http://deb.debian.org/debian stretch/main amd64 libnetfilter-acct1 amd64 1.0.2-1.1 [6,724 B] Get:3 http://deb.debian.org/debian stretch/main amd64 ulogd2 amd64 2.0.5-5 [109 kB] Fetched 125 kB in 0s (1,604 kB/s) Selecting previously unselected package libnetfilter-log1:amd64. (Reading database ... 35862 files and directories currently installed.) Preparing to unpack .../libnetfilter-log1_1.0.1-1.1_amd64.deb ... Unpacking libnetfilter-log1:amd64 (1.0.1-1.1) ... Selecting previously unselected package libnetfilter-acct1:amd64. Preparing to unpack .../libnetfilter-acct1_1.0.2-1.1_amd64.deb ... Unpacking libnetfilter-acct1:amd64 (1.0.2-1.1) ... Selecting previously unselected package ulogd2. Preparing to unpack .../ulogd2_2.0.5-5_amd64.deb ... Unpacking ulogd2 (2.0.5-5) ... Setting up libnetfilter-log1:amd64 (1.0.1-1.1) ... Processing triggers for systemd (232-25+deb9u11) ... Processing triggers for man-db (2.7.6.1-2) ... Setting up libnetfilter-acct1:amd64 (1.0.2-1.1) ... Setting up ulogd2 (2.0.5-5) ... adduser: Warning: The home directory `/var/log/ulog' does not belong to the user you are currently creating. Created symlink /etc/systemd/system/ulogd.service → /lib/systemd/system/ulogd2.service. Created symlink /etc/systemd/system/multi-user.target.wants/ulogd2.service → /lib/systemd/system/ulogd2.service. Processing triggers for systemd (232-25+deb9u11) ... ...
Install the Postgres client:
sudo apt-get install -y postgresql-client
The output is similar to the following:
... Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: libpq5 postgresql-client-9.6 postgresql-client-common Suggested packages: postgresql-9.6 postgresql-doc-9.6 The following NEW packages will be installed: libpq5 postgresql-client postgresql-client-9.6 postgresql-client-common 0 upgraded, 4 newly installed, 0 to remove and 0 not upgraded. Need to get 1,549 kB of archives. After this operation, 6,624 kB of additional disk space will be used. Get:1 http://deb.debian.org/debian stretch/main amd64 postgresql-client-common all 181+deb9u2 [79.2 kB] Get:2 http://security.debian.org stretch/updates/main amd64 libpq5 amd64 9.6.13-0+deb9u1 [136 kB] Get:3 http://deb.debian.org/debian stretch/main amd64 postgresql-client all 9.6+181+deb9u2 [55.8 kB] Get:4 http://security.debian.org stretch/updates/main amd64 postgresql-client-9.6 amd64 9.6.13-0+deb9u1 [1,278 kB] Fetched 1,549 kB in 0s (2,609 kB/s) Selecting previously unselected package libpq5:amd64. (Reading database ... 35927 files and directories currently installed.) Preparing to unpack .../libpq5_9.6.13-0+deb9u1_amd64.deb ... Unpacking libpq5:amd64 (9.6.13-0+deb9u1) ... Selecting previously unselected package postgresql-client-common. Preparing to unpack .../postgresql-client-common_181+deb9u2_all.deb ... Unpacking postgresql-client-common (181+deb9u2) ... Selecting previously unselected package postgresql-client-9.6. Preparing to unpack .../postgresql-client-9.6_9.6.13-0+deb9u1_amd64.deb ... Unpacking postgresql-client-9.6 (9.6.13-0+deb9u1) ... Selecting previously unselected package postgresql-client. Preparing to unpack .../postgresql-client_9.6+181+deb9u2_all.deb ... Unpacking postgresql-client (9.6+181+deb9u2) ... Setting up libpq5:amd64 (9.6.13-0+deb9u1) ... Processing triggers for libc-bin (2.24-11+deb9u4) ... Setting up postgresql-client-common (181+deb9u2) ... Processing triggers for man-db (2.7.6.1-2) ... Setting up postgresql-client-9.6 (9.6.13-0+deb9u1) ... update-alternatives: using /usr/share/postgresql/9.6/man/man1/psql.1.gz to provide /usr/share/man/man1/psql.1.gz (psql.1.gz) in auto mode Setting up postgresql-client (9.6+181+deb9u2) ... ...
Add a route to the private access subnet, replacing
replace-with-cloud-sql-cidr
with the Cloud SQL for PostgreSQL CIDR block10.231.0.0/24
you derived earlier.sudo ip route add to replace-with-cloud-sql-cidr \ via 10.32.0.1 dev eth1 table routed-domain ip route show table routed-domain
The output is similar to the following:
... 10.0.0.0/8 via 10.97.0.1 dev eth0 10.32.31.0/24 via 10.32.0.1 dev eth1 10.231.0.0/24 via 10.32.0.1 dev eth1 172.16.0.0/12 via 10.97.0.1 dev eth0 192.168.0.0/16 via 10.97.0.1 dev eth0 ...
Configure the database table, replacing the
replace-with-cloud-sql-ip-address
value with the Cloud SQL for PostgreSQLipAddress
value10.231.0.3
that you noted earlier:cd /usr/share/doc/ulogd2 sudo gzip -d /usr/share/doc/ulogd2/pgsql-ulogd2-flat.sql.gz psql -h replace-with-cloud-sql-ip-address -U ulog2 -f pgsql-ulogd2-flat.sql
When prompted, enter the database password you noted from the Terraform output.
Log into the database and add the
plpgsql
language forulog2
, replacingreplace-with-cloud-sql-ip-address
with the value10.231.0.3
:psql -h replace-with-cloud-sql-ip-address -U ulog2
When prompted, enter the database password you noted from the Terraform output.
Add the Postgres procedural language to the
ulog
table:CREATE EXTENSION plpgsql FROM unpackaged;
Verify the database table:
select * from ulog2_ct;
The output is similar to the following:
... --------+------------+-------------------+-------------------+------------------+---------------+---------------+-----------------+-------------------+--------------------+--------------------+-------------------+----------------+----------------+------------------+--------------------+-----------+-----------+---------+----------------+-----------------+--------------+---------------+---------- _ct_id | oob_family | orig_ip_saddr_str | orig_ip_daddr_str | orig_ip_protocol | orig_l4_sport | orig_l4_dport | orig_raw_pktlen | orig_raw_pktcount | reply_ip_saddr_str | reply_ip_daddr_str | reply_ip_protocol | reply_l4_sport | reply_l4_dport | reply_raw_pktlen | reply_raw_pktcount | icmp_code | icmp_type | ct_mark | flow_start_sec | flow_start_usec | flow_end_sec | flow_end_usec | ct_event --------+------------+-------------------+-------------------+------------------+---------------+---------------+-----------------+-------------------+--------------------+--------------------+-------------------+----------------+----------------+------------------+--------------------+-----------+-----------+---------+----------------+-----------------+--------------+---------------+---------- (0 rows) ...
Exit the database:
\q
Configure the
ulogd2
utility:sudo sed -i 's/^#plugin="\/usr\/lib\/x86_64-linux-gnu\/ulogd\/ulogd_output_PGSQL.so"/plugin="\/usr\/lib\/x86_64-linux-gnu\/ulogd\/ulogd_output_PGSQL.so"/' /etc/ulogd.conf sudo sed -i 's/#stack=ct1:NFCT,ip2str1:IP2STR,pgsql2:PGSQL/stack=ct2:NFCT,ip2str1:IP2STR,pgsql2:PGSQL/' /etc/ulogd.conf sudo sed -i 's/^stack=log1:NFLOG,base1:BASE,ifi1:IFINDEX,ip2str1:IP2STR,print1:PRINTPKT,emu1:LOGEMU/#stack=log1:NFLOG,base1:BASE,ifi1:IFINDEX,ip2str1:IP2STR,print1:PRINTPKT,emu1:LOGEMU/' /etc/ulogd.conf sudo iptables -A OUTPUT -m state --state NEW -j NFLOG --nflog-group 1
Modify the
/etc/ulogd.conf
file by using a text editor and thesudo
command. Change the[pgsql2]
section to the following:[pgsql2] db="ulog2" host="replace-with-cloud-sql-ip-address" user="ulog2" table="ulog2_ct" #schema="public" pass="replace-with-database-password" procedure="INSERT_CT" connstring="hostaddr=replace-with-cloud-sql-ip-address port=5432 dbname=ulog2 user=ulog2 password=replace-with-database-password" [ct2] event_mask=0x00000001 hash_enable=0
Start the
ulogd2
daemon and enable it across system reboots:sudo systemctl start ulogd2 sudo systemctl enable ulogd2
The output is similar to the following:
... Synchronizing state of ulogd2.service with SysV service script with /lib/systemd/systemd-sysv-install. Executing: /lib/systemd/systemd-sysv-install enable ulogd2 ...
Verifying the solution
In this section, you test the app and verify NAT logging.
Test the app
- Start a new Cloud Shell terminal.
Connect to the
test-10-11-vm
Compute Engine instance by usingssh
, and change to the Git repository working directory:cd kam\logging source TF_ENV_VARS gcloud compute ssh test-10-11-vm \ --project=$TF_VAR_shared_vpc_pid \ --zone=$TF_VAR_zone
Connect to the app:
curl http://10.32.31.49:8080/
The output is similar to the following:
... Hello, world! Version: 1.0.0 Hostname: my-app-6597cdc789-d6phf ...
This command retrieves the webpage from the app running on the GKE cluster.
Verify NAT logging
In Cloud Shell, connect to the Cloud SQL for PostgreSQL instance:
psql -h replace-with-cloud-sql-ip-address -U ulog2
When prompted, enter the database password you noted previously.
Query the database:
SELECT * FROM ulog2_ct WHERE orig_ip_saddr_str='10.0.0.2';
The output is similar to the following
... _ct_id | oob_family | orig_ip_saddr_str | orig_ip_daddr_str | orig_ip_protocol | orig_l4_sport | orig_l4_dport | orig_raw_pktlen | orig_raw_pktcount | reply_ip_saddr_str | reply_ip_daddr_str | reply_ip_protocol | reply_l4_sport | reply_l4_dport | reply_raw_pktlen | reply_raw_pktcount | icmp_code | icmp_type | ct_mark | flow_start_sec | flow_start_usec | flow_end_sec | flow_end_usec | ct_event --------+------------+-------------------+-------------------+------------------+---------------+---------------+-----------------+-------------------+--------------------+--------------------+-------------------+----------------+----------------+------------------+--------------------+-----------+-----------+---------+----------------+-----------------+--------------+---------------+---------- 12113 | 2 | 10.0.0.2 | 10.32.31.1 | 6 | 58404 | 8080 | 0 | 0 | 10.32.31.1 | 10.32.0.2 | 6 | 8080 | 58404 | 0 | 0 | | | 0 | 1560205157 | 950165 | | | 1 63 | 2 | 10.0.0.2 | 10.32.31.1 | 6 | 35510 | 8080 | 0 | 0 | 10.32.31.1 | 10.32.0.2 | 6 | 8080 | 35510 | 0 | 0 | | | 0 | 1559949207 | 180828 | | | 1 14 | 2 | 10.0.0.2 | 10.32.31.1 | 6 | 35428 | 8080 | 0 | 0 | 10.32.31.1 | 10.32.0.2 | 6 | 8080 | 35428 | 0 | 0 | | | 0 | 1559948312 | 193507 | | | 1 (3 rows) ...
Exit the database:
\q
If you don't see entries in the database, run the following commands:
\q sudo systemctl stop ulogd2 sudo systemctl disable ulogd2 sudo systemctl start ulogd2 sudo systemctl enable ulogd2
Then recheck the entries by starting at step 2 in this section. These commands restart the
ulogd2
daemon.
Clean up
Destroy the infrastructure
From the first Cloud Shell terminal, stop the
ulogd2
daemon and disconnect it from the Cloud SQL for PostgreSQL resource:sudo systemctl stop ulogd sudo systemctl disable ulogd
Exit from the SSH session to the isolated-VPC gateway:
exit
In Cloud Shell, destroy the configuration and all of the tutorial's components:
terraform destroy
Terraform prompts for confirmation before making the change. Answer
yes
when prompted.You might see the following Terraform error:
... ∗ google_compute_network.ivpc (destroy): 1 error(s) occurred: ∗ google_compute_network.ivpc: Error waiting for Deleting Network: The network resource 'projects/ivpc-pid--1058675427/global/networks/isolated-vpc-net' is already being used by 'projects/ivpc-pid--1058675427/global/firewalls/k8s-05693142c93de80e-node-hc' ...
This error occurs when the command attempts to destroy the isolated-VPC network before destroying the GKE firewall rules.
Fix the error by removing the non-default firewall rules from the isolated VPC:
../allnat/k8-fwr.sh
This script shows which firewall rules will be removed.
Review the list and, when prompted, enter
yes
.When the script is complete, rerun the
destroy
command.You might see the following Terraform error:
... Error: Error applying plan: 1 error(s) occurred: ∗ google_sql_user.users (destroy): 1 error(s) occurred: ∗ google_sql_user.users: Error, failed to delete user ulog2 in instance master-instance-b2aab5f6: googleapi: Error 400: Invalid request: Failed to delete user ulog2. Detail: pq: role "ulog2" cannot be dropped because some objects depend on it ., invalid ...
This error occurs when Terraform tries to destroy the database user before the Cloud SQL for PostgreSQL database has been fully deleted. Wait two minutes after the error appears, and then go to the next step.
From the original Cloud Shell terminal, rerun the
destroy
command:terraform destroy
Terraform prompts for confirmation before making the change.
Answer
yes
to destroy the configuration.Remove all the files created during this tutorial:
cd ../.. rm -rf kam
What's next
- Read the following background material:
- Explore reference architectures, diagrams, and best practices about Google Cloud. Take a look at our Cloud Architecture Center.