This documentation is for the current version of Anthos clusters on AWS, released in November 2021. See the Release notes for more information. For documentation on the previous generation of Anthos clusters on AWS, see Previous generation.

Known issues for Anthos clusters on AWS

This page lists selected known issues for Anthos clusters on AWS, and steps you can take to reduce their impact.

Networking

Application timeouts caused by conntrack table insertion failures

Versions affected by this issue are the following:

  • All versions of 1.23 starting from 1.23.8-gke.1700.
  • Versions of 1.24 ranging from 1.24.0-gke.0 up to, but not including, 1.24.13-gke.500.
  • Versions of 1.25 ranging from 1.25.0-gke.0 up to, but not including, 1.25.8-gke.500.
  • Versions from 1.25.0-gke.0 up to, but not including, 1.26.4-gke.2200.

Clusters running on an Ubuntu OS that uses kernel 5.15 or higher are susceptible to netfilter connection tracking (conntrack) table insertion failures. Insertion failures can occur even when the conntrack table has room for new entries. The failures are caused by changes in kernel 5.15 and higher that restrict table insertions based on chain length.

To see if you are affected by this issue, check the in-kernel connection tracking system statistics with the following command:

sudo conntrack -S

The response looks like this:

cpu=0       found=0 invalid=4 insert=0 insert_failed=0 drop=0 early_drop=0
error=0 search_restart=0 clash_resolve=0 chaintoolong=0
cpu=1       found=0 invalid=0 insert=0 insert_failed=0 drop=0 early_drop=0
error=0 search_restart=0 clash_resolve=0 chaintoolong=0
cpu=2       found=0 invalid=16 insert=0 insert_failed=0 drop=0 early_drop=0
error=0 search_restart=0 clash_resolve=0 chaintoolong=0
cpu=3       found=0 invalid=13 insert=0 insert_failed=0 drop=0 early_drop=0
error=0 search_restart=0 clash_resolve=0 chaintoolong=0
cpu=4       found=0 invalid=9 insert=0 insert_failed=0 drop=0 early_drop=0
error=0 search_restart=0 clash_resolve=0 chaintoolong=0
cpu=5       found=0 invalid=1 insert=0 insert_failed=0 drop=0 early_drop=0
error=519 search_restart=0 clash_resolve=126 chaintoolong=0

If a chaintoolong value in the response is a non-zero number, you are affected by this issue.

Solution

If you are running version 1.26.2-gke.1001, upgrade to version 1.26.4-gke.2200 or later.

Usability

Unreachable clusters detected error in UI

Versions affected by this issue are 1.25.5-gke.1500 and 1.25.4-gke.1300.

Some UI surfaces in Google Cloud console can't authorize to the cluster and might display the cluster as unreachable.

Solution

Upgrade your cluster to the latest available patch of version 1.25. This issue was fixed in version 1.25.5-gke.2000.