Learn about troubleshooting steps that you might find helpful if you run into problems using Google Kubernetes Engine (GKE).
Debugging Kubernetes resources
If you are experiencing an issue related to your cluster, refer to Troubleshooting Clusters in the Kubernetes documentation.
If you are having an issue with your application, its Pods, or its controller object, refer to Troubleshooting Applications.
The kubectl
command isn't found
Install the
kubectl
binary by running the following command:sudo gcloud components update kubectl
Answer "yes" when the installer prompts you to modify your
$PATH
environment variable. Modifying this variable enables you to usekubectl
commands without typing their full file path.Alternatively, add the following line to
~/.bashrc
(or~/.bash_profile
in macOS, or wherever your shell stores environment variables):export PATH=$PATH:/usr/local/share/google/google-cloud-sdk/bin/
Run the following command to load your updated
.bashrc
(or.bash_profile
) file:source ~/.bashrc
kubectl
commands return "connection refused" error
Set the cluster context with the following command:
gcloud container clusters get-credentials cluster-name
If you are unsure of what to enter for cluster-name, use the following command to list your clusters:
gcloud container clusters list
kubectl
commands return "failed to negotiate an api version" error
Ensure kubectl has authentication credentials:
gcloud auth application-default login
The kubectl
logs
, attach
, exec
, and port-forward
commands stops responding
These commands rely on the cluster's control plane (master) being able to talk to the nodes in the cluster. However, because the control plane isn't in the same Compute Engine network as your cluster's nodes, we rely on SSH tunnels to enable secure communication.
GKE saves an SSH public key file in your Compute Engine project metadata. All Compute Engine VMs using Google-provided images regularly check their project's common metadata and their instance's metadata for SSH keys to add to the VM's list of authorized users. GKE also adds a firewall rule to your Compute Engine network allowing SSH access from the control plane's IP address to each node in the cluster.
If any of the above kubectl
commands don't run, it's likely that the API
server is unable to open SSH tunnels with the nodes. Check for these potential
causes:
The cluster doesn't have any nodes.
If you've scaled down the number of nodes in your cluster to zero, SSH tunnels won't work.
To fix it, resize your cluster to have at least one node.
Pods in the cluster have gotten stuck in a terminating state and have prevented nodes that no longer exist from being removed from the cluster.
This is an issue that should only affect Kubernetes version 1.1, but could be caused by repeated resizing of the cluster.
To fix it, delete the Pods that have been in a terminating state for more than a few minutes. The old nodes are then removed from the control plane and replaced by the new nodes.
Your network's firewall rules don't allow for SSH access to the control plane.
All Compute Engine networks are created with a firewall rule called
default-allow-ssh
that allows SSH access from all IP addresses (requiring a valid private key, of course). GKE also inserts an SSH rule for each public cluster of the formgke-cluster-name-random-characters-ssh
that allows SSH access specifically from the cluster's control plane to the cluster's nodes. If neither of these rules exists, then the control plane will be unable to open SSH tunnels.To fix it, re-add a firewall rule allowing access to VMs with the tag that's on all the cluster's nodes from the control plane's IP address.
Your project's common metadata entry for "ssh-keys" is full.
If the project's metadata entry named "ssh-keys" is close to maximum size limit, then GKE isn't able to add its own SSH key to enable it to open SSH tunnels. You can see your project's metadata by running the following command:
gcloud compute project-info describe [--project=PROJECT]
And then check the length of the list of ssh-keys.
To fix it, delete some of the SSH keys that are no longer needed.
You have set a metadata field with the key "ssh-keys" on the VMs in the cluster.
The node agent on VMs prefers per-instance ssh-keys to project-wide SSH keys, so if you've set any SSH keys specifically on the cluster's nodes, then the control plane's SSH key in the project metadata won't be respected by the nodes. To check, run
gcloud compute instances describe <VM-name>
and look for an ssh-keys" field in the metadata.To fix it, delete the per-instance SSH keys from the instance metadata.
It's worth noting that these features are not required for the correct functioning of the cluster. If you prefer to keep your cluster's network locked down from all outside access, be aware that features like these won't work.
Node version not compatible with control plane version
Check what version of Kubernetes your cluster's control plane is running, and then check what version of Kubernetes your cluster's node pools are running. If any of the cluster's node pools are more than two minor versions older than the control plane, this might be causing issues with your cluster.
Periodically, the GKE team performs upgrades of the cluster control plane on your behalf. Control planes are upgraded to newer stable versions of Kubernetes. By default, a cluster's nodes have auto-upgrade enabled, and it is recommended that you do not disable it.
If auto-upgrade is disabled for a cluster's nodes, and you do not manually upgrade your node pool version to a version that is compatible with the control plane, your control plane will eventually become incompatible with your nodes as the control plane is automatically upgraded over time. Incompatibility between your cluster's control plane and the nodes can cause unexpected issues.
The Kubernetes version and version skew support policy guarantees that control planes are compatible with nodes up to two minor versions older than the control plane. For example, Kubernetes 1.19 control planes are compatible with Kubernetes 1.19, 1.18, and 1.17 nodes. To resolve this issue, manually upgrade the node pool version to a version that is compatible with the control plane.
If you are concerned about the upgrade process causing disruption to workloads running on the affected nodes, follow the steps in the Migrating the workloads section of the Migrating workloads to different machine types tutorial. These steps let you migrate gracefully by creating a new node pool and then cordoning and draining the old node pool.
Metrics from your cluster aren't showing up in Cloud Monitoring
Ensure that you have activated the Cloud Monitoring API and the Cloud Logging API on your project, and that you are able to view your project in Cloud Monitoring.
If the issue persists, check the following potential causes:
Ensure that you have enabled monitoring on your cluster.
Monitoring is enabled by default for clusters created from the Google Cloud Console and from the
gcloud
command-line tool, but you can verify by running the following command or clicking into the cluster's details in the Cloud Console:gcloud container clusters describe cluster-name
The output from this command should state that the "monitoringService" is "monitoring.googleapis.com", and Cloud Monitoring should be enabled in the Cloud Console.
If monitoring is not enabled, run the following command to enable it:
gcloud container clusters update cluster-name --monitoring-service=monitoring.googleapis.com
How long has it been since your cluster was created or had monitoring enabled?
It can take up to an hour for a new cluster's metrics to start appearing in Cloud Monitoring.
Is a
heapster
orgke-metrics-agent
(the OpenTelemetry Collector) running in your cluster in the "kube-system" namespace?This pod might be failing to schedule workloads because your cluster is running low on resources. Check whether Heapster or OpenTelemetry is running by calling
kubectl get pods --namespace=kube-system
and checking for pods withheapster
orgke-metrics-agent
in the name.Is your cluster's control plane able to communicate with the nodes?
Cloud Monitoring relies on that. You can check whether this is the case by running the following command:
kubectl logs pod-name
If this command returns an error, then the SSH tunnels may be causing the issue. See this section for further information.
If you are having an issue related to the Cloud Logging agent, see its troubleshooting documentation.
For more information, refer to the Logging documentation.
Error 404: Resource "not found" when calling gcloud container
commands
Re-authenticate to the gcloud
command-line tool:
gcloud auth login
Error 400/403: Missing edit permissions on account
Your Compute Engine default service account or the service account associated with GKE has been deleted or edited manually.
When you enable the Compute Engine or Kubernetes Engine API, a service account is created and given edit permissions on your project. If at any point you edit the permissions, remove the "Kubernetes Engine Service Agent" role, remove the account entirely, or disable the API, cluster creation and all management functionality will fail.
The name of your Google Kubernetes Engine service account is as follows, where project-number is your project number:
service-project-number@container-engine-robot.iam.gserviceaccount.com
To resolve the issue, if you have removed the Kubernetes Engine Service Agent
role from your Google Kubernetes Engine service account, add it back. Otherwise, you
must re-enable the Kubernetes Engine API, which will correctly restore your service
accounts and permissions. You can do this in the gcloud
tool or the
Cloud Console.
Console
Visit the APIs & Services in Cloud Console.
Select your project.
Click Enable APIs and Services.
Search for Kubernetes, then select the API from the search results.
Click Enable. If you have previously enabled the API, you must first disable it and then enable it again. It can take several minutes for the API and related services to be enabled.
gcloud
Run the following command in the gcloud
tool:
gcloud services enable container.googleapis.com
Replicating 1.8.x (and earlier) automatic firewall rules on 1.9.x and later
If your cluster is running Kubernetes version 1.9.x, the automatic firewall rules have changed to disallow workloads in a GKE cluster to initiate communication with other Compute Engine VMs that are outside the cluster but on the same network.
You can replicate the automatic firewall rules behavior of a 1.8.x (and earlier) cluster by performing the following steps:
Find your cluster's network:
gcloud container clusters describe cluster-name --format=get"(network)"
Get the cluster's IPv4 CIDR used for the containers:
gcloud container clusters describe cluster-name --format=get"(clusterIpv4Cidr)"
Create a firewall rule for the network, with the CIDR as the source range, and allow all protocols:
gcloud compute firewall-rules create "cluster-name-to-all-vms-on-network" \ --network="network" --source-ranges="cluster-ipv4-cidr" \ --allow=tcp,udp,icmp,esp,ah,sctp
Restore default service account to your GCP project
GKE's default service account, container-engine-robot
, can
accidentally become unbound from a project. GKE Service Agent
is an Identity and Access Management (IAM) role that
grants the service account the permissions to manage
cluster resources. If you remove this role binding from the service account, the
default service account becomes unbound from the project, which can prevent you
from deploying applications and performing other cluster operations.
You can check to see if the service account has been removed from your project
using gcloud
tool or the Cloud Console.
gcloud
Run the following command:
gcloud projects get-iam-policy project-id
Replace project-id with your project ID.
Console
Visit the IAM & Admin page in Cloud Console.
If the command or the dashboard do not display container-engine-robot
among
your service accounts, the service account has become unbound.
If you removed the GKE Service Agent role binding, run the following commands to restore the role binding:
PROJECT_ID=$(gcloud config get-value project)
PROJECT_NUMBER=$(gcloud projects describe "${PROJECT_ID}" --format "value(projectNumber)")
gcloud projects add-iam-policy-binding "${PROJECT_ID}" \
--member "serviceAccount:service-${PROJECT_NUMBER}@container-engine-robot.iam.gserviceaccount.com" \
--role roles/container.serviceAgent
To confirm that the role binding was granted:
gcloud projects get-iam-policy $PROJECT_ID
If you see the service account name along with the container.serviceAgent
role, the role binding has been granted. For example:
- members:
- serviceAccount:service-1234567890@container-engine-robot.iam.gserviceaccount.com
role: roles/container.serviceAgent
Cloud KMS key is disabled.
GKE's default service account cannot use a disabled Cloud KMS key for application-level secrets encryption.
To re-enable a disabled key, see Enable a disabled key version.
Pods stuck in pending state after enabling Node Allocatable
If you are experiencing an issue with Pods stuck in pending state after enabling Node Allocatable, please note the following:
Starting with version 1.7.6, GKE reserves CPU and memory for Kubernetes overhead, including Docker and the operating system. See Cluster architecture for information on how much of each machine type can be scheduled by Pods.
If Pods are pending after an upgrade, we suggest the following:
Ensure CPU and Memory requests for your Pods do not exceed their peak usage. With GKE reserving CPU and memory for overhead, Pods cannot request these resources. Pods that request more CPU or memory than they use prevent other Pods from requesting these resources, and might leave the cluster underutilized. For more information, see How Pods with resource requests are scheduled.
Consider resizing your cluster. For instructions, see Resizing a cluster.
Revert this change by downgrading your cluster. For instructions, see Manually upgrading a cluster or node pool.
Private cluster nodes created but not joining the cluster
Often when using custom routing and third-party network appliances on the VPC your private cluster is using, the default route (0.0.0.0/0) is redirected to the appliance instead of the default internet gateway. In addition to the control plane connectivity, you need to ensure that the following destinations are reachable:
- *.googleapis.com
- *.gcr.io
- gcr.io
Configure Private Google Access for all three domains. This best practice allows the new nodes to startup and join the cluster while keeping the internet bound traffic restricted.
Troubleshooting issues with deployed workloads
GKE returns an error if there are issues with a workload's Pods.
You can check the status of a Pod using the kubectl
command-line tool or
Cloud Console.
kubectl
To see all Pods running in your cluster, run the following command:
kubectl get pods
Output:
NAME READY STATUS RESTARTS AGE
pod-name 0/1 CrashLoopBackOff 23 8d
To get more details information about a specific Pod, run the following command:
kubectl describe pod pod-name
Replace pod-name with the name of the desired Pod.
Console
Perform the following steps:
Visit the GKE Workloads dashboard in Cloud Console.
Select the desired workload. The Overview tab displays the status of the workload.
From the Managed Pods section, click the error status message.
The following sections explain some common errors returned by workloads and how to resolve them.
CrashLoopBackOff
CrashLoopBackOff
indicates that a container is repeatedly crashing after
restarting. A container might crash for many reasons, and checking a Pod's
logs might aid in troubleshooting the root cause.
By default, crashed containers restart with an exponential delay limited to
five minutes. You can change this behavior setting the restartPolicy
field
Deployment's Pod specification under spec: restartPolicy
. The field's default
value is Always
.
You can find out why your Pod's container is crashing using the kubectl
command-line tool or Cloud Console.
kubectl
To see all Pods running in your cluster, run the following command:
kubectl get pods
Look for the Pod with the CrashLoopBackOff
error.
To get the Pod's logs, run the following command:
kubectl logs pod-name
Replace pod-name with the name of the problematic Pod.
You can also pass in the -p
flag to get the logs for the previous
instance of a Pod's container, if it exists.
Console
Perform the following steps:
Visit the GKE Workloads dashboard in Cloud Console.
Select the desired workload. The Overview tab displays the status of the workload.
From the Managed Pods section, click the problematic Pod.
From the Pod's menu, click the Logs tab.
Check "Exit Code" of the crashed container
You can find the exit code by performing the following tasks:
Run the following command:
kubectl describe pod pod-name
Replace pod-name with the name of the Pod.
Review the value in the
containers: container-name: last state: exit code
field:- If the exit code is 1, the container crashed because the application crashed.
- If the exit code is 0, verify for how long your app was running.
Containers exit when your application's main process exits. If your app finishes execution very quickly, container might continue to restart.
Connect to a running container
Open a shell to the Pod:
kubectl exec -it pod-name -- /bin/bash
If there is more than one container in your Pod, add
-c container-name
.
Now, you can run bash commands from the container: you can test the network or check if you have access to files or databases used by your application.
ImagePullBackOff and ErrImagePull
ImagePullBackOff
and ErrImagePull
indicate that the image used
by a container cannot be loaded from the image registry.
You can verify this issue using Cloud Console or the kubectl
command-line tool.
kubectl
To get more information about a Pod's container image, run the following command:
kubectl describe pod pod-name
Console
Perform the following steps:
Visit the GKE Workloads dashboard in Cloud Console.
Select the desired workload. The Overview tab displays the status of the workload.
From the Managed Pods section, click the problematic Pod.
From the Pod's menu, click the Events tab.
If the image is not found
If your image is not found:
- Verify that the image's name is correct.
- Verify that the image's tag is correct. (Try
:latest
or no tag to pull the latest image). - If the image has full registry path, verify that it exists in the Docker registry you are using. If you provide only the image name, check the Docker Hub registry.
Try to pull the docker image manually:
SSH into the node:
For example, to SSH into
example-instance
in theus-central1-a
zone:gcloud compute ssh example-instance --zone us-central1-a
Run
docker pull image-name
.
If this option works, you probably need to specify ImagePullSecrets on a Pod. Pods can only reference image pull secrets in their own namespace, so this process needs to be done one time per namespace.
If you encounter a "permission denied" or "no pull access" error, verify that you are logged in and/or have access to the image.
If you are using a private registry, it may require keys to read images.
If your image is hosted in Container Registry, the service account associated with your node pool needs read access to the Cloud Storage bucket containing the image. See Container Registry documentation for further details.
Pod unschedulable
PodUnschedulable
indicates that your Pod cannot be scheduled because of
insufficient resources or some configuration error.
Insufficient resources
You might encounter an error indicating a lack of CPU, memory, or another resource. For example: "No nodes are available that match all of the predicates: Insufficient cpu (2)" which indicates that on two nodes there isn't enough CPU available to fulfill a Pod's requests.
The default CPU request is 100m or 10% of a CPU (or one core).
If you want to request more or fewer resources, specify the value in the
Pod specification under spec: containers: resources: requests
MatchNodeSelector
MatchNodeSelector
indicates that there are no nodes that match the Pod's
label selector.
To verify this, check the labels specified in the Pod specification's
nodeSelector
field, under spec: nodeSelector
.
To see how nodes in your cluster are labelled, run the following command:
kubectl get nodes --show-labels
To attach a label to a node, run the following command:
kubectl label nodes node-name label-key=label-value
Replace the following:
- node-name with the desired node.
- label-key with the label's key.
- label-value with the label's value.
For more information, refer to Assigning Pods to Nodes.
PodToleratesNodeTaints
PodToleratesNodeTaints
indicates that the Pod can't be scheduled to any node
because no node currently tolerates its node taint.
To verify that this is the case, run the following command:
kubectl describe nodes node-name
In the output, check the Taints
field, which lists key-value pairs and
scheduling effects.
If the effect listed is NoSchedule
, then no Pod can be scheduled on that node
unless it has a matching toleration.
One way to resolve this issue is to remove the taint. For example, to remove a NoSchedule taint, run the following command:
kubectl taint nodes node-name key:NoSchedule-
PodFitsHostPorts
PodFitsHostPorts
indicates that a port that a node is attempting to use is
already in use.
To resolve this issue, check the Pod specification's hostPort
value under
spec: containers: ports: hostPort
. You might need to change this value to
another port.
Does not have minimum availability
If a node has adequate resources but you still see the Does not have minimum availability
message, check the Pod's status. If the status is SchedulingDisabled
or
Cordoned
status, the node cannot schedule new Pods. You can check the status of a
node using Cloud Console or the kubectl
command-line tool.
kubectl
To get statuses of your nodes, run the following command:
kubectl get nodes
To enable scheduling on the Node, run:
kubectl uncordon node-name
Console
Perform the following steps:
Visit the GKE Workloads dashboard in Cloud Console.
Select the desired cluster. The Nodes tab displays the Nodes and their status.
To enable scheduling on the Node, perform the following steps:
From the list, click the desired Node.
From the Node Details, click Uncordon button.
Unbound PersistentVolumeClaims
Unbound PersistentVolumeClaims
indicates that the Pod references a
PersistentVolumeClaim that is not bound. This error might happen if your
PersistentVolume failed to provision. You can verify that provisioning failed by
getting the events for your PersistentVolumeClaim and examining them for
failures.
To get events, run the following command:
kubectl describe pvc statefulset-name-pvc-name-0
Replace the following:
- statefulset-name with the name of the StatefulSet object.
- pvc-name with the name of the PersistentVolumeClaim object.
This may also happen if there was a configuration error during your manual pre-provisioning of a PersistentVolume and its binding to a PersistentVolumeClaim. You can try to pre-provision the volume again.
Connectivity issues
As mentioned in the Network Overview discussion, it is important to understand how Pods are wired from their network namespaces to the root namespace on the node in order to troubleshoot effectively. For the following discussion, unless otherwise stated, assume that the cluster uses GKE's native CNI rather than Calico's. That is, no network policy has been applied.
Pods on select nodes have no availability
If Pods on select nodes have no network connectivity, ensure that the Linux bridge is up:
ip address show cbr0
If the Linux bridge is down, raise it:
sudo ip link set cbr0 up
Ensure that the node is learning Pod MAC addresses attached to cbr0:
arp -an
Pods on select nodes have minimal connectivity
If Pods on select nodes have minimal connectivity, you should first confirm
whether there are any lost packets by running tcpdump
in the toolbox container:
sudo toolbox bash
Install tcpdump
in the toolbox if you have not done so already:
apt install -y tcpdump
Run tcpdump
against cbr0:
tcpdump -ni cbr0 host hostname and port port-number and [tcp|udp|icmp]
Should it appear that large packets are being dropped downstream from the bridge (for example, the TCP handshake completes, but no SSL hellos are received), ensure that the Linux bridge MTU is correctly set to the MTU of the cluster's VPC network.
ip address show cbr0
When overlays are used (for example, Weave or Flannel), this MTU must be further reduced to accommodate encapsulation overhead on the overlay.
Intermittent failed connections
Connections to and from the Pods are forwarded by iptables. Flows are tracked as entries in the conntrack table and, where there are many workloads per node, conntrack table exhaustion may manifest as a failure. These can be logged in the serial console of the node, for example:
nf_conntrack: table full, dropping packet
If you are able to determine that intermittent issues are driven by conntrack
exhaustion, you may increase the size of the cluster (thus reducing the number
of workloads and flows per node), or increase nf_conntrack_max
:
new_ct_max=$(awk '$1 == "MemTotal:" { printf "%d\n", $2/32; exit; }' /proc/meminfo)
sysctl -w net.netfilter.nf_conntrack_max="${new_ct_max:?}" \
&& echo "net.netfilter.nf_conntrack_max=${new_ct_max:?}" >> /etc/sysctl.conf
"bind: Address already in use" reported for a container
A container in a Pod is unable to start because according to the container logs, the port where the application is trying to bind to is already reserved. The container is crash looping. For example, in Cloud Logging:
resource.type="container"
textPayload:"bind: Address already in use"
resource.labels.container_name="redis"
2018-10-16 07:06:47.000 CEST 16 Oct 05:06:47.533 # Creating Server TCP listening socket *:60250: bind: Address already in use
2018-10-16 07:07:35.000 CEST 16 Oct 05:07:35.753 # Creating Server TCP listening socket *:60250: bind: Address already in use
When Docker crashes, sometimes a running container gets left behind and is stale. The process is still running in the network namespace allocated for the Pod, and listening on its port. Because Docker and the kubelet don't know about the stale container they try to start a new container with a new process, which is unable to bind on the port as it gets added to the network namespace already associated with the Pod.
To diagnose this problem:
You need the UUID of the Pod in the
.metadata.uuid
field:kubectl get pod -o custom-columns="name:.metadata.name,UUID:.metadata.uid" ubuntu-6948dd5657-4gsgg name UUID ubuntu-6948dd5657-4gsgg db9ed086-edba-11e8-bdd6-42010a800164
Get the output of the following commands from the node:
docker ps -a ps -eo pid,ppid,stat,wchan:20,netns,comm,args:50,cgroup --cumulative -H | grep [Pod UUID]
Check running processes from this Pod. Because the UUID of the cgroup namespaces contain the UUID of the Pod, you can grep for the Pod UUID in
ps
output. Grep also the line before, so you will have thedocker-containerd-shim
processes having the container id in the argument as well. Cut the rest of the cgroup column to get a simpler output:# ps -eo pid,ppid,stat,wchan:20,netns,comm,args:50,cgroup --cumulative -H | grep -B 1 db9ed086-edba-11e8-bdd6-42010a800164 | sed s/'blkio:.*'/''/ 1283089 959 Sl futex_wait_queue_me 4026531993 docker-co docker-containerd-shim 276e173b0846e24b704d4 12: 1283107 1283089 Ss sys_pause 4026532393 pause /pause 12: 1283150 959 Sl futex_wait_queue_me 4026531993 docker-co docker-containerd-shim ab4c7762f5abf40951770 12: 1283169 1283150 Ss do_wait 4026532393 sh /bin/sh -c echo hello && sleep 6000000 12: 1283185 1283169 S hrtimer_nanosleep 4026532393 sleep sleep 6000000 12: 1283244 959 Sl futex_wait_queue_me 4026531993 docker-co docker-containerd-shim 44e76e50e5ef4156fd5d3 12: 1283263 1283244 Ss sigsuspend 4026532393 nginx nginx: master process nginx -g daemon off; 12: 1283282 1283263 S ep_poll 4026532393 nginx nginx: worker process
From this list, you can see the container ids, which should be visible in
docker ps
as well.In this case:
docker-containerd-shim 276e173b0846e24b704d4
for pausedocker-containerd-shim ab4c7762f5abf40951770
for sh with sleep (sleep-ctr)docker-containerd-shim 44e76e50e5ef4156fd5d3
for nginx (echoserver-ctr)
Check those in the
docker ps
output:# docker ps --no-trunc | egrep '276e173b0846e24b704d4|ab4c7762f5abf40951770|44e76e50e5ef4156fd5d3' 44e76e50e5ef4156fd5d383744fa6a5f14460582d0b16855177cbed89a3cbd1f gcr.io/google_containers/echoserver@sha256:3e7b182372b398d97b747bbe6cb7595e5ffaaae9a62506c725656966d36643cc "nginx -g 'daemon off;'" 14 hours ago Up 14 hours k8s_echoserver-cnt_ubuntu-6948dd5657-4gsgg_default_db9ed086-edba-11e8-bdd6-42010a800164_0 ab4c7762f5abf40951770d3e247fa2559a2d1f8c8834e5412bdcec7df37f8475 ubuntu@sha256:acd85db6e4b18aafa7fcde5480872909bd8e6d5fbd4e5e790ecc09acc06a8b78 "/bin/sh -c 'echo hello && sleep 6000000'" 14 hours ago Up 14 hours k8s_sleep-cnt_ubuntu-6948dd5657-4gsgg_default_db9ed086-edba-11e8-bdd6-42010a800164_0 276e173b0846e24b704d41cf4fbb950bfa5d0f59c304827349f4cf5091be3327 k8s.gcr.io/pause-amd64:3.1
In normal cases, you see all container ids from
ps
showing up indocker ps
. If there is one you don't see, it's a stale container, and probably you will see a child process of thedocker-containerd-shim process
listening on the TCP port that is reporting as already in use.To verify this, execute
netstat
in the container's network namespace. Get the pid of any container process (so NOTdocker-containerd-shim
) for the Pod.From the above example:
- 1283107 - pause
- 1283169 - sh
- 1283185 - sleep
- 1283263 - nginx master
- 1283282 - nginx worker
# nsenter -t 1283107 --net netstat -anp Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 1283263/nginx: mast Active UNIX domain sockets (servers and established) Proto RefCnt Flags Type State I-Node PID/Program name Path unix 3 [ ] STREAM CONNECTED 3097406 1283263/nginx: mast unix 3 [ ] STREAM CONNECTED 3097405 1283263/nginx: mast gke-zonal-110-default-pool-fe00befa-n2hx ~ # nsenter -t 1283169 --net netstat -anp Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 1283263/nginx: mast Active UNIX domain sockets (servers and established) Proto RefCnt Flags Type State I-Node PID/Program name Path unix 3 [ ] STREAM CONNECTED 3097406 1283263/nginx: mast unix 3 [ ] STREAM CONNECTED 3097405 1283263/nginx: mast
You can also execute
netstat
usingip netns
, but you need to link the network namespace of the process manually, as Docker is not doing the link:# ln -s /proc/1283169/ns/net /var/run/netns/1283169 gke-zonal-110-default-pool-fe00befa-n2hx ~ # ip netns list 1283169 (id: 2) gke-zonal-110-default-pool-fe00befa-n2hx ~ # ip netns exec 1283169 netstat -anp Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 1283263/nginx: mast Active UNIX domain sockets (servers and established) Proto RefCnt Flags Type State I-Node PID/Program name Path unix 3 [ ] STREAM CONNECTED 3097406 1283263/nginx: mast unix 3 [ ] STREAM CONNECTED 3097405 1283263/nginx: mast gke-zonal-110-default-pool-fe00befa-n2hx ~ # rm /var/run/netns/1283169
Mitigation:
The short term mitigation is to identify stale processes by the method outlined
above, and end the processes using the kill [PID]
command.
Long term mitigation involves identifying why Docker is crashing and fixing that. Possible reasons include:
- Zombie processes piling up, so running out of PID namespaces
- Bug in docker
- Resource pressure / OOM