Version 1.1. This version is no longer supported as outlined in the Anthos version support policy. For the latest patches and updates for security vulnerabilities, exposures, and issues impacting Anthos clusters on VMware (GKE on-prem), upgrade to a supported version. You can find the most recent version here.

Debugging node issues

This page explains how to debug node issues using a suite of preinstalled debugging tools.

Overview

Each GKE On-Prem cluster you create is composed of several nodes. Each GKE On-Prem node includes a distribution of CoreOS' toolbox, a shell script that unpacks and runs a debugging container, debug-toolbox. debug-toolbox is a container image that includes several useful debugging tools.

If you encounter issues with a specific node, you can attempt debugging by connecting to the affected node, run the toolbox script to unpack and run the debug-toolbox container, and run the tools included in the container.

Tools included in debug-toolbox container

The debug-toolbox container runs a Debian base image that includes the following packages:

  • bash
  • curl
  • dnsutils
  • hping3
  • iperf3
  • lsof
  • netcat
  • mtr
  • procps
  • strace
  • tcpdump
  • traceroute
  • util-linux

Since these tools are included in the container, they don't require an internet connection. If you want to install additional debugging tools, you use apt-get, which does require an internet connection.

Using toolbox

  1. SSH into the cluster node.

  2. Run the toolbox command:

    sudo toolbox

    This command starts a debug-toolbox container.

  3. While inside the container, run one of the tools. For example, tcpdump.

  4. When you're finished, exit the container and close the SSH connection to the node.