This document describes default log throughput and how to increase throughput.
When system logging is enabled, a dedicated Logging agent is automatically deployed and managed. It runs on all GKE nodes in a cluster to collect logs, adds helpful metadata about the container, pod, and cluster, and then sends the logs to Cloud Logging using a fluentbit-based agent.
The dedicated Logging agent provides at least 100 KiB per second log throughput per node for system and workload logs. If a node is underutilized, then depending on the type of log load (for example, text or structured log entries, very few containers on the node or many containers), the dedicated logging agent might provide throughput as much as 500 KiB per second or more. Additionally, in clusters with GKE control plane version 1.23.13-gke.1000 or later, the Logging agent allows for throughput as high as 10 MiB per second on nodes that have at least 2 unused CPU cores. Be aware, however, that at higher throughputs, some logs may be lost.
Identify nodes with higher log throughput
By default, GKE clusters collect
system metrics.
The system metric kubernetes.io/node/logs/input_bytes
provides
the number of log bytes generated per second on a node. This metric can help
you decide which variant of the logging agent makes sense to deploy in your
cluster or node pools.
To view the historical logging throughput for each node in your cluster, follow these steps:
-
In the Google Cloud console, go to the leaderboard Metrics explorer page:
If you use the search bar to find this page, then select the result whose subheading is Monitoring.
In the Select a metric field, select
kubernetes.io/node/logs/input_bytes
.In the Group by field, select project_id, location, cluster_name, and node_name.
Click OK
Optionally, sort the list of metrics in descending order by clicking the column header Value above the list of metrics.
To understand how much logging volume is due to system components or due to workloads running on the node, you may also group by the type metric label.
Enable high-throughput logging
If any GKE nodes require more than 100 KiB per second log throughput and your GKE Standard cluster is using control plane version 1.23.13-gke.1000 or later, you can configure GKE to deploy an alternative configuration of the Logging agent designed to maximize logging throughput. This maximum throughput Logging variant allows for throughput as high as 10 MiB per second per node. You can deploy this high-throughput Logging agent to all nodes in a cluster or to all nodes in a node pool.
This high-throughput configuration will consume additional CPU and memory.
gcloud CLI
To enable high-throughput logging on all nodes in a new cluster:
gcloud container clusters create CLUSTER_NAME \
--location=COMPUTE_LOCATION \
--logging-variant=MAX_THROUGHPUT \
--machine-type=MACHINE_TYPE
Replace the following:
CLUSTER_NAME
: the name of the new cluster.COMPUTE_LOCATION
: the Compute Engine location for the new cluster.MACHINE_TYPE
: a machine type that has enough CPU for the Logging agent, such ase2-standard-8
.
All newly created node pools in this cluster, including the default node pool, deploy the high-throughput Logging agent.
To configure high-throughput logging for an existing cluster:
use the gcloud container clusters update
command:
gcloud container clusters update CLUSTER_NAME \
--location=COMPUTE_LOCATION \
--logging-variant=MAX_THROUGHPUT
Replace the following:
CLUSTER_NAME
: the name of the cluster.COMPUTE_LOCATION
: the Compute Engine location of the cluster.
To create a new node pool that uses the high-throughput
Logging agent, use the
gcloud container node-pools create
command:
gcloud container node-pools create NODEPOOL_NAME \
--cluster=CLUSTER_NAME \
--location=COMPUTE_LOCATION \
--logging-variant=MAX_THROUGHPUT
Replace the following:
NODEPOOL_NAME
: the name of the new node pool.CLUSTER_NAME
: the name of the cluster.COMPUTE_LOCATION
: the Compute Engine location for the new cluster.
To update an existing node pool, use the
gcloud container node-pools update
command.
gcloud container node-pools update NODEPOOL_NAME \
--cluster=CLUSTER_NAME \
--location=COMPUTE_LOCATION \
--logging-variant=MAX_THROUGHPUT
Replace the following:
NODEPOOL_NAME
: the name of the node pool.CLUSTER_NAME
: the name of the cluster.COMPUTE_LOCATION
: the Compute Engine location.
Terraform
The following code blocks specify how to declare node pools with or without high-throughput logging.
To manage the node pools explicitly, you must specify your cluster without a default node pool.
resource "google_container_cluster" "with_example_logging_variants" {
provider = google
name = "CLUSTER_NAME"
location = "COMPUTE_LOCATION"
initial_node_count = 1
remove_default_node_pool = true # We want to manage our node pools separately.
}
To specify a node pool that uses the high-throughput agent, use the node_config
field to specify the Logging agent variant as MAX_THROUGHPUT
and an appropriate machine type:
resource "google_container_node_pool" "with_example_logging_variant" {
provider = google
name = "example-node-pool-with-htl"
cluster = google_container_cluster.with_example_logging_variants.name
location = "COMPUTE_LOCATION"
node_config {
logging_variant = "MAX_THROUGHPUT"
# Use a machine type with enough CPU to accommodate the high-throughput agent, such as e2-standard-8.
machine_type = "e2-standard-8"
}
node_count = 1
}
To specify a node pool that uses the default agent, use the node_config
field
to specify the Logging agent variant as DEFAULT
:
resource "google_container_node_pool" "with_default_logging_variant" {
provider = google
name = "example-node-pool-with-default-logging"
cluster = google_container_cluster.with_example_logging_variants.name
location = "COMPUTE_LOCATION"
node_config {
logging_variant = "DEFAULT"
}
node_count = 1
}
Disable high-throughput logging
If you no longer want to use the high-throughput Logging agent, deploy the default Logging agent to the cluster or node pool.
gcloud CLI
Pass the flag --logging-variant=DEFAULT
when you create or update a cluster
or node pool.
To use the default logging agent on all nodes in a new cluster:
gcloud container clusters create CLUSTER_NAME \
--location=COMPUTE_LOCATION \
--logging-variant=DEFAULT \
--machine-type=MACHINE_TYPE
Replace the following:
CLUSTER_NAME
: the name of the new cluster.COMPUTE_LOCATION
: the Compute Engine location for the new cluster.MACHINE_TYPE
: a machine type that has enough CPU for the Logging agent, such ase2-standard-8
.
To use the default logging agent on an existing cluster:
use the gcloud container clusters update
command:
gcloud container clusters update CLUSTER_NAME \
--location=COMPUTE_LOCATION \
--logging-variant=DEFAULT
Replace the following:
CLUSTER_NAME
: the name of the cluster.COMPUTE_LOCATION
: the Compute Engine location of the cluster.
To use the default logging agent for a new node pool, use the
gcloud container node-pools create
command:
gcloud container node-pools create NODEPOOL_NAME \
--cluster=CLUSTER_NAME \
--location=COMPUTE_LOCATION \
--logging-variant=DEFAULT
Replace the following:
NODEPOOL_NAME
: the name of the new node pool.CLUSTER_NAME
: the name of the cluster.COMPUTE_LOCATION
: the Compute Engine location for the new cluster.
To update an existing node pool, use the
gcloud container node-pools update
command:
gcloud container node-pools update NODEPOOL_NAME \
--cluster=CLUSTER_NAME \
--location=COMPUTE_LOCATION \
--logging-variant=DEFAULT
Replace the following:
NODEPOOL_NAME
: the name of the node pool.CLUSTER_NAME
: the name of the cluster.COMPUTE_LOCATION
: the Compute Engine location.
Terraform
If you no longer want Terraform to create node pools that use the high-throughput Logging agent, set the logging_variant
field to DEFAULT
.
What's next
- Learn how to control log ingestion.