Configure logging and monitoring for GKE


Google Kubernetes Engine (GKE) includes integration with Cloud Logging and Cloud Monitoring, including Google Cloud Managed Service for Prometheus.

This integration lets you monitor your running GKE clusters, manage your system and debug logs, and analyze your system's performance using advanced profiling and tracing capabilities.

This integration also provides a dashboard for observing your GKE clusters.

Security logs, including basic audit logs, are available for GKE and most other Google Cloud services even when Cloud Logging is not enabled for a GKE cluster. For more information, see Cloud Audit Logs.

This page describes how to do the following:

  • Create a new cluster and configure Cloud Logging, Cloud Monitoring, and Google Cloud Managed Service for Prometheus.

  • Select which logs and metrics to collect.

  • Disable Cloud Logging, Cloud Monitoring, and Google Cloud Managed Service for Prometheus for a cluster.

For GKE Autopilot clusters, you cannot disable the Cloud Logging and Cloud Monitoring integration.

Before you begin

Before you start, make sure you have performed the following tasks:

  • Enable the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running gcloud components update.
  • Ensure you are an Owner of the project containing your cluster.

  • Ensure you have enabled the Cloud Logging API. You can check the status of the Cloud Logging API from its Overview page.

Logs and metrics

You have a choice whether or not to send logs and metrics from your GKE cluster to Cloud Logging and Cloud Monitoring. The following sections describe which logs and metrics are available, and which logs and metrics are enabled by default at cluster creation time.

Available logs

If you choose to send logs to Cloud Logging, you must send system logs, and you may optionally send logs from additional sources.

Learn about Cloud Logging pricing.

The following table indicates supported values for the --logging flag for the create and update commands.

Log Source --logging value Logs Collected
None NONE No logs sent to Cloud Logging; no log collection agent installed in the cluster. This value is not supported for GKE Autopilot clusters.
System SYSTEM Collects logs from the following:
  • All Pods running in namespaces kube-system, istio-system, knative-serving, gke-system, and config-management-system.
  • Key services that are not containerized including docker/containerd runtime, kubelet, kubelet-monitor, node-problem-detector, and kube-container-runtime-monitor.
  • The node's serial ports output, if the VM instance metadata serial-port-logging-enable is set to true.

Additionally, collects Kubernetes events. This value is required for all cluster types.

Workloads WORKLOAD All logs generated by non-system containers running on user nodes. This value is on by default but optional for all cluster types.
API server API_SERVER All logs generated by kube-apiserver. This value is optional for all cluster types.
Scheduler SCHEDULER All logs generated by kube-scheduler. This value is optional for all cluster types.
Controller Manager CONTROLLER_MANAGER All logs generated by kube-controller-manager. This value is optional for all cluster types.

Available metrics

If you choose to send metrics to Cloud Monitoring, you must send system metrics and can optionally send additional metrics.

Learn about Cloud Monitoring pricing, including which metrics are non-chargeable.

The following table indicates supported values for the --monitoring flag for the create and update commands.

Source --monitoring value Metrics Collected
None NONE No metrics sent to Cloud Monitoring; no metric collection agent installed in the cluster. This value is not supported for GKE Autopilot clusters.
System SYSTEM Metrics from essential system components required for Kubernetes. See a complete list of these Kubernetes metrics.
API server API_SERVER Metrics from kube-apiserver. See a complete list of API server metrics.
Scheduler SCHEDULER Metrics from kube-scheduler. See a complete list of Scheduler metrics.
Controller Manager CONTROLLER_MANAGER Metrics from kube-controller-manager. See a complete list of Controller Manager metrics.
Persistent volume (Storage) STORAGE Storage metrics from kube-state-metrics. Includes metrics for Persistent Volume and Persistent Volume Claims. See a complete list of Storage metrics.
Pod POD Pod metrics from kube-state-metrics. See a complete list of Pod metrics.
Deployment DEPLOYMENT Deployment metrics from kube-state-metrics. See a complete list of Deployment metrics.
StatefulSet STATEFULSET StatefulSet metrics from kube-state-metrics. See a complete list of StatefulSet metrics.
DaemonSet DAEMONSET DaemonSet metrics from kube-state-metrics. See a complete list of DaemonSet metrics.
HorizonalPodAutoscaler HPA HPA metrics from kube-state-metrics. See a complete list of HorizonalPodAutoscaler metrics.

Additionally, you can collect Prometheus-style metrics exposed by any GKE workload by using Google Cloud Managed Service for Prometheus, which lets you monitor and alert on your workloads, using Prometheus, without having to manually manage and operate Prometheus at scale.

Logs and metrics enabled by default

When you create a new GKE cluster on Google Cloud, some logs and metrics are enabled by default during cluster creation.

  • System logs and metrics are enabled for all types of clusters and they can't be disabled.
  • Workload logs are enabled by default for all Autopilot clusters but can be disabled. We don't recommend disabling workload logs because of the impact to supportability.
  • For GKE Enterprise edition projects, additional useful logs and metrics are enabled by default if you register to a fleet while creating the cluster. If you want to enable those logs and metrics after a cluster is created, see Modify your cluster.

In the following tables, a checkmark () indicates which logs and metrics are enabled by default when you create and register a new cluster in a project with GKE Enterprise enabled:

Logs

Log name Autopilot Standard
System
Workloads -
API server
Scheduler
Controller Manager

The control plane logs (API server, Scheduler, and Controller Manager) incur Cloud Logging charges.

Metrics

Metric name Autopilot Standard
System
API server
Scheduler
Controller Manager
Persistent volume (Storage)
Pods
Deployment
StatefulState
DaemonSet
HorizonalPodAutoscaler

All registered clusters in a project that has GKE Enterprise enabled can use control plane metrics and Kube state metrics without any additional charges. Otherwise these metrics incur Cloud Monitoring charges.

You can choose to disable the default logs and metrics during cluster creation or after the cluster is created.

Configure monitoring and logging for a new cluster

The cluster-creation instructions in this section only cover the options relevant to Cloud Logging and Cloud Monitoring. For complete instructions on creating a GKE cluster, see the documentation for creating a Standard or Autopilot cluster.

To manually configure logging and monitoring while creating a GKE cluster, complete the following steps:

Console

For an Autopilot cluster:

  1. On the Autopilot cluster creation page, from the navigation pane, click Advanced settings.

    Create an Autopilot cluster

  2. In the Operations list, select which logs and metrics you want collected.

    • In the Components list for Cloud Logging, select the components from which you want to collect logs.

    • In the Components list for Cloud Monitoring, select the components from which you want to collect metrics.

    Autopilot clusters always use Google's best practices for telemetry collection, meaning that system and workload logging is always enabled and system monitoring is always enabled.

  3. Click Create.

For a Standard cluster:

  1. On the Standard cluster creation page, from the navigation pane, under Cluster, click Features.

    Create a Kubernetes cluster

  2. In the Operations list, select which logs and metrics you want collected.

    • In the Components list for Cloud Logging, select the components from which you want to collect logs.

    • In the Components list for Cloud Monitoring, select the components from which you want to collect metrics.

    • To disable Cloud Logging (except for audit logs), clear the Enable Cloud Logging checkbox.

    • To disable Cloud Monitoring, clear the Enable Cloud Monitoring checkbox.

    • To disable Google Cloud Managed Service for Prometheus, clear the Enable Google Cloud Managed Service for Prometheus checkbox.

gcloud

  1. For new clusters, Cloud Logging and Cloud Monitoring are enabled by default. To create your cluster, run the following command:

    gcloud container clusters create CLUSTER_NAME \
        --location=COMPUTE_LOCATION
    

    Replace the following:

    1. Alternatively, you can configure which logs are sent to Cloud Logging by passing a comma-separated list of values to the create command's --logging flag. To collect no logs, pass --logging=NONE. To collect system, API server, Scheduler, and Controller Manager logs, pass --logging=SYSTEM,API_SERVER,SCHEDULER,CONTROLLER_MANAGER. To collect both system and workload logs, pass --logging=SYSTEM,WORKLOAD. For example:

      gcloud container clusters create CLUSTER_NAME \
          --location=COMPUTE_LOCATION \
          --logging=SYSTEM,WORKLOAD
      
    2. Similarly, you can configure which metrics are sent to Cloud Monitoring by passing a comma-separated list of values to the --monitoring flag. To collect no metrics, pass --monitoring=NONE. To collect system metrics, pass --monitoring=SYSTEM. To collect all metrics, pass --monitoring=SYSTEM,API_SERVER,SCHEDULER,CONTROLLER_MANAGER,STORAGE,POD,DEPLOYMENT,STATEFULSET,DAEMONSET,HPA. For example:

      gcloud container clusters create CLUSTER_NAME \
          --location=COMPUTE_LOCATION \
          --monitoring=SYSTEM,API_SERVER,SCHEDULER,CONTROLLER_MANAGER,STORAGE,POD,DEPLOYMENT,STATEFULSET,DAEMONSET,HPA
      
    3. Separately, you can enable Google Cloud Managed Service for Prometheus by using the --enable-managed-prometheus flag. For example:

      gcloud container clusters create CLUSTER_NAME \
          --location=COMPUTE_LOCATION \
          --enable-managed-prometheus
      

      The --enable-managed-prometheus flag enables the managed collector, which must be configured.

Terraform

  • To configure the collection of logs and metrics using Terraform, see the logging_config and monitoring_config blocks in the Terraform registry for google_container_cluster. Enabling the collection of logs from the API server, scheduler, and controller manager requires the Terraform Google Cloud provider version 4.44.0 or later.

  • For general information about using Google Cloud with Terraform, see Terraform with Google Cloud.

Configure monitoring and logging for an existing cluster

The following section details how to modify the Cloud Logging and Cloud Monitoring integration for an existing GKE cluster.

Changing your monitoring and logging support, and changing the Kubernetes version of your cluster are distinct actions. Changing the Kubernetes version of your cluster doesn't change your configured monitoring and logging support.

Which monitoring and logging support does my cluster use?

To see the Cloud Logging and Cloud Monitoring integration settings for your cluster, follow these steps:

  1. In the navigation panel of the Google Cloud console, select Kubernetes Engine, and then select Clusters:

    Go to Kubernetes Clusters

  2. In the Details panel for your cluster, see the status for Cloud Logging, Cloud Monitoring, and Google Cloud Managed Service for Prometheus.

Modify your cluster

To change the Cloud Logging or Cloud Monitoring integration settings for an existing cluster, follow these steps:

Console

  1. In the navigation panel of the Google Cloud console, select Kubernetes Engine, and then select Clusters:

    Go to Kubernetes Clusters

  2. Click the name of your cluster.

  3. To modify which logs are sent to Cloud Logging, which metrics are sent to Cloud Monitoring, or whether Google Cloud Managed Service for Prometheus is enabled, click Edit next to Cloud Logging, Cloud Monitoring, or Google Cloud Managed Service for Prometheus.

  4. Click Save.

gcloud

The following gcloud instructions cover upgrading your cluster's monitoring and logging support using the gcloud container clusters update command. Notice that you use the update command, not the upgrade command.

  • Configure which logs are sent to Cloud Logging by passing a comma-separated list of values to the gcloud container clusters update command's --logging flag. See a full list of available log sources. For example, to collect both system and workload logs, pass --logging=SYSTEM,WORKLOAD. To collect only system logs, pass --logging=SYSTEM. Or, to collect no logs, pass --logging=NONE:

    gcloud container clusters update CLUSTER_NAME \
        --location=COMPUTE_LOCATION \
        --logging=NONE
    
  • Configure which metrics are sent to Cloud Monitoring by passing a comma-separated list of values to the gcloud container clusters update command's --monitoring flag. See a full list of available metric sources. For example, to collect system metrics, pass --monitoring=SYSTEM. Or, to collect no metrics, pass --monitoring=NONE:

    gcloud container clusters update CLUSTER_NAME \
        --location=COMPUTE_LOCATION \
        --monitoring=NONE
    
  • Configure whether Google Cloud Managed Service for Prometheus is enabled by using the --enable-managed-prometheus or --disable-managed-prometheus flags. For example:

    gcloud container clusters update CLUSTER_NAME \
        --location=COMPUTE_LOCATION \
        --enable-managed-prometheus
    

Terraform

  • To configure the collection of logs and metrics using Terraform, see the logging_config and monitoring_config blocks in the Terraform registry for google_container_cluster. Enabling the collection of logs from the API server, scheduler, and controller manager requires the Terraform Google Cloud provider version 4.44.0 or later.

  • For general information about using Google Cloud with Terraform, see Terraform with Google Cloud.

Deprecated Configuration Parameters

If you have previously been using the old configuration parameters to configure logging and monitoring support for your GKE cluster, those parameters are deprecated. The following table shows the equivalent configuration parameters to replace the deprecated flags.

Old Configuration Old create Arguments Old update Arguments New create and update Arguments
Disabled --no-enable-stackdriver-kubernetes --no-enable-stackdriver-kubernetes --logging=NONE
--monitoring=NONE
System monitoring only (Logging disabled) --enable-stackdriver-kubernetes
--no-enable-cloud-logging
--logging-service=none
--monitoring-service=monitoring.googleapis.com/kubernetes
--logging=NONE
--monitoring=SYSTEM
System and workload logging only (Monitoring disabled) --enable-stackdriver-kubernetes
--no-enable-cloud-monitoring
--logging-service=logging.googleapis.com/kubernetes
--monitoring-service=none
--logging=SYSTEM,WORKLOAD
--monitoring=NONE
System logging and monitoring only (beta) --enable-logging-monitoring-system-only --enable-logging-monitoring-system-only --logging=SYSTEM
--monitoring=SYSTEM
System and workload logging and monitoring --enable-stackdriver-kubernetes --enable-stackdriver-kubernetes --logging=SYSTEM,WORKLOAD
--monitoring=SYSTEM

What's next

  • Learn about the costs associated with Cloud Logging, Cloud Monitoring, and Google Cloud Managed Service for Prometheus by reading the Pricing page.