Event types

This document describes the event types that you can display on your charts. An event is an activity, such as a reboot or a crash, that affects the operation of a system. Showing events can help you correlate data from different sources when you're troubleshooting an issue.

For each event type, the following information is provided:

  • A query suitable for use with the Logs Explorer or with a log-based alerting policy.
  • References to general information or to troubleshooting documentation.

The following screenshot illustrates a chart that is displaying one annotation, with the tooltip for the annotation activated:

Chart displaying warning and informational event annotations.

Each annotation can list multiple events. In the previous screenshot, an event for a GKE deployment is listed.

To learn how to show events on your custom dashboards, see Show events on a dashboard.

Google Kubernetes Engine event types

This section describes the Google Kubernetes Engine event types that can be displayed on a dashboard.

Patched or updated GKE workload

This event type helps you troubleshoot GKE workload deployment or statefulset changes, as these events can correlate with performance regressions or other performance issues. This event type is shown when a workload is created, updated, or deleted.

If you want to create a log-based alerting policy for this event type, then use the following query:

resource.type=k8s_cluster protoPayload.methodName=(
    io.k8s.apps.v1.deployments.create OR io.k8s.apps.v1.deployments.patch OR
    io.k8s.apps.v1.deployments.update OR io.k8s.apps.v1.deployments.delete OR
    io.k8s.apps.v1.deployments.deletecollection OR io.k8s.apps.v1.statefulsets.create OR
    io.k8s.apps.v1.statefulsets.patch OR io.k8s.apps.v1.statefulsets.update OR
    io.k8s.apps.v1.statefulsets.delete OR io.k8s.apps.v1.statefulsets.deletecollection OR
    io.k8s.apps.v1.daemonsets.create OR io.k8s.apps.v1.daemonsets.patch OR
    io.k8s.apps.v1.daemonsets.update OR io.k8s.apps.v1.daemonsets.delete OR
    io.k8s.apps.v1.daemonsets.deletecollection
)
-protoPayload.authenticationInfo.principalEmail="system:addon-manager"
-protoPayload.request.metadata.namespace=(kube-system OR gmp-system OR gmp-public OR gke-gmp-system)

For additional information, see Overview of deploying workloads and View observability metrics.

Crash of a GKE pod

This event type helps you identify and troubleshoot GKE pod crashes. Pod crashes can be caused by memory exhaustion or an application error. This event type is shown when any of the following occur:

  • Pod status is CrashLoopBackoff
  • Pod terminates with a non-zero exit code.
  • Pod terminates with an out-of-memory condition.
  • Pod is evicted.
  • Readiness/Liveliness probe fails.

If you want to create a log-based alerting policy for this event type, then use the following query:

(
    log_id(events)
    (
        (resource.type=k8s_pod jsonPayload.reason=(BackOff OR Unhealthy OR Killing OR Evicted)) OR
        (resource.type=k8s_node jsonPayload.reason=OOMKilling)
    )
    severity=WARNING
) OR (
    log_id(cloudaudit.googleapis.com%2Factivity) resource.type=k8s_cluster
    (protoPayload.methodName=io.k8s.core.v1.pods.eviction.create OR
        (protoPayload.methodName=io.k8s.core.v1.pods.delete
        protoPayload.response.status.containerStatuses.state.terminated.exitCode:*
        -protoPayload.response.status.containerStatuses.state.terminated.exitCode=0
        )
    )
)

For troubleshooting information, see Troubleshoot: CrashLoopBackOff.

Failure to schedule a GKE pod

This event type helps you identify and troubleshoot when pods that can't be scheduled on a node. This event type is shown when pod scheduling fails for any of the following reasons:

  • Insufficient node CPU.
  • Insufficient node memory.
  • No nodes for taints or tolerations.
  • Nodes at the maximum pod limit.
  • Node pool at maximum size.

If you want to create a log-based alerting policy for this event type, then use the following query:

(
    log_id(events) resource.type=k8s_pod jsonPayload.reason=(NotTriggerScaleUp OR FailedScheduling)
) OR (
    log_id(container.googleapis.com/cluster-autoscaler-visibility)
    resource.type=k8s_cluster jsonPayload.noDecisionStatus.noScaleUp:*
)

For troubleshooting information, see Troubleshoot: Pod unschedulable.

Failure to create a GKE container

This event type helps you identify and troubleshoot failures to create a GKE container. Container creation might fail due to reasons such as failed volume mounts or failed image pulls.

If you want to create a log-based alerting policy for this event type, then use the following query:

log_id(events) resource.type=k8s_pod jsonPayload.reason=(Failed OR FailedMount) severity=WARNING

For troubleshooting information, see Troubleshoot: ImagePullBackOff and ErrImagePull.

Pod autoscaler scale up and down

This event gives you visibility into Horizontal Pod Autoscaler rescales, which increase or decrease the number of running pods for a workload. For more information, see Horizontal Pod autoscaling.

If you want to create a log-based alerting policy for this event type, then use the following query:

resource.type=k8s_cluster log_id(events)
jsonPayload.involvedObject.kind=HorizontalPodAutoscaler jsonPayload.reason=SuccessfulRescale

Cluster autoscaler scale up and down

This event gives you visibility into when the cluster autoscaler scales up or down the number of nodes in a node pool of your cluster. For more information, see About cluster autoscaling and Viewing cluster autoscaler events.

If you want to create a log-based alerting policy for this event type, then use the following query:

(resource.type=k8s_cluster log_id(container.googleapis.com%2Fcluster-autoscaler-visibility)
jsonPayload.decision:*)

Cluster create and delete

This event tracks GKE cluster create and deletion actions. For more information, see Create an Autopilot cluster, Creating a zonal cluster, and Deleting a cluster.

If you want to create a log-based alerting policy for this event type, then use the following query:

resource.type=gke_cluster log_id(cloudaudit.googleapis.com%2Factivity)
protoPayload.methodName=(
    google.container.v1alpha1.ClusterManager.CreateCluster OR
    google.container.v1beta1.ClusterManager.CreateCluster OR
    google.container.v1.ClusterManager.CreateCluster OR
    google.container.v1alpha1.ClusterManager.DeleteCluster OR
    google.container.v1beta1.ClusterManager.DeleteCluster OR
    google.container.v1.ClusterManager.DeleteCluster
)
operation.first=true

Cluster update

This event tracks GKE cluster updates. Updates include automatic control plane version upgrades as well as manual upgrades and cluster configuration changes. For more information, see Manually upgrading a cluster or node pool and Standard cluster upgrades.

If you want to create a log-based alerting policy for this event type, then use the following query:

resource.type=gke_cluster log_id(cloudaudit.googleapis.com%2Factivity)
(
    protoPayload.methodName=(
        google.container.internal.ClusterManagerInternal.PatchCluster OR
        google.container.internal.ClusterManagerInternal.UpdateClusterInternal OR
        google.container.internal.ClusterManagerInternal.UpdateCluster
    )
) OR (
    protoPayload.methodName=(
        google.container.v1beta1.ClusterManager.UpdateCluster OR
        google.container.v1.ClusterManager.UpdateCluster
    )
    operation.first=true
)
protoPayload.metadata.operationType=(UPGRADE_MASTER OR REPAIR_CLUSTER OR UPDATE_CLUSTER)

Node pool update

This event tracks GKE node pool updates. Updates include automatic node pool version upgrades as well as manual upgrades, configuration changes, and resizes. For more information, see Manually upgrading a cluster or node pool and Standard cluster upgrades.

If you want to create a log-based alerting policy for this event type, then use the following query:

resource.type=gke_nodepool log_id(cloudaudit.googleapis.com%2Factivity)
(
    protoPayload.methodName=(
        google.container.internal.ClusterManagerInternal.UpdateClusterInternal OR
        google.container.internal.ClusterManagerInternal.RepairNodePool
    )
) OR (
    protoPayload.methodName=(
        google.container.v1beta1.ClusterManager.UpdateNodePool OR
        google.container.v1.ClusterManager.UpdateNodePool OR
        google.container.v1beta1.ClusterManager.SetNodePoolSize OR
        google.container.v1.ClusterManager.SetNodePoolSize OR
        google.container.v1beta1.ClusterManager.SetNodePoolManagement OR
        google.container.v1.ClusterManager.SetNodePoolManagement OR
        google.container.v1beta1.ClusterManager.SetNodePoolAutoscaling OR
        google.container.v1.ClusterManager.SetNodePoolAutoscaling
    )
    operation.first=true
)

Cloud Run event types

This section describes the Cloud Run event types that can be displayed on a dashboard.

Cloud Run deployment

This event type helps you identify and troubleshoot Cloud Run deployment failures. Deployment might fail due to reasons such as deleted service account, incorrect permissions, the import of a container failed, or a container failed to start.

If you want to create a log-based alerting policy for this event type, then use the following query:

log_id(cloudaudit.googleapis.com%2Factivity) resource.type=cloud_run_revision
protoPayload.methodName=google.cloud.run.v1.Services.ReplaceService

For troubleshooting information, see Troubleshoot: Cloud Run issues.

Cloud SQL event types

This section describes the Cloud SQL event types that can be displayed on a dashboard.

Cloud SQL failover

This event type helps you identify when manual or automatic failovers occur. A failover occurs when there is an instance or zone failure and the standby instance becomes the new primary instance. During a failover, Cloud SQL automatically switches to serving data from the standby instance.

If you want to create a log-based alerting policy for this event type, then use the following query:

resource.type=cloudsql_database
(
    (
        log_id(cloudaudit.googleapis.com%2Factivity)
        protoPayload.methodName=cloudsql.instances.failover
        operation.last=true
    ) OR (
        log_id(cloudaudit.googleapis.com%2Fsystem_event)
        protoPayload.methodName=cloudsql.instances.autoFailover
    )
)

For additional information, see About high availability.

Cloud SQL start or stop

This event type helps you identify a Cloud SQL instance has been manually started, stopped, or restarted. When an instance is stopped, all connections, open files, and running operations are also stopped.

If you want to create a log-based alerting policy for this event type, then use the following query:

log_id(cloudaudit.googleapis.com%2Factivity) resource.type=cloudsql_database
protoPayload.methodName=cloudsql.instances.update operation.last=true
protoPayload.metadata.intents.intent=(START_INSTANCE OR STOP_INSTANCE)

For additional information, see About high availability and Start, stop, and restart instances.

Cloud SQL storage

This event type helps you identify events related to Cloud SQL storage, including when database storage is full, and when a database is shut down due to reaching storage capacity. Databases at storage capacity and without automatic storage enabled might be shut down to prevent data corruption.

If you want to create a log-based alerting policy for this event type, then use the following query:

resource.type=cloudsql_database
(
    (
        (log_id(cloudsql.googleapis.com%2Fpostgres.log) OR log_id(cloudsql.googleapis.com%2Fmysql.err))
        textPayload=~"No space left on device"
        severity=(ERROR OR EMERGENCY)
    ) OR (
        log_id(cloudaudit.googleapis.com%2Fsystem_event)
        protoPayload.methodName=cloudsql.instances.databaseShutdownOutOfStorage
    )
)

Compute Engine event types

This section describes the Compute Engine event types that can be displayed on a dashboard.

Virtual machine terminations

This event type helps you identify virtual machine (VM) terminations, including manually triggered resets and stops, guest OS terminations, maintenance terminations, and host errors.

If you want to create a log-based alerting policy for this event type, then use the following query:

resource.type=gce_instance
(
    (
        log_id(cloudaudit.googleapis.com%2Factivity)
        protoPayload.methodName=(
            beta.compute.instances.reset OR v1.compute.instances.reset OR
            beta.compute.instances.stop OR v1.compute.instances.stop
        )
        operation.first=true
    ) OR (
        log_id(cloudaudit.googleapis.com%2Fsystem_event)
        protoPayload.methodName=(
            compute.instances.hostError OR
            compute.instances.guestTerminate OR
            compute.instances.terminateOnHostMaintenance
        )
    )
)

For additional information, see Stop and start a VM and Troubleshooting VM shutdowns and reboots.

VM instance start failure

This event tracks Compute Engine VM instance start failures. The event displays start failures due to stockouts, IP space exhaustion, quota exceeded, or shielded-VM integrity errors.

If you want to create a log-based alerting policy for this event type, then use the following query:

resource.type=gce_instance
(
    (
        log_id(cloudaudit.googleapis.com%2Factivity)
        protoPayload.methodName=(beta.compute.instances.insert OR v1.compute.instances.insert)
        protoPayload.status.message=(ZONE_RESOURCE_POOL_EXHAUSTED OR IP_SPACE_EXHAUSTED OR QUOTA_EXCEEDED)
    ) OR (
        log_id(compute.googleapis.com%2Fshielded_vm_integrity)
        severity="ERROR"
    )
)

VM instance guest OS error

This event tracks specific Compute Engine VM instance Guest OS errors as recorded by the serial console logs. The errors tracked are disk full, file system mount failed and boot failures that activate Linux emergency mode.

For these events to be visible, you must enable serial port output logging to Cloud Logging by setting serial-port-logging-enable=true in the VM or in the project metadata. For more information, see Enabling and disabling serial port output logging.

If you want to create a log-based alerting policy for this event type, then use the following query:

resource.type=gce_instance
log_id(serialconsole.googleapis.com%2Fserial_port_1_output)
textPayload=~("No space left on device" OR "Failed to mount" OR "You are in emergency mode")

Managed instance group update

This event type helps you identify when your Managed Instance Group (MIG) has been updated. For example, VMs have been added or removed, or the size limit has been updated. For more information, see Automatically apply VM configuration updates in a MIG.

If you want to create a log-based alerting policy for this event type, then use the following query:

resource.type=gce_instance_group_manager
log_id(cloudaudit.googleapis.com%2Factivity) operation.first=true
protoPayload.methodName=(beta.compute.instanceGroupManagers.patch OR v1.compute.instanceGroupManagers.patch)

For additional information, see Work with managed instances and Troubleshooting managed instance groups.

Managed instance group autoscaler

This event tracks scaling decisions made by the autoscaler of a MIG. These decisions could include changes in the recommended size for a MIG, or a change in the status of the autoscaler itself. For more information, see Autoscaling groups of instances.

If you want to create a log-based alerting policy for this event type, then use the following query:

resource.type=autoscaler log_id(cloudaudit.googleapis.com%2Fsystem_event)
protoPayload.methodName=(compute.autoscalers.resize OR compute.autoscalers.changeStatus)

Uptime check event types

This section describes the uptime check event types that can be displayed on a dashboard.

Uptime check failure

This event type helps you identify uptime check failures from configured regions.

If you want to create a log-based alerting policy for this event type, then use the following query:

log_id(monitoring.googleapis.com%2Fuptime_checks)
(
  resource.type=uptime_url OR resource.type=gce_instance OR
  resource.type=gae_app OR resource.type=k8s_service OR
  resource.type=servicedirectory_service OR resource.type=cloud_run_revision OR
  resource.type=aws_ec2_instance OR resource.type=aws_elb_load_balancer
)
labels.uptime_result_type=UptimeCheckResult
severity=NOTICE

For troubleshooting information, see Troubleshoot synthetic monitors and uptime checks.

What's next

To learn how to display events on your dashboards, see Show events on a dashboard.