This document describes the event types that you can display on your charts. An event is an activity, such as a reboot or a crash, that affects the operation of a system. Showing events can help you correlate data from different sources when you're troubleshooting an issue.
For each event type, the following information is provided:
- A query suitable for use with the Logs Explorer or with a log-based alerting policy.
- References to general information or to troubleshooting documentation.
The following screenshot illustrates a chart that is displaying one annotation, with the tooltip for the annotation activated:
Each annotation can list multiple events. In the previous screenshot, an event for a GKE deployment is listed.
To learn how to show events on your custom dashboards, see Show events on a dashboard.
Google Kubernetes Engine event types
This section describes the Google Kubernetes Engine event types that can be displayed on a dashboard.
Patched or updated GKE workload
This event type helps you troubleshoot GKE workload deployment or statefulset changes, as these events can correlate with performance regressions or other performance issues. This event type is shown when a workload is created, updated, or deleted.
If you want to create a log-based alerting policy for this event type, then use the following query:
resource.type=k8s_cluster protoPayload.methodName=( io.k8s.apps.v1.deployments.create OR io.k8s.apps.v1.deployments.patch OR io.k8s.apps.v1.deployments.update OR io.k8s.apps.v1.deployments.delete OR io.k8s.apps.v1.deployments.deletecollection OR io.k8s.apps.v1.statefulsets.create OR io.k8s.apps.v1.statefulsets.patch OR io.k8s.apps.v1.statefulsets.update OR io.k8s.apps.v1.statefulsets.delete OR io.k8s.apps.v1.statefulsets.deletecollection OR io.k8s.apps.v1.daemonsets.create OR io.k8s.apps.v1.daemonsets.patch OR io.k8s.apps.v1.daemonsets.update OR io.k8s.apps.v1.daemonsets.delete OR io.k8s.apps.v1.daemonsets.deletecollection ) -protoPayload.authenticationInfo.principalEmail="system:addon-manager" -protoPayload.request.metadata.namespace=(kube-system OR gmp-system OR gmp-public OR gke-gmp-system)
For additional information, see Overview of deploying workloads and View observability metrics.
Crash of a GKE pod
This event type helps you identify and troubleshoot GKE pod crashes. Pod crashes can be caused by memory exhaustion or an application error. This event type is shown when any of the following occur:
- Pod status is
CrashLoopBackoff
- Pod terminates with a non-zero exit code.
- Pod terminates with an out-of-memory condition.
- Pod is evicted.
- Readiness/Liveliness probe fails.
If you want to create a log-based alerting policy for this event type, then use the following query:
( log_id(events) ( (resource.type=k8s_pod jsonPayload.reason=(BackOff OR Unhealthy OR Killing OR Evicted)) OR (resource.type=k8s_node jsonPayload.reason=OOMKilling) ) severity=WARNING ) OR ( log_id(cloudaudit.googleapis.com%2Factivity) resource.type=k8s_cluster (protoPayload.methodName=io.k8s.core.v1.pods.eviction.create OR (protoPayload.methodName=io.k8s.core.v1.pods.delete protoPayload.response.status.containerStatuses.state.terminated.exitCode:* -protoPayload.response.status.containerStatuses.state.terminated.exitCode=0 ) ) )
For troubleshooting information, see Troubleshoot: CrashLoopBackOff.
Failure to schedule a GKE pod
This event type helps you identify and troubleshoot when pods that can't be scheduled on a node. This event type is shown when pod scheduling fails for any of the following reasons:
- Insufficient node CPU.
- Insufficient node memory.
- No nodes for taints or tolerations.
- Nodes at the maximum pod limit.
- Node pool at maximum size.
If you want to create a log-based alerting policy for this event type, then use the following query:
( log_id(events) resource.type=k8s_pod jsonPayload.reason=(NotTriggerScaleUp OR FailedScheduling) ) OR ( log_id(container.googleapis.com/cluster-autoscaler-visibility) resource.type=k8s_cluster jsonPayload.noDecisionStatus.noScaleUp:* )
For troubleshooting information, see Troubleshoot: Pod unschedulable.
Failure to create a GKE container
This event type helps you identify and troubleshoot failures to create a GKE container. Container creation might fail due to reasons such as failed volume mounts or failed image pulls.
If you want to create a log-based alerting policy for this event type, then use the following query:
log_id(events) resource.type=k8s_pod jsonPayload.reason=(Failed OR FailedMount) severity=WARNING
For troubleshooting information, see Troubleshoot: ImagePullBackOff and ErrImagePull.
Pod autoscaler scale up and down
This event gives you visibility into Horizontal Pod Autoscaler rescales, which increase or decrease the number of running pods for a workload. For more information, see Horizontal Pod autoscaling.
If you want to create a log-based alerting policy for this event type, then use the following query:
resource.type=k8s_cluster log_id(events) jsonPayload.involvedObject.kind=HorizontalPodAutoscaler jsonPayload.reason=SuccessfulRescale
Cluster autoscaler scale up and down
This event gives you visibility into when the cluster autoscaler scales up or down the number of nodes in a node pool of your cluster. For more information, see About cluster autoscaling and Viewing cluster autoscaler events.
If you want to create a log-based alerting policy for this event type, then use the following query:
(resource.type=k8s_cluster log_id(container.googleapis.com%2Fcluster-autoscaler-visibility) jsonPayload.decision:*)
Cluster create and delete
This event tracks GKE cluster create and deletion actions. For more information, see Create an Autopilot cluster, Creating a zonal cluster, and Deleting a cluster.
If you want to create a log-based alerting policy for this event type, then use the following query:
resource.type=gke_cluster log_id(cloudaudit.googleapis.com%2Factivity) protoPayload.methodName=( google.container.v1alpha1.ClusterManager.CreateCluster OR google.container.v1beta1.ClusterManager.CreateCluster OR google.container.v1.ClusterManager.CreateCluster OR google.container.v1alpha1.ClusterManager.DeleteCluster OR google.container.v1beta1.ClusterManager.DeleteCluster OR google.container.v1.ClusterManager.DeleteCluster ) operation.first=true
Cluster update
This event tracks GKE cluster updates. Updates include automatic control plane version upgrades as well as manual upgrades and cluster configuration changes. For more information, see Manually upgrading a cluster or node pool and Standard cluster upgrades.
If you want to create a log-based alerting policy for this event type, then use the following query:
resource.type=gke_cluster log_id(cloudaudit.googleapis.com%2Factivity) ( protoPayload.methodName=( google.container.internal.ClusterManagerInternal.PatchCluster OR google.container.internal.ClusterManagerInternal.UpdateClusterInternal OR google.container.internal.ClusterManagerInternal.UpdateCluster ) ) OR ( protoPayload.methodName=( google.container.v1beta1.ClusterManager.UpdateCluster OR google.container.v1.ClusterManager.UpdateCluster ) operation.first=true ) protoPayload.metadata.operationType=(UPGRADE_MASTER OR REPAIR_CLUSTER OR UPDATE_CLUSTER)
Node pool update
This event tracks GKE node pool updates. Updates include automatic node pool version upgrades as well as manual upgrades, configuration changes, and resizes. For more information, see Manually upgrading a cluster or node pool and Standard cluster upgrades.
If you want to create a log-based alerting policy for this event type, then use the following query:
resource.type=gke_nodepool log_id(cloudaudit.googleapis.com%2Factivity) ( protoPayload.methodName=( google.container.internal.ClusterManagerInternal.UpdateClusterInternal OR google.container.internal.ClusterManagerInternal.RepairNodePool ) ) OR ( protoPayload.methodName=( google.container.v1beta1.ClusterManager.UpdateNodePool OR google.container.v1.ClusterManager.UpdateNodePool OR google.container.v1beta1.ClusterManager.SetNodePoolSize OR google.container.v1.ClusterManager.SetNodePoolSize OR google.container.v1beta1.ClusterManager.SetNodePoolManagement OR google.container.v1.ClusterManager.SetNodePoolManagement OR google.container.v1beta1.ClusterManager.SetNodePoolAutoscaling OR google.container.v1.ClusterManager.SetNodePoolAutoscaling ) operation.first=true )
Cloud Run event types
This section describes the Cloud Run event types that can be displayed on a dashboard.
Cloud Run deployment
This event type helps you identify and troubleshoot Cloud Run deployment failures. Deployment might fail due to reasons such as deleted service account, incorrect permissions, the import of a container failed, or a container failed to start.
If you want to create a log-based alerting policy for this event type, then use the following query:
log_id(cloudaudit.googleapis.com%2Factivity) resource.type=cloud_run_revision protoPayload.methodName=google.cloud.run.v1.Services.ReplaceService
For troubleshooting information, see Troubleshoot: Cloud Run issues.
Cloud SQL event types
This section describes the Cloud SQL event types that can be displayed on a dashboard.
Cloud SQL failover
This event type helps you identify when manual or automatic failovers occur. A failover occurs when there is an instance or zone failure and the standby instance becomes the new primary instance. During a failover, Cloud SQL automatically switches to serving data from the standby instance.
If you want to create a log-based alerting policy for this event type, then use the following query:
resource.type=cloudsql_database ( ( log_id(cloudaudit.googleapis.com%2Factivity) protoPayload.methodName=cloudsql.instances.failover operation.last=true ) OR ( log_id(cloudaudit.googleapis.com%2Fsystem_event) protoPayload.methodName=cloudsql.instances.autoFailover ) )
For additional information, see About high availability.
Cloud SQL start or stop
This event type helps you identify a Cloud SQL instance has been manually started, stopped, or restarted. When an instance is stopped, all connections, open files, and running operations are also stopped.
If you want to create a log-based alerting policy for this event type, then use the following query:
log_id(cloudaudit.googleapis.com%2Factivity) resource.type=cloudsql_database protoPayload.methodName=cloudsql.instances.update operation.last=true protoPayload.metadata.intents.intent=(START_INSTANCE OR STOP_INSTANCE)
For additional information, see About high availability and Start, stop, and restart instances.
Cloud SQL storage
This event type helps you identify events related to Cloud SQL storage, including when database storage is full, and when a database is shut down due to reaching storage capacity. Databases at storage capacity and without automatic storage enabled might be shut down to prevent data corruption.
If you want to create a log-based alerting policy for this event type, then use the following query:
resource.type=cloudsql_database ( ( (log_id(cloudsql.googleapis.com%2Fpostgres.log) OR log_id(cloudsql.googleapis.com%2Fmysql.err)) textPayload=~"No space left on device" severity=(ERROR OR EMERGENCY) ) OR ( log_id(cloudaudit.googleapis.com%2Fsystem_event) protoPayload.methodName=cloudsql.instances.databaseShutdownOutOfStorage ) )
Compute Engine event types
This section describes the Compute Engine event types that can be displayed on a dashboard.
Virtual machine terminations
This event type helps you identify virtual machine (VM) terminations, including manually triggered resets and stops, guest OS terminations, maintenance terminations, and host errors.
If you want to create a log-based alerting policy for this event type, then use the following query:
resource.type=gce_instance ( ( log_id(cloudaudit.googleapis.com%2Factivity) protoPayload.methodName=( beta.compute.instances.reset OR v1.compute.instances.reset OR beta.compute.instances.stop OR v1.compute.instances.stop ) operation.first=true ) OR ( log_id(cloudaudit.googleapis.com%2Fsystem_event) protoPayload.methodName=( compute.instances.hostError OR compute.instances.guestTerminate OR compute.instances.terminateOnHostMaintenance ) ) )
For additional information, see Stop and start a VM and Troubleshooting VM shutdowns and reboots.
VM instance start failure
This event tracks Compute Engine VM instance start failures. The event displays start failures due to stockouts, IP space exhaustion, quota exceeded, or shielded-VM integrity errors.
If you want to create a log-based alerting policy for this event type, then use the following query:
resource.type=gce_instance ( ( log_id(cloudaudit.googleapis.com%2Factivity) protoPayload.methodName=(beta.compute.instances.insert OR v1.compute.instances.insert) protoPayload.status.message=(ZONE_RESOURCE_POOL_EXHAUSTED OR IP_SPACE_EXHAUSTED OR QUOTA_EXCEEDED) ) OR ( log_id(compute.googleapis.com%2Fshielded_vm_integrity) severity="ERROR" ) )
VM instance guest OS error
This event tracks specific Compute Engine VM instance Guest OS errors as recorded by the serial console logs. The errors tracked are disk full, file system mount failed and boot failures that activate Linux emergency mode.
For these events to be visible, you must enable serial port output logging to
Cloud Logging by setting serial-port-logging-enable=true
in the VM or in
the project metadata. For more information, see
Enabling and disabling serial port output logging.
If you want to create a log-based alerting policy for this event type, then use the following query:
resource.type=gce_instance log_id(serialconsole.googleapis.com%2Fserial_port_1_output) textPayload=~("No space left on device" OR "Failed to mount" OR "You are in emergency mode")
Managed instance group update
This event type helps you identify when your Managed Instance Group (MIG) has been updated. For example, VMs have been added or removed, or the size limit has been updated. For more information, see Automatically apply VM configuration updates in a MIG.
If you want to create a log-based alerting policy for this event type, then use the following query:
resource.type=gce_instance_group_manager log_id(cloudaudit.googleapis.com%2Factivity) operation.first=true protoPayload.methodName=(beta.compute.instanceGroupManagers.patch OR v1.compute.instanceGroupManagers.patch)
For additional information, see Work with managed instances and Troubleshooting managed instance groups.
Managed instance group autoscaler
This event tracks scaling decisions made by the autoscaler of a MIG. These decisions could include changes in the recommended size for a MIG, or a change in the status of the autoscaler itself. For more information, see Autoscaling groups of instances.
If you want to create a log-based alerting policy for this event type, then use the following query:
resource.type=autoscaler log_id(cloudaudit.googleapis.com%2Fsystem_event) protoPayload.methodName=(compute.autoscalers.resize OR compute.autoscalers.changeStatus)
Uptime check event types
This section describes the uptime check event types that can be displayed on a dashboard.
Uptime check failure
This event type helps you identify uptime check failures from configured regions.
If you want to create a log-based alerting policy for this event type, then use the following query:
log_id(monitoring.googleapis.com%2Fuptime_checks) ( resource.type=uptime_url OR resource.type=gce_instance OR resource.type=gae_app OR resource.type=k8s_service OR resource.type=servicedirectory_service OR resource.type=cloud_run_revision OR resource.type=aws_ec2_instance OR resource.type=aws_elb_load_balancer ) labels.uptime_result_type=UptimeCheckResult severity=NOTICE
For troubleshooting information, see Troubleshoot synthetic monitors and uptime checks.
What's next
To learn how to display events on your dashboards, see Show events on a dashboard.