There are two options for monitoring and logging support in Google Kubernetes Engine (GKE):
Legacy Logging and Monitoring: For documentation, see the Legacy Monitoring and Legacy Logging pages.
Cloud Operations for GKE: For documentation, see Cloud Operations for GKE.
This page explains the differences between these two options and what you must change to migrate from Legacy Logging and Monitoring to Cloud Operations for GKE.
When do I need to migrate?
You can migrate your existing Cloud Monitoring and Cloud Logging configurations from Legacy Logging and Monitoring to Cloud Operations for GKE at any time. However, keep in mind that Legacy Logging and Monitoring is not supported in GKE version 1.20.
The following table summarizes which monitoring and logging options are available in each GKE release version:
GKE version | Legacy Logging and Monitoring | Cloud Operations for GKE |
---|---|---|
1.14 | Available | Default |
1.15 | Available | Default |
1.16 | Available | Default |
1.17 | Available | Default |
1.18 | Available | Default |
1.19 | Available | Default |
1.20 | Not Available | Default |
For information on the deprecation of Legacy Logging and Monitoring, refer to the Legacy support for GKE deprecation guide.
What are the benefits of using Cloud Operations for GKE?
Cloud Operations for GKE provides important benefits to you including the following:
Improved infrastructure monitoring. GKE dashboard includes more out-of-the-box metrics in the free tier, an increase from 17 legacy metrics to 44 new metrics.
More resource types to better differentiate between Kubernetes resources, more metadata for filtering and grouping metrics.
Service oriented monitoring support with SLO Monitoring for GKE.
Consistent resource models across Cloud Logging and Cloud Monitoring.
Performance improvements for all new GKE metrics.
What is changing?
Cloud Operations for GKE uses a different resource model than Legacy Logging and Monitoring to organize its metrics, logs, and metadata. Here are some specific changes for your clusters using Cloud Operations for GKE:
Navigation change: Cloud Monitoring dashboard is named GKE. This dashboard doesn't appear if you don't have clusters using Cloud Operations for GKE.
Monitored resource type names changes: For example, your Kubernetes nodes are listed under the monitored resource type k8s_node , which is a Kubernetes Node, rather than gce_instance (Compute Engine VM instance).
Kubernetes metric names changes: In Cloud Operations for GKE, metric type names now start with the prefix
kubernetes.io/
rather thancontainer.googleapis.com/
.logEntry metadata changes: Cloud Operations for GKE log entries changed the names for some of the
resource.label
andlabels
fields. For example, the fieldresource.labels.namespace_id
has changed toresource.labels.namespace_name
while the value has not changed.logName changes: Cloud Operations for GKE log entries use
stdout
orstderr
in their log names whereas Legacy Logging and Monitoring uses a wider variety of names, including the container name. The container name is still available in Cloud Operations for GKE as a resource label underresource.labels.container_name
.
The following table summarizes the preceding changes:
Change | (Old) Legacy Logging and Monitoring | (New) Cloud Operations for GKE |
---|---|---|
Dashboard menus | Dashboards > GKE Clusters | Dashboards > GKE |
Metric prefixes | container.googleapis.com
|
kubernetes.io
|
Metrics resource types | gke_container
gce_instance
(none) |
k8s_container
k8s_node
k8s_pod |
Log resource types | container
gke_cluster
gce_instance
gke_nodepool
|
k8s_container
k8s_cluster
gke_cluster (audit logs only)
k8s_node
k8s_pod |
Resource type changes
Cloud Operations for GKE has new resource type names, new resource type display names, and new names for the labels that identify specific resources. These changes are listed in the following table.
(Old) Legacy Logging and Monitoring resource type | (New) Cloud Operations for GKE resource type | |||
---|---|---|---|---|
Table footnotes: 1 In the new resource type used for monitoring (only), instance_id becomes node_name in
metadata.system_labels .
2 zone refers to the location of this
container or instance. location refers to the
location of the cluster master node.
3 metadata.system_labels.node_name
is not available in k8s_container resource types
used for logging. You cannot search by node name for logs.
4 The gce_instance resource type can represent
Kubernetes nodes as well as non-Kubernetes VM instances. When upgrading
to Cloud Operations for GKE, node-related uses are changed to use the new resource
type, k8s_node , including
node-level logs with the following names:
kubelet ,
docker ,
kube-proxy ,
startupscript , and
node-problem-detector .
5 The k8s_pod and k8s_cluster nodes might
include logs not present in the Legacy Logging and Monitoring support.
|
||||
Monitoring only:
gke_container
(GKE Container)
Labels:
container
(GKE Container)
Labels:
|
Monitoring and Logging:
k8s_container
(Kubernetes Container)
Labels:
|
|||
Logging only::
gce_instance
(Compute Engine VM Instance)4Labels: cluster_name
instance_id
project_id
zone 2
|
Monitoring and Logging
k8s_node 4 (Kubernetes Node)Labels: cluster_name
node_name
project_id
location 2
|
|||
(none) |
Monitoring and Logging:
k8s_pod 5 (Kubernetes Pod)
Labels:
|
|||
Logging only
gke_cluster (GKE_cluster)
Labels:
|
Monitoring and Logging:
k8s_cluster 5 (Kubernetes Cluster)
Labels:
|
What do I need to do?
This section contains more specific information on the resource model changes in Cloud Operations for GKE and their impact on your existing monitoring and logging configurations.
You should perform the following steps to migrate your cluster to Cloud Operations for GKE:
Identify your Logging and Monitoring configurations: Identify any Logging and Monitoring configurations that might be using values that have changed between Legacy Logging and Monitoring and Cloud Operations for GKE.
Update your Logging and Monitoring configurations: Update any Logging and Monitoring configurations to reflect the changes present in Cloud Operations for GKE.
Update your GKE cluster configuration: Update your GKE cluster to use the Cloud Operations for GKE setting.
Since the resource models and logNames have changed between Legacy Logging and Monitoring and Cloud Operations for GKE, any Logging or Monitoring configurations that reference the changes in the resource models must also be updated. The migration might require you to update Logging and Monitoring configurations including, but not limited to:
- custom dashboards
- charts
- group filters
- alerting policies
- log sinks
- log exclusions
- log-based metrics in Cloud Logging and Cloud Monitoring
Identifying clusters using Legacy Logging and Monitoring
Use Cloud Monitoring's GKE Clusters dashboard to identify which clusters within a project are still using Legacy Logging and Monitoring:
- Click on the Cloud Monitoring GKE Clusters dashboard.
- Ensure the "Metrics Scope" selected includes the Google Cloud project that you want to review for clusters running Legacy Logging and Monitoring.
View the list of clusters in the dashboard. Only clusters using the Legacy Logging and Monitoring appear in the dashboard.
For example, in the following screenshot, there are 4 clusters using Legacy Logging and Monitoring.
Migrating your monitoring resources
If you are using Legacy Logging and Monitoring with a GKE cluster whose control plane version is 1.15 or newer, then your cluster's metrics are available in both the Legacy Monitoring and Cloud Operations for GKE resource models. This means that even before you migrate your clusters to Cloud Operations for GKE, your clusters start generating metrics using the new data model at no additional cost.
Starting in January 2021, your custom dashboards and alerts will be updated automatically to reference the new resource model metrics. If you want to migrate your own Cloud Monitoring configurations (charts in custom dashboards, alerts, groups) you need to update each configuration to reflect the new resource model.
You also need to migrate your configurations if you maintain your configuration in Terraform or another deployment manager and automatically sync changes.
Identifying configurations for the old data model
To identify the Cloud Monitoring configurations that you must update as part of the migration to Cloud Operations for GKE, view the Kubernetes Migration Status dashboard:
In the Google Cloud console, go to Monitoring:
In the Monitoring navigation pane, click Settings and then select the tab Kubernetes Migration Status.
The following sample dashboard shows that 1 alerting policy needs to be updated:
Updating Cloud Monitoring configurations
If your cluster is using GKE version 1.15 or later and is using Legacy Monitoring, then it is publishing to both data models. In this case, you have two options for how to migrate your configurations.
Clone the configurations and update the clones. With this option, you create a copy of your existing dashboards, alerting policies, and groups and migrate the copies to the new resource model. That way, you can continue to use Monitoring for your cluster using the old data model and the new data model simultaneously. For example, with this option, you would have 2 dashboards: the original one that continues to use the original resource model and a clone of original dashboard that uses the new resource model.
Upgrade the affected configurations in place. This option switches to the new data model in Cloud Monitoring immediately.
The following sections provide instructions for migrating your configurations for dashboards, alerting policies, and groups.
One consideration for deciding which option to choose is how much monitoring history you want to have available. Currently Cloud Monitoring offers 6 weeks of historical data for your clusters. After the GKE cluster upgrade that starts double writing to the data models, the old data model still has the historical metrics for the cluster, while the new data model only has metrics that begin at the time of the upgrade.
If you don't need the historical data, you can upgrade the configurations in place to the new data model at any time. If the historical data is important, you can clone the configurations and update the clones to use the new resource model types.
Alternatively, you can wait for 6 weeks after your cluster starts double writing to both of the data models. After six weeks, both data models have the same historical data, so you can upgrade the configurations in place and switch to the new data model.
Updating dashboards
To view your dashboards, complete the following steps:
From the Google Cloud console, go to Monitoring:
Select Dashboards.
To clone a dashboard and update the clone, complete the following steps:
Find the dashboard you want to clone.
Click Copy Dashboard (
) and enter a name for the cloned dashboard.Update the new dashboard's configurations as needed.
To update the chart definitions in the dashboard, complete the following steps:
Click More chart options (⋮) of the chart you want to edit.
Select Edit to open the Edit chart panel.
Change the resource type and metric name to translate to the new data model. You can also update the Filter and Group by fields as necessary.
Updating alerting policies
To view your alerting policies, complete the following steps:
From the Google Cloud console, go to Monitoring:
Select Alerting.
To clone and update an alerting policy, complete the following steps:
Select the policy you want to clone from the Policies table.
Click Copy to begin the creation flow for the copy of the alerting policy.
Edit any conditions that refer to the old data model to update the resource type and metric name.
The last step of the flow lets you to enter a name for the cloned policy.
To edit an alerting policy in place, complete the following steps:
Select the policy you want to edit from the Policies table.
Click Edit to update the policy.
Update any conditions that refer to the old data model.
Updating groups
You can't clone a group through the Google Cloud console, so if you want to duplicate a group, you must create a new group with the same filter.
A group filter can reference the old data model in several ways.
Resource type - A group might define a filter
resource.type="gke_container"
. Because thegke_container
type can be used to refer to several different types of GKE entities, you must update the filter to the type of resource that you actually intend to match:k8s_container
,k8s_pod
, ork8s_node
. If you want to match multiple types, then define a filter with multiple clauses combined with theOR
operator.Label
cloud_account
- A group might define a filterresource.metadata.cloud_account="<var>CLOUD_ACCOUNT_ID</<var>"
. As part of a separate deprecation thecloud_account
metadata field is no longer available. Consider using theresource.labels.project_id
label.Label
region
- A group might define a filterresource.metadata.region="<var>REGION_NAME</<var>"
. Theregion
metadata field is no longer available in the new data model. If you want to match GKE entities based on geographic location, consider using theresource.labels.location
label.
Mapping metrics between data models
This section describes how to map metrics from the old data model to the metrics in the new data model. The old data model published 17 different metrics, listed in the tables below. Some of these metrics were published against multiple GKE entity types, which results in more than 17 mappings to translate all metrics.
When mapping metrics, remember the following:
The prefix for the old metrics is
container.googleapis.com/
. The prefix for the new metrics iskubernetes.io/
.In the old data model, the only resource type is
gke_container
. Depending on how you defined the resource labels, this resource type might refer to GKE Containers, Pods, System Daemons, and Machines, which correspond to GKE Nodes.You can query the Monitoring API using combinations of
pod_id
andcontainer_name
that don't match those listed in the following table. The data returned by such queries is undefined, and no mapping from these undefined states are provided.GKE Entity Type Filter Container pod_id != '' and container_name != ''
(pod_id
is not the empty string andcontainer_name
is not the empty string)Pod pod_id != '' and container_name == ''
(pod_id
is not the empty string andcontainer_name
is the empty string)System daemon pod_id == '' and container_name != 'machine'
(pod_id
is the empty string andcontainer_name
is one ofdocker-daemon
,kubelets
, orpods
)Machine pod_id == '' and container_name == 'machine'
(pod_id
is the empty string andcontainer_name
is the stringmachine
)
The tables list three types of mappings:
Direct mapping between the old and new data models.
Mappings that require configuration.
Mappings of old metrics that don't have a direct equivalent in the new model.
Direct mapping
The following metrics translate directly between the old and new data models.
Old Metric Name | Old GKE Entity Type | New Metric Name | New GKE Resource Type | Notes |
---|---|---|---|---|
container/accelerator/ duty_cycle |
Container | container/accelerator/ duty_cycle |
k8s_container | |
container/accelerator/ memory_total |
Container | container/accelerator/ memory_total |
k8s_container | |
container/accelerator/ memory_used |
Container | container/accelerator/ memory_used |
k8s_container | |
container/accelerator/ request |
Container | container/accelerator/ request |
k8s_container | |
container/cpu/ reserved_cores |
Container | container/cpu/ limit_cores |
k8s_container | See Mappings that require configuration for mapping when resource is pod |
container/cpu/ usage_time |
Container | container/cpu/ core_usage_time |
k8s_container | See Mappings that require configuration for mapping when resource is pod |
container/cpu/ usage_time |
System Daemon | node_daemon/cpu/ core_usage_time |
k8s_node | In the old data model,gke_container.container_name is one of docker-daemon , kubelets , or pods . These filter values match with the values in the new data model field metric.component . |
container/cpu/ utilization |
Container | container/cpu/ limit_utilization |
k8s_container | |
container/disk/ bytes_total |
Pod | pod/volume/ total_bytes |
k8s_pod | gke_container.device_name (Volume:config-volume) is translated to k8s_pod.volume_name (config-volume) by removing the Volume: that was prepended. |
container/disk/bytes_used | Pod | pod/volume/ used_bytes |
k8s_pod | gke_container.device_name (Volume:config-volume) is translated to k8s_pod.volume_name (config-volume) by removing the Volume: that was prepended. |
container/memory/ bytes_total |
Container | container/memory/ limit_bytes |
k8s_container | |
container/memory/ bytes_used |
Container | container/memory/ used_bytes |
k8s_container | |
container/memory/ bytes_used |
System Daemon | node_daemon/memory/ used_bytes |
k8s_node | In the old data model,gke_container.container_name is one of docker-daemon , kubelets , or pods . These filter values match with the values in the new data model field metric.component . |
container/disk/ inodes_free |
Machine | node/ephemeral_storage/ inodes_free |
k8s_node | The old data model has the instance_id field, a random number ID. The new data model has node_name , a human-readable name. |
container/disk/ inodes_total |
Machine | node/ephemeral_storage/ inodes_total |
k8s_node | The old data model has the instance_id field, a random number ID. The new data model has node_name , a human-readable name. |
container/pid_limit | Machine | node/pid_limit | k8s_node | The old data model has the instance_id field, a random number ID. The new data model has node_name , a human-readable name. |
container/pid_used | Machine | node/pid_used | k8s_node | The old data model has the instance_id field, a random number ID. The new data model has node_name , a human-readable name. |
Mappings that require configuration
The following metrics translate from the old data model to the new data model with some basic manipulation.
Old Metric Name | Old GKE Entity Type | New Metric Name | New GKE Resource Type | Notes |
---|---|---|---|---|
container/cpu/ reserved_cores |
Pod | SUM container/cpu/limit_cores GROUP BY pod_name |
k8s_container | The old data model has a pod_id field, a UUID. The new data model has pod_name , a human-readable name. |
container/cpu/ usage_time |
Pod | SUM container/cpu/core_usage_time GROUP BY pod_name |
k8s_container | The old data model has a pod_id field, a UUID. The new data model has a pod_name , a human-readable name. |
container/disk/ bytes_total |
Container | node/ephemeral_storage/ total_bytes |
k8s_container | gke_container.device_name is one of / or logs . Each of these values is equal to the new value. |
container/disk/ bytes_used |
Container | container/ephemeral_storage/ used_bytes |
k8s_container | gke_container.device_name is one of / or logs . These 2 values must be added together to get the new value. In the new data model, you cannot get the value for / and logs separately. |
container/memory/ bytes_total |
Pod | SUM container/memory/limit_bytes GROUP BY pod_name |
k8s_container | The old data model has a pod_id field, a UUID. The new data model has pod_name , a human-readable name. |
container/memory/ bytes_used |
Pod | SUM container/memory/used_bytes GROUP BY pod_name |
k8s_container | The old data model has a pod_id field, a UUID. The new data model has pod_name , a human-readable name. |
Mappings that don't have a direct equivalent in the new model
The following metrics don't have an equivalent in the new data model.
- CPU utilization for Pod
- In the old data model, this metric, based on the CPU limit for each container, is a weighted average of CPU utilization across all containers in a pod.
- In the new data model, this value doesn't exist and must be calculated on the client-side based on the limit and utilization of each container.
- Uptime
- In the old data model, this metric is a cumulative metric that represents the fraction of time that a container is available in units ms/s. For a container that is always available, the value is ~1000ms/s.
- In the new data model, this metric is a gauge metric in hours that reports how long each part of the system has been running without interruption.
Resource group changes
If you define your own resource groups and use any of the Legacy Logging and Monitoring resource types shown in the preceding Resource type changes table, then change those types to be the corresponding Cloud Operations for GKE resource types. If your resource group includes custom charts, you might have to change them.
Migrating your logging resources
To migrate your logging resources, complete the steps in the following sections.
Changes in log entry contents
When you update to Cloud Operations for GKE, you might find that certain information in log entries has moved to differently-named fields. This information can appear in logs queries used in logs-based metrics, log sinks, and log exclusions.
The following table, Log entry changes, lists the new fields and labels. Here's a brief summary:
- Check the
logName
field in your filters. Cloud Operations for GKE log entries usestdout
orstderr
in their log names whereas Legacy Logging and Monitoring used a wider variety of names, including the container name. The container name is still available as a resource label. - Check the
labels
field in the log entries. This field might contain information formerly in themetadata
log entry fields. - Check the
resource.labels
field in the log entries. The new resource types have additional label values.
(Old) Legacy Logging and Monitoring log entries | (New) Cloud Operations for GKE log entries | |||
---|---|---|---|---|
Table footnotes: 1 Resource labels identify specific resources that yield metrics, such as specific clusters and nodes. 2 The labels field appears in new log entries that are part
of Cloud Operations for GKE and occasionally in some Legacy Logging and Monitoring log entries.
In Cloud Operations for GKE, it is used to hold some information formerly in the
metadata
log entry fields.
|
||||
Log entry resourcesresource.labels (Resource labels1) |
Log entry resourcesresource.labels (Resource labels1) |
|||
Log entry metadatalabels (Log entry labels2)labels (Examples) compute.googleapis.com/resource_name:
container.googleapis.com/namespace_name:
container.googleapis.com/pod_name:
container.googleapis.com/stream:
|
Log entry metadatalabels labels (Examples) k8s-pod/app:
k8s-pod/pod-template-hash:
|
Example logs:
Container resource type changes:
The red, bold text highlights the differences between the Legacy Logging and Monitoring and Cloud Operations for GKE resource models.
Resource model | Example logs |
---|---|
Legacy Logging and Monitoring | { "insertId": "fji4tsf1a8o5h", "jsonPayload": { "pid": 1, "name": "currencyservice-server", "v": 1, "message": "conversion request successful", "hostname": "currencyservice-6995d74b95-zjkmj" }, "resource": { "type": "container", "labels": { "project_id": "my-test-project", "cluster_name": "my-test-cluster", "pod_id": "currencyservice-6995d74b95-zjkmj", "zone": "us-central1-c", "container_name": "server", "namespace_id": "default", "instance_id": "1234567890" } }, "timestamp": "2020-10-02T19:02:47.575434759Z", "severity": "INFO", "labels": { "container.googleapis.com/pod_name": "currencyservice-6995d74b95-zjkmj", "compute.googleapis.com/resource_name": "gke-legacy-cluster-default-pool-c534acb8-hvxk", "container.googleapis.com/stream": "stdout", "container.googleapis.com/namespace_name": "default" }, "logName": "projects/my-test-project/logs/server", "receiveTimestamp": "2020-10-02T19:02:50.972304596Z" } |
Cloud Operations for GKE | { "insertId": "mye361s5zfcl55amj", "jsonPayload": { "v": 1, "name": "currencyservice-server", "pid": 1, "hostname": "currencyservice-5b69f47d-wg4zl", "message": "conversion request successful" }, "resource": { "type": "k8s_container", "labels": { "container_name": "server", "project_id": "my-test-project", "pod_name": "currencyservice-5b69f47d-wg4zl", "namespace_name": "onlineboutique", "location": "us-central1-c", "cluster_name": "my-prod-cluster" } }, "timestamp": "2020-10-02T18:41:55.359669767Z", "severity": "INFO", "labels": { "k8s-pod/app": "currencyservice", "k8s-pod/pod-template-hash": "5b69f47d", "compute.googleapis.com/resource_name": "gke-legacy-cluster-default-pool-c534acb8-hvxk" }, "logName": "projects/my-test-project/logs/stdout", "receiveTimestamp": "2020-10-02T18:41:57.930654427Z" } |
Cluster resource type changes:
The red, bold text highlights the differences between the Legacy Logging and Monitoring and Cloud Operations for GKE resource models.
Resource model | Example logs |
---|---|
Legacy Logging and Monitoring | { "insertId": "962szqg9uiyalt", "jsonPayload": { "type": "Normal", "involvedObject": { "apiVersion": "policy/v1beta1", "uid": "a1bc2345-12ab-12ab-1234-123456a123456", "resourceVersion": "50968", "kind": "PodDisruptionBudget", "namespace": "knative-serving", "name": "activator-pdb" }, "apiVersion": "v1", "reason": "NoPods", "source": { "component": "controllermanager" }, "message": "No matching pods found", "kind": "Event", "metadata": { "selfLink": "/api/v1/namespaces/knative-serving/events/activator-pdb.163a42fcb707c1fe", "namespace": "knative-serving", "name": "activator-pdb.163a42fcb707c1fe", "uid": "a1bc2345-12ab-12ab-1234-123456a123456", "creationTimestamp": "2020-10-02T19:17:50Z", "resourceVersion": "1917" } }, "resource": { "type": "gke_cluster", "labels": { "project_id": "my-test-project", "location": "us-central1-c", "cluster_name": "my-prod-cluster" } }, "timestamp": "2020-10-02T21:33:20Z", "severity": "INFO", "logName": "projects/my-test-project/logs/events", "receiveTimestamp": "2020-10-02T21:33:25.510671123Z" } |
Cloud Operations for GKE | { "insertId": "1qzipokg6ydoesp", "jsonPayload": { "involvedObject": { "uid": "a1bc2345-12ab-12ab-1234-123456a123456", "name": "istio-telemetry", "apiVersion": "autoscaling/v2beta2", "resourceVersion": "90505937", "kind": "HorizontalPodAutoscaler", "namespace": "istio-system" }, "source": { "component": "horizontal-pod-autoscaler" }, "kind": "Event", "type": "Warning", "message": "missing request for cpu", "metadata": { "resourceVersion": "3071416", "creationTimestamp": "2020-08-22T14:18:59Z", "name": "istio-telemetry.162d9ce2894d6642", "selfLink": "/api/v1/namespaces/istio-system/events/istio-telemetry.162d9ce2894d6642", "namespace": "istio-system", "uid": "a1bc2345-12ab-12ab-1234-123456a123456" }, "apiVersion": "v1", "reason": "FailedGetResourceMetric" }, "resource": { "type": "k8s_cluster", "labels": { "project_id": "my-test-project" "location": "us-central1-a", "cluster_name": "my-prod-cluster1", } }, "timestamp": "2020-10-02T21:39:07Z", "severity": "WARNING", "logName": "projects/my-test-project/logs/events", "receiveTimestamp": "2020-10-02T21:39:12.182820672Z" } |
Node resource type changes:
The red, bold text highlights the differences between the Legacy Logging and Monitoring and Cloud Operations for GKE resource models.
Resource model | Example logs |
---|---|
Legacy Logging and Monitoring | { "insertId": "16qdegyg9t3n2u5", "jsonPayload": { "SYSLOG_IDENTIFIER": "kubelet", [...] "PRIORITY": "6", "_COMM": "kubelet", "_GID": "0", "_MACHINE_ID": "9565f7c82afd94ca22612c765ceb1042", "_SYSTEMD_UNIT": "kubelet.service", "_EXE": "/home/kubernetes/bin/kubelet" }, "resource": { "type": "gce_instance", "labels": { "instance_id": "1234567890", "zone": "us-central1-a", "project_id": "my-test-project" } }, "timestamp": "2020-10-02T21:43:14.390150Z", "labels": { "compute.googleapis.com/resource_name": "gke-legacy-monitoring-default-pool-b58ff790-29rr" }, "logName": "projects/my-test-project/logs/kubelet", "receiveTimestamp": "2020-10-02T21:43:20.433270911Z" } |
Cloud Operations for GKE | { "insertId": "kkbgd6e5tmkpmvjji", "jsonPayload": { "SYSLOG_IDENTIFIER": "kubelet", [...] "_CAP_EFFECTIVE": "3fffffffff", "_HOSTNAME": "gke-standard-cluster-1-default-pool-f3929440-f4dy", "PRIORITY": "6", "_COMM": "kubelet", "_TRANSPORT": "stdout", "_GID": "0", "MESSAGE": "E1002 21:43:14.870346 1294 pod_workers.go:190] Error syncing pod 99ba1919-d633-11ea-a5ea-42010a800113 (\"stackdriver-metadata-agent-cluster-level-65655bdbbf-v5vjv_kube-system(99ba1919-d633-11ea-a5ea-42010a800113)\"), skipping: failed to \"StartContainer\" for \"metadata-agent\" with CrashLoopBackOff: \"Back-off 5m0s restarting failed container=metadata-agent pod=stackdriver-metadata-agent-cluster-level-65655bdbbf-v5vjv_kube-system(99ba1919-d633-11ea-a5ea-42010a800113)\"" }, "resource": { "type": "k8s_node", "labels": { "cluster_name": "my-prod-cluster-1", "location": "us-central1-a", "node_name": "gke-standard-cluster-1-default-pool-f3929440-f4dy" "project_id": "my-test-project", } }, "timestamp": "2020-10-02T21:43:14.870426Z", "logName": "projects/my-test-project/logs/kubelet", "receiveTimestamp": "2020-10-02T21:43:20.788933199Z" } |
Logging configuration updates
This section describes changes you might need to make to your Cloud Logging configuration as part of a migration to Cloud Operations for GKE. You also need to migrate your configurations if you maintain your configuration in Terraform or another deployment manager and automatically sync changes.
Logging queries
If you use queries to find and filter your logs in Cloud Logging, and you use any of the Legacy Logging and Monitoring resource types shown in the preceding Resource type changes table, then change those types to the corresponding Cloud Operations for GKE types.
For example, in Legacy Logging and Monitoring, you query for container logs using the container
resource type while in Cloud Operations for GKE you use the resource type, k8s_container
to
query container logs.
resource.type="k8s_container"
As another example, in Legacy Logging and Monitoring, you query for specific log names for
containers using the name of the container, while in Cloud Operations for GKE you use the
stdout
and stderr
log names to query container logs.
resource.type="k8s_container"
log_name="projects/YOUR_PROJECT_NAME/logs/stdout"
resource.labels.container_name="CONTAINER_NAME"
Logs-based metrics
If you define your own logs-based metrics and use Legacy Logging and Monitoring metrics or resource types shown in the previous Metric name changes or Resource type changes tables, then change those metrics and resource types to the corresponding Cloud Operations for GKE ones.
You can use the following gcloud CLI command to find your logs-based metrics:
gcloud logging metrics list --filter='filter~resource.type=\"container\" OR filter~resource.type=container'
gcloud logging metrics list --filter='filter~resource.labels.namespace_id'
gcloud logging metrics list --filter='filter~resource.labels.pod_id'
gcloud logging metrics list --filter='filter~resource.labels.zone'
You can use the following gcloud CLI command to update your logs-based metrics.
gcloud logging metrics update YOUR_LOGS_BASED_METRIC_NAME --log-filter='resource.type=\"container\" OR resource.type=\"k8s_container\"'
gcloud logging metrics update YOUR_LOGS_BASED_METRIC_NAME --log-filter='resource.labels.namespace_id=\"YOUR_NAMESPACE\" OR resource.labels.namespace_name=\"YOUR_NAMESPACE\"'
gcloud logging metrics update YOUR_LOGS_BASED_METRIC_NAME --log-filter='resource.labels.pod_id=\"YOUR_POD_NAME\" OR resource.labels.pod_name=\"YOUR_NAME\"'
gcloud logging metrics update YOUR_LOGS_BASED_METRIC_NAME --log-filter='resource.labels.zone=\"YOUR_ZONE\" OR resource.labels.location=\"YOUR_ZONE\"'
Alternatively, you can update your logs-based metrics in the Google Cloud console.
Logs exports
If you export any of your logs, and if your export uses
Legacy Logging and Monitoring resource types shown in the previous Resource type
changes table, then change your export to use the
corresponding Cloud Operations for GKE resource types. Cloud Operations for GKE log entries use
stdout
or stderr
in their log names whereas Legacy Logging and Monitoring
uses the container name.
There are 2 important considerations for the log name change:
- Changes to export destination file locations and tables – The log name
values in Cloud Operations for GKE include
stdout
orstderr
rather than the container name. The container name is still available as a resource label. Processing of the log name in Cloud Storage exports or queries against the BigQuery tables need to be changed to use thestdout
andstderr
log names. - logName values – The log name values are used to determine the exported file structure in Cloud Storage and the table structure in BigQuery. Usage of the Cloud Storage files and BigQuery tables should be adjusted to account for the folder structure in Cloud Storage and table structures in BigQuery.
You can use the following Google Cloud CLI commands to find your affected Logging sinks:
gcloud logging sinks list --filter='filter~resource.type=\"container\" OR filter~resource.type=container'
gcloud logging sinks list --filter='filter~resource.labels.namespace_id'
gcloud logging sinks list --filter='filter~resource.labels.pod_id'
gcloud logging sinks list --filter='filter~resource.labels.zone'
You can use the following gcloud CLI command to update your Logging sinks.
gcloud logging sinks update YOUR_SINK_NAME --log-filter='resource.type=\"container\" OR resource.type=\"k8s_container\"'
gcloud logging sinks update YOUR_SINK_NAME --log-filter='resource.labels.namespace_id=\"YOUR_NAMESPACE\" OR resource.labels.namespace_name=\"YOUR_NAMESPACE\"'
gcloud logging sinks update YOUR_SINK_NAME --log-filter='resource.labels.pod_id=\"YOUR_POD_NAME\" OR resource.labels.pod_name=\"YOUR_NAME\"'
gcloud logging sinks update YOUR_SINK_NAME --log-filter='resource.labels.zone=\"YOUR_ZONE\" OR resource.labels.location=\"YOUR_ZONE\"'
Alternatively, you can update your logs-based metrics in the Google Cloud console.
Logs exclusions
If you exclude any of your logs, and if your exclusion filters use Legacy Logging and Monitoring resource types shown in the previous Resource type changes table, then change your exclusion filters to use the corresponding Cloud Operations for GKE resource types.
For information on viewing your logs exclusions, refer to the Viewing exclusion filters guide.
Changes in log locations
In Cloud Logging, your logs are stored with the resource type that generated
them. Since these types have changed in Cloud Operations for GKE, be sure to look for your logs
in the new resource types like Kubernetes Container
, not in the Legacy Logging and Monitoring types
such as GKE Container
.
Update your cluster's configuration
After you have migrated any logging and monitoring resources to use the Cloud Operations for GKE data format, the last step is to update your GKE cluster to use Cloud Operations for GKE.
To update your GKE cluster's logging and monitoring configuration, follow these steps:
CONSOLE
Go to the GKE Clusters page for your project. The following button takes you there:
Click on the cluster you want to update to use Cloud Operations for GKE.
In the row labelled Cloud Operations for GKE, click the Edit icon.
In the dialog box that appears, confirm Enable Cloud Operations for GKE is selected.
In the dropdown menu within that dialog box, select which logs and metrics you want collected. The default (recommended) setting for Cloud Operations for GKE is System and workload logging and monitoring. Selecting any value in this dropdown other than "Legacy Logging and Monitoring" will update the cluster to start using Cloud Operations for GKE rather than Legacy Logging and Monitoring.
Click Save Changes
GCLOUD
Run this command:
gcloud container clusters update [CLUSTER_NAME] \ --zone=[ZONE] \ --project=[PROJECT_ID] \ --logging=SYSTEM,WORKLOAD \ --monitoring=SYSTEM
What's next
- To learn about the new Cloud Operations for GKE dashboard, see Observing your system.
- For information in viewing your logs, see Viewing your GKE logs.