Migrating to Cloud Operations for GKE

There are two options for monitoring and logging support in Google Kubernetes Engine (GKE). They are provided by all GKE versions available for new clusters and for updates to existing clusters:

This page explains the differences between these two options and what you must change to migrate from Legacy Logging and Monitoring to Cloud Operations for GKE.

When do I need to migrate?

You can migrate your existing Cloud Monitoring and Cloud Logging configurations from Legacy Logging and Monitoring to Cloud Operations for GKE at any time. However, keep in mind that a future release of GKE might withdraw support for Legacy Logging and Monitoring. If support for Legacy Logging and Monitoring is removed, then you must migrate to Cloud Operations for GKE before then if you want to continue to use Cloud Monitoring and Cloud Logging support.

The following table summarizes the expected upcoming GKE release versions with their support options:

GKE version Legacy Logging and Monitoring Cloud Operations for GKE
1.10 – 1.12.5 Default Opt-in (Beta)
1.12.7 Default Optional
1.13 Default Optional
1.14 Optional Default
1.15 Not Available Default

For information on the deprecation of Legacy Logging and Monitoring, refer to the Legacy support for GKE deprecation guide.

What is changing?

Cloud Operations for GKE uses a different data model to organize its metrics, logs, and metadata. Here are some specific changes for your clusters using Cloud Operations for GKE:

  • Navigation change: Cloud Monitoring dashboards are named GKE. This dashboard doesn't appear if you don't have clusters using Cloud Operations for GKE.

  • Monitored resource type names changes: For example, your Kubernetes nodes are listed under the monitored resource type k8s_node , which is a Kubernetes Node, rather than gce_instance (Compute Engine VM instance).

  • Kubernetes metric names changes: In Cloud Operations for GKE, metric type names now start with the prefix kubernetes.io/  rather than container.googleapis.com/.

The following table summarizes the preceding changes:

Change (Old) Legacy Logging and Monitoring (New) Cloud Operations for GKE
Dashboard menus Dashboards > GKE Clusters Dashboards > GKE
Metric prefixes container.googleapis.com kubernetes.io
Resource types gke_container (Metrics)
container (Logs)
gce_instance
(none)
gke_cluster
k8s_container
k8s_container
k8s_node
k8s_pod
k8s_cluster

What do I need to do?

This section contains more specific information on the data model changes in Cloud Operations for GKE and their impact on your existing monitoring and logging configurations.

Using the migration status dashboard

For information about the changes that you need to make to your Logging configuration as part of the migration to Cloud Operations for GKE, see Logging configuration updates.

To identify the Cloud Monitoring configurations that you must update as part of the migration to Cloud Operations for GKE, do the following:

  1. In the Cloud Console, go to Monitoring:

    Go to Monitoring

  2. In the Monitoring navigation pane, click Settings and then select the tab Kubernetes Migration Status.

The following sample dashboard shows that 1 alerting policy needs to be updated:

Display of the migration dashboard.

Resource type changes

Cloud Operations for GKE has new resource type names, new resource type display names, and new names for the labels that identify specific resources. These changes are listed in the following table.

Resource type changes
(Old) Legacy Logging and Monitoring resource type (New) Cloud Operations for GKE resource type
Table footnotes:
1 In the new resource type used for monitoring (only), instance_id becomes node_name in metadata.system_labels.
2 zone refers to the location of this container or instance. location refers to the location of the cluster master node.
3 metadata.system_labels.node_name is not available in k8s_container resource types used for logging. You cannot search by node name for logs.
4 The gce_instance resource type can represent Kubernetes nodes as well as non-Kubernetes VM instances. When upgrading to Cloud Operations for GKE, node-related uses are changed to use the new resource type, k8s_node, including node-level logs with the following names: kubelet, docker, kube-proxy, startupscript, and node-problem-detector.
5 The k8s_pod and k8s_cluster nodes might include logs not present in the Legacy Logging and Monitoring support.
Monitoring only:
gke_container (GKE Container)

Labels:
  cluster_name
  container_name
  instance_id1
  namespace_id
  pod_id
  project_id
  zone2

Logging only:
container (GKE Container)

Labels:
  cluster_name
  container_name
  instance_id1
  namespace_id
  pod_id
  project_id
  zone2

Monitoring and Logging:
k8s_container (Kubernetes Container)

Labels:
  cluster_name
  container_name
  metadata.system_labels.node_name3
  namespace_name
  pod_name
  project_id
  location2

Logging only::
gce_instance (Compute Engine VM Instance)4

Labels:
  cluster_name
  instance_id
  project_id
  zone2
Monitoring and Logging
k8s_node4 (Kubernetes Node)

Labels:
  cluster_name
  node_name
  project_id
  location2
 
(none)
Monitoring and Logging:
k8s_pod5 (Kubernetes Pod)

Labels:
  cluster_name
  namespace_name
  pod_name
  project_id
  location2

Logging only
gke_cluster (GKE_cluster)

Labels:
  cluster_name
  project_id
  location

Monitoring and Logging:
k8s_cluster5 (Kubernetes Cluster)

Labels:
  cluster_name
  project_id
  location

Metric name changes

The following table shows some samples of the different metric names. You must change every use of a metric whose name begins with container.googleapis.com/ to a new metric whose name begins with kubernetes.io/.

The new metric names might differ in other ways besides the new prefix. Look for the new metrics in kubernetes.io .

Metric name changes
(Old) Legacy Logging and Monitoring metrics (New) Cloud Operations for GKE metrics
Legacy GKE metrics
container.googleapis.com/

Examples:
  .../container/cpu/utilization
  .../container/uptime
  .../container/memory/bytes_total
Kubernetes Engine Monitoring metrics
kubernetes.io/

Examples:
  .../container/cpu/request_utilization
  .../container/uptime
  .../node/memory/total_bytes
  .../node/cpu/total_cores

Resource group changes

If you define your own resource groups and use any of the Legacy Logging and Monitoring resource types shown in the preceding Resource type changes table or any Legacy Logging and Monitoring metrics shown in the preceding Metric name changes table, then change those types and metrics to be the corresponding Cloud Operations for GKE resource types and metrics. If your resource group includes custom charts, you might have to change them.

Custom chart and dashboard changes

If you define your own custom charts and dashboards, and use any of the Legacy Logging and Monitoring resource types shown in the preceding Resource type changes table or any Legacy Logging and Monitoring metrics shown in the preceding Metric name changes table, then change those types and metrics to the corresponding Cloud Operations for GKE types and metrics.

For your custom charts and dashboards, you can get help by viewing the GKE migration status dashboard:

  1. In the Cloud Console, select the Google Cloud project that contains a GKE cluster to update to Cloud Operations for GKE:

    Go to Cloud Console

  2. Select Monitoring.

  3. To access the migration status, in the Monitoring navigation pane, click Settings and then select the tab Kubernetes Migration Status.

Alerting and uptime policy changes

If you define alerting policies or uptime checks, and use any of the Legacy Logging and Monitoring resource types shown in the preceding Resource type changes table or any Legacy Logging and Monitoring metrics shown in the preceding Metric name changes table, then change those types and metrics to the corresponding Cloud Operations for GKE types and metrics.

Upgrading alerting policies and uptime checks can be the most difficult changes to perform and verify. One question to consider is when to make these changes. Do you change your policy configuration before upgrading your cluster or afterwards? Old policies fail after you update your cluster and new policies fail before you update your cluster.

Instead of changing policies in place, consider leaving the existing policies unchanged and creating new policies with the updated changes. This might make it easier to keep track of which policies you expect to fail and which you don't at different times during the update.

Here are some other tips:

  • Examine your policies and estimate how long your updated cluster will have to run before you've accumulated enough data for it to be operating in its steady state.

  • Have some idea of how your policies or individual metrics are performing before you update, so you can compare that behavior with the post-update behavior.

Logging configuration updates

This section describes changes you might need to make to your Cloud Logging configuration as part of a migration to Cloud Operations for GKE.

Logging queries

If you use queries to find and filter your logs in Cloud Logging, and you use any of the Legacy Logging and Monitoring resource types shown in the preceding Resource type changes table, then change those types to the corresponding Cloud Operations for GKE types.

Logs-based metrics

If you define your own logs-based metrics and use Legacy Logging and Monitoring metrics or resource types shown in the previous Metric name changes or Resource type changes tables, then change those metrics and resource types to the corresponding Cloud Operations for GKE ones.

Logs exports and exclusions

If you export or exclude any of your logs, and if your export or exclusion filters use Legacy Logging and Monitoring resource types shown in the previous Resource type changes table, then change your export and exclusion filters to use the corresponding Cloud Operations for GKE resource types.

Changes in log entry contents

When you update to Cloud Operations for GKE, you might find that certain information in log entries has moved to differently-named fields. This information can appear in logs queries used in logs-based metrics, log sinks, and log exclusions.

The following table, Log entry changes, lists the new fields and labels. Here's a brief summary:

  • The logName field might change. Cloud Operations for GKE log entries use stdout or stderr in their log names whereas Legacy Logging and Monitoring used a wider variety of names, including the container name. The container name is still available as a resource label.
  • Check the labels field in the log entries. This field might contain information formerly in the metadata log entry fields.
  • Check the resource.labels field in the log entries. The new resource types have additional label values.
Log entry changes
(Old) Legacy Logging and Monitoring log entries (New) Cloud Operations for GKE log entries
Table footnotes:
1 Resource labels identify specific resources that yield metrics, such as specific clusters and nodes.
2 The labels field appears in new log entries that are part of Cloud Operations for GKE and occasionally in some Legacy Logging and Monitoring log entries. In Cloud Operations for GKE, it is used to hold some information formerly in the metadata log entry fields.
Log entry resources
resource.labels (Resource labels1)
Log entry resources
resource.labels (Resource labels1)
Log entry metadata
labels (Log entry labels2)

labels (Examples)
  compute.googleapis.com/resource_name:
    "fluentd-gcp-v3.2.0-d4d9p"

  container.googleapis.com/namespace_name:
    "kube-system"

  container.googleapis.com/pod_name:
    "fluentd-gcp-scaler-8b674f786-d4pq2"

  container.googleapis.com/stream:
    "stdout"
Log entry metadata
labels

Changes in log locations

In Cloud Logging, your logs are stored with the resource type that generated them. Since these types have changed in Cloud Operations for GKE, be sure to look for your logs in the new resource types like Kubernetes Container, not in the Legacy Logging and Monitoring types such as GKE Container.

What's next