Logs and metrics

This page explains how Distributed Cloud Edge logs various types of information about its operation and how to view that information.

Configure logging and monitoring

Before you can start gathering logs and metrics, you must do the following:

  1. Enable the logging APIs using the following commands:

    gcloud services enable opsconfigmonitoring.googleapis.com --project PROJECT_ID
    gcloud services enable logging.googleapis.com --project PROJECT_ID
    gcloud services enable monitoring.googleapis.com --project PROJECT_ID
    

    Replace PROJECT_ID with the ID of the target Google Cloud project.

  2. Grant the roles required to write logs and metrics:

    gcloud projects add-iam-policy-binding PROJECT_ID \
        --role roles/opsconfigmonitoring.resourceMetadata.writer \
        --member "serviceAccount:PROJECT_ID.svc.id.goog[kube-system/metadata-agent]"
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --role roles/logging.logWriter \
         --member "serviceAccount:PROJECT_ID.svc.id.goog[kube-system/stackdriver-log-forwarder]"
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --role roles/monitoring.metricWriter \
        --member "serviceAccount:PROJECT_ID.svc.id.goog[kube-system/gke-metrics-agent]"
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --role roles/monitoring.metricWriter \
        --member "serviceAccount:cloud-cre-tsx-frontend@system.gserviceaccount.com"
    

    Replace PROJECT_ID with the ID of the target Google Cloud project.

Logs

This section lists the Cloud Logging resource types supported by Distributed Cloud Edge. To view Distributed Cloud Edge logs, use the Logs Explorer in the Google Cloud console. Distributed Cloud Edge logging is always enabled.

The Distributed Cloud Edge logged resource types are the following standard Kubernetes resources:

  • k8s_container
  • k8s_node

You can also capture and retrieve Distributed Cloud Edge logs using Cloud Logging API. See the documentation for Cloud Logging client libraries for information on how to configure this logging mechanism.

Metrics

This section lists the Cloud Monitoring metrics supported by Distributed Cloud Edge. To view Distributed Cloud Edge metrics, use the Metrics Explorer in the Google Cloud console.

Distributed Cloud Edge Cluster metrics

For Distributed Cloud Edge Clusters, Distributed Cloud Edge provides the following types of metrics generated by Distributed Cloud Edge Nodes:

  • Resource metrics provide information on Distributed Cloud Edge Node and Pod performance, such as CPU load and memory usage.
  • System application metrics provide information on Distributed Cloud Edge system workloads, such as coredns.

For a list of these metrics, see Anthos on-prem and Anthos bare metal metrics.

Distributed Cloud Edge does not provide metrics generated by the Kubernetes control planes associated with Distributed Cloud Edge Clusters.

Distributed Cloud Edge hardware metrics

Distributed Cloud Edge provides metrics for Distributed Cloud Edge hardware using the Global metrics resource.

Distributed Cloud Edge writes the following Cloud Monitoring API metrics to the custom.googleapis.com/tsx namespace:

Metric Description
/machine/cpu/total_cores
  • Kind: GAUGE
  • Type: INT
Total count of physical processor cores present in the machine.
/machine/cpu/usage_time
  • Kind: CUMULATIVE
  • Type: DOUBLE
  • Unit: Seconds
Cumulative CPU usage time for all cores in the machine. Type can be workload (customer workloads) or system (everything else).
/machine/cpu/utilization
  • Kind: GAUGE
  • Type: DOUBLE
CPU utilization percentage on the machine. Range is 0 to 1. Type can be workload (customer workloads) or system (everything else).
/machine/memory/total_bytes
  • Kind: GAUGE
  • Type: INT64
Byte count of total memory in the machine.
/machine/memory/used_bytes
  • Kind: GAUGE
  • Type: INT64
Byte count of used memory in the machine. memory_type is either evictable (reclaimable by the kernel) or non-evictable (not reclaimable).
/machine/memory/utilization
  • Kind: GAUGE
  • Type: DOUBLE
Memory utilization percentage on the machine. Range is 0 to 1. memory_type is either evictable (reclaimable by the kernel) or non-evictable (not reclaimable).
/machine/network/up
  • Kind: GAUGE
  • Type: BOOL
Indicates whether the network interface is up and running. Includes primary cards, secondary cards, and ports.
/machine/network/link_speed
  • Kind: GAUGE
  • Type: DOUBLE
  • Unit: Bytes per second
Link speed of the primary network interface card.
/machine/network/received_bytes_count
  • Kind: CUMULATIVE
  • Type: DOUBLE
Received byte count for the primary network interface card.
/machine/network/sent_bytes_count
  • Kind: CUMULATIVE
  • Type: DOUBLE
Sent byte count for the primary network interface card.
/machine/network/connectivity
  • Kind: GAUGE
  • Type: BOOL
Indicates whether the primary network interface card has internet connectivity.
/machine/disk/total_bytes
  • Kind: GAUGE
  • Type: INT64
Byte count of total disk space in the machine.
/machine/disk/used_bytes
  • Kind: GAUGE
  • Type: INT64
Byte count of used disk space in the machine.
/machine/disk/utilization
  • Kind: GAUGE
  • Type: DOUBLE
Disk space utilization percentage on the machine. Range is 0 to 1.
/machine/restart_count
  • Kind: CUMULATIVE
  • Type: INT
Number of restarts the machine has undergone.
/machine/uptime
  • Kind: GAUGE
  • Type: INT
  • Unit: Seconds
Machine uptime since last restart.
/machine/connected
  • Kind: GAUGE
  • Type: INT64
Indicates whether the machine is connected to Google Cloud.
/router/connected
  • Kind: GAUGE
  • Type: BOOL
Indicates whether the BGP router is connected to Google Cloud. router_id identifies the specific router (up to 2 per rack).

Export custom application logs and metrics

Distributed Cloud Edge automatically exports logs for applications running on Distributed Cloud Edge workloads. To export metrics for an application running on Distributed Cloud Edge workloads, you must annotate it as described in the next section.

Annotate the workload to enable metrics export

To enable the collection of custom metrics from an application, add the following annotations to the application's Service or Deployment manifest:

  • prometheus.io/scrape: "true"
  • prometheus.io/path: "ENDPOINT_PATH" where ENDPOINT_PATH is the full path to the target application's metric endpoint.
  • prometheus.io/port: "PORT_NUMBER" is the port on which the application's metric endpoint listens for connections.

Run an example application

In this section, you create an application that writes custom logs and exposes a custom metric endpoint.

  1. Save the following Service and Deployment manifests to a file named my-app.yaml. Notice that the Service has the annotation prometheus.io/scrape: "true":

    kind: Service
    apiVersion: v1
    metadata:
      name: "monitoring-example"
      namespace: "default"
      annotations:
        prometheus.io/scrape: "true"
    spec:
      selector:
        app: "monitoring-example"
      ports:
        - name: http
          port: 9090
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: "monitoring-example"
      namespace: "default"
      labels:
        app: "monitoring-example"
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: "monitoring-example"
      template:
        metadata:
          labels:
            app: "monitoring-example"
        spec:
          containers:
          - image: gcr.io/google-samples/prometheus-dummy-exporter:latest
            name: prometheus-example-exporter
            imagePullPolicy: Always
            command:
            - /bin/sh
            - -c
            - ./prometheus-dummy-exporter --metric-name=example_monitoring_up --metric-value=1 --port=9090
            resources:
              requests:
                cpu: 100m
    
  2. Create the Deployment and the Service:

    kubectl --kubeconfig apply -f my-app.yaml
    

View application logs

Console

  1. Go to the Logs explorer in the Google Cloud console.

    Go to the Logs explorer

  2. Click Resource. Under ALL_RESOURCE_TYPES, select Kubernetes Container.

  3. Under CLUSTER_NAME, select the name of your user cluster.

  4. Under NAMESPACE_NAME, select default.

  5. Click Add and then click Run Query.

  6. Under Query results, you can see log entries from the monitoring-example Deployment. For example:

    {
      "textPayload": "2020/11/14 01:24:24 Starting to listen on :9090\n",
      "insertId": "1oa4vhg3qfxidt",
      "resource": {
        "type": "k8s_container",
        "labels": {
          "pod_name": "monitoring-example-7685d96496-xqfsf",
          "cluster_name": ...,
          "namespace_name": "default",
          "project_id": ...,
          "location": "us-west1",
          "container_name": "prometheus-example-exporter"
        }
      },
      "timestamp": "2020-11-14T01:24:24.358600252Z",
      "labels": {
        "k8s-pod/pod-template-hash": "7685d96496",
        "k8s-pod/app": "monitoring-example"
      },
      "logName": "projects/.../logs/stdout",
      "receiveTimestamp": "2020-11-14T01:24:39.562864735Z"
    }
    

gcloud

  1. Run this command:

    gcloud logging read 'resource.labels.project_id="PROJECT_ID" AND \
        resource.type="k8s_container" AND resource.labels.namespace_name="default"'
    

    Replace PROJECT_ID with the ID of your project.

  2. In the output, you can see log entries from the monitoring-example Deployment. For example:

    insertId: 1oa4vhg3qfxidt
    labels:
      k8s-pod/app: monitoring-example
      k8s- pod/pod-template-hash: 7685d96496
    logName: projects/.../logs/stdout
    receiveTimestamp: '2020-11-14T01:24:39.562864735Z'
    resource:
      labels:
        cluster_name: ...
        container_name: prometheus-example-exporter
        location: us-west1
        namespace_name: default
        pod_name: monitoring-example-7685d96496-xqfsf
        project_id: ...
      type: k8s_container
    textPayload: |
      2020/11/14 01:24:24 Starting to listen on :9090
    timestamp: '2020-11-14T01:24:24.358600252Z'
    

View application metrics in the Google Cloud console

Your example application exposes a custom metric named example_monitoring_up. You can view the values of that metric in the Google Cloud console.

  1. Go to the Metrics explorer in the Google Cloud console.

    Go to the Metrics explorer

  2. For Resource type, select Kubernetes Pod.

  3. For metric, select external/prometheus/example_monitoring_up.

  4. In the chart, you can see that example_monitoring_up has a repeated value of 1.