Version 1.2. This version is no longer supported. For more information see the version support policy.

Using logging and monitoring

This page explains how to use Stackdriver and Prometheus and Grafana for logging and monitoring. Refer to Logging and Monitoring Overview for summary of the configuration options available.

Using Stackdriver

The following sections explain how to use Stackdriver with GKE on-prem clusters.

Monitored resources

Monitored resources are how Google represents resources such as clusters, nodes, Pods, and containers. To learn more, refer to Cloud Monitoring's Monitored Resource Types documentation.

To query for logs and metrics, you'll need to know at least these resource labels:

project_id: Project ID for the project associated with the GKE on-prem cluster.
location: A GCP region where you want to store Stackdriver logs and metrics. It is a good idea to choose a region that is near your on-prem data center. You provided this value during installation in the stackdriver.clusterlocation field of your GKE on-prem configuration file.
cluster_name: Cluster name you chose when you created the cluster.

You can retrieve the cluster_name value for either the admin or the user cluster by inspecting the Stackdriver custom resource:
```
  kubectl -n kube-system get stackdrivers stackdriver -o yaml | grep 'clusterName:'
```

Accessing log data

You can access logs using the Logs Explorer in Google Cloud console. For example, to access a container's logs:

Open the Logs Explorer in Google Cloud console for your project.
Find logs for a container by:
1. Clicking on the top-left log catalog drop-down box and selecting Kubernetes Container.
2. Selecting the cluster name, then the namespace, and then a container from the hierarchy.

Accessing metrics data

You can choose from over 1,500 metrics by using Metrics Explorer. To access Metrics Explorer, do the following:

In the Google Cloud console, select Monitoring, or use the following button:

Go to Monitoring
Select Resources > Metrics Explorer.

Accessing Stackdriver metadata

Metadata is used indirectly via metrics. When you filter for metrics in Stackdriver Metrics Explorer, you see options to filter metrics by metadata.systemLabels and metadata.userLabels. System labels are labels such as node name and Service name for Pods. User labels are labels assigned to Pods in the Kubernetes YAML files in the "metadata" section of the Pod specification.

Prometheus and Grafana

The following sections explain how to use Prometheus and Grafana with GKE on-prem clusters.

Enabling Prometheus and Grafana

Starting in GKE on-prem version 1.2, you can choose whether to enable or disable Prometheus and Grafana. In new user clusters, Prometheus and Grafana are disabled by default.

Your user cluster has a Monitoring object named monitoring-sample. Open the object for editing:
```
kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] edit \
   monitoring monitoring-sample --namespace kube-system
```
where [USER_CLUSTER_KUBECONFIG] is the kubeconfig file for your user cluster.

To enable Prometheus and Grafana, set enablePrometheus to true. To disable Prometheus and Grafana, set enablePrometheus to false:

apiVersion: addons.k8s.io/v1alpha1
kind: Monitoring
metadata:
 labels:
   k8s-app: monitoring-operator
 name: monitoring-sample
 namespace: kube-system
spec:
 channel: stable
 ...
 enablePrometheus: true

Save your changes by closing the editing session.

Known issue

In user clusters, Prometheus and Grafana get automatically disabled during upgrade. However, the configuration and metrics data are not lost.

To work around this issue, after the upgrade, open monitoring-sample for editing and set enablePrometheus to true.

Accessing monitoring metrics from Grafana dashboards

Grafana displays metrics gathered from your clusters. To view these metrics, you need to access Grafana's dashboards:

Get the name of the Grafana Pod running in a user cluster's kube-system namespace:
```
kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] -n kube-system get pods
```
where [USER_CLUSTER_KUBECONFIG] is the user cluster's kubeconfig file.
The container in the Grafana Pod listens on TCP port 3000. Forward a local port to port 3000 in the Pod, so that you can view Grafana's dashboards from a web browser.

For example, suppose the name of the Pod is grafana-0. To forward port 50000 to port 3000 in the Pod, enter this command::
```
kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] -n kube-system port-forward grafana-0 50000:3000
```
From a web browser, navigate to http://localhost:50000. The user cluster's Grafana Home Dashboard should load.
To access other dashboards, click the Home drop-down menu in the top-left corner of the page.

For an example of using Grafana, see Create a Grafana dashboard.

Accessing alerts

Prometheus Alertmanager collects alerts from the Prometheus server. You can view these alerts in a Grafana dashboard. To view the alerts, you need to access the dashboard:

The container in the alertmanger-0 Pod listens on TCP port 9093. Forward a local port to port 9093 in the Pod:
```
kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] port-forward \
   -n kube-system alertmanager-0 50001:9093
```
From a web browser, navigate to http://localhost:50001.

Changing Prometheus Alertmanager configuration

You can change Prometheus Alertmanager's default configuration by editing your user cluster's monitoring.yaml file. You should do this if you want to direct alerts to a specific destination, rather than keep them in the dashboard. You can learn how to configure Alertmanager in Prometheus' Configuration documentation.

To change the Alertmanager configuration, perform the following steps:

Make a copy of the user cluster's monitoring.yaml manifest file:

kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] -n kube-system \
   get monitoring monitoring-sample -o yaml > monitoring.yaml

To configure Alertmanager, make changes to the fields under spec.alertmanager.yml. When you're finished, save the changed manifest.

Apply the manifest to your cluster:

kubectl apply --kubeconfig [USER_CLUSTER_KUBECONIFG] -f monitoring.yaml

Scaling Prometheus resources

The default monitoring configuration supports up to five nodes. For larger clusters, you can adjust the Prometheus Server resources. The recommendation is 50m cores of CPU and 500Mi of memory per cluster node. Make sure that your cluster contains two nodes, each with sufficient resources to fit Prometheus. For more information, refer to Resizing a user cluster.

To change Prometheus Server resources, perform the following steps:

Make a copy of the user cluster's monitoring.yaml manifest file:

kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] -n kube-system get monitoring monitoring-sample -o yaml > monitoring.yaml

To override resources, make changes to the fields under spec.resourceOverride. When you're finished, save the changed manifest. Example:

spec:
  resourceOverride:
  - component: Prometheus
    resources:
      requests:
        cpu: 300m
        memory: 3000Mi
      limits:
        cpu: 300m
        memory: 3000Mi

Apply the manifest to your cluster:

kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] apply -f monitoring.yaml

Create a Grafana dashboard

You've deployed an application that exposes a metric, verified that the metric is exposed, and verified that Prometheus scrapes the metric. Now you can add the application-level metric to a custom Grafana dashboard.

To create a Grafana dashboard, perform the following steps:

If necessary, gain access to Grafana.
From the Home Dashboard, click the Home drop-down menu in the top-left corner of the page.
From the right-side menu, click New dashboard.
From the New panel section, click Graph. An empty graph dashboard appears.
Click Panel title, then click Edit. The bottom Graph panel opens to the Metrics tab.
From the Data Source drop-down menu, select user. Click Add query, and enter foo in the search field.
Click the Back to dashboard button in the top-right corner of the screen. Your dashboard is displayed.
To save the dashboard, click Save dashboard in the top-right corner of the screen. Choose a name for the dashboard, then click Save.

Disabling in-cluster monitoring

To disable in-cluster monitoring, enter the following command:

kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] -n kube-system delete monitoring monitoring-sample

Example: Adding application-level metrics to a Grafana dashboard

The following sections walk you through adding metrics for an application. In this section, you complete the following tasks:

Deploy an example application that exposes a metric called foo.
Verify that Prometheus exposes and scrapes the metric.
Create a custom Grafana dashboard.

Deploy the example application

The example application runs in a single Pod. The Pod's container exposes a metric, foo, with a constant value of 40.

Create the following Pod manifest, pro-pod.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: prometheus-example
  annotations:
    prometheus.io/scrape: 'true'
    prometheus.io/port: '8080'
    prometheus.io/path: '/metrics'
spec:
  containers:
  - image: registry.k8s.io/prometheus-dummy-exporter:v0.1.0
    name: prometheus-example
    command:
    - /bin/sh
    - -c
    - ./prometheus_dummy_exporter --metric-name=foo --metric-value=40 --port=8080

Then apply the Pod manifest to your user cluster:

kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] apply -f pro-pod.yaml

Verify that the metric is exposed and scraped

The container in the prometheus-example pod listens on TCP port 8080. Forward a local port to port 8080 in the Pod:
```
kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] port-forward prometheus-example 50002:8080
```
To verify that the application exposes the metric, run the following command:
```
curl localhost:50002/metrics | grep foo
```
The command returns the following output:
```
# HELP foo Custom metric
# TYPE foo gauge
foo 40
```
The container in the prometheus-0 Pod listens on TCP port 9090. Forward a local port to port 9090 in the Pod:
```
kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] port-forward prometheus-0 50003:9090
```
To verify that Prometheus is scraping the metric, navigate to http://localhost:50003/targets, which should take you to the prometheus-0 Pod under the prometheus-io-pods target group.
To view metrics in Prometheus, navigate to http://localhost:50003/graph. From the search field, enter foo, then click Execute. The page should display the metric.