Logging and monitoring for Anthos attached clusters

This page shows you how to export logs and metrics from an Anthos attached cluster to Cloud Logging and Cloud Monitoring.

How it works

Google Cloud's operations suite is the built-in observability solution for Google Cloud. To export cluster-level telemetry from an attached cluster into Google Cloud, you need to deploy the following open source export agents into your cluster:

  • Stackdriver Log Aggregator (stackdriver-log-aggregator-*). A Fluentd StatefulSet that sends logs to the Cloud Logging (formerly Stackdriver Logging) API.
  • Stackdriver Log Forwarder (stackdriver-log-forwarder-*). A Fluentbit daemonset that forwards logs from each Kubernetes node to the Stackdriver Log Aggregator.
  • Stackdriver Metrics Collector (stackdriver-prometheus-k8s-*). A Prometheus StatefulSet, configured with a stackdriver export sidecar container, to send Prometheus metrics to the Cloud Monitoring (formerly Stackdriver Monitoring) API. The sidecar is another container inside the same pod which reads the metrics which the prometheus server stores on disk and forwards them to the Cloud Monitoring API.

Prerequisites

  1. A Google Cloud project with billing enabled. See our pricing guide to learn about Cloud Operations costs.

  2. One Anthos attached cluster, registered using this guide. Run the following command to verify that your cluster is registered.

    gcloud container hub memberships list
    

    Example output:

    NAME  EXTERNAL_ID
    eks   ae7b76b8-7922-42e9-89cd-e46bb8c4ffe4
    

  3. A local environment from which you can access your cluster and run kubectl commands. See the GKE quickstart for instructions on how to install kubectl through gcloud. Run the following command to verify that you can reach your attached cluster using kubectl.

    kubectl cluster-info
    

    Example output:

    Kubernetes master is running at https://[redacted].gr7.us-east-2.eks.amazonaws.com
    

Setup

  1. Create a Cloud Monitoring workspace for your project by following the instructions here.

  2. Clone the sample repository and navigate into the directory for this guide.

    git clone https://github.com/GoogleCloudPlatform/anthos-samples
    cd anthos-samples/attached-logging-monitoring
    
  3. Set the project ID variable to the project where you've registered your cluster.

    PROJECT_ID="your-project-id"
    
  4. Create a Google Cloud service account with permissions to write metrics and logs to the Cloud Monitoring and Cloud Logging APIs. You'll add this service account's key to the workloads deployed in the next section.

    gcloud iam service-accounts create anthos-lm-forwarder
    
    gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:anthos-lm-forwarder@${PROJECT_ID}.iam.gserviceaccount.com" \
      --role=roles/logging.logWriter
    
    gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:anthos-lm-forwarder@${PROJECT_ID}.iam.gserviceaccount.com" \
      --role=roles/monitoring.metricWriter
    
  5. Create and download a JSON key for the service account you just created, then create a Kubernetes secret in your cluster using that key.

    gcloud iam service-accounts keys create credentials.json \
    --iam-account anthos-lm-forwarder@${PROJECT_ID}.iam.gserviceaccount.com
    
    kubectl create secret generic google-cloud-credentials -n kube-system --from-file credentials.json
    

Installing the logging agent

  1. Change into the logging/ directory.

    cd logging/
    
  2. Open aggregator.yaml. At the bottom of the file, set the following variables to the value corresponding to your project and cluster:

    project_id [PROJECT_ID]
    k8s_cluster_name [CLUSTER_NAME]
    k8s_cluster_location [CLUSTER_LOCATION]
    

    You can find your cluster location by running the following commmand with your attached cluster's membership name, and getting the location that appears at /locations/<location>.

    gcloud container hub memberships describe eks | grep name
    

    Output:

    name: projects/my-project/locations/global/memberships/eks
    

  3. In aggregator.yaml, under volumeClaimTemplates/spec, specify the PersistentVolumeClaim storageClassName for your cluster: we have provided default values for EKS and AKS for you to uncomment as appropriate. If you are using EKS, this is gp2. For AKS, this is default.

    If you have configured a custom Kubernetes Storage Class in AWS or Azure, want to use a non-default storage class, or are using another conformant cluster type, you can add your own storageClassName. The appropriate storageClassName is based on the type of PersistentVolume (PV) that has been provisioned by an administrator for the cluster using StorageClass. You can find out more about storage classes and the default storage classes for other major Kubernetes providers in the Kubernetes documentation.

    # storageClassName: standard #Google Cloud
    # storageClassName: gp2 #AWS EKS
    # storageClassName: default #Azure AKS
    
  4. Deploy the log aggregator and forwarder to the cluster.

    kubectl apply -f aggregator.yaml
    kubectl apply -f forwarder.yaml
    
  5. Verify that the pods have started up. You should see 2 aggregator pods, and one forwarder pod per Kubernetes worker node. For instance, in a 4-node cluster, you should expect to see 4 forwarder pods.

    kubectl get pods -n kube-system | grep stackdriver-log
    

    Output:

    stackdriver-log-aggregator-0                 1/1     Running   0          139m
    stackdriver-log-aggregator-1                 1/1     Running   0          139m
    stackdriver-log-forwarder-2vlxb              1/1     Running   0          139m
    stackdriver-log-forwarder-dwgb7              1/1     Running   0          139m
    stackdriver-log-forwarder-rfrdk              1/1     Running   0          139m
    stackdriver-log-forwarder-sqz7b              1/1     Running   0          139m
    

  6. Get aggregator logs and verify that logs are being sent to Google Cloud.

    kubectl logs stackdriver-log-aggregator-0 -n kube-system
    

    Output:

    2020-10-12 14:35:40 +0000 [info]: #3 [google_cloud] Successfully sent gRPC to Stackdriver Logging API.
    

  7. Deploy a test application to your cluster. This is a basic HTTP web server with a loadgenerator.

    kubectl apply -f  https://raw.githubusercontent.com/GoogleCloudPlatform/istio-samples/master/sample-apps/helloserver/server/server.yaml
    
    kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/istio-samples/master/sample-apps/helloserver/loadgen/loadgen.yaml
    
  8. Verify that you can view logs from your attached cluster in the Cloud Logging dashboard. Go to the Logs Explorer in the Google Cloud Console:

    Go to Logs Explorer

  9. In the Logs Explorer, copy the sample query below into the Query builder field, replacing ${your-cluster-name} with your cluster name. Click Run query. You should see recent cluster logs appear under Query results.

    resource.type="k8s_container" resource.labels.cluster_name="${your-cluster-name}"
    

Logs for attached cluster

Installing the monitoring agent

  1. Navigate out of the logging/ directory and into the monitoring/ directory.

    cd ../monitoring
    
  2. Open prometheus.yaml. Under stackdriver-prometheus-sidecar/args, set the following variables to match your environment.

    "--stackdriver.project-id=[PROJECT_ID]"
    "--stackdriver.kubernetes.location=[CLUSTER_LOCATION]"
    "--stackdriver.generic.location=[CLUSTER_LOCATION]"
    "--stackdriver.kubernetes.cluster-name=[CLUSTER_NAME]"
    
  3. From prometheus.yaml, under volumeClaimTemplates, uncomment the storageClassName that matches your cloud provider, as described in Installing the logging agent.

    # storageClassName: standard #Google Cloud
    # storageClassName: gp2 #AWS EKS
    # storageClassName: default #Azure AKS
    
  4. Deploy the stackdriver-prometheus StatefulSet, configured with the exporter sidecar, to your cluster.

    kubectl apply -f server-configmap.yaml
    kubectl apply -f sidecar-configmap.yaml
    kubectl apply -f prometheus.yaml
    
  5. Verify that the stackdriver-prometheus pod is running.

    watch kubectl get pods -n kube-system | grep stackdriver-prometheus
    
    Output:
    stackdriver-prometheus-k8s-0         2/2     Running   0          5h24m
    
  6. Get the Stackdriver Prometheus sidecar container logs to verify that the pod has started up.

    kubectl logs stackdriver-prometheus-k8s-0 -n kube-system stackdriver-prometheus-sidecar
    
    Output:
    level=info ts=2020-11-18T21:37:24.819Z caller=main.go:598 msg="Web server started"
    level=info ts=2020-11-18T21:37:24.819Z caller=main.go:579 msg="Stackdriver client started"
    
  7. Verify that cluster metrics are exporting successfully to Cloud Monitoring. Go to the Metrics Explorer in the Google Cloud Console:

    Go to Metrics Explorer

  8. Click Query editor, then copy in the following command, replacing ${your-project-id} and ${your-cluster-name} with your own project and cluster information. Then click run query. You should see 1.0.

    fetch k8s_container
    | metric 'kubernetes.io/anthos/up'
    | filter
        resource.project_id == '${your-project-id}'
        && (resource.cluster_name =='${your-cluster-name}')
    | group_by 1m, [value_up_mean: mean(value.up)]
    | every 1m
    

Monitoring for attached cluster

Clean up

  1. To remove all the resources created in this guide:

    kubectl delete -f logging
    kubectl delete -f monitoring
    kubectl delete secret google-cloud-credentials -n kube-system
    kubectl delete -f https://raw.githubusercontent.com/GoogleCloudPlatform/istio-samples/master/sample-apps/helloserver/loadgen/loadgen.yaml
    kubectl delete -f https://raw.githubusercontent.com/GoogleCloudPlatform/istio-samples/master/sample-apps/helloserver/server/server.yaml
    rm -r credentials.json
    gcloud compute service-accounts delete anthos-lm-forwarder
    

What's next?

Learn about Cloud Logging:

Learn about Cloud Monitoring: