Customizing Stackdriver logs for Google Kubernetes Engine with Fluentd

This tutorial describes how to customize Fluentd logging for a Google Kubernetes Engine cluster. You'll learn how to host your own configurable Fluentd daemonset to send logs to Stackdriver, instead of selecting the cloud logging option when creating the Google Kubernetes Engine (GKE) cluster, which does not allow configuration of the Fluentd daemon.

Objectives

  • Deploy your own Fluentd daemonset on a Google Kubernetes Engine cluster, configured to log data to Stackdriver. We assume that you are already familiar with Kubernetes.
  • Customize GKE logging to remove sensitive data from the Stackdriver logs.
  • Customize GKE logging to add node-level events to to the Stackdriver logs

Costs

This tutorial uses billable components of Cloud Platform, including:

The Pricing Calculator estimates the cost of this environment at around $1.14 for 8 hours.

Before you begin

  1. Sign in to your Google Account.

    If you don't already have one, sign up for a new account.

  2. Select or create a GCP project.

    Go to the Project selector page

  3. Make sure that billing is enabled for your Google Cloud Platform project.

    Learn how to enable billing

  4. Enable the Google Kubernetes Engine, Compute Engine APIs.

    Enable the APIs

Initializing common variables

You must define several variables that control where elements of the infrastructure are deployed.

  1. Using a text editor, edit the following script, substituting your project ID for [YOUR_PROJECT_ID]. The script sets the region to us-east-1. If you make any changes to the script, make sure that the zone values reference the region you specify.

    region=us-east1
    zone=${region}-b
    project_id=[YOUR_PROJECT_ID]
    
  2. Go to Cloud Shell.

    Open Cloud Shell

  3. Copy the script into your Cloud Shell window and run it.

  4. Run the following commands to set the default zone and project ID so you don't have to specify these values in every subsequent command:

    gcloud config set compute/zone ${zone}
    gcloud config set project ${project_id}
    

Creating the GKE cluster

Unless otherwise noted, you enter all the commands for this tutorial at the command line of your computer or in Cloud Shell.

  1. Clone the sample repository. The sample repository includes the Kubernetes manifests for the Fluentd daemonset and a test logging program that you will deploy:

    git clone https://github.com/GoogleCloudPlatform/kubernetes-engine-customize-fluentd
    
  2. Change your working directory to the cloned repository:

    cd kubernetes-engine-customize-fluentd
    
  3. Create the GKE cluster without cloud logging turned on:

    gcloud beta container clusters create gke-with-custom-fluentd \
       --zone us-east1-b \
       --no-enable-cloud-logging \
       --tags=gke-cluster-with-customized-fluentd \
       --scopes=logging-write
    

Deploying the test logger application

By default, the sample application that you deploy continuously emits random logging statements. The Docker container it uses is available at gcr.io/cloud-solutions-images/test-logger, and its source code is included in the test-logger subdirectory.

  1. Deploy the test-logger application to the GKE cluster:

    kubectl apply -f kubernetes/test-logger.yaml
    
  2. View the status of the test-logger pods:

    kubectl get pods
    
  3. Repeat this command until the output looks like the following, with all three test-logger pods running:

    Command output showing three pods running

Deploying the Fluentd daemonset to your cluster

Next you will configure and deploy the Fluentd daemonset.

  1. Deploy the Fluentd configuration:

    kubectl apply -f kubernetes/fluentd-configmap.yaml
    
  2. Deploy the Fluentd daemonset:

    kubectl apply -f kubernetes/fluentd-daemonset.yaml
    
  3. Check that the Fluentd pods have started:

    kubectl get pods --namespace=kube-system
    

    If they're running, you see output like the following:

    Command output showing three pods running

  4. Verify that you're seeing logs in Stackdriver. In the console, in the left-hand menu click Stackdriver > Logging > Logs and select Kubernetes Container in the list.

    Stackdriver listing showing unfiltered data

Filtering information from the logfile

The next step is to specify that Fluentd should filter certain data so that it is not logged. For this tutorial, you filter out the Social Security numbers, credit card numbers, and email addresses. To make this update, you change the daemonset to use a different ConfigMap that contains these filters. You use Kubernetes rolling updates feature and preserve the old version of the ConfigMap.

  1. Open the kubernetes/fluentd-configmap.yaml file in an editor.

  2. Uncomment the lines between and not including the lines ### sample log scrubbing filters and ### end sample log scrubbing filters:

    ############################################################################################################
    #  ### sample log scrubbing filters
    #  #replace social security numbers
    # <filter reform.**>
    #   @type record_transformer
    #   enable_ruby true
    #   <record>
    #     log ${record["log"].gsub(/[0-9]{3}-*[0-9]{2}-*[0-9]{4}/,"xxx-xx-xxxx")}
    #   </record>
    # </filter>
    # # replace credit card numbers that appear in the logs
    # <filter reform.**>
    #   @type record_transformer
    #   enable_ruby true
    #   <record>
    #      log ${record["log"].gsub(/[0-9]{4} *[0-9]{4} *[0-9]{4} *[0-9]{4}/,"xxxx xxxx xxxx xxxx")}
    #   </record>
    # </filter>
    # # replace email addresses that appear in the logs
    # <filter reform.**>
    #   @type record_transformer
    #   enable_ruby true
    #   <record>
    #     log ${record["log"].gsub(/[\w+\-]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+/i,"user@email.tld")}
    #   </record>
    # </filter>
    # ### end sample log scrubbing filters
    #############################################################################################################
  3. Change the name of the ConfigMap from fluentd-gcp-config to fluentd-gcp-config-filtered by editing the metadata.name field:

    name: fluentd-gcp-config
    namespace: kube-system
    labels:
      k8s-app: fluentd-gcp-custom
  4. Save and close the file.

Updating the Fluentd daemonset to use the new configuration

Now you change kubernetes/fluentd-daemonset.yaml to mount the ConfigMap fluentd-gcp-config-filtered instead of fluentd-gcp-config.

  1. Open the kubernetes/fluentd-daemonset.yaml file in an editor.

  2. Change the name of the ConfigMap from fluentd-gcp-config to fluentd-gcp-config-filtered by editing the configMap.name field:

    - configMap:
        defaultMode: 420
        name: fluentd-gcp-config
      name: config-volume
  3. Deploy the new version of the ConfigMap to your cluster:

    kubectl apply -f kubernetes/fluentd-configmap.yaml
    
  4. Roll out the new version of the daemonset:

    kubectl apply -f kubernetes/fluentd-daemonset.yaml
  5. Roll out the update and wait for it to complete:

    kubectl rollout status ds/fluentd-gcp-v3.2.0 --namespace=kube-system
    

    Command output showing 'Waiting' messages for 3 pods, then success

  6. When the rollout is complete, refresh the Stackdriver logs and make sure that the Social Security number, credit card number, and email address data has been filtered out.

    Stackdriver listing showing the same data but filtered

Logging node-level events

If you want events that happen on your GKE nodes to show up in Stackdriver as well, add the following lines to your ConfigMap and follow the instructions described in the last section:

<source>
  type systemd
  filters [{ "SYSLOG_IDENTIFIER": "sshd" }]
  pos_file /var/log/journal/gcp-journald-ssh.pos
  read_from_head true
  tag sshd
</source>

<source>
  type systemd
  filters [{ "SYSLOG_IDENTIFIER": "sudo" }]
  pos_file /var/log/journal/gcp-journald-sudo.pos
  read_from_head true
  tag sudo
</source>

Cleaning up

After you've finished the tutorial, you can clean up the resources you created on GCP so you won't be billed for them in the future.

Deleting the project

The easiest way to eliminate billing is to delete the project that you created for the tutorial.

To delete the project:

  1. In the GCP Console, go to the Projects page.

    Go to the Projects page

  2. In the project list, select the project you want to delete and click Delete .
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Deleting the GKE cluster

If you don't want to delete the whole project, run the following command to delete the GKE cluster:

gcloud container clusters delete gke-with-custom-fluentd --zone us-east1-b

What's next

Var denne side nyttig? Giv os en anmeldelse af den: