Scenarios for exporting Cloud Logging data: Splunk

This scenario shows how to export selected logs from Cloud Logging to Pub/Sub for ingestion into Splunk. Splunk is a security information and event management (SIEM) solution that supports several ways of ingesting data, such as receiving streaming data out of Google Cloud through Splunk HTTP Event Collector (HEC) or by fetching data from Google Cloud APIs through Splunk Add-on for Google Cloud.

Using the Pub/Sub to Splunk Dataflow template, you can natively forward logs and events from a Pub/Sub topic into Splunk HEC. If Splunk HEC is not available in your Splunk deployment, you can use the Add-on to collect the logs and events from the Pub/Sub topic.

If you are using Splunk solutions that are already deployed, Cloud Logging lets you export logs from Google Cloud into your Splunk solution. This ability helps you take advantage of the native logging, monitoring, and diagnostics capabilities while still enabling you to include these logs in your existing systems.

This scenario is part of the series Design patterns for exporting Cloud Logging.

Set up the logging export

The following diagram shows the steps for enabling logging export to Splunk through Pub/Sub.

Enabling logging export to Pub/Sub.

Set up a Pub/Sub topic and subscription

Follow the instructions to set up a Pub/Sub topic that will receive your exported logs and add a subscription to the topic.

This document uses the following Pub/Sub topic and subscription names:

projects/compliance-logging-export/topics/logs-export-topic

projects/compliance-logging-export/topics/logs-export-subscription

Turn on audit logging for all services

Admin Activity audit logs are always written and cannot be disabled. However, Data Access audit logs—except for BigQuery—are disabled by default. In order to enable all audit logs, follow the instructions to update the Cloud IAM policy with the configuration listed in the audit policy documentation. The steps include the following:

  • Downloading the current IAM policy as a file.
  • Adding the audit log policy JSON or YAML object to the current policy file.
  • Updating the Google Cloud project (or organization) with the changed policy file.

The following is an example JSON object that enables all audit logs for all services.

"auditConfigs": [
    {
        "service": "allServices",
        "auditLogConfigs": [
            { "logType": "ADMIN_READ" },
            { "logType": "DATA_READ"  },
            { "logType": "DATA_WRITE" },
        ]
    },
]

Configure the logging export

After you set up aggregated exports or logs export, you need to refine the logging filters to export audit logs, virtual machine–related logs, storage logs, and database logs. The following logging filter includes the Admin Activity and Data Access audit logs and the logs for specific resource types.

logName:"/logs/cloudaudit.googleapis.com" OR
resource.type:gce OR
resource.type=gcs_bucket OR
resource.type=bigquery_resource

From the gcloud command-line tool, use the gcloud logging sinks create command or the organizations.sinks.create API call to create a sink with the appropriate filters. The following example gcloud command creates a sink called gcp_logging_sink_pubsub for the organization, with the destination being the previously created Pub/Sub topic logs-export-topic. The sink includes all children projects and specifies filtering to select specific audit logs.

gcloud logging sinks create gcp_logging_sink_pubsub \
    pubsub.googleapis.com/projects/compliance-logging-export/topics/logs-export-topic \
    --log-filter='logName:"/logs/cloudaudit.googleapis.com" OR \
    resource.type:\"gce\" OR \
    resource.type=\"gcs_bucket\" OR   \
    resource.type=\"bigquery_resource\"' \
    --include-children   \
    --organization=your-organization

The command output is similar to the following:

Created [https://logging.googleapis.com/v2/organizations/your-organization/sinks/gcp_logging_export_pubsub_sink].
Please remember to grant `serviceAccount:gcp-logging-export-pubsub-si@logging-oyour-organization.iam.gserviceaccount.com` Pub/Sub Publisher role to the topic.
More information about sinks can be found at /logging/docs/export/configure_export

In the serviceAccount entry returned from the API call, the identity gcp-logging-export-pubsub-si@logging-oyour-organization.iam.gserviceaccount.com is included in the response. This identity represents a Google Cloud service account that has been created for the export. Until you grant this identity publish access to the destination topic, log entry exports from this sink will fail. For more information, see the next section or the documentation for Granting access for a resource.

Set IAM policy permissions for the Pub/Sub topic

By adding the service account gcp-logging-export-pubsub-si@logging-oyour-organization.iam.gserviceaccount.com to the pubsub.googleapis.com/projects/compliance-logging-export/topics/logs-export-topic topic with the Pub/Sub Publisher permissions, you grant the service account permission to publish to the topic. Until you add these permissions, the sink export will fail.

To add the permissions to the service account, follow these steps:

  1. In the Cloud Console, open the Cloud Pub/Sub Topics page:

    GO TO THE TOPICS PAGE

  2. Select the topic name.

  3. Click Show info panel, and then select the Pub/Sub Publisher permissions.

    IAM policy permissions - Pub/Sub Publisher.

After you create the logging export by using this filter, log files begin to populate in the Pub/Sub topic in the configured project. You can confirm that the topic is receiving messages by using the Metrics Explorer in Cloud Monitoring. Using the following resource type and metric, observe the number of message-send operations over a brief period. If you have configured the export properly, you will see activity above 0 on the graph, as in this screenshot.

  • Resource type: pubsub_topic
  • Metric: pubsub/topic/send_message_operation_count

Activity graph.

Set up the Splunk data ingest

Option A: Stream logs using Pub/Sub to Splunk Dataflow

You use the Pub/Sub to Splunk Dataflow template to create a Dataflow job that pulls messages from the previously created Pub/Sub subscription, converts payloads into Splunk HEC event format, and forwards them to Splunk HEC. Therefore, it requires Splunk HEC endpoint URL and an HEC token, with the URL being accessible from the Dataflow job's network. Follow the instructions to configure Splunk HEC if you haven't already done so.

In addition to parallelizing and distributing the workload among multiple workers, the Dataflow service manages these resources and autoscales the number of workers based on existing messages backlog from Pub/Sub and current workers' resource utilization.

In terms of fault-tolerance, the Pub/Sub to Splunk Dataflow template handles retries (with exponential backoff) to Splunk HEC in case the downstream endpoint is down or if there's a network connectivity issue. In case of message processing errors and to ensure no data loss, those messages are forwarded to another Pub/Sub dead-letter topic, which must also be created before running this Dataflow template.

After Splunk HEC endpoint is configured to receive data, you can execute the Pub/Sub to Splunk Dataflow pipeline through Cloud Console, gcloud command-line tool, or Dataflow API. Note that you can optionally apply a JavaScript user-defined function to transform the message payload before forwarding to Splunk.

The following example gcloud command runs the latest version of Pub/Sub to Splunk Dataflow template on the input Pub/Sub subscription or logs-export-subscription:

JOB_NAME=pubsub-to-splunk-$USER-`date +"%Y%m%d-%H%M%S%z"`
gcloud dataflow jobs run $JOB_NAME \
    --gcs-location gs://dataflow-templates/latest/Cloud_PubSub_to_Splunk \
    --max-workers 2 \
    --parameters \
  inputSubscription=projects/compliance-logging-export/subscriptions/logs-export-subscription,\
  token=your-splunk-hec-token,\
  url=your-splunk-hec-url,\
  outputDeadletterTopic=projects/compliance-logging-export/topics/splunk-pubsub-deadletter,\
  batchCount=10,\
  parallelism=8

You can confirm that the Dataflow pipeline is successfully sending messages to Splunk by using Metrics Explorer in Cloud Monitoring. Using the following resource type and metrics, observe the number of successful outbound events and potentially any failed events over a brief period.

  • Resource type: dataflow_job
  • Metric 1: custom.googleapis.com/dataflow/outbound-successful-events
  • Metric 1: custom.googleapis.com/dataflow/outbound-failed-events

If you have configured the template properly, you see activity above 0 on the graph, as in this screenshot.

graph activity for outbound events.

Option B: Pull logs using Splunk Add-on for Google Cloud

The Splunk Add-on for Google Cloud uses the Pub/Sub topic and a service account in Google Cloud. The service account is used to generate a private key that the add-on uses to establish a Pub/Sub subscription and ingest messages from the logging export topic. The appropriate IAM permissions are required to allow the service account to create the subscription and list the components in the Pub/Sub project that contains the subscription.

Follow the instructions to set up Splunk Add-on. After you configure the add-on, the Pub/Sub messages from the logging export appear in Splunk.

By using Metrics Explorer in Cloud Monitoring, you can confirm that the subscription that the Splunk add-on is using is pulling messages. Using the following resource type and metric, observe the number of message-pull operations over a brief period.

  • Resource type: pubsub_subscription
  • Metric: pubsub/subscription/pull_message_operation_count

If you have configured the export properly, you see activity above 0 on the graph, as in this screenshot.

graph activity for pull operations.

Using the exported logs

After the exported logs have been ingested into Splunk, you can use Splunk as you would with any other data source to do the following tasks:

  • Search the logs.
  • Correlate complex events.
  • Visualize results by using dashboards.

What's next