This scenario shows you how to export selected logs from Cloud Logging into Splunk Enterprise or Splunk Cloud in real time. Splunk enables you to search, analyze, and visualize logs, events, and metrics gathered from your on-premises and cloud deployments for IT and security monitoring. By integrating logs from Cloud Logging, you can continue to use existing partner services like Splunk as unified log analytics solution — with options to deploy Splunk on-premises, in Google Cloud as SaaS, or through a hybrid approach.
This document gives you an overview of two different supported methods for log export to Splunk: either by pushing or pulling logs from Google Cloud. As explained in the following section, the cloud-native push-based approach is recommended in most cases. For more in-depth information on how to deploy the push-based log export solution, see Deploying production-ready log exports to Splunk using Dataflow.
Introduction
There are two methods for ingesting Google Cloud data supported by Splunk:
- Push-based method: data is sent to Splunk HTTP Event Collector (HEC) through a Pub/Sub to Splunk Dataflow job.
- Pull-based method: data is fetched from Google Cloud APIs through the Splunk Add-on for Google Cloud Platform.
We recommend that you use the push-based method to ingest Google Cloud data in Splunk. This method has the following advantages:
- Managed service. The Dataflow managed service manages the required resources in Google Cloud for data processing tasks such as log export.
- Distributed workload. The workload is distributed over multiple workers for parallel processing. So, unlike when pulling data with a Splunk heavy forwarder, there’s no single point of failure.
- Security. Data is pushed to Splunk HEC, so there’s none of the maintenance and security burden associated with creating and managing service account keys. A pull-based method still requires you to create and manage service account keys.
- Autoscaling. The Dataflow service autoscales the number of workers in response to variations in incoming log volume and backlog.
- Fault-tolerance. Retries to Splunk HEC are handled automatically if there are transient server or network issues. This method also supports unprocessed topics (also known as dead-letter topics) for any undeliverable log messages to avoid data loss.
- Simplicity. You avoid management overhead and the cost of running one or more Splunk heavy forwarders.
There are circumstances where you may use the pull-based method to ingest Google Cloud data into Splunk. These circumstances are as follows:
- Your Splunk deployment does not offer a Splunk HEC endpoint.
- Your log volume is low.
- You want to pull Cloud Monitoring metrics, Cloud Storage objects, or low-volume logs.
- You are already managing one or more Splunk heavy forwarders (or are using a hosted Inputs Data Manager for Splunk Cloud).
Set up the logging export
The following diagram shows the steps for enabling logging export to Splunk through Pub/Sub.
Set up a Pub/Sub topic and subscription
Follow the instructions to set up a Pub/Sub topic that will receive your exported logs and add a subscription to the topic.
This document uses the following Pub/Sub topic and subscription names:
projects/compliance-logging-export/topics/logs-export-topic
projects/compliance-logging-export/subscriptions/logs-export-subscription
Turn on audit logging for all services
Admin Activity audit logs are always written and cannot be disabled. However, Data Access audit logs—except for BigQuery—are disabled by default. In order to enable all audit logs, follow the instructions to update the IAM policy with the configuration listed in the audit policy documentation. The steps include the following:
- Downloading the current IAM policy as a file.
- Adding the audit log policy JSON or YAML object to the current policy file.
- Updating the Google Cloud project (or organization) with the changed policy file.
The following is an example JSON object that enables all audit logs for all services.
"auditConfigs": [ { "service": "allServices", "auditLogConfigs": [ { "logType": "ADMIN_READ" }, { "logType": "DATA_READ" }, { "logType": "DATA_WRITE" }, ] }, ]
Configure the logging export
After you set up aggregated exports or logs export, you need to refine the logging filters to export audit logs, virtual machine–related logs, storage logs, and database logs. The following logging filter includes the Admin Activity and Data Access audit logs and the logs for specific resource types.
logName:"/logs/cloudaudit.googleapis.com" OR resource.type:gce OR resource.type=gcs_bucket OR resource.type=bigquery_resource
From the Google Cloud CLI, use the
gcloud logging sinks create
command or the
organizations.sinks.create
API call to create a sink with the appropriate filters. The following example
gcloud
command creates a sink called gcp_logging_sink_pubsub
for the
organization, with the destination being the previously created Pub/Sub
topic logs-export-topic
. The sink includes all children projects and
specifies filtering to select specific audit logs.
gcloud logging sinks create gcp_logging_sink_pubsub \ pubsub.googleapis.com/projects/compliance-logging-export/topics/logs-export-topic \ --log-filter='logName:"/logs/cloudaudit.googleapis.com" OR \ resource.type:\"gce\" OR \ resource.type=\"gcs_bucket\" OR \ resource.type=\"bigquery_resource\"' \ --include-children \ --organization=your-organization
The command output is similar to the following:
Created [https://logging.googleapis.com/v2/organizations/your-organization/sinks/gcp_logging_export_pubsub_sink]. Please remember to grant `serviceAccount:gcp-logging-export-pubsub-si@logging-oyour-organization.iam.gserviceaccount.com` Pub/Sub Publisher role to the topic. More information about sinks can be found at /logging/docs/export/configure_export
In the serviceAccount
entry returned from the API call, the identity
gcp-logging-export-pubsub-si@logging-oyour-organization.iam.gserviceaccount.com
is
included in the response. This identity represents a Google Cloud service
account that has been created for the export. Until you grant this identity
publish access to the destination topic, log entry exports from this sink will
fail. For more information, see the next section or the documentation for
Granting access for a resource.
Set IAM policy permissions for the Pub/Sub topic
By adding the service account
gcp-logging-export-pubsub-si@logging-oyour-organization.iam.gserviceaccount.com
to
the
pubsub.googleapis.com/projects/compliance-logging-export/topics/logs-export-topic
topic with the Pub/Sub Publisher permissions, you grant the service account
permission to publish to the topic. Until you add these permissions, the sink
export will fail.
To add the permissions to the service account, follow these steps:
In the Google Cloud console, open the Cloud Pub/Sub Topics page:
Select the topic name.
Click Show info panel, and then select the Pub/Sub Publisher permissions.
After you create the logging export by using this filter, log files begin to populate in the Pub/Sub topic in the configured project. You can confirm that the topic is receiving messages by using the Metrics Explorer in Cloud Monitoring. Using the following resource type and metric, observe the number of message-send operations over a brief period. If you have configured the export properly, you will see activity above 0 on the graph, as in this screenshot.
- Resource type:
pubsub_topic
- Metric:
pubsub/topic/send_message_operation_count
Set up the Splunk data ingest
Option A: Stream logs using Pub/Sub to Splunk Dataflow
You use the Pub/Sub to Splunk Dataflow template to create a Dataflow job that pulls messages from the previously created Pub/Sub subscription, converts payloads into Splunk HEC event format, and forwards them to Splunk HEC. Therefore, it requires Splunk HEC endpoint URL and an HEC token, with the URL being accessible from the Dataflow job's network. Follow the instructions to configure Splunk HEC if you haven't already done so.
In addition to parallelizing and distributing the workload among multiple workers, the Dataflow service manages these resources and autoscales the number of workers based on existing messages backlog from Pub/Sub and current workers' resource utilization.
In terms of fault-tolerance, the Pub/Sub to Splunk Dataflow template handles retries (with exponential backoff) to Splunk HEC in case the downstream endpoint is down or if there's a network connectivity issue. In case of message processing errors and to ensure no data loss, those messages are forwarded to another Pub/Sub dead-letter topic, which must also be created before running this Dataflow template.
After Splunk HEC endpoint is configured to receive data, you can execute the Pub/Sub to Splunk Dataflow pipeline through Google Cloud console, Google Cloud CLI, or Dataflow API. Note that you can optionally apply a JavaScript user-defined function to transform the message payload before forwarding to Splunk.
The following example gcloud
command runs the latest version of Pub/Sub
to Splunk Dataflow template on the input Pub/Sub subscription logs-export-subscription
. Replace your-splunk-hec-token
and your-splunk-hec-url
with your Splunk HEC token and endpoint, for example https://splunk-hec-host:8088
:
JOB_NAME=pubsub-to-splunk-$USER-`date +"%Y%m%d-%H%M%S%z"`
gcloud dataflow jobs run $JOB_NAME \
--gcs-location gs://dataflow-templates/latest/Cloud_PubSub_to_Splunk \
--max-workers 2 \
--parameters \
inputSubscription=projects/compliance-logging-export/subscriptions/logs-export-subscription,\
token=your-splunk-hec-token,\
url=your-splunk-hec-url,\
outputDeadletterTopic=projects/compliance-logging-export/topics/splunk-pubsub-deadletter,\
batchCount=10,\
parallelism=8
You can confirm that the Dataflow pipeline is successfully sending messages to Splunk by using Metrics Explorer in Cloud Monitoring. Using the following resource type and metrics, observe the number of successful outbound events and potentially any failed events over a brief period.
- Resource type:
dataflow_job
- Metric 1:
custom.googleapis.com/dataflow/outbound-successful-events
- Metric 1:
custom.googleapis.com/dataflow/outbound-failed-events
If you have configured the template properly, you see activity above 0 on the graph, as in this screenshot.
Option B: Pull logs using Splunk Add-on for Google Cloud Platform
The Splunk Add-on for Google Cloud Platform uses the Pub/Sub topic and a service account in Google Cloud. The service account is used to generate a private key that the add-on uses to establish a Pub/Sub subscription and ingest messages from the logging export topic. The appropriate IAM permissions are required to allow the service account to create the subscription and list the components in the Pub/Sub project that contains the subscription.
Follow the
instructions to set up the Splunk Add-on for Google Cloud Platform.
Grant the roles/pubsub.viewer
and roles/pubsub.subscriber
Pub/Sub IAM Permissions for the Service Account created as a part of the add-on instruction.
After you configure the add-on, the Pub/Sub messages from the logging export appear in Splunk.
By using Metrics Explorer in Cloud Monitoring, you can confirm that the subscription that the add-on is using is pulling messages. Using the following resource type and metric, observe the number of message-pull operations over a brief period.
- Resource type:
pubsub_subscription
- Metric:
pubsub/subscription/pull_message_operation_count
If you have configured the export properly, you see activity above 0 on the graph, as in this screenshot.
Using the exported logs
After the exported logs have been ingested into Splunk, you can use Splunk as you would with any other data source to do the following tasks:
- Search the logs.
- Correlate complex events.
- Visualize results by using dashboards.
What's next
- Deploying production-ready log exports to Splunk using Dataflow
- Collect logs on GKE Enterprise with Splunk Connect
Look at the other export scenarios:
Explore reference architectures, diagrams, and best practices about Google Cloud. Take a look at our Cloud Architecture Center.