Detective controls

Last reviewed 2023-12-20 UTC

Threat detection and monitoring capabilities are provided using a combination of built-in security controls from Security Command Center and custom solutions that let you detect and respond to security events.

Centralized logging for security and audit

The blueprint configures logging capabilities to track and analyze changes to your Google Cloud resources with logs that are aggregated to a single project.

The following diagram shows how the blueprint aggregates logs from multiple sources in multiple projects into a centralized log sink.

Logging structure for example.com.

The diagram describes the following:

  • Log sinks are configured at the organization node to aggregate logs from all projects in the resource hierarchy.
  • Multiple log sinks are configured to send logs that match a filter to different destinations for storage and analytics.
  • The prj-c-logging project contains all the resources for log storage and analytics.
  • Optionally, you can configure additional tooling to export logs to a SIEM.

The blueprint uses different log sources and includes these logs in the log sink filter so that the logs can be exported to a centralized destination. The following table describes the log sources.

Log source

Description

Admin Activity audit logs

You cannot configure, disable, or exclude Admin Activity audit logs.

System Event audit logs

You cannot configure, disable, or exclude System Event audit logs.

Policy Denied audit logs

You cannot configure or disable Policy Denied audit logs, but you can optionally exclude them with exclusion filters.

Data Access audit logs

By default, the blueprint doesn't enable data access logs because the volume and cost of these logs can be high.

To determine whether you should enable data access logs, evaluate where your workloads handle sensitive data and consider whether you have a requirement to enable data access logs for each service and environment working with sensitive data.

VPC Flow Logs

The blueprint enables VPC Flow Logs for every subnet. The blueprint configures log sampling to sample 50% of logs to reduce cost.

If you create additional subnets, you must ensure that VPC Flow Logs are enabled for each subnet.

Firewall Rules Logging

The blueprint enables Firewall Rules Logging for every firewall policy rule.

If you create additional firewall policy rules for workloads, you must ensure that Firewall Rules Logging is enabled for each new rule.

Cloud DNS logging

The blueprint enables Cloud DNS logs for managed zones.

If you create additional managed zones, you must enable those DNS logs.

Google Workspace audit logging

Requires a one-time enablement step that is not automated by the blueprint. For more information, see Share data with Google Cloud services.

Access Transparency logs

Requires a one-time enablement step that is not automated by the blueprint. For more information, see Enable Access Transparency.

The following table describes the log sinks and how they are used with supported destinations in the blueprint.

Sink

Destination

Purpose

sk-c-logging-la

Logs routed to Cloud Logging buckets with Log Analytics and a linked BigQuery dataset enabled

Actively analyze logs. Run ad hoc investigations by using Logs Explorer in the console, or write SQL queries, reports, and views using the linked BigQuery dataset.

sk-c-logging-bkt

Logs routed to Cloud Storage

Store logs long-term for compliance, audit, and incident-tracking purposes.

Optionally, if you have compliance requirements for mandatory data retention, we recommend that you additionally configure Bucket Lock.

sk-c-logging-pub

Logs routed to Pub/Sub

Export logs to an external platform such as your existing SIEM.

This requires additional work to integrate with your SIEM, such as the following mechanisms:

For guidance on enabling additional log types and writing log sink filters, see the log scoping tool.

Threat monitoring with Security Command Center

We recommend that you activate Security Command Center Premium for your organization to automatically detect threats, vulnerabilities, and misconfigurations in your Google Cloud resources. Security Command Center creates security findings from multiple sources including the following:

  • Security Health Analytics: detects common vulnerabilities and misconfigurations across Google Cloud resources.
  • Attack path exposure: shows a simulated path of how an attacker could exploit your high-value resources, based on the vulnerabilities and misconfigurations that are detected by other Security Command Center sources.
  • Event Threat Detection: applies detection logic and proprietary threat intelligence against your logs to identify threats in near-real time.
  • Container Threat Detection: detects common container runtime attacks.
  • Virtual Machine Threat Detection: detects potentially malicious applications that are running on virtual machines.
  • Web Security Scanner: scans for OWASP Top Ten vulnerabilities in your web-facing applications on Compute Engine, App Engine, or Google Kubernetes Engine.

For more information on the vulnerabilities and threats addressed by Security Command Center, see Security Command Center sources.

You must activate Security Command Center after you deploy the blueprint. For instructions, see Activate Security Command Center for an organization.

After you activate Security Command Center, we recommend that you export the findings that are produced by Security Command Center to your existing tools or processes for triaging and responding to threats. The blueprint creates the prj-c-scc project with a Pub/Sub topic to be used for this integration. Depending on your existing tools, use one of the following methods to export findings:

Alerting on log-based metrics and performance metrics

When you begin to deploy workloads on top of your foundation, we recommend that you use Cloud Monitoring to measure the performance metrics.

The blueprint creates a monitoring project such as prj-p-monitoring for each environment. This project is configured as a scoping project to gather aggregated performance metrics across multiple projects. The blueprint deploys an example with log-based metrics and an alerting policy to generate email notifications if there are any changes to the IAM policy that is applied to Cloud Storage buckets. This helps monitor for suspicious activities on sensitive resources such as the bucket in the prj-b-seed project that contains the Terraform state.

More generally, you can also use Cloud Monitoring to measure the performance metrics and health of your workload applications. Depending on the operational responsibility for supporting and monitoring applications in your organization, you might make more granular monitoring projects for different teams. Use these monitoring projects to view performance metrics, create dashboards of application health, and trigger alerts when your expected SLO is not met.

The following diagram shows a high-level view of how Cloud Monitoring aggregates performance metrics.

Monitoring of performance.

For guidance on how to monitor workloads effectively for reliability and availability, see the Site Reliability Engineering book by Google, particularly the chapter on monitoring distributed systems.

Custom solution for automated log analysis

You might have requirements to create alerts for security events that are based on custom queries against logs. Custom queries can help supplement the capabilities of your SIEM by analyzing logs on Google Cloud and exporting only the events that merit investigation, especially if you don't have the capacity to export all cloud logs to your SIEM.

The blueprint helps enable this log analysis by setting up a centralized source of logs that you can query using a linked BigQuery dataset. To automate this capability, you must implement the code sample at bq-log-alerting and extend the foundation capabilities. The sample code lets you regularly query a log source and send a custom finding to Security Command Center.

The following diagram introduces the high-level flow of the automated log analysis.

Automated logging analysis.

The diagram shows the following concepts of automated log analysis:

  • Logs from various sources are aggregated into a centralized logs bucket with log analytics and a linked BigQuery dataset.
  • BigQuery views are configured to query logs for the security event that you want to monitor.
  • Cloud Scheduler pushes an event to a Pub/Sub topic every 15 minutes and triggers Cloud Functions.
  • Cloud Functions queries the views for new events. If it finds events, it pushes them to Security Command Center as custom findings.
  • Security Command Center publishes notifications about new findings to another Pub/Sub topic.
  • An external tool such as a SIEM subscribes to the Pub/Sub topic to ingest new findings.

The sample has several use cases to query for potentially suspicious behavior. Examples include a login from a list of super admins or other highly privileged accounts that you specify, changes to logging settings, or changes to network routes. You can extend the use cases by writing new query views for your requirements. Write your own queries or reference security log analytics for a library of SQL queries to help you analyze Google Cloud logs.

Custom solution to respond to asset changes

To respond to events in real time, we recommend that you use Cloud Asset Inventory to monitor asset changes. In this custom solution, an asset feed is configured to trigger notifications to Pub/Sub about changes to resources in real time, and then Cloud Functions runs custom code to enforce your own business logic based on whether the change should be allowed.

The blueprint has an example of this custom governance solution that monitors for IAM changes that add highly sensitive roles including Organization Admin, Owner, and Editor. The following diagram describes this solution.

Automatically reverting an IAM policy change and sending a notification.

The previous diagram shows these concepts:

  • Changes are made to an allow policy.
  • The Cloud Asset Inventory feed sends a real-time notification about the allow policy change to Pub/Sub.
  • Pub/Sub triggers a function.
  • Cloud Functions runs custom code to enforce your policy. The example function has logic to assess if the change has added the Organization Admin, Owner, or Editor roles to an allow policy. If so, the function creates a custom security finding and sends it to Security Command Center.
  • Optionally, you can use this model to automate remediation efforts. Write additional business logic in Cloud Functions to automatically take action on the finding, such as reverting the allow policy to its previous state.

In addition, you can extend the infrastructure and logic used by this sample solution to add custom responses to other events that are important to your business.

What's next