VPC Flow Logs

VPC Flow Logs records a sample of network flows sent from and received by VM instances, including instances used as Google Kubernetes Engine nodes. These logs can be used for network monitoring, forensics, real-time security analysis, and expense optimization.

You can view flow logs in Cloud Logging, and you can export logs to any destination that Cloud Logging export supports.

Flow logs are aggregated by connection from Compute Engine VMs and exported in real time. By subscribing to Pub/Sub, you can analyze flow logs using real-time streaming APIs.

Use cases

Network monitoring

VPC Flow Logs provides you with real-time visibility into network throughput and performance. You can:

  • Monitor the VPC network
  • Perform network diagnosis
  • Filter the flow logs by VMs and by applications to understand traffic changes
  • Understand traffic growth for capacity forecasting

Understanding network usage and optimizing network traffic expenses

You can analyze network usage with VPC Flow Logs. You can analyze the network flows for the following:

  • Traffic between regions and zones
  • Traffic to specific countries on the internet
  • Top talkers

Based on the analysis, you can optimize network traffic expenses.

Network forensics

You can utilize VPC Flow Logs for network forensics. For example, if an incident occurs, you can examine the following:

  • Which IPs talked with whom and when
  • Any compromised IPs by analyzing all the incoming and outgoing network flows

Real-time security analysis

You can use the real-time streaming APIs (through Pub/Sub) and integrate with SIEM (Security Information and Event Management) systems. This can provide real-time monitoring, correlation of events, analysis, and security alerts.

Specifications

  • VPC Flow Logs is part of Andromeda, the software that powers VPC networks. VPC Flow Logs introduces no delay or performance penalty when enabled.
  • VPC Flow Logs works with VPC networks, not legacy networks. You enable or disable VPC Flow Logs per subnet. If enabled for a subnet, VPC Flow Logs collects data from all VM instances in that subnet.
  • VPC Flow Logs samples each VM's TCP, UDP, ICMP, ESP, and GRE flows. Both inbound and outbound flows are sampled. These flows can be within Google Cloud or between Google Cloud and other networks. If a flow is captured by sampling, VPC Flow Logs generates a log for the flow. Each flow record includes the information described in the Record format section.
  • VPC Flow Logs interacts with firewall rules in the following ways:
    • Egress packets are sampled before egress firewall rules. Even if an egress firewall rule denies outbound packets, those packets can be sampled by VPC Flow Logs.
    • Ingress packets are sampled after ingress firewall rules. If an ingress firewall rule denies inbound packets, those packets are not sampled by VPC Flow Logs.
  • You can use filters in VPC Flow Logs to generate only certain logs.
  • VPC Flow Logs supports VMs that have multiple network interfaces. You need to enable VPC Flow Logs for each subnet, in each VPC, that contains a network interface.
  • To log flows between Pods on the same Google Kubernetes Engine (GKE) node, you must enable Intranode visibility for the cluster.
  • VPC Flow Logs are not reported from non-VM resources like Cloud Run or on-premises endpoints.

Logs collection

Flow logs are collected for each VM connection at specific intervals. All packets collected for a given interval for a given connection are aggregated for a period of time (aggregation interval) into a single flow log entry. This data is then sent to Logging.

Logs are stored in Logging for 30 days by default. If you want to keep logs longer than that, you can either set a custom retention period or export them to a supported destination.

Log sampling and processing

Google Cloud samples packets that leave and enter a VM to generate flow logs. Not every packet is captured into its own log record. About 1 out of every 30 packets is captured, but this sampling rate might be lower depending on the VM's load. You cannot adjust this rate.

After the flow logs are generated, Google Cloud processes them according to the following procedure:

  1. Filtering: You can specify that only logs that match specified criteria are generated. For example, you can filter so that only logs for a particular VM or only logs with a particular metadata value are generated and the rest are discarded. For more information, see Log filtering.
  2. Aggregation: Information for sampled packets is aggregated over a configurable aggregation interval to produce a flow log entry.
  3. Flow log sampling: This is a second sampling process. Flow log entries are further sampled according to a configurable sample rate parameter.
  4. Metadata: If disabled, all metadata annotations are discarded. If you want to keep metadata, you can specify that all fields or a specified set of fields are retained. For more information, see Metadata annotations.
  5. Write to Logging: The final log entries are written to Cloud Logging.

Because VPC Flow Logs does not capture every packet, it compensates for missed packets by interpolating from the captured packets. This happens for packets missed because of initial and user-configurable sampling settings.

Even though Google Cloud doesn't capture every packet, log record captures can be quite large. You can balance your traffic visibility and storage cost needs by adjusting the following aspects of logs collection:

  • Aggregation interval: Sampled packets for a time interval are aggregated into a single log entry. This time interval can be 5 seconds (default), 30 seconds, 1 minute, 5 minutes, 10 minutes, or 15 minutes.
  • Sample rate: Before being written to Logging, the number of logs can be sampled to reduce their number. By default, the log entry volume is scaled by 0.5 (50%), which means that half of entries are kept. You can set this from 1.0 (100%, all log entries are kept) to 0.0 (0%, no logs are kept).
  • Metadata annotations: By default, flow log entries are annotated with metadata information, such as the names of the source and destination VMs or the geographic region of external sources and destinations. Metadata annotations can be turned off, or you can specify only certain annotations, to save storage space.
  • Filtering: By default, logs are generated for every flow in the subnet. You can set filters so that only logs that match certain criteria are generated.

Pricing

Standard pricing for Logging, BigQuery, or Pub/Sub apply. VPC Flow Logs pricing is described in Network Telemetry pricing.

What's next