VPC Flow Logs

VPC Flow Logs records a sample of network flows sent from and received by VM instances, including instances used as Google Kubernetes Engine nodes. These logs can be used for network monitoring, forensics, real-time security analysis, and expense optimization.

You can view flow logs in Cloud Logging, and you can export logs to any destination that Cloud Logging export supports.

Flow logs are aggregated by connection from Compute Engine VMs and exported in real time. By subscribing to Pub/Sub, you can analyze flow logs using real-time streaming APIs.

Key properties

  • VPC Flow Logs is part of Andromeda, the software that powers VPC networks. VPC Flow Logs introduces no delay or performance penalty when enabled.
  • VPC Flow Logs works with VPC networks, not legacy networks. You enable or disable VPC Flow Logs per subnet. If enabled for a subnet, VPC Flow Logs collects data from all VM instances in that subnet.
  • VPC Flow Logs samples each VM's TCP, UDP, ICMP, ESP, and GRE flows. Both inbound and outbound flows are sampled. These flows can be between the VM and another VM, a host in your on-premises data center, a Google service, or a host on the internet. If a flow is captured by sampling, VPC Flow Logs generates a log for the flow. Each flow record includes the information described in the Record format section.
  • VPC Flow Logs interacts with firewall rules in the following ways:
    • Egress packets are sampled before egress firewall rules. Even if an egress firewall rule denies outbound packets, those packets can be sampled by VPC Flow Logs.
    • Ingress packets are sampled after ingress firewall rules. If an ingress firewall rule denies inbound packets, those packets are not sampled by VPC Flow Logs.
  • You can use filters in VPC Flow Logs to generate only certain logs.
  • VPC Flow Logs supports VMs that have multiple network interfaces. You need to enable VPC Flow Logs for each subnet, in each VPC, that contains a network interface.
  • To log flows between Pods on the same Google Kubernetes Engine (GKE) node, you must enable Intranode visibility for the cluster.

Use cases

Network monitoring

VPC Flow Logs provides you with real-time visibility into network throughput and performance. You can:

  • Monitor the VPC network
  • Perform network diagnosis
  • Filter the flow logs by VMs and by applications to understand traffic changes
  • Understand traffic growth for capacity forecasting

Understanding network usage and optimizing network traffic expenses

You can analyze network usage with VPC Flow Logs. You can analyze the network flows for the following:

  • Traffic between regions and zones
  • Traffic to specific countries on the internet
  • Top talkers

Based on the analysis, you can optimize network traffic expenses.

Network forensics

You can utilize VPC Flow Logs for network forensics. For example, if an incident occurs, you can examine the following:

  • Which IPs talked with whom and when
  • Any compromised IPs by analyzing all the incoming and outgoing network flows

Real-time security analysis

You can leverage the real-time streaming APIs (through Pub/Sub), and integrate with SIEM (Security Information and Event Management) systems. This can provide real-time monitoring, correlation of events, analysis, and security alerts.

Logs collection

Flow logs are collected for each VM connection at specific intervals. All packets collected for a given interval for a given connection are aggregated for a period of time (aggregation interval) into a single flow log entry. This data is then sent to Logging.

Logs are stored in Logging for 30 days by default. If you wish to keep logs longer than that, you can either set a custom retention period or export them to a supported destination.

Log sampling and processing

Google Cloud samples packets that leave and enter a VM to generate flow logs. Not every packet is captured into its own log record. About 1 out of every 30 packets is captured, but this sampling rate might be lower depending on the VM's load. You cannot adjust this rate.

After the flow logs are generated, Google Cloud processes them according to the following procedure:

  1. Filtering: You can specify that only logs that match specified criteria are generated. For example, you can filter so that only logs for a particular VM or only logs with a particular metadata value are generated and the rest are discarded. For more information, see Log filtering.
  2. Aggregation: Information for sampled packets is aggregated over a configurable aggregation interval to produce a flow log entry.
  3. Flow log sampling: This is a second sampling process. Flow log entries are further sampled according to a configurable sample rate parameter.
  4. Metadata: If disabled, all Metadata annotations are discarded. If you want to keep metadata, you can specify that all fields or a specified set of fields are retained. For more information, see Metadata annotations.
  5. Write to Logging: The final log entries are written to Cloud Logging.

Because VPC Flow Logs does not capture every packet, it compensates for missed packets by interpolating from the captured packets. This happens for packets missed because of initial and user-configurable sampling settings.

Even though Google Cloud doesn't capture every packet, log record captures can be quite large. You can balance your traffic visibility and storage cost needs by adjusting the following aspects of logs collection:

  • Aggregation interval: Sampled packets for a time interval are aggregated into a single log entry. This time interval can be 5 seconds (default), 30 seconds, 1 minute, 5 minutes, 10 minutes, or 15 minutes.
  • Sample rate: Before being written to Logging, the number of logs can be sampled to reduce their number. By default, the log entry volume is scaled by 0.5 (50%), which means that half of entries are kept. You can set this from 1.0 (100%, all log entries are kept) to 0.0 (0%, no logs are kept).
  • Metadata annotations: By default, flow log entries are annotated with metadata information, such as the names of the source and destination VMs or the geographic region of external sources and destinations. Metadata annotations can be turned off, or you can specify only certain annotations, to save storage space.
  • Filtering: By default, logs are generated for every flow in the subnet. You can set filters so that only logs that match certain criteria are generated.

Metadata annotations

Log records contain base fields and metadata fields. The Record format section lists which fields are type metadata and which are type base. All base fields are always included. You can customize which metadata fields you keep.

  • If you select all metadata, all metadata fields in the VPC Flow Logs record format are included in the flow logs. When new metadata fields are added to the record format, the flow logs automatically include the new fields.

  • If you select no metadata, this omits all metadata fields.

  • If you select custom metadata, you can specify the metadata fields that you want to include by the parent field, such as src_vpc, or by their full names, such as src_vpc.project_id

    When new metadata fields are added to the record format, the flow logs will not include these fields, unless they are a new field within a parent field that you have specified to include.

    • If you specify custom metadata using parent fields, when new metadata fields are added to the record format within that parent field, the flow logs will automatically include the new fields.

    • If you specify custom metadata using the full name of the field, when new metadata fields are added to the parent field, the flow logs will not include the new fields.

For information on customizing metadata fields, see the gcloud CLI or API instructions for enabling VPC flow logging when you create a subnet.

GKE annotations

Flows that have an endpoint in a GKE Cluster can be annotated with GKE annotations, which can include details of the Cluster, Pod, and Service of the endpoint.

GKE Service annotations

Traffic sent to a ClusterIP, NodePort, or LoadBalancer can receive Service annotations. If sent to a NodePort or LoadBalancer, the flow receives the Service annotation on both hops of the connection.

Traffic sent directly to a Pod's Service port is annotated with a Service annotation on the destination endpoint.

Traffic sent to a Pod's Service port where the Pod is backing more than one Service on the same Service port is annotated with multiple Services on the destination endpoint. This is limited to two Services. If there are more than that, the endpoint will be annotated with a special MANY_SERVICES marker.

Pod annotations on internet traffic

Traffic between a Pod and the internet does not receive Pod annotations by default. For packets to the internet, the masquerade agent translates the Pod IP address to the node IP address before VPC Flow Logs sees the packet,so VPC Flow Logs doesn't know anything about the Pod and cannot add Pod annotations.

Because of the masquerade, Pod annotations are only visible if the destinations are within either the default non-masquerade destinations or in a custom nonMasqueradeCIDRs list. If you include internet destinations in a custom nonMasqueradeCIDRs list, you need to provide a way for the internal Pod IP addresses to be translated before they are delivered to the internet. For both private and non-private clusters, you can use Cloud NAT. See GKE interaction for more details.

Record format

Log records contain base fields, which are the core fields of every log record, and metadata fields that add additional information. Metadata fields may be omitted to save storage costs.

Some log fields are in a multi-field format, with more than one piece of data in a given field. For example, the connection field is of the IpConnection format, which contains the source and destination IP address and port, plus the protocol, in a single field. These multi-field fields are described below the record format table.

Field Field format Field type: Base or optional metadata
connection IpConnection
5-Tuple describing this connection
Base
reporter string
The side which reported the flow. Can be either SRC or DEST.
Base
rtt_msec int64
Latency as measured during the time interval, for TCP flows only. The measured latency is the time elapsed between sending a SEQ and receiving a corresponding ACK. The latency result is the sum of the network RTT and any time consumed by the application.
Base
bytes_sent int64
Amount of bytes sent from the source to the destination
Base
packets_sent int64
Number of packets sent from the source to the destination
Base
start_time string
Timestamp (RFC 3339 date string format) of the first observed packet during the aggregated time interval.
Base
end_time string
Timestamp (RFC 3339 date string format) of the last observed packet during the aggregated time interval
Base
src_gke_details GkeDetails
GKE metadata for source endpoints. Only available if the endpoint is GKE.
Metadata
dest_gke_details GkeDetails
GKE metadata for destination endpoints. Only available if the endpoint is GKE.
Metadata
src_instance InstanceDetails
If the source of the connection was a VM located on the same VPC, this field is populated with VM instance details. In a Shared VPC configuration, project_id corresponds to the project that owns the instance, usually the service project.
Metadata
dest_instance InstanceDetails
If the destination of the connection was a VM located on the same VPC, this field is populated with VM instance details. In a Shared VPC configuration, project_id corresponds to the project that owns the instance, usually the service project.
Metadata
src_location GeographicDetails
If the source of the connection was external to the VPC, this field is populated with available location metadata.
Metadata
dest_location GeographicDetails
If the destination of the connection was external to the VPC, this field is populated with available location metadata.
Metadata
src_vpc VpcDetails
If the source of the connection was a VM located on the same VPC, this field is populated with VPC network details. In a Shared VPC configuration, project_id corresponds to that of the host project.
Metadata
dest_vpc VpcDetails
If the destination of the connection was a VM located on the same VPC, this field is populated with VPC network details. In a Shared VPC configuration, project_id corresponds to that of the host project.
Metadata

IpConnection field format

Field Type Description
protocol int32 The IANA protocol number
src_ip string Source IP address
dest_ip string Destination IP address
src_port int32 Source port
dest_port int32 Destination port

GkeDetails field format

Field Type Description
cluster ClusterDetails GKE cluster metadata
pod PodDetails GKE Pod metadata, populated when the source or destination of the traffic is a Pod
service ServiceDetails GKE Service metadata, populated in Service endpoints only. The record contains up to two Services. If there are more than two relevant Services, this field contains a single Service with a special MANY_SERVICES marker.

ClusterDetails field format

Field Type Description
cluster_location string Location of the cluster. This can be a zone or a region depending if the cluster is zonal or regional.
cluster_name string GKE cluster name.

PodDetails field format

Field Type Description
pod_name string Name of the Pod
pod_namespace string Namespace of the Pod

ServiceDetails field format

Field Type Description
service_name string Name of the Service. If there are more than two relevant Services, the field is set to a special MANY_SERVICES marker.
service_namespace string Namespace of the Service

Example:

If there are two services, the Service field looks like this:

service: [
 0: {
  service_name: "my-lb-service"
  service_namespace: "default"
 }
 1: {
  service_name: "my-lb-service2"
  service_namespace: "default"
 }
]

If there are more than two services, the Service field looks like this:

service: [
 0: {
  service_name: "MANY_SERVICES"
 }
]

InstanceDetails field format

Field Type Description
project_id string ID of the project containing the VM
region string Region of the VM
vm_name string Instance name of the VM
zone string Zone of the VM

GeographicDetails field format

Field Type Description
asn int32 The autonomous system number (ASN) of the external network to which this endpoint belongs.
city string City for external endpoints
continent string Continent for external endpoints
country string Country for external endpoints, represented as ISO 3166-1 Alpha-3 country codes
region string Region for external endpoints

VpcDetails field format

Field Type Description
project_id string ID of the project containing the VPC
subnetwork_name string Subnetwork on which the VM is operating
vpc_name string VPC on which the VM is operating

Log filtering

When you enable VPC Flow Logs, you can set a filter based on both base and metadata fields that only preserves logs that match the filter. All other logs are discarded before being written to Logging, which saves you money and reduces the time needed to find the information you are looking for.

You can filter on any subset of fields from the Record format.

VPC Flow Logs filtering uses CEL, an embedded expression language for attribute-based logic expressions. Filter expressions for VPC Flow Logs have a limit of 2,048 characters. For more information, see Supported CEL logic operators.

For more information about CEL, see the CEL introduction and the language definition. The generation filter feature supports a limited subset of CEL syntax.

For information about creating a subnet that uses log filtering, see the gcloud CLI or API instructions for Enabling VPC Flow Logs when you create a subnet.

For information about configuring log filtering, see the gcloud CLI or API instructions for Updating VPC Flow Logs parameters.

Example 1: Limit logs collection to a specific VM named my-vm. In this case, only logs where the src_instance field as reported by the source of the traffic is my-vm or the dst_instance field as reported by the destination of the traffic is my-vm are recorded.

gcloud compute networks subnets update my-subnet \
    --logging-filter-expr="(src_instance.vm_name == 'my-vm' && reporter=='SRC') || (dest_instance.vm_name == 'my-vm' && reporter=='DEST')"

Example 2: Limit logs collection to packets whose source IP addresses are in the 10.0.0.0/8 subnet.

gcloud compute networks subnets update my-subnet \
    --logging-filter-expr="inIpRange(connection.src_ip, '10.0.0.0/8')"

Example 3: Limit logs collection to traffic that is external to a VPC.

gcloud compute networks subnets update my-subnet \
    --logging-filter-expr '!(has(src_vpc.vpc_name) && has(dest_vpc.vpc_name))'

Supported CEL logic operators

Expression Supported types Description
true, false Boolean Boolean constants

x == y

x != y

Boolean, Int, String

Comparison operators

Example: connection.protocol == 6

x && y

x || y

Boolean

Boolean logic operators

Example: connection.protocol == 6 && src_instance.vm_name == "vm_1"

!x Boolean Negation
1, 2.0, 0, ... Int Constant numeric literals
x + y String String concatenation
"foo", 'foo', ... String Constant string literal
x.lower() String Returns the lowercase value of the string
x.upper() String Returns the uppercase value of the string
x.contains(y) String Returns true if the string contains the specified substring
x.startsWith(y) String Returns true if the string begins with the specified substring
x.endsWith(y) String Returns true if the string ends with the specified substring
inIpRange(X, Y) String

Returns true if X is an IP and Y is an IP range that contains X

Example: inIpRange("1.2.3.1", "1.2.3.0/24")

x.containsFieldValue(y) x: list
y: map(string, string)

Returns true if the list contains an object with fields that match the specified key-value pairs

Example: dest_gke_details.service.containsFieldValue({'service_name': 'service1', 'service_namespace': 'namespace1'})

has(x) String

Returns true if the field is present.

Traffic pattern examples

This section demonstrates how VPC Flow Logs works for various use cases.

VM-to-VM flows in the same VPC network

VM flows within a VPC network.
VM flows within a VPC network. (click to enlarge).

For VM-to-VM flows in the same VPC network, flow logs are reported from both requesting and responding VMs, as long as both VMs are in subnets that have VPC Flow Logs enabled. In this example, VM 10.10.0.2 sends a request with 1,224 bytes to VM 10.50.0.2, which is also in a subnet that has logging enabled. In turn, 10.50.0.2 responds to the request with a reply containing 5,342 bytes. Both the request and reply are recorded from both the requesting and responding VMs.

As reported by requesting VM (10.10.0.2)
request/reply connection.src_ip connection.dest_ip bytes_sent Annotations
request 10.10.0.2 10.50.0.2 1,224 src_instance.*
dest_instance.*
src_vpc.*
dest_vpc.*
reply 10.50.0.2 10.10.0.2 5,342 src_instance.*
dest_instance.*
src_vpc.*
dest_vpc.*
As reported by responding VM (10.50.0.2)
request/reply connection.src_ip connection.dest_ip bytes Annotations
request 10.10.0.2 10.50.0.2 1,224 src_instance.*
dest_instance.*
src_vpc.*
dest_vpc.*
reply 10.50.0.2 10.10.0.2 5,342 src_instance.*
dest_instance.*
src_vpc.*
dest_vpc.*

VM-to-external-IP-address flows

VM-to-external-IP-address flows.
VM-to-external-IP-address flows (click to enlarge).

For flows that traverse the internet between a VM that's in a VPC network and an endpoint with an external IP address, flow logs are reported from the VM that's in the VPC network only:

  • For egress flows, the logs are reported from the VPC network VM that is the source of the traffic.
  • For ingress flows, the logs are reported from the VPC network VM that is the destination of the traffic.

In this example, VM 10.10.0.2 exchanges packets over the internet with an endpoint that has the external IP address 203.0.113.5. The outbound traffic of 1,224 bytes sent from 10.10.0.2 to 203.0.113.5 is reported from the source VM, 10.10.0.2. The inbound traffic of 5,342 bytes sent from 203.0.113.5 to 10.10.0.2 is reported from the destination of the traffic, VM 10.10.0.2.

request/reply connection.src_ip connection.dest_ip bytes_sent Annotations
request 10.10.0.2 203.0.113.5 1,224 src_instance.*
src_vpc.*
dest_location.*
reply 203.0.113.5 10.10.0.2 5,342 dest_instance.*
dest_vpc.*
src_location.*

VM-to-on-premises flows

VM-to-on-premises flows.
VM-to-on-premises flows (click to enlarge).

For flows between a VM that's in a VPC network and an on-premises endpoint with an internal IP address, flow logs are reported from the VM that's in the VPC network only:

  • For egress flows, the logs are reported from the VPC network VM that is the source of the traffic.
  • For ingress flows, the logs are reported from the VPC network VM that is the destination of the traffic.

In this example, VM 10.10.0.2 and on-premises endpoint 10.30.0.2 are connected through a VPN gateway or Cloud Interconnect. The outbound traffic of 1,224 bytes sent from 10.10.0.2 to 10.30.0.2 is reported from the source VM, 10.10.0.2. The inbound traffic of 5,342 bytes sent from 10.30.0.2 to 10.10.0.2 is reported from the destination of the traffic, VM 10.10.0.2.

request/reply connection.src_ip connection.dest_ip bytes_sent Annotations
request 10.10.0.2 10.30.0.2 1,224 src_instance.*
src_vpc.*
reply 10.30.0.2 10.10.0.2 5,342 dest_instance.*
dest_vpc.*

VM-to-VM flows for Shared VPC

Shared VPC flows.
Shared VPC flows (click to enlarge).

For VM-to-VM flows for Shared VPC, you can enable VPC Flow Logs for the subnet in the host project. For example, subnet 10.10.0.0/20 belongs to a Shared VPC network defined in a host project. You can see flow logs from VMs belonging to this subnet, including ones created by service projects. In this example, the service projects are called "webserver", "recommendation", "database".

For VM-to-VM flows, if both VMs are in the same project, or in the case of a shared network, the same host project, annotations for project ID and the like are provided for the other endpoint in the connection. If the other VM is in a different project, then annotations for the other VM are not provided.

The following table shows a flow as reported by either 10.10.0.10 or 10.10.0.20.

  • src_vpc.project_id and dest_vpc.project_id are for the host project because the VPC subnet belongs to the host project.
  • src_instance.project_id and dest_instance.project_id are for the service projects because the instances belong to the service projects.
connection
.src_ip
src_instance
.project_id
src_vpc
.project_id
connection
.dest_ip
dest_instance
.project_id
dest_vpc
.project_id
10.10.0.10 webserver host_project 10.10.0.20 recommendation host_project

Service projects do not own the Shared VPC network and do not have access to the flow logs of the Shared VPC network.

VM-to-VM flows for VPC Network Peering

VPC Network Peering flows (click to enlarge).
VPC Network Peering flows (click to enlarge).

Unless both VMs are in the same Google Cloud project, VM-to-VM flows for peered VPC networks are reported in the same way as for external endpoints—project and other annotation information for the other VM are not provided. If both VMs are in the same project, even if in different networks, then project and other annotation information is provided for the other VM as well.

In this example, the subnets of VM 10.10.0.2 in project analytics-prod and VM 10.50.0.2 in project webserver-test are connected through VPC Network Peering. If VPC Flow Logs is enabled in project analytics-prod, the traffic (1224 bytes) sent from 10.10.0.2 to 10.50.0.2 is reported from VM 10.10.0.2, which is the source of the flow. The traffic (5342 bytes) sent from 10.50.0.2 to 10.10.0.2 is also reported from VM 10.10.0.2, which is the destination of the flow.

In this example, VPC Flow Logs is not turned on in project webserver-test, so no logs are recorded by VM 10.50.0.2.

reporter connection.src_ip connection.dest_ip bytes_sent Annotations
source 10.10.0.2 10.50.0.2 1,224 src_instance.*
src_vpc.*
destination 10.50.0.2 10.10.0.2 5,342 dest_instance.*
dest_vpc.*

VM-to-VM flows for internal passthrough Network Load Balancers

Internal passthrough Network Load Balancer flows (click to enlarge).
Internal passthrough Network Load Balancer flows (click to enlarge).

When you add a VM to the backend service for an internal passthrough Network Load Balancer, the Linux or Windows Guest Environment adds the IP address of the load balancer to the local routing table of the VM. This allows the VM to accept request packets with destinations set to the IP address of the load balancer. When the VM replies, it sends its response directly; however, the source IP address for the response packets is set to the IP address of the load balancer, not the VM being load balanced.

VM-to-VM flows sent through an internal passthrough Network Load Balancer are reported from both source and destination. For an example HTTP request / response pair, the following table explains the fields of the flow log entries observed. For the purpose of this illustration, consider the following network configuration:

  • Browser instance at 192.168.1.2
  • Internal passthrough Network Load Balancer at 10.240.0.200
  • Webserver instance at 10.240.0.3
Traffic Direction reporter connection.src_ip connection.dest_ip connection.src_instance connection.dest_instance
Request SRC 192.168.1.2 10.240.0.200 Browser instance
Request DEST 192.168.1.2 10.240.0.200 Browser instance Webserver instance
Response SRC 10.240.0.3 192.168.1.2 Webserver instance Browser instance
Response DEST 10.240.0.200 192.168.1.2 Browser instance

The requesting VM does not know which VM will respond to the request. In addition, because the other VM sends a response with the internal load balancer IP as the source address, it does not know which VM has responded. For these reasons, the requesting VM cannot add dest_instance information to its report, only src_instance information. Because the responding VM does know the IP address of the other VM, it can supply both src_instance and dest_instance information.

Pod to ClusterIP flow

Pod to cluster IP flow (click to enlarge).
Pod to cluster IP flow (click to enlarge).

In this example, traffic is sent from client Pod (10.4.0.2) to cluster-service (10.0.32.2:80). The destination is resolved to the selected server Pod IP address (10.4.0.3) on the target port (8080).

On node edges, the flow is sampled twice with the translated IP address and port. For both sampling points, we will identify that the destination Pod is backing service cluster-service on port 8080, and annotate the record with the Service details as well as the Pod details. In case the traffic is routed to a Pod on the same node, the traffic doesn't leave the node and is not sampled at all.

In this example, the following records are found.

reporter connection.src_ip connection.dst_ip bytes_sent Annotations
SRC 10.4.0.2 10.4.0.3 1,224 src_instance.*
src_vpc.*
src_gke_details.cluster.*
src_gke_details.pod.*
dest_instance.*
dest_vpc.*
dest_gke_details.cluster.*
dest_gke_details.pod.*
dest_gke_details.service.*
DEST 10.4.0.2 10.4.0.3 1,224 src_instance.*
src_vpc.*
src_gke_details.cluster.*
src_gke_details.pod.*
dest_instance.*
dest_vpc.*
dest_gke_details.cluster.*
dest_gke_details.pod.*
dest_gke_details.service.*

GKE external LoadBalancer flows

External load balancer flows (click to enlarge).
External load balancer flows (click to enlarge).

Traffic from an external IP address to a GKE service (35.35.35.35) is routed to a node in the cluster, 10.0.12.2 in this example, for routing. By default, external passthrough Network Load Balancers distribute traffic across all nodes in the cluster, even those not running a relevant Pod. Traffic might take extra hops to get to the relevant Pod. For more information, see Networking outside the cluster.

The traffic is then routed from the node (10.0.12.2) to the selected server Pod (10.4.0.2). Both hops are logged because all node edges are sampled. In case the traffic is routed to a Pod on the same node, 10.4.0.3 in this example, the second hop would not be logged, as it doesn't leave the node. The second hop is logged by both nodes' sampling points. The second hop is logged by both nodes' sampling points. For the first hop, we identify the Service based on the load balancer IP and Service port (80). For the second hop, we identify that the destination Pod is backing the Service on the target port (8080).

In this example, the following records are found.

reporter connection.src_ip connection.dst_ip bytes_sent Annotations
DEST 203.0.113.1 35.35.35.35 1,224 src_location.*
dest_instance.*
dest_vpc.*
dest_gke_details.cluster.*
dest_gke_details.service.*
SRC 10.0.12.2 10.4.0.2 1,224 src_instance.*
src_vpc.*
src_gke_details.cluster.*
dest_instance.*
dest_vpc.*
dest_gke_details.cluster.*
dest_gke_details.pod.*
dest_gke_details.service.*
DEST 10.0.12.2 10.4.0.2 1,224 src_instance.*
src_vpc.*
src_gke_details.cluster.*
dest_instance.*
dest_vpc.*
dest_gke_details.cluster.*
dest_gke_details.pod.*
dest_gke_details.service.*

GKE Ingress flows

Ingress flows (click to enlarge).
Ingress flows (click to enlarge).

A connection from a public IP address to an Ingress destination is terminated at the Load Balancer Service. The connection is mapped to a NodePort Service according to the URL. To serve the request, the load balancer (130.211.0.1) connects to one of the cluster nodes (10.0.12.2) for routing using the Service's NodePort. By default, when creating an Ingress, the GKE Ingress controller configures an HTTP(S) load balancer that distributes traffic across all nodes in the cluster, even those not running a relevant Pod. Traffic might take extra hops to get to the relevant Pod. For more information, see Networking outside the cluster. The traffic is then routed from the node (10.0.12.2) to the selected server Pod (10.4.0.2).

Both hops are logged because all node edges are sampled. For the first hop, we identify the Service based on the Service's NodePort (60000). For the second hop, we identify that the destination Pod is backing the Service on the target port (8080). The second hop is logged by both nodes' sampling points. However, in a case where the traffic is routed to a Pod on the same node (10.4.0.3), the second hop is not logged because the traffic didn't leave the node.

In this example, the following records are found.

reporter connection.src_ip connection.dst_ip bytes_sent Annotations
DEST 130.211.0.1 10.0.12.2 1,224 dest_instance.*
dest_vpc.*
dest_gke_details.cluster.*
dest_gke_details.service.*
SRC 10.0.12.2 10.4.0.2 1,224 src_instance.*
src_vpc.*
src_gke_details.cluster.*
dest_instance.*
dest_vpc.*
dest_gke_details.cluster.*
dest_gke_details.pod.*
dest_gke_details.service.*
DEST 10.0.12.2 10.4.0.2 1,224 src_instance.*
src_vpc.*
src_gke_details.cluster.*
dest_instance.*
dest_vpc.*
dest_gke_details.cluster.*
dest_gke_details.pod.*
dest_gke_details.service.*

GKE Ingress flows using container-native load balancing

Ingress flows using container-native load balancing (click to enlarge).
Ingress flows using container-native load balancing (click to enlarge).

Requests from a public IP address to an Ingress that is using container-native load balancing are terminated at the load balancer. In this type of Ingress, Pods are core objects for load balancing. A request is then sent from the load balancer (130.211.0.1) directly to a selected Pod (10.4.0.2). We identify that the destination Pod is backing the Service on the target port (8080).

In this example, the following record is found.

reporter connection.src_ip connection.dst_ip bytes_sent Annotations
DEST 130.211.0.1 10.4.0.2 1,224 dest_instance.*
dest_vpc.*
dest_gke_details.cluster.*
dest_gke_details.pod.*
dest_gke_details.service.*

Pod to external flows

Pod to external flow (click to enlarge).
Pod to external flow (click to enlarge).

Traffic from a Pod (10.4.0.3) to an external IP (203.0.113.1) is modified by IP masquerading so that the packets are sent from the node IP (10.0.12.2) instead of the Pod IP. By default, the GKE cluster is configured to masquerade traffic to external destinations. For more information, see IP masquerade agent.

In order to view Pod annotations for this traffic, you can configure the masquerade agent not to masquerade pod IPs. In such a case, to allow traffic to the internet, you can configure Cloud NAT, which processes the Pod IP addresses. For more information about Cloud NAT with GKE, review GKE interaction.

In this example, the following record is found.

reporter connection.src_ip connection.dst_ip bytes_sent Annotations
SRC 10.0.12.2 203.0.113.1 1,224 src_instance.*
src_vpc.*
src_gke_details.cluster.*
dest_location.*

Pricing

Standard pricing for Logging, BigQuery, or Pub/Sub apply. VPC Flow Logs pricing is described in Network Telemetry pricing.

FAQ

  • Does VPC Flow Logs include both allowed and denied traffic based on firewall rules?

    • VPC Flow Logs covers traffic from the perspective of a VM. All egress (outgoing) traffic from a VM is logged, even if it is blocked by an egress deny firewall rule. Ingress (incoming) traffic is logged if it is permitted by an ingress allow firewall rule. Ingress traffic blocked by an ingress deny firewall rule is not logged.
  • Does VPC Flow Logs work with VM instances with multiple interfaces?

  • Does VPC Flow Logs work with legacy networks?

What's next