VPC Flow Logs

VPC Flow Logs records a sample of network flows sent from and received by VM instances, including instances used as Google Kubernetes Engine nodes. These logs can be used for network monitoring, forensics, real-time security analysis, and expense optimization.

You can view flow logs in Cloud Logging, and you can export logs to any destination that Cloud Logging export supports.

Flow logs are aggregated by connection from Compute Engine VMs and exported in real time. By subscribing to Pub/Sub, you can analyze flow logs using real-time streaming APIs.

Key properties

VPC Flow Logs is part of Andromeda, the software that powers VPC networks. VPC Flow Logs introduces no delay or performance penalty when enabled.
VPC Flow Logs works with VPC networks, not legacy networks. You enable or disable VPC Flow Logs per subnet. If enabled for a subnet, VPC Flow Logs collects data from all VM instances in that subnet.
VPC Flow Logs samples each VM's TCP, UDP, ICMP, ESP, and GRE flows. Both inbound and outbound flows are sampled. These flows can be between the VM and another VM, a host in your on-premises data center, a Google service, or a host on the internet. If a flow is captured by sampling, VPC Flow Logs generates a log for the flow. Each flow record includes the information described in the Record format section.
VPC Flow Logs interacts with firewall rules in the following ways:
- Egress packets are sampled before egress firewall rules. Even if an egress firewall rule denies outbound packets, those packets can be sampled by VPC Flow Logs.
- Ingress packets are sampled after ingress firewall rules. If an ingress firewall rule denies inbound packets, those packets are not sampled by VPC Flow Logs.
You can use filters in VPC Flow Logs to generate only certain logs.
VPC Flow Logs supports VMs that have multiple network interfaces. You need to enable VPC Flow Logs for each subnet, in each VPC, that contains a network interface.
To log flows between Pods on the same Google Kubernetes Engine (GKE) node, you must enable Intranode visibility for the cluster.

Use cases

Network monitoring

VPC Flow Logs provides you with real-time visibility into network throughput and performance. You can:

Monitor the VPC network
Perform network diagnosis
Filter the flow logs by VMs and by applications to understand traffic changes
Understand traffic growth for capacity forecasting

Understanding network usage and optimizing network traffic expenses

You can analyze network usage with VPC Flow Logs. You can analyze the network flows for the following:

Traffic between regions and zones
Traffic to specific countries on the internet
Top talkers

Based on the analysis, you can optimize network traffic expenses.

Network forensics

You can utilize VPC Flow Logs for network forensics. For example, if an incident occurs, you can examine the following:

Which IPs talked with whom and when
Any compromised IPs by analyzing all the incoming and outgoing network flows

Real-time security analysis

You can leverage the real-time streaming APIs (through Pub/Sub), and integrate with SIEM (Security Information and Event Management) systems. This can provide real-time monitoring, correlation of events, analysis, and security alerts.

Logs collection

Flow logs are collected for each VM connection at specific intervals. All packets collected for a given interval for a given connection are aggregated for a period of time (aggregation interval) into a single flow log entry. This data is then sent to Logging.

Logs are stored in Logging for 30 days by default. If you wish to keep logs longer than that, you can either set a custom retention period or export them to a supported destination.

Log sampling and processing

Google Cloud samples packets that leave and enter a VM to generate flow logs. Not every packet is captured into its own log record. About 1 out of every 30 packets is captured, but this sampling rate might be lower depending on the VM's load. You cannot adjust this rate.

After the flow logs are generated, Google Cloud processes them according to the following procedure:

Filtering: You can specify that only logs that match specified criteria are generated. For example, you can filter so that only logs for a particular VM or only logs with a particular metadata value are generated and the rest are discarded. For more information, see Log filtering.
Aggregation: Information for sampled packets is aggregated over a configurable aggregation interval to produce a flow log entry.
Flow log sampling: This is a second sampling process. Flow log entries are further sampled according to a configurable sample rate parameter.
Metadata: If disabled, all Metadata annotations are discarded. If you want to keep metadata, you can specify that all fields or a specified set of fields are retained. For more information, see Metadata annotations.
Write to Logging: The final log entries are written to Cloud Logging.

Because VPC Flow Logs does not capture every packet, it compensates for missed packets by interpolating from the captured packets. This happens for packets missed because of initial and user-configurable sampling settings.

Even though Google Cloud doesn't capture every packet, log record captures can be quite large. You can balance your traffic visibility and storage cost needs by adjusting the following aspects of logs collection:

Aggregation interval: Sampled packets for a time interval are aggregated into a single log entry. This time interval can be 5 seconds (default), 30 seconds, 1 minute, 5 minutes, 10 minutes, or 15 minutes.
Sample rate: Before being written to Logging, the number of logs can be sampled to reduce their number. By default, the log entry volume is scaled by 0.5 (50%), which means that half of entries are kept. You can set this from 1.0 (100%, all log entries are kept) to 0.0 (0%, no logs are kept).
Metadata annotations: By default, flow log entries are annotated with metadata information, such as the names of the source and destination VMs or the geographic region of external sources and destinations. Metadata annotations can be turned off, or you can specify only certain annotations, to save storage space.
Filtering: By default, logs are generated for every flow in the subnet. You can set filters so that only logs that match certain criteria are generated.

Metadata annotations

Log records contain base fields and metadata fields. The Record format section lists which fields are type metadata and which are type base. All base fields are always included. You can customize which metadata fields you keep.

If you select all metadata, all metadata fields in the VPC Flow Logs record format are included in the flow logs. When new metadata fields are added to the record format, the flow logs automatically include the new fields.
If you select no metadata, this omits all metadata fields.
If you select custom metadata, you can specify the metadata fields that you want to include by the parent field, such as src_vpc, or by their full names, such as src_vpc.project_id

When new metadata fields are added to the record format, the flow logs will not include these fields, unless they are a new field within a parent field that you have specified to include.
- If you specify custom metadata using parent fields, when new metadata fields are added to the record format within that parent field, the flow logs will automatically include the new fields.
- If you specify custom metadata using the full name of the field, when new metadata fields are added to the parent field, the flow logs will not include the new fields.

For information on customizing metadata fields, see the gcloud CLI or API instructions for enabling VPC flow logging when you create a subnet.

GKE annotations

Flows that have an endpoint in a GKE Cluster can be annotated with GKE annotations, which can include details of the Cluster, Pod, and Service of the endpoint.

GKE Service annotations

Traffic sent to a ClusterIP, NodePort, or LoadBalancer can receive Service annotations. If sent to a NodePort or LoadBalancer, the flow receives the Service annotation on both hops of the connection.

Traffic sent directly to a Pod's Service port is annotated with a Service annotation on the destination endpoint.

Traffic sent to a Pod's Service port where the Pod is backing more than one Service on the same Service port is annotated with multiple Services on the destination endpoint. This is limited to two Services. If there are more than that, the endpoint will be annotated with a special MANY_SERVICES marker.

Pod annotations on internet traffic

Traffic between a Pod and the internet does not receive Pod annotations by default. For packets to the internet, the masquerade agent translates the Pod IP address to the node IP address before VPC Flow Logs sees the packet,so VPC Flow Logs doesn't know anything about the Pod and cannot add Pod annotations.

Because of the masquerade, Pod annotations are only visible if the destinations are within either the default non-masquerade destinations or in a custom nonMasqueradeCIDRs list. If you include internet destinations in a custom nonMasqueradeCIDRs list, you need to provide a way for the internal Pod IP addresses to be translated before they are delivered to the internet. For both private and non-private clusters, you can use Cloud NAT. See GKE interaction for more details.

Record format

Log records contain base fields, which are the core fields of every log record, and metadata fields that add additional information. Metadata fields may be omitted to save storage costs.

Some log fields are in a multi-field format, with more than one piece of data in a given field. For example, the connection field is of the IpConnection format, which contains the source and destination IP address and port, plus the protocol, in a single field. These multi-field fields are described below the record format table.

Field	Field format	Field type: Base or optional metadata
connection	IpConnection 5-Tuple describing this connection	Base
reporter	string The side which reported the flow. Can be either `SRC` or `DEST`.	Base
rtt_msec	int64 Latency as measured during the time interval, for TCP flows only. The measured latency is the time elapsed between sending a SEQ and receiving a corresponding ACK. The latency result is the sum of the network RTT and any time consumed by the application.	Base
bytes_sent	int64 Amount of bytes sent from the source to the destination	Base
packets_sent	int64 Number of packets sent from the source to the destination	Base
start_time	string Timestamp (RFC 3339 date string format) of the first observed packet during the aggregated time interval.	Base
end_time	string Timestamp (RFC 3339 date string format) of the last observed packet during the aggregated time interval	Base
src_gke_details	GkeDetails GKE metadata for source endpoints. Only available if the endpoint is GKE.	Metadata
dest_gke_details	GkeDetails GKE metadata for destination endpoints. Only available if the endpoint is GKE.	Metadata
src_instance	InstanceDetails If the source of the connection was a VM located on the same VPC, this field is populated with VM instance details. In a Shared VPC configuration, `project_id` corresponds to the project that owns the instance, usually the service project.	Metadata
dest_instance	InstanceDetails If the destination of the connection was a VM located on the same VPC, this field is populated with VM instance details. In a Shared VPC configuration, `project_id` corresponds to the project that owns the instance, usually the service project.	Metadata
src_location	GeographicDetails If the source of the connection was external to the VPC, this field is populated with available location metadata.	Metadata
dest_location	GeographicDetails If the destination of the connection was external to the VPC, this field is populated with available location metadata.	Metadata
src_vpc	VpcDetails If the source of the connection was a VM located on the same VPC, this field is populated with VPC network details. In a Shared VPC configuration, `project_id` corresponds to that of the host project.	Metadata
dest_vpc	VpcDetails If the destination of the connection was a VM located on the same VPC, this field is populated with VPC network details. In a Shared VPC configuration, `project_id` corresponds to that of the host project.	Metadata

IpConnection field format

Field	Type	Description
protocol	int32	The IANA protocol number
src_ip	string	Source IP address
dest_ip	string	Destination IP address
src_port	int32	Source port
dest_port	int32	Destination port

GkeDetails field format

Field	Type	Description
cluster	ClusterDetails	GKE cluster metadata
pod	PodDetails	GKE Pod metadata, populated when the source or destination of the traffic is a Pod
service	ServiceDetails	GKE Service metadata, populated in Service endpoints only. The record contains up to two Services. If there are more than two relevant Services, this field contains a single Service with a special `MANY_SERVICES` marker.

ClusterDetails field format

Field	Type	Description
cluster_location	string	Location of the cluster. This can be a zone or a region depending if the cluster is zonal or regional.
cluster_name	string	GKE cluster name.

PodDetails field format

Field	Type	Description
pod_name	string	Name of the Pod
pod_namespace	string	Namespace of the Pod

ServiceDetails field format

Field	Type	Description
service_name	string	Name of the Service. If there are more than two relevant Services, the field is set to a special `MANY_SERVICES` marker.
service_namespace	string	Namespace of the Service

Example:

If there are two services, the Service field looks like this:

service: [
 0: {
  service_name: "my-lb-service"
  service_namespace: "default"
 }
 1: {
  service_name: "my-lb-service2"
  service_namespace: "default"
 }
]

If there are more than two services, the Service field looks like this:

service: [
 0: {
  service_name: "MANY_SERVICES"
 }
]

InstanceDetails field format

Field	Type	Description
project_id	string	ID of the project containing the VM
region	string	Region of the VM
vm_name	string	Instance name of the VM
zone	string	Zone of the VM

GeographicDetails field format

Field	Type	Description
asn	int32	The autonomous system number (ASN) of the external network to which this endpoint belongs.
city	string	City for external endpoints
continent	string	Continent for external endpoints
country	string	Country for external endpoints, represented as ISO 3166-1 Alpha-3 country codes
region	string	Region for external endpoints

VpcDetails field format

Field	Type	Description
project_id	string	ID of the project containing the VPC
subnetwork_name	string	Subnetwork on which the VM is operating
vpc_name	string	VPC on which the VM is operating

Log filtering

When you enable VPC Flow Logs, you can set a filter based on both base and metadata fields that only preserves logs that match the filter. All other logs are discarded before being written to Logging, which saves you money and reduces the time needed to find the information you are looking for.

You can filter on any subset of fields from the Record format.

VPC Flow Logs filtering uses CEL, an embedded expression language for attribute-based logic expressions. Filter expressions for VPC Flow Logs have a limit of 2,048 characters. For more information, see Supported CEL logic operators.

For more information about CEL, see the CEL introduction and the language definition. The generation filter feature supports a limited subset of CEL syntax.

For information about creating a subnet that uses log filtering, see the gcloud CLI or API instructions for Enabling VPC Flow Logs when you create a subnet.

For information about configuring log filtering, see the gcloud CLI or API instructions for Updating VPC Flow Logs parameters.

Example 1: Limit logs collection to a specific VM named my-vm. In this case, only logs where the src_instance field as reported by the source of the traffic is my-vm or the dst_instance field as reported by the destination of the traffic is my-vm are recorded.

gcloud compute networks subnets update my-subnet \
    --logging-filter-expr="(src_instance.vm_name == 'my-vm' && reporter=='SRC') || (dest_instance.vm_name == 'my-vm' && reporter=='DEST')"

Example 2: Limit logs collection to packets whose source IP addresses are in the 10.0.0.0/8 subnet.

gcloud compute networks subnets update my-subnet \
    --logging-filter-expr="inIpRange(connection.src_ip, '10.0.0.0/8')"

Example 3: Limit logs collection to traffic that is external to a VPC.

gcloud compute networks subnets update my-subnet \
    --logging-filter-expr '!(has(src_vpc.vpc_name) && has(dest_vpc.vpc_name))'

Supported CEL logic operators

Expression	Supported types	Description
true, false	Boolean	Boolean constants
x == y x != y	Boolean, Int, String	Comparison operators Example: connection.protocol == 6
x && y x \|\| y	Boolean	Boolean logic operators Example: connection.protocol == 6 && src_instance.vm_name == "vm_1"
!x	Boolean	Negation
1, 2.0, 0, ...	Int	Constant numeric literals
x + y	String	String concatenation
"foo", 'foo', ...	String	Constant string literal
x.lower()	String	Returns the lowercase value of the string
x.upper()	String	Returns the uppercase value of the string
x.contains(y)	String	Returns true if the string contains the specified substring
x.startsWith(y)	String	Returns true if the string begins with the specified substring
x.endsWith(y)	String	Returns true if the string ends with the specified substring
inIpRange(X, Y)	String	Returns true if X is an IP and Y is an IP range that contains X Example: inIpRange("1.2.3.1", "1.2.3.0/24")
x.containsFieldValue(y)	x: list y: map(string, string)	Returns true if the list contains an object with fields that match the specified key-value pairs Example: dest_gke_details.service.containsFieldValue({'service_name': 'service1', 'service_namespace': 'namespace1'})
has(x)	String	Returns true if the field is present.

Traffic pattern examples

This section demonstrates how VPC Flow Logs works for various use cases.

VM-to-VM flows in the same VPC network

VM flows within a VPC network. (click to enlarge).

For VM-to-VM flows in the same VPC network, flow logs are reported from both requesting and responding VMs, as long as both VMs are in subnets that have VPC Flow Logs enabled. In this example, VM 10.10.0.2 sends a request with 1,224 bytes to VM 10.50.0.2, which is also in a subnet that has logging enabled. In turn, 10.50.0.2 responds to the request with a reply containing 5,342 bytes. Both the request and reply are recorded from both the requesting and responding VMs.

As reported by requesting VM (10.10.0.2)
request/reply	connection.src_ip	connection.dest_ip	bytes_sent	Annotations
request	10.10.0.2	10.50.0.2	1,224	src_instance.* dest_instance.* src_vpc.* dest_vpc.*
reply	10.50.0.2	10.10.0.2	5,342	src_instance.* dest_instance.* src_vpc.* dest_vpc.*

As reported by responding VM (10.50.0.2)
request/reply	connection.src_ip	connection.dest_ip	bytes	Annotations
request	10.10.0.2	10.50.0.2	1,224	src_instance.* dest_instance.* src_vpc.* dest_vpc.*
reply	10.50.0.2	10.10.0.2	5,342	src_instance.* dest_instance.* src_vpc.* dest_vpc.*

VM-to-external-IP-address flows

For flows that traverse the internet between a VM that's in a VPC network and an endpoint with an external IP address, flow logs are reported from the VM that's in the VPC network only:

For egress flows, the logs are reported from the VPC network VM that is the source of the traffic.
For ingress flows, the logs are reported from the VPC network VM that is the destination of the traffic.

In this example, VM 10.10.0.2 exchanges packets over the internet with an endpoint that has the external IP address 203.0.113.5. The outbound traffic of 1,224 bytes sent from 10.10.0.2 to 203.0.113.5 is reported from the source VM, 10.10.0.2. The inbound traffic of 5,342 bytes sent from 203.0.113.5 to 10.10.0.2 is reported from the destination of the traffic, VM 10.10.0.2.

request/reply	connection.src_ip	connection.dest_ip	bytes_sent	Annotations
request	10.10.0.2	203.0.113.5	1,224	src_instance.* src_vpc.* dest_location.*
reply	203.0.113.5	10.10.0.2	5,342	dest_instance.* dest_vpc.* src_location.*

VM-to-on-premises flows

For flows between a VM that's in a VPC network and an on-premises endpoint with an internal IP address, flow logs are reported from the VM that's in the VPC network only:

For egress flows, the logs are reported from the VPC network VM that is the source of the traffic.
For ingress flows, the logs are reported from the VPC network VM that is the destination of the traffic.

In this example, VM 10.10.0.2 and on-premises endpoint 10.30.0.2 are connected through a VPN gateway or Cloud Interconnect. The outbound traffic of 1,224 bytes sent from 10.10.0.2 to 10.30.0.2 is reported from the source VM, 10.10.0.2. The inbound traffic of 5,342 bytes sent from 10.30.0.2 to 10.10.0.2 is reported from the destination of the traffic, VM 10.10.0.2.

request/reply	connection.src_ip	connection.dest_ip	bytes_sent	Annotations
request	10.10.0.2	10.30.0.2	1,224	src_instance.* src_vpc.*
reply	10.30.0.2	10.10.0.2	5,342	dest_instance.* dest_vpc.*

VM-to-VM flows for Shared VPC

Shared VPC flows. — Shared VPC flows (click to enlarge).

For VM-to-VM flows for Shared VPC, you can enable VPC Flow Logs for the subnet in the host project. For example, subnet 10.10.0.0/20 belongs to a Shared VPC network defined in a host project. You can see flow logs from VMs belonging to this subnet, including ones created by service projects. In this example, the service projects are called "webserver", "recommendation", "database".

For VM-to-VM flows, if both VMs are in the same project, or in the case of a shared network, the same host project, annotations for project ID and the like are provided for the other endpoint in the connection. If the other VM is in a different project, then annotations for the other VM are not provided.

The following table shows a flow as reported by either 10.10.0.10 or 10.10.0.20.

src_vpc.project_id and dest_vpc.project_id are for the host project because the VPC subnet belongs to the host project.
src_instance.project_id and dest_instance.project_id are for the service projects because the instances belong to the service projects.

connection .src_ip	src_instance .project_id	src_vpc .project_id	connection .dest_ip	dest_instance .project_id	dest_vpc .project_id
10.10.0.10	webserver	host_project	10.10.0.20	recommendation	host_project

Service projects do not own the Shared VPC network and do not have access to the flow logs of the Shared VPC network.

VM-to-VM flows for VPC Network Peering

VPC Network Peering flows (click to enlarge).

Unless both VMs are in the same Google Cloud project, VM-to-VM flows for peered VPC networks are reported in the same way as for external endpoints—project and other annotation information for the other VM are not provided. If both VMs are in the same project, even if in different networks, then project and other annotation information is provided for the other VM as well.

In this example, the subnets of VM 10.10.0.2 in project analytics-prod and VM 10.50.0.2 in project webserver-test are connected through VPC Network Peering. If VPC Flow Logs is enabled in project analytics-prod, the traffic (1224 bytes) sent from 10.10.0.2 to 10.50.0.2 is reported from VM 10.10.0.2, which is the source of the flow. The traffic (5342 bytes) sent from 10.50.0.2 to 10.10.0.2 is also reported from VM 10.10.0.2, which is the destination of the flow.

In this example, VPC Flow Logs is not turned on in project webserver-test, so no logs are recorded by VM 10.50.0.2.

reporter	connection.src_ip	connection.dest_ip	bytes_sent	Annotations
source	10.10.0.2	10.50.0.2	1,224	src_instance.* src_vpc.*
destination	10.50.0.2	10.10.0.2	5,342	dest_instance.* dest_vpc.*

VM-to-VM flows for internal passthrough Network Load Balancers

Internal passthrough Network Load Balancer flows (click to enlarge).

When you add a VM to the backend service for an internal passthrough Network Load Balancer, the Linux or Windows Guest Environment adds the IP address of the load balancer to the local routing table of the VM. This allows the VM to accept request packets with destinations set to the IP address of the load balancer. When the VM replies, it sends its response directly; however, the source IP address for the response packets is set to the IP address of the load balancer, not the VM being load balanced.

VM-to-VM flows sent through an internal passthrough Network Load Balancer are reported from both source and destination. For an example HTTP request / response pair, the following table explains the fields of the flow log entries observed. For the purpose of this illustration, consider the following network configuration:

Browser instance at 192.168.1.2
Internal passthrough Network Load Balancer at 10.240.0.200
Webserver instance at 10.240.0.3

Traffic Direction	reporter	connection.src_ip	connection.dest_ip	connection.src_instance	connection.dest_instance
Request	SRC	192.168.1.2	10.240.0.200	Browser instance
Request	DEST	192.168.1.2	10.240.0.200	Browser instance	Webserver instance
Response	SRC	10.240.0.3	192.168.1.2	Webserver instance	Browser instance
Response	DEST	10.240.0.200	192.168.1.2		Browser instance

The requesting VM does not know which VM will respond to the request. In addition, because the other VM sends a response with the internal load balancer IP as the source address, it does not know which VM has responded. For these reasons, the requesting VM cannot add dest_instance information to its report, only src_instance information. Because the responding VM does know the IP address of the other VM, it can supply both src_instance and dest_instance information.

Pod to ClusterIP flow

In this example, traffic is sent from client Pod (10.4.0.2) to cluster-service (10.0.32.2:80). The destination is resolved to the selected server Pod IP address (10.4.0.3) on the target port (8080).

On node edges, the flow is sampled twice with the translated IP address and port. For both sampling points, we will identify that the destination Pod is backing service cluster-service on port 8080, and annotate the record with the Service details as well as the Pod details. In case the traffic is routed to a Pod on the same node, the traffic doesn't leave the node and is not sampled at all.

In this example, the following records are found.

reporter	connection.src_ip	connection.dst_ip	bytes_sent	Annotations
SRC	10.4.0.2	10.4.0.3	1,224	src_instance.* src_vpc.* src_gke_details.cluster.* src_gke_details.pod.* dest_instance.* dest_vpc.* dest_gke_details.cluster.* dest_gke_details.pod.* dest_gke_details.service.*
DEST	10.4.0.2	10.4.0.3	1,224	src_instance.* src_vpc.* src_gke_details.cluster.* src_gke_details.pod.* dest_instance.* dest_vpc.* dest_gke_details.cluster.* dest_gke_details.pod.* dest_gke_details.service.*

GKE external LoadBalancer flows

External load balancer flows (click to enlarge).

Traffic from an external IP address to a GKE service (35.35.35.35) is routed to a node in the cluster, 10.0.12.2 in this example, for routing. By default, external passthrough Network Load Balancers distribute traffic across all nodes in the cluster, even those not running a relevant Pod. Traffic might take extra hops to get to the relevant Pod. For more information, see Networking outside the cluster.

The traffic is then routed from the node (10.0.12.2) to the selected server Pod (10.4.0.2). Both hops are logged because all node edges are sampled. In case the traffic is routed to a Pod on the same node, 10.4.0.3 in this example, the second hop would not be logged, as it doesn't leave the node. The second hop is logged by both nodes' sampling points. The second hop is logged by both nodes' sampling points. For the first hop, we identify the Service based on the load balancer IP and Service port (80). For the second hop, we identify that the destination Pod is backing the Service on the target port (8080).

In this example, the following records are found.

reporter	connection.src_ip	connection.dst_ip	bytes_sent	Annotations
DEST	203.0.113.1	35.35.35.35	1,224	src_location.* dest_instance.* dest_vpc.* dest_gke_details.cluster.* dest_gke_details.service.*
SRC	10.0.12.2	10.4.0.2	1,224	src_instance.* src_vpc.* src_gke_details.cluster.* dest_instance.* dest_vpc.* dest_gke_details.cluster.* dest_gke_details.pod.* dest_gke_details.service.*
DEST	10.0.12.2	10.4.0.2	1,224	src_instance.* src_vpc.* src_gke_details.cluster.* dest_instance.* dest_vpc.* dest_gke_details.cluster.* dest_gke_details.pod.* dest_gke_details.service.*

GKE Ingress flows

A connection from a public IP address to an Ingress destination is terminated at the Load Balancer Service. The connection is mapped to a NodePort Service according to the URL. To serve the request, the load balancer (130.211.0.1) connects to one of the cluster nodes (10.0.12.2) for routing using the Service's NodePort. By default, when creating an Ingress, the GKE Ingress controller configures an HTTP(S) load balancer that distributes traffic across all nodes in the cluster, even those not running a relevant Pod. Traffic might take extra hops to get to the relevant Pod. For more information, see Networking outside the cluster. The traffic is then routed from the node (10.0.12.2) to the selected server Pod (10.4.0.2).

Both hops are logged because all node edges are sampled. For the first hop, we identify the Service based on the Service's NodePort (60000). For the second hop, we identify that the destination Pod is backing the Service on the target port (8080). The second hop is logged by both nodes' sampling points. However, in a case where the traffic is routed to a Pod on the same node (10.4.0.3), the second hop is not logged because the traffic didn't leave the node.

In this example, the following records are found.

reporter	connection.src_ip	connection.dst_ip	bytes_sent	Annotations
DEST	130.211.0.1	10.0.12.2	1,224	dest_instance.* dest_vpc.* dest_gke_details.cluster.* dest_gke_details.service.*
SRC	10.0.12.2	10.4.0.2	1,224	src_instance.* src_vpc.* src_gke_details.cluster.* dest_instance.* dest_vpc.* dest_gke_details.cluster.* dest_gke_details.pod.* dest_gke_details.service.*
DEST	10.0.12.2	10.4.0.2	1,224	src_instance.* src_vpc.* src_gke_details.cluster.* dest_instance.* dest_vpc.* dest_gke_details.cluster.* dest_gke_details.pod.* dest_gke_details.service.*

GKE Ingress flows using container-native load balancing

Ingress flows using container-native load balancing (click to enlarge).

Requests from a public IP address to an Ingress that is using container-native load balancing are terminated at the load balancer. In this type of Ingress, Pods are core objects for load balancing. A request is then sent from the load balancer (130.211.0.1) directly to a selected Pod (10.4.0.2). We identify that the destination Pod is backing the Service on the target port (8080).

In this example, the following record is found.

reporter	connection.src_ip	connection.dst_ip	bytes_sent	Annotations
DEST	130.211.0.1	10.4.0.2	1,224	dest_instance.* dest_vpc.* dest_gke_details.cluster.* dest_gke_details.pod.* dest_gke_details.service.*

Pod to external flows

Pod to external flow (click to enlarge).

Traffic from a Pod (10.4.0.3) to an external IP (203.0.113.1) is modified by IP masquerading so that the packets are sent from the node IP (10.0.12.2) instead of the Pod IP. By default, the GKE cluster is configured to masquerade traffic to external destinations. For more information, see IP masquerade agent.

In order to view Pod annotations for this traffic, you can configure the masquerade agent not to masquerade pod IPs. In such a case, to allow traffic to the internet, you can configure Cloud NAT, which processes the Pod IP addresses. For more information about Cloud NAT with GKE, review GKE interaction.

In this example, the following record is found.

reporter	connection.src_ip	connection.dst_ip	bytes_sent	Annotations
SRC	10.0.12.2	203.0.113.1	1,224	src_instance.* src_vpc.* src_gke_details.cluster.* dest_location.*

Pricing

Standard pricing for Logging, BigQuery, or Pub/Sub apply. VPC Flow Logs pricing is described in Network Telemetry pricing.

FAQ

Does VPC Flow Logs include both allowed and denied traffic based on firewall rules?
- VPC Flow Logs covers traffic from the perspective of a VM. All egress (outgoing) traffic from a VM is logged, even if it is blocked by an egress deny firewall rule. Ingress (incoming) traffic is logged if it is permitted by an ingress allow firewall rule. Ingress traffic blocked by an ingress deny firewall rule is not logged.
Does VPC Flow Logs work with VM instances with multiple interfaces?
- Yes, you can enable VPC Flow Logs for all interfaces on a multiple interface VM.
Does VPC Flow Logs work with legacy networks?
- No, VPC Flow Logs are not supported on legacy networks.

What's next

Use VPC Flow Logs