Control Pod egress traffic using FQDN network policies


This page explains how to control egress communication between Pods and resources outside of the Google Kubernetes Engine (GKE) cluster using fully qualified domain names (FQDN). The custom resource that you use to configure FQDNs is the FQDNNetworkPolicy resource.

Pricing

FQDN network policy is a paid feature, but payment will not be required at this time until FQDN network policy becomes a generally available offering.

Before you begin

Before you start, make sure you have performed the following tasks:

  • Enable the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running gcloud components update.

Requirements and limitations

FQDNNetworkPolicy resources have the following requirements and limitations:

  • You must have a GKE cluster running one of the following versions:
    • 1.26.4-gke.500 or later
    • 1.27.1-gke.400 or later
  • Your cluster must use GKE Dataplane V2.
  • You must use one of the DNS providers in your GKE cluster, kube-dns or Cloud DNS. Custom kube-dns or Core DNS deployments are not supported.
  • Google Cloud CLI version 434.0.0 or later.
  • Windows node pools are not supported.
  • Anthos Service Mesh is not supported.
  • Network Policy Logging for FQDNNetworkPolicy resources is not supported.
  • If you have hard-coded IP addresses in your application, use the IPBlock field of Kubernetes Network Policy instead of a FQDNNetworkPolicy.
  • Results returned by non-cluster DNS nameservers such as alternate nameservers in resolv.conf are not considered valid to be programmed in the allowlist in the GKE data plane.
  • The maximum number of IPv4 and IPv6 IP addresses that a FQDNNetworkPolicy can resolve to is 50.
  • You cannot allow traffic to a ClusterIP or Headless Service as an egress destination in a FQDNNetworkPolicy because GKE translates the Service virtual IP address (VIP) to backend Pod IP addresses before evaluating Network Policy rules. Instead, use a Kubernetes label-based Network Policy.
  • There is a maximum quota of 100 IP addresses per hostname.

Enable FQDN Network Policy

You can enable FQDN Network Policy on a new or an existing cluster.

Enable FQDN Network Policy in a new cluster

Create your cluster using the --enable-fqdn-network-policy flag:

gcloud beta container clusters create CLUSTER_NAME  \
    --enable-fqdn-network-policy

Replace CLUSTER_NAME with the name of your cluster.

Enable FQDN Network Policy in an existing cluster

  1. For both Autopilot and Standard clusters, update the cluster using the --enable-fqdn-network-policy flag:

    gcloud beta container clusters update CLUSTER_NAME  \
        --enable-fqdn-network-policy
    

    Replace CLUSTER_NAME with the name of your cluster.

  2. For Standard clusters only, restart the GKE Dataplane V2 anetd DaemonSet:

    kubectl rollout restart ds -n kube-system anetd
    

Create a FQDNNetworkPolicy

  1. Save the following manifest as fqdn-network-policy.yaml:

    apiVersion: networking.gke.io/v1alpha1
    kind: FQDNNetworkPolicy
    metadata:
      name: allow-out-fqdnnp
    spec:
      podSelector:
        matchLabels:
          app: curl-client
      egress:
      - matches:
        - pattern: "*.yourdomain.com"
        - name: "www.google.com"
        ports:
        - protocol: "TCP"
          port: 443
    

    This manifest has the following properties:

    • name: www.google.com: the fully qualified domain name. IP addresses provided by the nameserver associated with www.google.com are allowed. You must specify either name or pattern, or both.
    • pattern: "*.yourdomain.com": IP addresses provided by nameservers matching this pattern are allowed. You can use the following regular expressions for the pattern key: ^([a-zA-Z0-9*]([-a-zA-Z0-9_*]*[a-zA-Z0-9*])*\.?)*$. Match criteria are additive. You can use multiple pattern fields. You must specify either name or pattern, or both.
    • protocol: "TCP" and port: 443: specifies a protocol and port. If a Pod tries to establish a connection to IP addresses using this protocol and port combination, the name resolution works, but the data plane blocks the outbound connection. This field is optional.
  2. Verify that the network policy is selecting your workloads:

    kubectl describe fqdnnp
    

    The output is similar to the following:

    Name:         allow-out-fqdnnp
    Labels:       <none>
    Annotations:  <none>
    API Version:  networking.gke.io/v1alpha1
    Kind:         FQDNNetworkPolicy
    Metadata:
    ...
    Spec:
      Egress:
        Matches:
          Pattern:  *.yourdomain.com
          Name:     www.google.com
        Ports:
          Port:      443
          Protocol:  TCP
      Pod Selector:
        Match Labels:
          App: curl-client
    Events:     <none>
    

Delete a FQDNNetworkPolicy

You can delete a FQDNNetworkPolicy using the kubectl delete fqdnnp command:

kubectl delete fqdnnp FQDN_POLICY_NAME

Replace FQDN_POLICY_NAME with the name of your FQDNNetworkPolicy.

GKE deletes the rules from policy enforcement, but existing connections remain active until they close following the conntrack standard protocol guidelines.

How FQDN network policies work

Subsequent requests

An active FQDNNetworkPolicy that selects workloads does not affect the ability of workloads to make DNS requests. Commands such as nslookup or dig work on any domains without being affected by the policy. However, subsequent requests to the IP address backing domains not in the allowist would be dropped.

For example, if a FQDNNetworkPolicy allows egress to www.github.com, then DNS requests for all domains are allowed but traffic sent to an IP address backing twitter.com is dropped.

TTL expiration

FQDNNetworkPolicy honors the TTL provided by a DNS record. If a Pod attempts to contact an expired IP address after the TTL of the DNS record has elapsed, new connections are rejected. Long lived connections whose duration exceeds the TTL of the DNS record should not experience traffic disruption while conntrack considers the connection still active.

FQDNNetworkPolicy and NetworkPolicy

When both a FQDNNetworkPolicy and a NetworkPolicy apply to the same Pod, meaning the Pod's labels match what is configured in the policies, egress traffic is allowed as long as it matches one of the policies. There is no hierarchy between egress NetworkPolicies specifying IP addresses or label-selectors and FQDNNetworkPolicies.

Known Issues

Specifying protocol: ALL causes policy to be ignored

If you create a FQDNNetworkPolicy which specifies protocol: ALL in the ports section, GKE does not enforce the policy. This issue occurs because of an issue with parsing the policy. Specifying TCP or UDP does not cause this issue.

As a workaround, if you do not specify a protocol in the ports entry, the rule matches all protocols by default. Removing the protocol: ALL bypasses the parsing issue and GKE enforces the FQDNNetworkPolicy.

CNAME Chasing

If the FQDN object in the FQDN Network Policy includes a domain that has CNAMEs in the DNS record, you must configure your FQDN Network Policy with all domain names that your Pod can query directly, including all potential aliases, in order to ensure a reliable FQDN Network Policy behavior.

If your Pod queries example.com, then example.com is what you should write in the rule. Even if you get back a chain of aliases from your upstream DNS servers (e.g. example.com to example.cdn.com to 1.2.3.4), the FQDN Network Policy will still allow your traffic through.

What's next