Dataplane V2


This page gives an overview of what Dataplane V2 does and how it works.

Introduction

Dataplane V2 is a dataplane for GKE and Anthos clusters that is optimized for Kubernetes networking. Dataplane V2 provides:

  • A consistent user experience for networking in GKE and all Anthos clusters environments. See Availability of Dataplane V2 for information about the environments that support Dataplane V2.
  • Real-time visibility of network activity.
  • Simpler architecture that makes it easier to manage and troubleshoot clusters.

Dataplane V2 is based on eBPF and Linux nodes to flexibly and performantly process network packets in-kernel using Kubernetes-specific metadata.

Advantages of Dataplane V2

Security

Kubernetes Network policy is always on in clusters with Dataplane V2. You don't have to install and manage third-party software add-ons such as Calico to enforce network policy.

Scalability

Dataplane V2 is implemented without kube-proxy and does not rely on iptables for service routing. This removes a major bottleneck for scaling Kubernetes services in very large clusters.

Operations

When you create a cluster with Dataplane V2, network policy logging is built in. Configure the logging CRD on your cluster to see when connections are allowed and denied by your Pods.

Consistency

Dataplane V2 is available and provides the same features on GKE and on other Anthos clusters environments. See Availability of Dataplane V2 for more details.

How Dataplane V2 works

Dataplane V2 is implemented with eBPF (extended Berkeley Packet Filter). As packets arrive at a GKE node, eBPF programs installed in the kernel decide how to route and process the packets. Unlike packet processing with iptables, eBPF programs can use Kubernetes-specific metadata in the packet. This lets Dataplane V2 performantly process network packets in the kernel and report annotated actions back to user space for logging. The following diagram shows the path of a packet through a node using Dataplane V2:

A packet arriving at a node is processed in-kernel by eBPF. eBPF programs perform policy enforcement, service resolution, and connection tracking. This activity is reported to userspace for logging. The packet payload is then delivered to a Pod.

The Dataplane V2 controller on the node is called anetd. anetd is deployed as a DaemonSet to each node in the cluster and is responsible for interpreting Kubernetes objects and programming the desired network topologies in eBPF. anetd replaces the service routing and network policy enforcement that is otherwise performed by kube-proxy and Calico respectively in the kube-system namespace.

Technical specifications

Dataplane V2 supports clusters with the following specifications:

Specification GKE Anthos clusters on VMware Anthos on bare metal
Number of nodes per cluster 500* 500 250
Number of Pods per cluster 50,000 15,000 27,500
Number of LoadBalancer services per cluster 750 500 1,000

Dataplane V2 maintains a service map to keep track of which services refer to which Pods as their backends. The number of Pod backends for each service summed across all services must all fit into the service map, which can contain up to 64,000 entries. If this limit is exceeded your cluster may not work as intended.

* You can request a quota increase of up to 1,000 nodes per cluster with Dataplane V2 on GKE. Please file a support ticket and provide the cluster name.

The number of LoadBalancer services supported in Anthos clusters on VMware depends on the load balancer mode being used. 500 LoadBalancer services are supported on Anthos clusters on VMware when using bundled load balancing mode (Seesaw) and 250 are supported when using integrated load balancing mode with F5. See Scalability for more information.

Limitations

The following limitations apply in GKE, Anthos clusters on VMware, and all other environments:

  • Dataplane V2 can only be enabled when creating a new cluster. Existing clusters cannot be upgraded to use Dataplane V2.
  • Windows nodes are not supported with Dataplane V2.
  • On clusters using Dataplane V2 and NodeLocal DNSCache, pods running on the host network can experience DNS timeouts if dnsPolicy is set to ClusterFirstWithHostNet.
  • In case your cluster has services of type LoadBalancer and externalTrafficPolicy: Cluster (which is the default), the services might experience packet loss when nodes startup, shutdown, or become unhealthy because load balancers configured this way always fail health checks. In this situation the load balancer distributes new connections among all backends irrespective of the health.

    If the node is otherwise healthy it will redirect the traffic to a Pod for the Service. If a node is just coming up or is in the process of shutting down, it can't redirect traffic and the packet will be dropped.

Network policy enforcement without Dataplane V2

See Using network policy enforcement for instructions to enable network policy enforcement in clusters that don't use Dataplane V2.

What's next