Kubernetes allows you to declaratively define how your applications are deployed, how applications communicate with each other and with the Kubernetes control plane, and how clients can reach your applications. This page also provides information about how GKE configures Google Cloud services, where it is relevant to networking.
When you use Kubernetes to orchestrate your applications, it's important to change how you think about the network design of your applications and their hosts. With Kubernetes, you think about how Pods, Services, and external clients communicate, rather than thinking about how your hosts or VMs are connected.
Kubernetes' advanced software-defined networking (SDN) enables packet routing and forwarding for Pods, Services, and nodes across different zones in the same regional cluster. Kubernetes and Google Cloud also dynamically configure IP filtering rules, routing tables, and firewall rules on each node, depending on the declarative model of your Kubernetes deployments and your cluster configuration on Google Cloud.
In addition, you may find this content easier to understand if you have a basic
understanding of Linux network management concepts and utilities such as
iptables rules and routing.
Terminology related to IP addresses in Kubernetes
The Kubernetes networking model relies heavily on IP addresses. Services, Pods, containers, and nodes communicate using IP addresses and ports. Kubernetes provides different types of load balancing to direct traffic to the correct Pods. All of these mechanisms are described in more detail later in this topic. Keep the following terms in mind as you read:
- ClusterIP: The IP address assigned to a Service. In other documents, it may be called the "Cluster IP". This address is stable for the lifetime of the Service, as discussed in the Services section in this topic.
- Pod IP: The IP address assigned to a given Pod. This is ephemeral, as discussed in the Pods section in this topic.
- Node IP: The IP address assigned to a given node.
Networking inside the cluster
Kubernetes uses various IP ranges to assign IP addresses to nodes, Pods, and Services.
- Each node has an IP address assigned from the cluster's Virtual Private
Cloud (VPC) network. This node IP provides connectivity from control
kubeletto the Kubernetes API server. This IP is the node's connection to the rest of the cluster.
Each node has a pool of IP addresses that GKE assigns Pods running on that node (a /24 CIDR block by default). You can optionally specify the range of IPs when you create the cluster. The Flexible Pod CIDR range feature allows you to reduce the size of the range for Pod IPs for nodes in a given node pool.
Each Pod has a single IP address assigned from the Pod CIDR range of its node. This IP address is shared by all containers running within the Pod, and connects them to other Pods running in the cluster.
Each Service has an IP address, called the ClusterIP, assigned from the cluster's VPC network. You can optionally customize the VPC network when you create the cluster.
For more information, visit Creating VPC-native clusters using Alias IPs.
In Kubernetes, a Pod is the most basic deployable unit within a Kubernetes cluster. A Pod runs one or more containers. Zero or more Pods run on a node. Each node in the cluster is part of a node pool. In GKE, these nodes are virtual machines, each running as an instance in Compute Engine.
Pods can also attach to external storage volumes and other custom resources. This diagram shows a single node running two Pods, each attached to two volumes.
When Kubernetes schedules a Pod to run on a node, it creates a
for the Pod in the node's Linux kernel. This network namespace connects the
node's physical network interface, such as
eth0, with the Pod using a virtual
network interface, so that packets can flow to and from the Pod. The associated
virtual network interface in the node's root network namespace connects to a
Linux bridge that allows communication among Pods on the same node. A Pod can
also send packets outside of the node using the same virtual interface.
Kubernetes assigns an IP address (the Pod IP) to the virtual network interface in the Pod's network namespace from a range of addresses reserved for Pods on the node. This address range is a subset of the IP address range assigned to the cluster for Pods, which you can configure when you create a cluster.
A container running in a Pod uses the Pod's network namespace. From the
container's point of view, the Pod appears to be a physical machine with one
network interface. All containers in the Pod see this same network interface.
localhost is connected, through the Pod, to the node's
physical network interface, such as
Note that this connectivity differs drastically depending on whether you use GKE's native Container Network Interface (CNI) or choose to use Calico's implementation by enabling Network policy upon creating the cluster.
If you use GKE's CNI, one end of the Virtual Ethernet Device (veth) pair is attached to the Pod in its namespace, and the other is connected to the Linux bridge device
cbr0. In this case, the following command shows the various Pods' MAC addresses attached to
Additionally, invoking the following in the toolbox container shows the root namespace end of each veth pair attached to
brctl show cbr0
If Network Policy is enabled, one end of the veth pair is attached to the Pod and the other to
eth0. In this case, the following command shows the various Pods' MAC addresses attached to different veth devices.
Additionally, invoking the following in the toolbox container shows that there is not a Linux bridge device named
The iptables rules that facilitate forwarding within the cluster differ from one scenario to the other. It is important to have this distinction in mind during detailed troubleshooting of connectivity issues.
By default, each Pod has unfiltered access to all the other Pods running on all nodes of the cluster, but you can limit access among Pods. Kubernetes regularly tears down and recreates Pods. This happens when a node pool is upgraded, when changing the Pod's declarative configuration or changing a container's image, or when a node becomes unavailable. Therefore, a Pod's IP address is an implementation detail, and you should not rely on them. Kubernetes provides stable IP addresses using Services.
In Kubernetes, you can assign arbitrary key-value pairs called labels to any Kubernetes resource. Kubernetes uses labels to group multiple related Pods into a logical unit called a Service. A Service has a stable IP address and ports, and provides load balancing among the set of Pods whose labels match all the labels you define in the label selector when you create the Service.
The following diagram shows two separate Services, each of which is comprised of
multiple Pods. Each of the Pods in the diagram has the label
their other labels differ. Service "frontend" matches all Pods with both
component=frontend, while Service "users" matches all Pods with
component=users. The Client Pod does not match either Service
selector exactly, so it is not a part of either Service. However, the Client Pod
can communicate with either of the Services because it runs in the same cluster.
Kubernetes assigns a stable, reliable IP address to each newly-created Service (the ClusterIP) from the cluster's pool of available Service IP addresses. Kubernetes also assigns a hostname to the ClusterIP, by adding a DNS entry. The ClusterIP and hostname are unique within the cluster and do not change throughout the lifecycle of the Service. Kubernetes only releases the ClusterIP and hostname if the Service is deleted from the cluster's configuration. You can reach a healthy Pod running your application using either the ClusterIP or the hostname of the Service.
At first glance, a Service may seem to be a single point of failure for your applications. However, Kubernetes spreads traffic as evenly as possible across the full set of Pods, running on many nodes, so a cluster can withstand an outage affecting one or more (but not all) nodes.
Kubernetes manages connectivity among Pods and Services using the
component. This is deployed as a static Pod on each node by default. Any
GKE cluster running 1.16 or later will have a
DaemonSet. This will only select nodes which are running a
GKE version between 1.16.0 and 1.16.8-gke.13. If the cluster
does not have any nodes running these versions, it is expected for this
DaemonSet to show 0 Pods.
kube-proxy, which is not an in-line proxy, but an egress-based load-
balancing controller, watches the Kubernetes API server and continually maps the
ClusterIP to healthy Pods by adding and removing destination NAT (DNAT) rules to
iptables subsystem. When a container running in a Pod sends traffic
to a Service's ClusterIP, the node selects a Pod at random and routes the
traffic to that Pod.
When you configure a Service, you can optionally remap its listening port by
defining values for
portis where clients reach the application.
targetPortis the port where the application is actually listening for traffic within the Pod.
kube-proxy manages this port remapping by adding and removing
on the node.
This diagram illustrates the flow of traffic from a client Pod to a server Pod
on a different node. The client connects to the Service at
The Kubernetes API server maintains a list of Pods running the application. The
kube-proxy process on each node uses this list to create an
iptables rule to
direct traffic to an appropriate Pod (such as
10.255.255.202:8080). The client
Pod does not need to be aware of the topology of the cluster or any details
about individual Pods or containers within them.
Networking outside the cluster
This section explains how traffic from outside the cluster reaches applications running within a Kubernetes cluster. This information is important when designing your cluster's applications and workloads.
You've already read about how Kubernetes uses Services to provide
stable IP addresses for applications running within Pods. By default,
Pods do not expose an external IP address, because
kube-proxy manages all
traffic on each node. Pods and their containers can
communicate freely, but connections outside the cluster cannot access the
Service. For instance, in the previous illustration, clients outside the cluster
cannot access the frontend Service via its ClusterIP.
GKE provides three different types of load balancers to control access and to spread incoming traffic across your cluster as evenly as possible. You can configure one Service to use multiple types of load balancers simultaneously.
- External load balancers manage traffic coming from outside the cluster and outside your Google Cloud Virtual Private Cloud (VPC) network. They use forwarding rules associated with the Google Cloud network to route traffic to a Kubernetes node.
- Internal load balancers manage traffic coming from within the same VPC network. Like external load balancers, they use forwarding rules associated with the Google Cloud network to route traffic to a Kubernetes node.
- HTTP(S) load balancers are specialized external load balancers used for HTTP(S) traffic. They use an Ingress resource rather than a forwarding rule to route traffic to a Kubernetes node.
When traffic reaches a Kubernetes node, it is handled the same way, regardless
of the type of load balancer. The load balancer is not aware of which nodes
in the cluster are running Pods for its Service. Instead, it balances traffic
across all nodes in the cluster, even those not running a relevant Pod. On a
regional cluster, the load is spread across all nodes in all zones for the
cluster's region. When traffic is routed to a node, the node routes the traffic
to a Pod, which may be running on the same node or a different node. The node
forwards the traffic to a randomly chosen Pod by using the
kube-proxy manages on the node.
In the following diagram, the network load balancer directs traffic to the middle node, and the traffic is redirected to a Pod on the first node.
When a load balancer sends traffic to a node, the traffic might get forwarded to a Pod on a different node. This requires extra network hops. If you want to avoid the extra hops, you can specify that traffic must go to a Pod that is on the same node that initially receives the traffic.
To specify that traffic must go to a Pod on the same node, set
Local in your Service manifest:
apiVersion: v1 kind: Service metadata: name: my-lb-service spec: type: LoadBalancer externalTrafficPolicy: Local selector: app: demo component: users ports: - protocol: TCP port: 80 targetPort: 8080
When you set
Local, the load balancer sends
traffic only to nodes that have a healthy Pod that belongs to the Service.
The load balancer uses a health check to determine which nodes have the
External load balancer
If your Service needs to be reachable from outside the cluster and outside your
VPC network, you can configure your Service as a LoadBalancer,
by setting the Service's
type field to
Loadbalancer when defining the
Service. GKE then provisions a
Network load balancer in front of the Service.
The Network load balancer is aware of all nodes in your cluster and configures
your VPC network's firewall rules to allow connections to the Service from
outside the VPC network, using the Service's external IP address. You can assign
a static external IP address to the Service. Visit
Configuring domain names with static IP addresses
for more information.
When using the external load balancer, arriving traffic is initially routed to
a node using a forwarding rule associated with the Google Cloud network.
After the traffic reaches the node, the node uses its
iptables NAT table to
choose a Pod.
kube-proxy manages the
iptables rules on the node.
Internal load balancer
For traffic that needs to reach your cluster from within the same VPC network, you can configure your Service to provision an Internal load balancer. The Internal load balancer chooses an IP address from your cluster's VPC subnet instead of an external IP address. Applications or services within the VPC network can use this IP address to communicate with Services inside the cluster.
Internal load balancing functionality is provided by Google Cloud. When
the traffic reaches a given node, that node uses its
iptables NAT table to
choose a Pod, even if the Pod is on a different node.
kube-proxy manages the
iptables rules on the node.
For more information about internal load balancers, visit the Internal load balancer documentation.
HTTP(S) load balancer
Many applications, such as RESTful web service APIs, communicate using HTTP(S). You can allow clients external to your VPC network to access this type of application using a Kubernetes Ingress resource. An Ingress resource allows you to map hostnames and URL paths to Services within the cluster. When using a HTTP(S) load balancer, you must configure the Service to use a NodePort, as well as a ClusterIP. When traffic accesses the Service on a node's IP at the NodePort, GKE routes traffic to a healthy Pod for the Service. You can specify a NodePort or allow GKE to assign a random unused port.
When you create the Ingress resource, GKE provisions an HTTP(S) load balancer
in the Google Cloud project. The load balancer sends a request to a node's
IP address at the NodePort. After the request reaches the node, the node uses its
iptables NAT table to choose a Pod.
kube-proxy manages the
on the node.
This Ingress definition routes traffic for
demo.example.com to a Service named
frontend on port 80, and
demo-backend.example.com to a Service named
users on port 8080.
apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: demo spec: rules: - host: demo.example.com http: paths: - backend: serviceName: frontend servicePort: 80 - host: demo-backend.example.com http: paths: - backend: serviceName: users servicePort: 8080
Visit GKE Ingress for HTTP(S) load balancing for more information.
When you create an Ingress object, the GKE Ingress controller
configures a Google Cloud HTTP(S) load balancer according to the rules in
the Ingress manifest and the associated Service manifests. The client sends a
request to the HTTP(S) load balancer. The load balancer is an actual proxy; it
chooses a node and forwards the request to that node's
combination. The node uses its
iptables NAT table to choose a Pod.
kube-proxy manages the
iptables rules on the node.
Limiting connectivity between nodes
Creating ingress or egress firewall rules targeting nodes in your cluster may have
adverse effects. For example, applying egress deny rules to nodes in your
cluster could break functionality such as
Limiting connectivity to Pods and Services
By default, all Pods running within the same cluster can communicate freely. However, you can limit connectivity within a cluster in different ways, depending on your needs.
Limiting access among Pods
You can limit access among Pods using a network policy. Network policy definitions allow you to restrict the ingress and egress of Pods based on an arbitrary combination of labels, IP ranges, and port numbers. By default, there is no network policy, so all traffic among Pods in the cluster is allowed. As soon as you create the first network policy in a namespace, all other traffic is denied.
Visit Network policies for more details about how to specify the policy itself.
After creating a network policy, you must explicitly enable it for the cluster. Visit Configuring network policies for applications for more information.
Limiting access to an external load balancer
If your Service uses an external load balancer,
traffic from any external IP address can access your Service by default. You can
restrict which IP address ranges can access endpoints within your cluster, by
loadBalancerSourceRanges option when configuring the Service.
You can specify multiple ranges, and you can update the configuration of a
running Service at any time. The
kube-proxy instance running on each node
configures that node's
iptables rules to deny all traffic that does not match
loadBalancerSourceRanges. No VPC firewall rule is created.
Limiting access to an HTTP(S) load balancer
If your service uses the HTTP(S) load balancer, you can use a Google Cloud Armor security policy to limit which external IP addresses can access your Service and which responses to return when access is denied because of the security policy. You can configure Cloud Logging to log information about these interactions.
If a Google Cloud Armor security policy is not fine-grained enough, you can enable the Identity-Aware Proxy on your endpoints to implement user-based authentication and authorization for your application. Visit the detailed tutorial for configuring IAP for more information.
- Learn about Services.
- Learn about Pods.
- Set up a cluster with shared VPC.
- Watch an in-depth presentation about The ins and outs of networking in Google Container Engine and Kubernetes (Google Cloud Next '17).