Configuring multi-cluster Services


This page shows you how to enable and use multi-cluster Services (MCS). To learn more about how MCS works and its benefits, see Multi-cluster Services.

The Google Kubernetes Engine (GKE) MCS feature extends the reach of the Kubernetes Service beyond the cluster boundary and lets you discover and invoke Services across multiple GKE clusters. You can export a subset of existing Services or new Services.

When you export a Service with MCS, that Service is then available across all of the clusters in your fleet.

Google Cloud resources managed by MCS

MCS manages the following components of Google Cloud:

  • Cloud DNS: MCS configures Cloud DNS zones and records for each exported Service in your fleet clusters. This lets you connect to Services that are running in other clusters. These zones and records are created, read, updated, and deleted based on the Services that you choose to export across clusters.

  • Firewall rules: MCS configures firewall rules that let Pods communicate with each other across clusters within your fleet. Firewall rules are created, read, updated, and deleted based on the clusters that you add to your fleet. These rules are similar to the rules that GKE creates to enable communication between Pods within a GKE cluster.

  • Traffic Director: MCS uses Traffic Director as a control plane to keep track of endpoints and their health across clusters.

Requirements

MCS has the following requirements:

  • MCS only supports exporting services from VPC-native GKE clusters on Google Cloud. For more information, see Creating a VPC-native cluster. You cannot use non VPC-native GKE Standard clusters.

  • Connectivity between clusters depends on clusters running within the same VPC network, in peered or shared VPC networks. Otherwise the calls to external services will not be able to cross the network boundary.

  • While a fleet might span multiple Google Cloud projects and VPC networks, a single multi-cluster service must be exported from a single project and a single VPC network.

  • MCS is not supported with network policies.

  • Clusters must have the HttpLoadBalancing add-on enabled. Ensure that the HttpLoadBalancing add-on is enabled. The HttpLoadBalancing add-on is enabled by default and shouldn't be disabled.

Pricing

Multi-cluster Services is included as part of the GKE cluster management fee and has no extra cost for usage. You must enable the Traffic Director API, but MCS does not incur any Traffic Director endpoint charges. GKE Enterprise licensing is not required to use MCS.

Before you begin

Before you start, make sure you have performed the following tasks:

  1. Install the Google Cloud SDK.

  2. Enable the Google Kubernetes Engine API:

    Enable Google Kubernetes Engine API

  3. Enable the MCS, fleet (hub), Resource Manager, Traffic Director, and Cloud DNS APIs:

    gcloud services enable \
        multiclusterservicediscovery.googleapis.com \
        gkehub.googleapis.com \
        cloudresourcemanager.googleapis.com \
        trafficdirector.googleapis.com \
        dns.googleapis.com \
        --project=PROJECT_ID
    

    Replace PROJECT_ID with the project ID from the project where you plan to register your clusters to a fleet.

Enabling MCS in your project

MCS requires that participating GKE clusters be registered into the same fleet. Once the MCS feature is enabled for a fleet, any clusters can export Services between clusters in the fleet.

While MCS does require registration to a fleet, it does not require you to enable the GKE Enterprise platform.

GKE Enterprise

If the GKE Enterprise API is enabled in your fleet host project as a prerequisite for using other GKE Enterprise components, then any clusters registered to the project's fleet are charged according to GKE Enterprise pricing. This pricing model lets you use all GKE Enterprise features on registered clusters for a single per-vCPU charge. You can confirm if the GKE Enterprise API is enabled using the following command:

gcloud services list --project=PROJECT_ID | grep anthos.googleapis.com

If the output is similar to the following, the full GKE Enterprise platform is enabled and any clusters registered to the fleet will incur GKE Enterprise charges:

anthos.googleapis.com                        Anthos API

If this is not expected then contact your project administrator.

An empty output indicates that GKE Enterprise is not enabled.

Enabling MCS on your GKE cluster

  1. Enable the MCS feature for your project's fleet:

    gcloud container fleet multi-cluster-services enable \
        --project PROJECT_ID
    

    Replace PROJECT_ID with the project ID from the project where you plan to register your clusters to a fleet. This is your fleet host project.

  2. Register your GKE clusters to the fleet. We strongly recommend that you register your cluster with workload identity federation for GKE enabled. If you do not enable workload identity federation for GKE, you need to register the cluster with a Google Cloud service account for authentication, and complete the additional steps in Authenticating service accounts.

    To register your cluster with workload identity federation for GKE, run the following command:

    gcloud container fleet memberships register MEMBERSHIP_NAME \
       --gke-cluster CLUSTER_LOCATION/CLUSTER_NAME \
       --enable-workload-identity \
       --project PROJECT_ID
    

    Replace the following:

    • MEMBERSHIP_NAME: the membership name that you choose to uniquely represent the cluster in the fleet. Typically, a cluster's fleet membership name is the cluster's name, but you might need to specify a new name if another cluster with the original name already exists in the fleet.
    • CLUSTER_LOCATION: the zone or region where the cluster is located.
    • CLUSTER_NAME: the name of the cluster.
  3. Grant the required Identity and Access Management (IAM) permissions for MCS Importer:

    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member "serviceAccount:PROJECT_ID.svc.id.goog[gke-mcs/gke-mcs-importer]" \
        --role "roles/compute.networkViewer"
    

    Replace PROJECT_ID with the project ID from the fleet host project.

  4. Ensure that each cluster in the fleet has a namespace to share Services in. If needed, create a namespace by using the following command:

    kubectl create ns NAMESPACE
    

    Replace NAMESPACE with a name for the namespace.

  5. To verify that MCS is enabled, run the following command:

    gcloud container fleet multi-cluster-services describe \
        --project PROJECT_ID
    

    The output is similar to the following:

    createTime: '2021-08-10T13:05:23.512044937Z'
    membershipStates:
      projects/PROJECT_ID/locations/global/memberships/MCS_NAME:
        state:
          code: OK
          description: Firewall successfully updated
          updateTime: '2021-08-10T13:14:45.173444870Z'
    name: projects/PROJECT_NAME/locations/global/features/multiclusterservicediscovery
    resourceState:
      state: ACTIVE
    spec: {}
    

    If the value of state is not ACTIVE, see the troubleshooting section.

Authenticating service accounts

If you registered your GKE clusters to a fleet using a service account, you need to take additional steps to authenticate the service account. MCS deploys a component called gke-mcs-importer. This component receives endpoint updates from Traffic Director, so as part of enabling MCS you need to grant your service account permission to read information from Traffic Director.

When you use a service account, you can use the Compute Engine default service account or your own node service account:

Using MCS

The following sections show you how to use MCS. MCS uses the Kubernetes multi-cluster services API.

Registering a Service for export

To register a Service for export to other clusters within your fleet, complete the following steps:

  1. Create a ServiceExport object named export.yaml:

    # export.yaml
    kind: ServiceExport
    apiVersion: net.gke.io/v1
    metadata:
     namespace: NAMESPACE
     name: SERVICE_EXPORT_NAME
    

    Replace the following:

    • NAMESPACE: the namespace of the ServiceExport object. This namespace must match the namespace of the Service that you are exporting.
    • SERVICE_EXPORT_NAME: the name of a Service in your cluster that you want to export to other clusters within your fleet.
  2. Create the ServiceExport resource by running the following command:

    kubectl apply -f export.yaml
    

The initial export of your Service takes approximately five minutes to sync to clusters registered in your fleet. After a Service is exported, subsequent endpoint syncs happen immediately.

You can export the same Service from multiple clusters to create a single highly available multi-cluster service endpoint with traffic distribution across clusters. Before you export Services that have the same name and namespace, ensure that you want them to be grouped in this manner. We recommend against exporting services in the default and kube-system namespaces because of the high probability of unintended name conflicts and the resulting unintended grouping. If you are exporting more than five services with the same name and namespace, traffic distribution on imported services might be limited to five exported services.

Consuming cross-cluster Services

MCS only supports ClusterSetIP and headless Services. Only DNS "A" records are available.

After you create a ServiceExport object, the following domain name resolves to your exported Service from any Pod in any fleet cluster:

 SERVICE_EXPORT_NAME.NAMESPACE.svc.clusterset.local

The output includes the following values:

  • SERVICE_EXPORT_NAME and NAMESPACE: the values you define in your ServiceExport object.

For ClusterSetIP Services, the domain resolves to the ClusterSetIP. You can find this value by locating the ServiceImport object in a cluster in the namespace that the ServiceExport object was created in. The ServiceImport object is automatically created.

For example:

kind: ServiceImport
apiVersion: net.gke.io/v1
metadata:
 namespace: EXPORTED-SERVICE-NAMESPACE
 name: external-svc-SERVICE-EXPORT-TARGET
status:
 ports:
 - name: https
   port: 443
   protocol: TCP
   targetPort: 443
 ips: CLUSTER_SET_IP

MCS creates an Endpoints object as part of importing a Service into a cluster. By investigating this object you can monitor the progress of a Service import. To find the name of the Endpoints object, look up the value of the annotation net.gke.io/derived-service on a ServiceImport object corresponding to your imported Service. For example:

kind: ServiceImport
apiVersion: net.gke.io/v1
annotations: net.gke.io/derived-service: DERIVED_SERVICE_NAME
metadata:
 namespace: EXPORTED-SERVICE-NAMESPACE
 name: external-svc-SERVICE-EXPORT-TARGET

Next, look up the Endpoints object to check if MCS has already propagated the endpoints to the importing cluster. The Endpoints object is created in the same namespace as the ServiceImport object, under the name stored in the net.gke.io/derived-service annotation. For example:

kubectl get endpoints DERIVED_SERVICE_NAME -n NAMESPACE

Replace the following:

  • DERIVED_SERVICE_NAME: the value of the annotation net.gke.io/derived-service on the ServiceImport object.
  • NAMESPACE: the namespace of the ServiceExport object.

You can find out more on the healthiness status of the endpoints using the Traffic Director dashboard in Google Cloud console.

For headless Services, the domain resolves to the list of IP addresses of the endpoints in the exporting clusters. Each backend Pod with a hostname is also independently addressable with a domain name of the following form:

 HOSTNAME.MEMBERSHIP_NAME.LOCATION.SERVICE_EXPORT_NAME.NAMESPACE.svc.clusterset.local

The output includes the following values:

  • SERVICE_EXPORT_NAME and NAMESPACE: the values you define in your ServiceExport object.
  • MEMBERSHIP_NAME: the unique identifier in the fleet for the cluster that the Pod is in.
  • LOCATION: the location of the membership. Memberships are either global, or their location is one of the regions or zones that the Pod is in, such as us-central1.
  • HOSTNAME: the hostname of the Pod.

You can also address a backend Pod with a hostname exported from a cluster registered with a global Membership, using a domain name of the following format:

HOSTNAME.MEMBERSHIP_NAME.SERVICE_EXPORT_NAME.NAMESPACE.svc.clusterset.local

Disabling MCS

To disable MCS, complete the following steps:

  1. For each cluster in your fleet, delete each ServiceExport object that you created:

    kubectl delete serviceexport SERVICE_EXPORT_NAME \
        -n NAMESPACE
    

    Replace the following:

    • SERVICE_EXPORT_NAME: the name of your ServiceExport object.
    • NAMESPACE: the namespace of the ServiceExport object.
  2. Unregister your clusters from the fleet if they don't need to be registered for another purpose.

  3. Disable the multiclusterservicediscovery feature:

    gcloud container fleet multi-cluster-services disable \
        --project PROJECT_ID
    

    Replace PROJECT_ID with the project ID from the project where you registered clusters.

  4. Disable the API for MCS:

    gcloud services disable multiclusterservicediscovery.googleapis.com \
        --project PROJECT_ID
    

    Replace PROJECT_ID with the project ID from the project where you registered clusters.

Limitations

The following limits are not enforced, and in some cases you can exceed these limits depending on the load in your clusters or project and the rate of endpoint churn. However, you might experience performance issues when these limitations are exceeded.

  • Exporting clusters: A single Service, identified by a namespaced name, can be safely exported from up to 5 clusters simultaneously. Beyond that limit, it's possible that only a subset of endpoints can be imported to consuming clusters. You can export different Services from different subsets of clusters.

  • The number of Pods behind a single Service: It's safe if you keep below 250 Pods behind a single Service. This is the same limitation that single cluster Services have. With relatively static workloads and a small number of multi-cluster Services, it might be possible to significantly exceed this number into thousands of endpoints per Service. As with single cluster Services, all endpoints are watched by kube-proxy on every node. When going beyond this limit, especially when exporting from multiple clusters simultaneously, larger nodes might be required.

  • The number of multi-cluster Services simultaneously exported: We recommend that you simultaneously export no more than 50 unique Service ports, identified by a Service's namespaced name and declared ports. For example, exporting a Service that exposes ports 80 and 443 would count against 2 of the 50 unique Service port limit. Services with the same namespaced name exported from multiple clusters count as a single unique Service. The previously mentioned 2 port service would still only count against 2 ports if it were exported from 5 clusters simultaneously. Each multi-cluster Service counts toward your Backend Services quota, and each exporting cluster or zone creates a network endpoint group (NEG).

  • Service types: MCS only supports ClusterSetIP and Headless Services. NodePort and LoadBalancer Services are not supported and might lead to an unexpected behaviour.

  • Using IPmasq Agent with MCS: MCS operates as expected when you use a default or other non masqueraded Pod IP range.

    If you use a custom Pod IP range or a custom IPmasq agent ConfigMap, MCS traffic can be masqueraded. This prevents MCS from working because the firewall rules only allow traffic from Pod IPs.

    To avoid this issue, you should either use the default Pod IP range or specify all Pod IP ranges in the nonMasqueradeCIDRs field of the IPmasq agent ConfigMap. If you use Autopilot or you must use a non-default Pod IP range and cannot specify all Pod IP ranges in the ConfigMap, you should use Egress NAT Policy to configure IP masquerade.

MCS with clusters in multiple projects

You cannot export a service if that service is already being exported by other clusters in a different project in the fleet with the same name and namespace. You can access the service in other clusters in the fleet in other projects, but those clusters cannot export the same service in the same namespace.

Troubleshooting

The following sections provide you with troubleshooting tips for MCS.

Viewing the featureState

Viewing the feature state can help you confirm if MCS was configured successfully. You can view the MCS feature state by using the following command:

gcloud container fleet multi-cluster-services describe

The output is similar to the following:

createTime: '2021-08-10T13:05:23.512044937Z'
membershipStates:
 projects/PROJECT_ID/locations/global/memberships/MCS_NAME:
   state:
     code: OK
     description: Firewall successfully updated
     updateTime: '2021-08-10T13:14:45.173444870Z'
name: projects/PROJECT_NAME/locations/global/features/multiclusterservicediscovery
resourceState:
 state: ACTIVE
spec: {}

The most helpful fields for troubleshooting are code and description.

Codes in the featureState

A code indicates the member's general state in relation to MCS. You can find these fields in the state.code field. There are three possible codes:

  • OK: The membership was successfully added to MCS and is ready to use.

  • WARNING: MCS is in the process of reconciling membership setup. The description field can provide more information about what caused this code.

  • FAILED: This membership was not added to MCS. Other memberships in the fleet with an OK code are not affected by this FAILED membership. The description field can provide more information about what caused this code.

  • ERROR: This membership is missing resources. Other memberships in the fleet with an OK code are not affected by this ERROR membership. The description field can provide more information about what caused this code.

Descriptions in the featureState

A description gives you further information about the membership's state in MCS. You can find these descriptions in the state.description field and you can see the following descriptions:

  • Firewall successfully created: This message indicates that the member's firewall rule was successfully created and or updated. The membership's code is OK.

  • Firewall creation pending: This message indicates that the member's firewall rule is pending creation or update. The membership's code is WARNING. This membership can experience issues updating and connecting to new multi-cluster Services and memberships added while the firewall rule is pending.

  • GKE Cluster missing: This message indicates that the registered GKE cluster is unavailable and or deleted. The membership's code is ERROR. This membership needs to be manually unregistered from the fleet after a GKE cluster is deleted.

  • Project that member lives in is missing required permissions and/or has not enabled all required APIs - additional setup steps are required: This message indicates there are internal StatusForbidden (403) errors, and the membership's code is FAILED. This error occurs in the following scenarios:

    • You have not enabled the necessary APIs in the member's project.

      If the member cluster lives in a separate project than the fleet, see cross-project setup to ensure you have completed all necessary steps. If you have completed all steps, ensure that the following APIs are enabled in the registration project with the following commands:

      gcloud services enable multiclusterservicediscovery.googleapis.com --project PROJECT_ID
      gcloud services enable dns.googleapis.com --project PROJECT_ID
      gcloud services enable trafficdirector.googleapis.com --project PROJECT_ID
      gcloud services enable cloudresourcemanager.googleapis.com --project PROJECT_ID
      

      Replace PROJECT_ID with the project ID from the project where you registered clusters.

    • The mcsd or gkehub service account requires more permissions in the member's project.

      The mcsd and gkehub service accounts should automatically have been created in the fleet host project with all the required permissions. To verify the service accounts exist, run the following commands:

      gcloud projects get-iam-policy PROJECT_ID | grep gcp-sa-mcsd
      gcloud projects get-iam-policy PROJECT_ID | grep gcp-sa-gkehub
      

      Replace PROJECT_ID with the project ID from the fleet host project.

    These commands should show you the full name of the mcsd and gkehub service accounts.

  • Multiple VPCs detected in the hub - VPC must be peered with other VPCs for successful connectivity: This message occurs when clusters hosted in different VPCs are registered to the same fleet. Membership status is OK. A cluster's VPC network is defined by its NetworkConfig's network. Multi-cluster Services require a flat network, and these VPCs must be actively peered for multi-cluster Services to properly connect to each other. To learn more, see Example VPC Network Peering setup.

  • Member does not exist in the same project as hub - additional setup steps are required, errors may occur if not completed.: This message reminds you that cross-project clusters require additional setup steps. Membership status is OK. Cross-project memberships are defined as a member cluster that is not in the same project as the fleet. For more information, see cross-project setup.

  • Non-GKE clusters are currently not supported: This message reminds you that MCS only supports GKE clusters. Non-GKE clusters cannot be added to MCS. Membership status is FAILED.

Known issues

MCS Services with multiple ports

There is a known issue with multi-cluster Services with multiple (TCP/UDP) ports on GKE Dataplane V2 where some endpoints are not programmed in the dataplane. This issue impacts GKE versions earlier than 1.26.3-gke.400.

As a workaround, when using GKE Dataplane V2, use multiple MCS with a single port instead of one MCS with multiple ports.

MCS with Shared VPC

With the current implementation of MCS, if you deploy more than one fleet in the same Shared VPC, metadata are shared between fleets. When a Service is created in one fleet, Service metadata is exported or imported in all other fleets that are part of the same Shared VPC and visible to the user.

This behavior will be fixed in an upcoming release of MCS.

Health check uses default port instead of containerPort

When you deploy a Service with a targetPort field referencing a named port in a Deployment, MCS configures the default port for the health check instead of the specified containerPort.

To avoid this issue, use numerical values in the Service field ports.targetPort and the Deployment field readinessProbe.httpGet.port instead of named values.

This behavior will be fixed in an upcoming release of MCS.

What's next