Configuring multi-cluster Services

This page shows you how to enable and use multi-cluster Services (MCS). To learn more about how MCS works and its benefits, see Multi-cluster Services.

The Google Kubernetes Engine (GKE) MCS feature extends the reach of the Kubernetes Service beyond the cluster boundary and lets you discover and invoke Services across multiple GKE clusters. You can export a subset of existing Services or new Services.

Google Cloud resources managed by MCS

MCS manages the following components of Google Cloud:

  • Cloud DNS: MCS configures Cloud DNS zones and records for each exported Service in your fleet clusters. This lets you connect to Services that are running in other clusters. These zones and records are created, read, updated, and deleted based on the Services that you choose to export across clusters.

  • Firewall rules: MCS configures firewall rules that let Pods communicate with each other across clusters within your fleet. Firewall rules are created, read, updated, and deleted based on the clusters that you add to your fleet. These rules are similar to the rules that GKE creates to enable communication between Pods within a GKE cluster.

  • Traffic Director: MCS configures the Traffic Director resources that perform health checks against endpoints and distributes endpoints across clusters in your fleet. These resources are created, read, updated, and deleted based on the Services that you choose to export across clusters.

Requirements

MCS has the following requirements:

  • MCS only supports exporting services from VPC-native GKE clusters on Google Cloud. For more information, see Creating a VPC-native cluster. You cannot use non VPC-native clusters.

  • Connectivity between clusters depends on clusters running within the same VPC network or in peered VPC networks.

Before you begin

Before you start, make sure you have performed the following tasks:

  1. Install the Cloud SDK.
  2. Enable the Google Kubernetes Engine API:

    Enable Google Kubernetes Engine API

  3. Enable the GKE fleet (hub), Cloud DNS, Traffic Director, and Resource Manager APIs:

    gcloud services enable gkehub.googleapis.com --project PROJECT_ID
    gcloud services enable dns.googleapis.com --project PROJECT_ID
    gcloud services enable trafficdirector.googleapis.com --project PROJECT_ID
    gcloud services enable cloudresourcemanager.googleapis.com --project PROJECT_ID
    

    Replace PROJECT_ID with the project ID from the project where you plan to register your clusters to a fleet.

Scenarios

MCS supports the following scenarios:

Scenario Networking setup Example with hub project and registered clusters
Two clusters in the same project and VPC network. One VPC network (Shared VPC or standalone). Two clusters located in the same project as hub project.
Shared VPC setup, two projects, and two clusters involved. Shared VPC network in the host project. Hub project located in network host project, and registered cluster located in service project.
Shared VPC setup, three projects, and two clusters involved. Shared VPC network in the host project. Hub project located in one service project, and registered cluster located in a different service project.
Two standalone VPC networks, two projects, and two clusters involved. Each cluster uses a different VPC network that must be connected using VPC Network Peering. Hub project located in one project and registered cluster located in the other project.

Enabling MCS

To enable MCS for two clusters in the same project and VPC network or to enable MCS for the hub project in a Shared VPC setup, complete the following steps:

  1. Enable the MCS API:

    gcloud services enable multiclusterservicediscovery.googleapis.com \
        --project PROJECT_ID
    

    Replace PROJECT_ID with the project ID from the project where you plan to register your clusters to a fleet.

  2. Enable the MCS feature:

    gcloud container hub multi-cluster-services enable \
        --project PROJECT_ID
    

    Replace PROJECT_ID with the project ID from the project where you plan to register your clusters to a fleet.

    If this command does not work, make sure that you are using the latest Cloud SDK version by running:

    gcloud components update
    
  3. Register your GKE clusters to a fleet. We strongly recommend that you register your cluster with Workload Identity enabled. If you do not use Workload Identity, you need to complete the additional steps in Authenticating service accounts.

    To register your cluster, run the following command:

    gcloud container hub memberships register MEMBERSHIP_NAME \
       --gke-cluster GKE_CLUSTER \
       --enable-workload-identity \
       --project PROJECT_ID
    

    Replace the following:

    • MEMBERSHIP_NAME: the membership name that you choose to uniquely represent the cluster being registered on the hub.
    • GKE_CLUSTER: the location/name of the GKE cluster from the current project. The location can be a zone or a region, for example: us-central1-a/my-gke-cluster.
  4. Grant the required Identity and Access Management (IAM) permissions for MCS Importer:

    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member "serviceAccount:PROJECT_ID.svc.id.goog[gke-mcs/gke-mcs-importer]" \
        --role "roles/compute.networkViewer"
    

    Replace PROJECT_ID with the project ID from the project where you registered your clusters to a fleet.

  5. Ensure that each cluster in the fleet has a namespace to share Services in. You cannot export Services in the default and kube-system namespaces across clusters.

    If needed, create a namespace by using the following command:

    kubectl create ns NAMESPACE
    

    Replace NAMESPACE with a name for the namespace.

  6. To verify that MCS is enabled, run the following command:

    gcloud container hub multi-cluster-services describe
    

    The output is similar to the following:

    createTime: '2021-08-10T13:05:23.512044937Z'
    membershipStates:
      projects/PROJECT_ID/locations/global/memberships/MCS_NAME:
        state:
          code: OK
          description: Firewall successfully updated
          updateTime: '2021-08-10T13:14:45.173444870Z'
    name: projects/PROJECT_NAME/locations/global/features/multiclusterservicediscovery
    resourceState:
      state: ACTIVE
    spec: {}
    

    If the value of state is not ACTIVE, see the troubleshooting section.

Authenticating service accounts

If you registered your GKE clusters to a fleet using a service account, you need to take additional steps to authenticate the service account. MCS deploys a component called gke-mcs-importer. This component receives endpoint updates from Traffic Director, so as part of enabling MCS you need to grant your service account permission to read information from Traffic Director.

When you use a service account, you can use the Compute Engine default service account or your own node service account:

Using MCS

The following sections show you how to use MCS. MCS uses the Kubernetes multi-cluster services API.

Registering a Service for export

To register a Service for export to other clusters within your fleet, complete the following steps:

  1. Create a ServiceExport object named export.yaml:

    # export.yaml
    kind: ServiceExport
    apiVersion: net.gke.io/v1
    metadata:
     namespace: NAMESPACE
     name: SERVICE_EXPORT_NAME
    

    Replace the following:

    • NAMESPACE: the namespace of the ServiceExport object. This namespace must match the namespace of the Service that you are exporting. Services in the default and kube-system namespaces are not exported across clusters.
    • SERVICE_EXPORT_NAME: the name of a Service in your cluster that you want to export to other clusters within your fleet.
  2. Create the ServiceExport resource by running the following command:

    kubectl apply -f export.yaml
    

The initial export of your Service takes approximately five minutes to sync to clusters registered in your fleet. After a Service is exported, subsequent endpoint syncs happen immediately.

You can export the same Service from multiple clusters. Services with the same namespace and same name are treated the same. For the same Service, endpoints from up to five clusters are fully synced across clusters. When there are more than five clusters exporting the same Service, endpoint subsetting happens. Over all consuming clusters, Service endpoints are distributed uniformly.

Consuming cross-cluster Services

MCS supports ClusterSetIP and headless Services. Only DNS "A" records are available.

After you create a ServiceExport object, the following domain name resolves to your exported Service from any Pod in any fleet cluster:

 SERVICE_EXPORT_NAME.NAMESPACE.svc.clusterset.local

The output includes the following values:

  • SERVICE_EXPORT_NAME and NAMESPACE: the values you define in your ServiceExport object.

For ClusterSetIP Services, the domain resolves to the ClusterSetIP. You can find this value by locating the ServiceImport object in a cluster in the namespace that the ServiceExport object was created in. The ServiceImport object is automatically created.

For example:

kind: ServiceImport
apiVersion: net.gke.io/v1
metadata:
 namespace: EXPORTED-SERVICE-NAMESPACE
 name: external-svc-SERVICE-EXPORT-TARGET
status:
 ports:
 - name: https
   port: 443
   protocol: TCP
   targetPort: 443
 ips: CLUSTER_SET_IP

For headless Services, the domain resolves to the list of IP addresses of the endpoints in the exporting clusters. Each backend Pod with a hostname is also independently addressable with a domain name of the following form:

 HOSTNAME.MEMBERSHIP_NAME.SERVICE_EXPORT_NAME.NAMESPACE.svc.clusterset.local

The output includes the following values:

  • SERVICE_EXPORT_NAME and NAMESPACE: the values you define in your ServiceExport object.
  • MEMBERSHIP_NAME: the membership name in your fleet for the cluster that the Pod is in.
  • HOSTNAME: the hostname of the Pod.

Disabling MCS

To disable MCS, complete the following steps:

  1. For each cluster in your fleet, delete each ServiceExport object that you created:

    kubectl delete serviceexport SERVICE_EXPORT_NAME \
        -n NAMESPACE
    

    Replace the following:

    • SERVICE_EXPORT_NAME: the name of your ServiceExport object.
    • NAMESPACE: the namespace of the ServiceExport object.
  2. Unregister your clusters from the fleet if they don't need to be registered for another purpose.

  3. Disable the multiclusterservicediscovery feature:

    gcloud container hub multi-cluster-services disable \
        --project PROJECT_ID
    

    Replace PROJECT_ID with the project ID from the project where you registered clusters.

  4. Disable the API for MCS:

    gcloud services disable multiclusterservicediscovery.googleapis.com \
        --project PROJECT_ID
    

    Replace PROJECT_ID with the project ID from the project where you registered clusters.

Limitations

The following limits are not enforced, and in some cases you can exceed these limits depending on the load in your clusters or project and the rate of endpoint churn. However, you might experience performance issues when these limitations are exceeded.

  • Exporting clusters: A single Service, identified by a namespaced name, can be safely exported from up to 5 clusters simultaneously. Beyond that limit, it's possible that only a subset of endpoints can be imported to consuming clusters. You can export different Services from different subsets of clusters.

  • The number of Pods behind a single Service: It's safe if you keep below 250 Pods behind a single Service. This is the same limitation that single cluster Services have. With relatively static workloads and a small number of multi-cluster Services, it might be possible to significantly exceed this number into thousands of endpoints per Service. As with single cluster Services, all endpoints are watched by kube-proxy on every node. When going beyond this limit, especially when exporting from multiple clusters simultaneously, larger nodes might be required.

  • The number of multi-cluster Services simultaneously exported: We recommend that you simultaneously export no more than 50 unique Service ports, identified by a Service's namespaced name and declared ports. For example, exporting a Service that exposes ports 80 and 443 would count against 2 of the 50 unique Service port limit. Services with the same namespaced name exported from multiple clusters count as a single unique Service. The previously mentioned 2 port service would still only count against 2 ports if it were exported from 5 clusters simultaneously. Each multi-cluster Service counts toward your Backend Services quota, and each exporting cluster or zone creates a network endpoint group (NEG).

Troubleshooting

The following sections provide you with troubleshooting tips for MCS.

Viewing the featureState

Viewing the feature state can help you confirm if MCS was configured successfully. You can view the MCS feature state by using the following command:

gcloud container hub multi-cluster-services describe

The output is similar to the following:

createTime: '2021-08-10T13:05:23.512044937Z'
membershipStates:
 projects/PROJECT_ID/locations/global/memberships/MCS_NAME:
   state:
     code: OK
     description: Firewall successfully updated
     updateTime: '2021-08-10T13:14:45.173444870Z'
name: projects/PROJECT_NAME/locations/global/features/multiclusterservicediscovery
resourceState:
 state: ACTIVE
spec: {}

The most helpful fields for troubleshooting are code and description.

Codes in the featureState

A code indicates the member's general state in relation to MCS. You can find these fields in the state.code field. There are three possible codes:

  • OK: The membership was successfully added to MCS and is ready to use.

  • WARNING: MCS is in the process of reconciling membership setup. The description field can provide more information about what caused this code.

  • FAILED: This membership was not added to MCS. Other memberships in the fleet with an OK code are not affected by this FAILED membership. The description field can provide more information about what caused this code.

Descriptions in the featureState

A description gives you further information about the membership's state in MCS. You can find these descriptions in the state.description field and you can see the following descriptions:

  • Firewall successfully created: This message indicates that the member's firewall rule was successfully created and or updated. The membership's code is OK.

  • Firewall creation pending: This message indicates that the member's firewall rule is pending creation or update. The membership's code is WARNING. This membership can experience issues updating and connecting to new multi-cluster Services and memberships added while the firewall rule is pending.

  • Project that member lives in is missing required permissions and/or has not enabled all required APIs - additional setup steps are required: This message indicates there are internal StatusForbidden (403) errors, and the membership's code is FAILED. This error occurs in the following scenarios:

    • You have not enabled the necessary APIs in the member's project.

      If the member's cluster lives in a separate project than the fleet, see cross-project setup to ensure you have completed all necessary steps. If you have completed all steps, ensure that the following APIs are enabled in the registration project with the following commands:

      gcloud services enable multiclusterservicediscovery.googleapis.com --project PROJECT_ID
      gcloud services enable dns.googleapis.com --project PROJECT_ID
      gcloud services enable trafficdirector.googleapis.com --project PROJECT_ID
      gcloud services enable cloudresourcemanager.googleapis.com --project PROJECT_ID
      

      Replace PROJECT_ID with the project ID from the project where you registered clusters.

    • The mcsd or gkehub service account requires more permissions in the member's project.

      The mcsd and gkehub service accounts should automatically have been created in the registration project with all the required permissions. To verify the service accounts exist, run the following commands:

      gcloud projects get-iam-policy PROJECT_ID | grep gcp-sa-mcsd
      gcloud projects get-iam-policy PROJECT_ID | grep gcp-sa-gkehub
      

      Replace PROJECT_ID with the project ID from the project where you registered clusters.

    These commands should show you the full name of the mcsd and gkehub service accounts.

  • Multiple VPCs detected in the hub - VPC must be peered with other VPCs for successful connectivity: This message occurs when clusters hosted in different VPCs are registered to the same fleet. Membership status is OK. A cluster's VPC network is defined by its NetworkConfig's network. Multi-cluster Services require a flat network, and these VPCs must be actively peered for multi-cluster Services to properly connect to each other. To learn more, see Example VPC Network Peering setup.

  • Member does not exist in the same project as hub - additional setup steps are required, errors may occur if not completed.: This message reminds you that cross-project clusters require additional setup steps. Membership status is OK. Cross-project memberships are defined as a member's cluster that does not exist in the same project as the fleet. For more information, see cross-project setup.

  • Non-GKE clusters are currently not supported: This message reminds you that MCS only supports GKE clusters. Non-GKE clusters cannot be added to MCS. Membership status is FAILED.

What's next