The Anthos Ingress controller manages Compute Engine
MultiClusterService resources map to
different Compute Engine resources, so understanding the relationship
between these resources helps you troubleshoot. For example, examine the
apiVersion: extensions/v1beta1 kind: MultiClusterIngress metadata: name: foo-ingress spec: rules: - host: store.foo.com http: paths: - backend: serviceName: store-foo servicePort: 80 - host: search.foo.com http: paths: - backend: serviceName: search-foo servicePort: 80
Compute Engine to Multi Cluster Ingress resource mappings
The table below shows the mapping of fleet resources to resources created in the Kubernetes clusters and Google Cloud:
|Kubernetes resource||Google Cloud resource||Description|
|MultiClusterIngress||Forwarding rule||HTTP(S) load balancer VIP.|
|Target proxy||HTTP/S terminations settings taken from annotations and the TLS block.|
|URL map||Virtual host path mapping from the rules section.|
|MultiClusterService||Kubernetes Service||Derived resource from template.|
|Backend service||A backend service is created for each (Service, ServicePort) pair.|
|Network endpoint groups||Set of backend Pods participating in the Service.|
Inspecting Compute Engine load balancer resources
After creating a load balancer, the Multi Cluster Ingress status will contain the names of every Compute Engine resource that was created to construct the load balancer. For example:
Name: shopping-service Namespace: prod Labels: <none> Annotations: <none> API Version: networking.gke.io/v1beta1 Kind: MultiClusterIngress Metadata: Creation Timestamp: 2019-07-16T17:23:14Z Finalizers: mci.finalizer.networking.gke.io Spec: Template: Spec: Backend: Service Name: shopping-service Service Port: 80 Status: VIP: 18.104.22.168 CloudResources: Firewalls: "mci-l7" ForwardingRules: "mci-abcdef-myforwardingrule" TargetProxies: "mci-abcdef-mytargetproxy" UrlMap: "mci-abcdef-myurlmap" HealthChecks: "mci-abcdef-80-myhealthcheck" BackendServices: "mci-abcdef-80-mybackendservice" NetworkEndpointGroups: "k8s1-neg1", "k8s1-neg2", "k8s1-neg3"
VIP not created
If you do not see a VIP, then an error may have occurred during its creation. To see if an error did occur, run the following command:
kubectl describe mci shopping-service
The output may look similar to:
Name: shopping-service Namespace: prod Labels: <none> Annotations: <none> API Version: networking.gke.io/v1beta1 Kind: MultiClusterIngress Metadata: Creation Timestamp: 2019-07-16T17:23:14Z Finalizers: mci.finalizer.networking.gke.io Spec: Template: Spec: Backend: Service Name: shopping-service Service Port: 80 Status: VIP: 22.214.171.124 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning SYNC 29s multi-cluster-ingress-controller error translating MCI prod/shopping-service: exceeded 4 retries with final error: error translating MCI prod/shopping-service: multiclusterservice prod/shopping-service does not exist
In this example, the error was that the user did not create a
MultiClusterService resource that was referenced by a
If your load balancer acquired a VIP but is consistently serving a 502 response, the load balancer health checks may be failing. Health checks could fail for two reasons:
- Application Pods are not healthy (see Cloud console debugging for example).
- A misconfigured firewall is blocking Google health checkers from performing health checks.
In the case of #1, make sure that your application is in fact serving a 200 response on the "/" path.
In the case of #2, make sure that a firewall named "mci-default-l7" exists in your VPC. The Ingress controller creates the firewall in your VPC to ensure Google health checkers can reach your backends. If the firewall does not exist, make sure there is no external automation that deletes this firewall upon its creation.
Traffic not added to or removed from cluster
When adding a new Membership, traffic should reach the backends in the
underlying cluster when applicable. Similarly, if a Membership is removed, no
traffic should reach the backends in the underlying cluster. If you are not
observing this behavior, check for errors on the
Common cases in which this error would occur include adding a new Membership on a GKE cluster that is not in VPC-native mode or adding a new Membership but not deploying an application in the GKE cluster.
kubectl describe mcs zone-svc
kubectl describe mci zone-mci
Config cluster migration
To understand more about the use cases for migration, see the Config cluster design concept.
Config cluster migration can be a disruptive operation if not handled correctly. Follow these guidelines when performing a config cluster migration:
- Make sure to use the
MultiClusterIngressresources. Failing to do so will result in disrupted traffic while migrating. Ephemeral IPs will be recreated when migrating config clusters.
MultiClusterServiceresources must be deployed identically to the existing and new config cluster. Differences between them will result in the reconciliation of
MultiClusterIngressresources that are different in the new config cluster.
- Only a single config cluster is active at any time. Until the config cluster
is changed, the
MultiClusterServiceresources in the new config cluster will not impact load balancer resources.
To migrate the config cluster, run the following command:
gcloud container fleet ingress update \ --config-membership=projects/project_id/locations/global/memberships/new_config_cluster
Verify the command worked by ensuring there are no visible errors in the Feature state:
gcloud container fleet ingress describe
In most cases, checking the exact state of the load balancer is helpful when debugging an issue. You can find the load balancer by going to Load balancing in the Google Cloud console.
Multi Cluster Ingress emits error and warning codes on
MultiClusterService resources as well as the gcloud
Description field for known issues. These messages have documented error and
warning codes to make it easier to understand what it means when something is
not operating as expected. Each code consists of an error ID in the format
123 is a unique number that corresponds to an error or
warning and suggestions on how to solve it.
Annotation [NAME] not recognized
This error displays when an annotation is specified on a
MultiClusterService manifest that is not recognized. There are a couple
reasons why the annotation might not be recognized:
The annotation is not supported in Multi Cluster Ingress. This may be expected if annotating resources that are not expected to be used by the Anthos Ingress controller.
The annotation is supported, but is misspelled and thus not recognized.
In both cases, please refer to documentation to understand the supported annotations and how they are specified.
[RESOURCE_NAME] not found
This error displays when a supplementary resource is specified in a
MultiClusterIngress but cannot be found in the Config Membership. For example,
this error is thrown when a
MultiClusterIngress refers to a
that cannot be found or a
MultiClusterService refers to a BackendConfig that
cannot be found. There are a couple reasons why a resource could not be found:
- It is not in the proper namespace. Ensure that resources which reference each other are all in the same namespace.
- The resource name is misspelled.
- The resource truly does not exist with the proper namespace + name. In this case, please create it.
[CLUSTER_SELECTOR] is invalid
This error displays when a cluster selector specified on a
is invalid. There are a couple reasons why this selector could be invalid:
- The provided string contains a typo.
- The provided string refers to a cluster membership that no longer exists in the fleet.
Cannot find NEGs for Service Port [SERVICE_PORT]
This error is thrown when the NetworkEndpointGroup's (NEGs) for a given
MultiClusterService and service port pair cannot be found. NEGs are the
resources which contain the Pod endpoints in each of your backend clusters. The
main reason why the NEGs might not exist is because there was an error creating
or updating the Derived Services in your backend clusters. Check the Events on
MultiClusterService resource for more information.
Missing Anthos license.
This error displays under Feature state, and indicates that the Anthos API (anthos.googleapis.com) is not enabled.
Derived service is invalid: [REASON].
This error displays under the events of the
MultiClusterService resource. One
common reason for this error is that the Service resource derived from
MultiClusterService has an invalid spec.
For example, this
MultiClusterService does not have any
in its spec.
apiVersion: networking.gke.io/v1 kind: MultiClusterService metadata: name: zone-mcs namespace: whereami spec: clusters: - link: "us-central1-a/gke-us" - link: "europe-west1-c/gke-eu"
Missing GKE cluster resource link in Membership.
This error displays under Feature state and occurs because there is no GKE cluster underlying the Membership resource. You can verify this by running the following command:
gcloud container fleet memberships describe membership-name
and ensuring that there is no GKE cluster resource link under the endpoint field.
GKE cluster [NAME] not found.
This error displays under Feature state and is thrown if the underlying GKE cluster for the Membership does not exist.
[NAME] is not a VPC-native GKE cluster.
This error displays under Feature state. This error is thrown if the specified GKE cluster is a route-based cluster. The Multi Cluster Ingress controller creates a container-native load balancer using NEGs. Clusters must be VPC-native to use a container-native load balancer.
For more information, see Creating a VPC-native cluster.
[IAM_PERMISSION] permission missing for GKE cluster [NAME].
This error displays under Feature state. There are a couple reasons for this error:
- The underlying GKE cluster for the Membership is located in a different project from the Membership itself.
- The specified IAM permission was removed from the
Failed to get Config Membership: [REASON].
This error displays under Feature state. The main reason this error occurs is because the Config Membership was deleted while the Feature is enabled.
You should never need to delete the Config Membership. If you would like to change it, follow the config cluster migration steps.
HTTPLoadBalancing Addon is disabled in GKE Cluster [NAME].
This error displays under Feature state and occurs when the
addon is disabled in a GKE cluster. You can update your
GKE cluster to enable the
gcloud container clusters update name --update-addons=HttpLoadBalancing=ENABLED
This resource is orphaned.
In some cases, the usefulness of a resource depends on it being referenced by
another resource. This error is thrown when a Kubernetes resource is created but
is not referenced by another resource. For example, you will see this error if
you create a
BackendConfig resource that is not being referenced by a