部署多集群网关

Autopilot Standard

本页面介绍如何部署 Kubernetes Gateway 资源，以跨多个 Google Kubernetes Engine (GKE) 集群（或舰队）对入站流量进行负载均衡。在部署多集群 Gateway 之前，请参阅启用多集群 Gateway 以准备环境。

如需部署 Gateway 以将流量负载均衡到单个 GKE 集群，请参阅部署 Gateway。

多集群 Gateway

多集群网关是一种网关资源，用于跨多个 Kubernetes 集群进行流量负载均衡。在 GKE 中，gke-l7-cross-regional-internal-managed-mc、gke-l7-global-external-managed-mc、gke-l7-regional-external-managed-mc、gke-l7-rilb-mc 和 gke-l7-gxlb-mc GatewayClass 会部署多集群 Gateway，这些 Gateway 跨不同 GKE 集群、Kubernetes 命名空间和不同区域提供 HTTP 路由、流量分配、流量镜像、基于健康状况的故障切换等。通过多集群 Gateway，基础架构管理员能够跨多个集群和团队轻松、安全、可扩缩地管理应用网络。

多集群 Gateway 是一种 Gateway 资源，用于跨多个 Kubernetes 集群进行流量负载均衡。

本页面介绍了三个示例，教您如何使用 GKE Gateway Controller 控制器部署多集群 Gateway：

示例 1：外部多集群 Gateway，用于跨两个 GKE 集群为互联网流量提供负载均衡。
示例 2：专用第 7 层跨区域网关。
示例 3：跨两个 GKE 集群对内部 VPC 流量进行基于权重的蓝绿流量分配和流量镜像。
示例 4：基于容量的 Gateway，用于根据后端的最大容量为发往不同后端的请求提供负载均衡。

每个示例都将使用相同的 store 和 site 应用来模拟以下现实场景：在线购物服务和网站服务由单独的团队拥有和运营，并且跨共享 GKE 集群舰队进行部署。每个示例都重点介绍多集群 Gateway 启用的不同拓扑和使用场景。

多集群 Gateway 需要一些环境准备才能部署。在继续操作之前，请按照启用多集群 Gateway 中的步骤操作：

部署 GKE 集群。
向队列注册集群。
启用多集群 Service 和多集群 Gateway 控制器。

最后，请先查看 GKE Gateway Controller 的限制和已知问题，然后才可在环境中使用它。

多集群、多区域、外部 Gateway

在本教程中，您将创建一个外部多集群 Gateway，用于处理两个 GKE 集群中运行的整个应用中的外部流量。

store.example.com 部署在两个 GKE 集群中，并通过多集群 Gateway 向互联网公开

在以下步骤中，您将执行以下操作：

将示例 store 应用部署到 gke-west-1 和 gke-east-1 集群。
配置要导出到舰队的每个集群上的 Service（多集群 Service）。
将外部多集群 Gateway 和 HTTPRoute 部署到配置集群 (gke-west-1)。

部署应用和 Gateway 资源后，您可以使用基于路径的路由来控制两个 GKE 集群之间的流量：

对 /west 的请求将路由到 gke-west-1 集群中的 store pod。
对 /east 的请求将路由到 gke-east-1 集群中的 store pod。
针对任何其他路径的请求会根据其运行状况、容量和与请求客户端的邻近程度路由到集群。

部署演示应用

在启用多集群 Gateway 中部署的所有三个集群中创建 store 部署和命名空间：

kubectl apply --context gke-west-1 -f https://raw.githubusercontent.com/GoogleCloudPlatform/gke-networking-recipes/main/gateway/gke-gateway-controller/multi-cluster-gateway/store.yaml
kubectl apply --context gke-west-2 -f https://raw.githubusercontent.com/GoogleCloudPlatform/gke-networking-recipes/main/gateway/gke-gateway-controller/multi-cluster-gateway/store.yaml
kubectl apply --context gke-east-1 -f https://raw.githubusercontent.com/GoogleCloudPlatform/gke-networking-recipes/main/gateway/gke-gateway-controller/multi-cluster-gateway/store.yaml

它会将以下资源部署到每个集群：

namespace/store created
deployment.apps/store created

本页面中的所有示例均使用此步骤中部署的应用。在尝试任何其余步骤之前，请确保已在全部三个集群中部署该应用。此示例仅使用集群 gke-west-1 和 gke-east-1，而 gke-west-2 则用于另一示例。

多集群 Service

Service 是将 Pod 公开给客户端的方式。由于 GKE Gateway Controller 控制器使用容器原生负载均衡，因此不使用 ClusterIP 或 Kubernetes 负载均衡来访问 Pod。流量会直接从负载均衡器发送到 Pod IP 地址。但是，Service 仍然发挥关键作用，是 Pod 分组的逻辑标识符。

多集群 Service (MCS) 是一种适用于跨集群的 Service 的 API 标准，其 GKE 控制器可跨 GKE 集群提供服务发现。多集群 Gateway 控制器使用 MCS API 资源将 Pod 分组到可跨多个集群寻址的 Service 中。

Multi-cluster Services API 定义了以下自定义资源：

ServiceExports 会映射到 Kubernetes Service，并将该 Service 的端点导出到向队列注册的所有集群。当 Service 具有相应的 ServiceExport 时，表示该 Service 可以通过多集群 Gateway 进行寻址。
ServiceImports 由多集群 Service 控制器自动生成。ServiceExport 和 ServiceImport 成对提供。如果舰队中存在 ServiceExport，则系统会创建相应的 ServiceImport，以允许从集群访问映射到 ServiceExport 的 Service。

导出 Service 的方式如下。gke-west-1 中存在一项存储服务，用于选择该集群中的一组 Pod。系统会在集群中创建 ServiceExport，该资源允许从舰队中的其他集群访问 gke-west-1 中的 Pod。ServiceExport 将映射到并公开与 ServiceExport 资源具有相同名称和命名空间的 Service。

apiVersion: v1
kind: Service
metadata:
  name: store
  namespace: store
spec:
  selector:
    app: store
  ports:
  - port: 8080
    targetPort: 8080
---
kind: ServiceExport
apiVersion: net.gke.io/v1
metadata:
  name: store
  namespace: store

下图显示了部署 ServiceExport 后会发生什么。如果存在 ServiceExport 和 Service 对，则多集群 Service 控制器会将相应的 ServiceImport 部署到机群中的每个 GKE 集群。ServiceImport 是每个集群中 store Service 的本地表示形式。这允许 gke-east-1 中 client Pod 使用 ClusterIP 或无头 Service 访问 gke-west-1 中的 store Pod。以这种方式使用多集群 Service 时，可以在集群之间提供东-西负载均衡，而无需内部 LoadBalancer Service。如需使用多集群 Service 进行集群到集群负载均衡，请参阅配置多集群 Service。

多集群 Service 可跨集群导出 Service，从而实现集群间通信

多集群 Gateway 也使用 ServiceImport，但不用于集群到集群负载均衡。相反，Gateway 将 ServiceImport 用作存在于另一个集群中或跨多个集群的 Service 的逻辑标识符。以下 HTTPRoute 引用 ServiceImport 而不是 Service 资源。通过引用 ServiceImport，这表示它正在将流量转发到在一个或多个集群上运行的一组后端 Pod。

kind: HTTPRoute
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: store-route
  namespace: store
  labels:
    gateway: multi-cluster-gateway
spec:
  parentRefs:
  - kind: Gateway
    namespace: store
    name: external-http
  hostnames:
  - "store.example.com"
  rules:
  - backendRefs:
    - group: net.gke.io
      kind: ServiceImport
      name: store
      port: 8080

下图显示了 HTTPRoute 如何将 store.example.com 流量路由到 gke-west-1 和 gke-east-1 上的 store Pod。负载均衡器将它们视为一个后端池。如果来自其中一个集群的 Pod 运行状况不佳、无法访问或没有流量容量，则流量负载会均衡到另一个集群上剩余的 Pod。您可以使用 store Service 和 ServiceExport 添加或移除新集群。这将以透明方式添加或移除后端 Pod，而不会发生任何显式路由配置更改。

MCS 资源

导出 Service

此时，应用在这两个集群中运行。接下来，您将向每个集群部署 Service 和 ServiceExport，以公开和导出应用。

将以下清单应用于 gke-west-1 集群，以创建 store 和 store-west-1 Service 和 ServiceExport：

cat << EOF | kubectl apply --context gke-west-1 -f -
apiVersion: v1
kind: Service
metadata:
  name: store
  namespace: store
spec:
  selector:
    app: store
  ports:
  - port: 8080
    targetPort: 8080
---
kind: ServiceExport
apiVersion: net.gke.io/v1
metadata:
  name: store
  namespace: store
---
apiVersion: v1
kind: Service
metadata:
  name: store-west-1
  namespace: store
spec:
  selector:
    app: store
  ports:
  - port: 8080
    targetPort: 8080
---
kind: ServiceExport
apiVersion: net.gke.io/v1
metadata:
  name: store-west-1
  namespace: store
EOF

将以下清单应用于 gke-east-1 集群，以创建 store 和 store-east-1 Service 和 ServiceExport：

cat << EOF | kubectl apply --context gke-east-1 -f -
apiVersion: v1
kind: Service
metadata:
  name: store
  namespace: store
spec:
  selector:
    app: store
  ports:
  - port: 8080
    targetPort: 8080
---
kind: ServiceExport
apiVersion: net.gke.io/v1
metadata:
  name: store
  namespace: store
---
apiVersion: v1
kind: Service
metadata:
  name: store-east-1
  namespace: store
spec:
  selector:
    app: store
  ports:
  - port: 8080
    targetPort: 8080
---
kind: ServiceExport
apiVersion: net.gke.io/v1
metadata:
  name: store-east-1
  namespace: store
EOF

验证是否已在集群中创建了正确的 ServiceExport。
```
kubectl get serviceexports --context CLUSTER_NAME --namespace store
```
将 CLUSTER_NAME 替换为 gke-west-1 和 gke-east-1。您应该会看到类似如下所示的输出：
```
# gke-west-1
NAME           AGE
store          2m40s
store-west-1   2m40s

# gke-east-1
NAME           AGE
store          2m25s
store-east-1   2m25s
```
这表明 store Service 包含两个集群中的 store Pod，而 store-west-1 和 store-east-1 Service 在其各自的集群仅包含 store Pod。这些重叠的 Service 用于面向多个集群中的 Pod 或单个集群上的部分 Pod。
几分钟后，请验证多集群 Service 控制器是否已跨舰队中的所有集群自动创建随附的 ServiceImports。

注意：您在舰队中创建的第一个 MCS 可能需要长达 20 分钟才能完全正常运行。在创建第一个服务后导出新服务，或者向现有的多集群 Service 添加端点会更快（在某些情况下，可能需要几分钟）。
```
kubectl get serviceimports --context CLUSTER_NAME --namespace store
```
将 CLUSTER_NAME 替换为 gke-west-1 和 gke-east-1。您应该会看到类似如下所示的输出：
```
# gke-west-1
NAME           TYPE           IP                  AGE
store          ClusterSetIP   ["10.112.31.15"]    6m54s
store-east-1   ClusterSetIP   ["10.112.26.235"]   5m49s
store-west-1   ClusterSetIP   ["10.112.16.112"]   6m54s

# gke-east-1
NAME           TYPE           IP                  AGE
store          ClusterSetIP   ["10.72.28.226"]    5d10h
store-east-1   ClusterSetIP   ["10.72.19.177"]    5d10h
store-west-1   ClusterSetIP   ["10.72.28.68"]     4h32m
```
这表明可以从集群中的所有两个集群访问所有三个 Service。但是，由于每个机组只有一个活跃配置集群，因此您只能在 gke-west-1 中部署引用这些 ServiceImport 的 Gateway 和 HTTPRoute。当配置集群中的 HTTPRoute 将这些 ServiceImport 引用为后端时，Gateway 可以将流量转发到这些 Service，无论它们是从哪个集群导出的。

部署 Gateway 和 HTTPRoute

部署应用后，您可以使用 gke-l7-global-external-managed-mc GatewayClass 配置 Gateway。此 Gateway 会创建一个外部应用负载均衡器，配置为在目标集群之间分配流量。

将以下 Gateway 清单应用于配置集群（在此示例中为 gke-west-1）：
```
cat << EOF | kubectl apply --context gke-west-1 -f -
kind: Gateway
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: external-http
  namespace: store
spec:
  gatewayClassName: gke-l7-global-external-managed-mc
  listeners:
  - name: http
    protocol: HTTP
    port: 80
    allowedRoutes:
      kinds:
      - kind: HTTPRoute
EOF
```
注意：Gateway 可能需要几分钟（最多 10 分钟）才能完全部署并处理流量。

此 Gateway 配置会根据以下命名惯例部署外部应用负载均衡器资源：gkemcg1-NAMESPACE-GATEWAY_NAME-HASH。

使用此配置创建的默认资源包括：
- 1 个负载均衡器：gkemcg1-store-external-http-HASH
- 1 个公共 IP 地址：gkemcg1-store-external-http-HASH
- 1 条转发规则：gkemcg1-store-external-http-HASH
- 2 个后端服务：
  - 默认的 404 后端服务：gkemcg1-store-gw-serve404-HASH
  - 默认的 500 后端服务：gkemcg1-store-gw-serve500-HASH
- 1 项健康检查：
  - 默认的 404 健康检查：gkemcg1-store-gw-serve404-HASH
- 0 条路由规则（URLmap 为空）
在此阶段，对 GATEWAY_IP:80 的任何请求都将导致生成默认页面，其中显示以下消息：fault filter abort。

警告：在此示例中，我们将部署一个监听端口 80 的外部多集群 Gateway。如果要为生产环境部署外部多集群 Gateway，则应确保根据您的上下文和环境配置一组功能来正确保护 Gateway。如需详细了解如何保护多集群 Gateway 的安全，请参阅保护 Gateway 安全；如需详细了解如何将政策应用于多集群 Gateway，请参阅使用政策配置 Gateway 资源。
将以下 HTTPRoute 清单应用于配置集群（在此示例中为 gke-west-1）：
```
cat << EOF | kubectl apply --context gke-west-1 -f -
kind: HTTPRoute
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: public-store-route
  namespace: store
  labels:
    gateway: external-http
spec:
  hostnames:
  - "store.example.com"
  parentRefs:
  - name: external-http
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /west
    backendRefs:
    - group: net.gke.io
      kind: ServiceImport
      name: store-west-1
      port: 8080
  - matches:
    - path:
        type: PathPrefix
        value: /east
    backendRefs:
      - group: net.gke.io
        kind: ServiceImport
        name: store-east-1
        port: 8080
  - backendRefs:
    - group: net.gke.io
      kind: ServiceImport
      name: store
      port: 8080
EOF
```
在此阶段，对 GATEWAY_IP:80 的任何请求都将导致生成默认页面，其中显示以下消息：fault filter abort。

部署后，此 HTTPRoute 将配置以下路由行为：
- 对 /west 的请求将路由到 gke-west-1 集群中的 store Pod，因为 store-west-1 ServiceExport 选择的 Pod 仅存在于 gke-west-1 集群中。
- 对 /east 的请求将路由到 gke-east-1 集群中的 store Pod，因为 store-east-1 ServiceExport 选择的 Pod 仅存在于 gke-east-1 集群中。
- 对任何其他路径的请求会根据集群的运行状况、容量和与请求客户端的邻近程度路由到任意集群的 store Pod 中。
- 对 GATEWAY_IP:80 的请求将导致生成默认页面，其中显示以下消息：fault filter abort。
请注意，如果给定集群上的所有 Pod 运行状况不佳（或不存在），则流向 store Service 的流量只会发送到实际具有 store Pod 的集群。如果给定集群上存在 ServiceExport 和 Service，则无法保证流量会发送到该集群。Pod 必须存在且对负载均衡器健康检查做出肯定响应，否则负载均衡器只会将流量发送到其他集群中健康状况良好的 store Pod。

使用以下配置创建新资源：
- 3 个后端服务：
  - store 后端服务：gkemcg1-store-store-8080-HASH
  - store-east-1 后端服务：gkemcg1-store-store-east-1-8080-HASH
  - store-west-1 后端服务：gkemcg1-store-store-west-1-8080-HASH
- 3 项健康检查：
  - store 健康检查：gkemcg1-store-store-8080-HASH
  - store-east-1 健康检查：gkemcg1-store-store-east-1-8080-HASH
  - store-west-1 健康检查：gkemcg1-store-store-west-1-8080-HASH
- URLmap 中 1 个路由规则：
  - store.example.com 路由规则：
  - 1 个主机：store.example.com
  - 多个 matchRules，用于路由到新的后端服务

下图显示了您在这两个集群中部署的资源。由于 gke-west-1 是 Gateway 配置集群，因此它是 Gateway 控制器在其中监视 Gateway、HTTPRoute 和 ServiceImport 的集群。每个集群都有一个 store ServiceImport，还另外有一个特定于该集群的 ServiceImport。这两者都指向相同的 Pod。这样，HTTPRoute 就能精确地指定流量应流向何处 - 是特定集群上的 store Pod，还是所有集群中的 store Pod。

这是两个集群中的 Gateway 和多集群 Service 资源模型

请注意，这是一个逻辑资源模型，而不是对流量的描述。流量路径直接从负载均衡器进入后端 Pod，与配置集群是哪个集群没有直接关系。

验证部署

您现在可以向我们的多集群 Gateway 发出请求，并将流量分配到多个 GKE 集群。

通过检查 Gateway 状态和事件来验证 Gateway 和 HTTPRoute 是否已成功部署。

kubectl describe gateways.gateway.networking.k8s.io external-http --context gke-west-1 --namespace store

您的输出应如下所示：

Name:         external-http
Namespace:    store
Labels:       <none>
Annotations:  networking.gke.io/addresses: /projects/PROJECT_NUMBER/global/addresses/gkemcg1-store-external-http-laup24msshu4
              networking.gke.io/backend-services:
                /projects/PROJECT_NUMBER/global/backendServices/gkemcg1-store-gw-serve404-80-n65xmts4xvw2, /projects/PROJECT_NUMBER/global/backendServices/gke...
              networking.gke.io/firewalls: /projects/PROJECT_NUMBER/global/firewalls/gkemcg1-l7-default-global
              networking.gke.io/forwarding-rules: /projects/PROJECT_NUMBER/global/forwardingRules/gkemcg1-store-external-http-a5et3e3itxsv
              networking.gke.io/health-checks:
                /projects/PROJECT_NUMBER/global/healthChecks/gkemcg1-store-gw-serve404-80-n65xmts4xvw2, /projects/PROJECT_NUMBER/global/healthChecks/gkemcg1-s...
              networking.gke.io/last-reconcile-time: 2023-10-12T17:54:24Z
              networking.gke.io/ssl-certificates:
              networking.gke.io/target-http-proxies: /projects/PROJECT_NUMBER/global/targetHttpProxies/gkemcg1-store-external-http-94oqhkftu5yz
              networking.gke.io/target-https-proxies:
              networking.gke.io/url-maps: /projects/PROJECT_NUMBER/global/urlMaps/gkemcg1-store-external-http-94oqhkftu5yz
API Version:  gateway.networking.k8s.io/v1
Kind:         Gateway
Metadata:
  Creation Timestamp:  2023-10-12T06:59:32Z
  Finalizers:
    gateway.finalizer.networking.gke.io
  Generation:        1
  Resource Version:  467057
  UID:               1dcb188e-2917-404f-9945-5f3c2e907b4c
Spec:
  Gateway Class Name:  gke-l7-global-external-managed-mc
  Listeners:
    Allowed Routes:
      Kinds:
        Group:  gateway.networking.k8s.io
        Kind:   HTTPRoute
      Namespaces:
        From:  Same
    Name:      http
    Port:      80
    Protocol:  HTTP
Status:
  Addresses:
    Type:   IPAddress
    Value:  34.36.127.249
  Conditions:
    Last Transition Time:  2023-10-12T07:00:41Z
    Message:               The OSS Gateway API has deprecated this condition, do not depend on it.
    Observed Generation:   1
    Reason:                Scheduled
    Status:                True
    Type:                  Scheduled
    Last Transition Time:  2023-10-12T07:00:41Z
    Message:
    Observed Generation:   1
    Reason:                Accepted
    Status:                True
    Type:                  Accepted
    Last Transition Time:  2023-10-12T07:00:41Z
    Message:
    Observed Generation:   1
    Reason:                Programmed
    Status:                True
    Type:                  Programmed
    Last Transition Time:  2023-10-12T07:00:41Z
    Message:               The OSS Gateway API has altered the "Ready" condition semantics and reservedit for future use.  GKE Gateway will stop emitting it in a future update, use "Programmed" instead.
    Observed Generation:   1
    Reason:                Ready
    Status:                True
    Type:                  Ready
  Listeners:
    Attached Routes:  1
    Conditions:
      Last Transition Time:  2023-10-12T07:00:41Z
      Message:
      Observed Generation:   1
      Reason:                Programmed
      Status:                True
      Type:                  Programmed
      Last Transition Time:  2023-10-12T07:00:41Z
      Message:               The OSS Gateway API has altered the "Ready" condition semantics and reservedit for future use.  GKE Gateway will stop emitting it in a future update, use "Programmed" instead.
      Observed Generation:   1
      Reason:                Ready
      Status:                True
      Type:                  Ready
    Name:                    http
    Supported Kinds:
      Group:  gateway.networking.k8s.io
      Kind:   HTTPRoute
Events:
  Type    Reason  Age                    From                   Message
  ----    ------  ----                   ----                   -------
  Normal  UPDATE  35m (x4 over 10h)      mc-gateway-controller  store/external-http
  Normal  SYNC    4m22s (x216 over 10h)  mc-gateway-controller  SYNC on store/external-http was a success

Gateway 部署成功后，从 external-http Gateway 成功检索外部 IP 地址。

kubectl get gateways.gateway.networking.k8s.io external-http -o=jsonpath="{.status.addresses[0].value}" --context gke-west-1 --namespace store

将以下步骤中的 VIP 替换为您收到的 IP 地址作为输出。

将流量发送到网域的根路径。这会将流量负载均衡到跨集群 gke-west-1 和 gke-east-1 的 store ServiceImport。负载均衡器会将流量发送到离您最近的区域，因此您可能不会看到来自其他区域的响应。

curl -H "host: store.example.com" http://VIP

输出会确认 Pod 已从 gke-east-1 集群处理请求：

{
  "cluster_name": "gke-east-1",
  "zone": "us-east1-b",
  "host_header": "store.example.com",
  "node_name": "gke-gke-east-1-default-pool-7aa30992-t2lp.c.agmsb-k8s.internal",
  "pod_name": "store-5f5b954888-dg22z",
  "pod_name_emoji": "⏭",
  "project_id": "agmsb-k8s",
  "timestamp": "2021-06-01T17:32:51"
}

接下来，将流量发送到 /west 路径。这会将流量路由到仅在 gke-west-1 集群上运行 Pod 的 store-west-1 ServiceImport。通过 store-west-1 等集群专用的 ServiceImport，应用所有者可以将流量明确发送到特定集群，而不是让负载均衡器做出决策。

curl -H "host: store.example.com" http://VIP/west

输出会确认 Pod 已从 gke-west-1 集群处理请求：

{
  "cluster_name": "gke-west-1", 
  "zone": "us-west1-a", 
  "host_header": "store.example.com",
  "node_name": "gke-gke-west-1-default-pool-65059399-2f41.c.agmsb-k8s.internal",
  "pod_name": "store-5f5b954888-d25m5",
  "pod_name_emoji": "🍾",
  "project_id": "agmsb-k8s",
  "timestamp": "2021-06-01T17:39:15",
}

最后，将流量发送到 /east 路径。

curl -H "host: store.example.com" http://VIP/east

输出会确认 Pod 已从 gke-east-1 集群处理请求：

{
  "cluster_name": "gke-east-1",
  "zone": "us-east1-b",
  "host_header": "store.example.com",
  "node_name": "gke-gke-east-1-default-pool-7aa30992-7j7z.c.agmsb-k8s.internal",
  "pod_name": "store-5f5b954888-hz6mw",
  "pod_name_emoji": "🧜🏾",
  "project_id": "agmsb-k8s",
  "timestamp": "2021-06-01T17:40:48"
}

跨区域部署内部多集群网关

您可以部署多集群 Gateway，以在多个区域中的 GKE 集群之间提供内部第 7 层负载均衡。这些网关使用 gke-l7-cross-regional-internal-managed-mc GatewayClass。此 GatewayClass 会预配由 Google Cloud 管理的跨区域内部应用负载平衡器，并启用 VPC 网络中的客户端可以访问的内部 VIP。只需使用网关请求这些区域中的地址，即可通过您选择的区域中的前端公开这些网关。内部 VIP 可以是单个 IP 地址，也可以是多个区域中的 IP 地址，每个区域中有一个 IP 地址在网关中指定。流量会定向到距离最近且可处理请求的运行状况良好的后端 GKE 集群。

准备工作

通过使用项目 ID 配置 gcloud 环境来设置项目和 shell：

export PROJECT_ID="YOUR_PROJECT_ID"
gcloud config set project ${PROJECT_ID}

在不同区域中创建 GKE 集群。

此示例使用了两个集群，即 us-west1 中的 gke-west-1 和 us-east1 中的 gke-east-1。确保已启用 Gateway API (--gateway-api=standard)，并且集群已注册到舰队。

gcloud container clusters create gke-west-1 \
    --location=us-west1-a \
    --workload-pool=${PROJECT_ID}.svc.id.goog \
    --project=${PROJECT_ID} \
    --enable-fleet \
    --gateway-api=standard

gcloud container clusters create gke-east-1 \
    --location=us-east1-c \
    --workload-pool=${PROJECT_ID}.svc.id.goog \
    --project=${PROJECT_ID} \
    --enable-fleet \
    --gateway-api=standard

重命名上下文，以便更轻松地访问：

gcloud container clusters get-credentials gke-west-1 \
  --location=us-west1-a \
  --project=${PROJECT_ID}

gcloud container clusters get-credentials gke-east-1 \
  --location=us-east1-c \
  --project=${PROJECT_ID}
kubectl config rename-context gke_${PROJECT_ID}_us-west1-a_gke-west-1 gke-west1
kubectl config rename-context gke_${PROJECT_ID}_us-east1-c_gke-east-1 gke-east1

启用多集群服务 (MCS) 和多集群 Ingress (MCI/Gateway)：

gcloud container fleet multi-cluster-services enable --project=${PROJECT_ID}

# Set the config membership to one of your clusters (e.g., gke-west-1)
# This cluster will be the source of truth for multi-cluster Gateway and Route resources.
gcloud container fleet ingress enable \
    --config-membership=projects/${PROJECT_ID}/locations/us-west1/memberships/gke-west-1 \
    --project=${PROJECT_ID}

配置代理专用子网。在 GKE 集群所在的每个区域以及负载均衡器将运行的每个区域中，都需要一个代理专用子网。跨区域内部应用负载均衡器要求将此子网的用途设置为 GLOBAL_MANAGED_PROXY。

# Proxy-only subnet for us-west1
gcloud compute networks subnets create us-west1-proxy-only-subnet \
    --purpose=GLOBAL_MANAGED_PROXY \
    --role=ACTIVE \
    --region=us-west1 \
    --network=default \
    --range=10.129.0.0/23 # Choose an appropriate unused CIDR range

# Proxy-only subnet for us-east1
gcloud compute networks subnets create us-east1-proxy-only-subnet \
    --purpose=GLOBAL_MANAGED_PROXY \
    --role=ACTIVE \
    --region=us-east1 \
    --network=default \
    --range=10.130.0.0/23 # Choose an appropriate unused CIDR range

如果您未使用默认网络，请将 default 替换为您的 VPC 网络名称。确保 CIDR 范围具有唯一性，并且不重叠。

将演示应用（例如 store）部署到两个集群。gke-networking-recipes 中的示例 store.yaml 文件会创建一个 store 命名空间和一个部署。

kubectl apply --context gke-west1 -f https://raw.githubusercontent.com/GoogleCloudPlatform/gke-networking-recipes/main/gateway/gke-gateway-controller/multi-cluster-gateway/store.yaml
kubectl apply --context gke-east1 -f https://raw.githubusercontent.com/GoogleCloudPlatform/gke-networking-recipes/main/gateway/gke-gateway-controller/multi-cluster-gateway/store.yaml

通过在每个集群中创建 Kubernetes Service 资源和 ServiceExport 资源，从每个集群导出服务，从而使服务可在整个舰队中被发现。以下示例从每个集群导出通用 store 服务和特定于区域的服务（store-west-1、store-east-1），所有这些服务均位于 store 命名空间内。

应用于gke-west1。

cat << EOF | kubectl apply --context gke-west1 -f -
apiVersion: v1
kind: Service
metadata:
  name: store
  namespace: store
spec:
  selector:
    app: store
  ports:
  - port: 8080
    targetPort: 8080
---
kind: ServiceExport
apiVersion: net.gke.io/v1
metadata:
  name: store
  namespace: store
---
apiVersion: v1
kind: Service
metadata:
  name: store-west-1 # Specific to this cluster
  namespace: store
spec:
  selector:
    app: store
  ports:
  - port: 8080
    targetPort: 8080
---
kind: ServiceExport
apiVersion: net.gke.io/v1
metadata:
  name: store-west-1 # Exporting the region-specific service
  namespace: store
EOF

应用于gke-east1。

cat << EOF | kubectl apply --context gke-east1 -f -
apiVersion: v1
kind: Service
metadata:
  name: store
  namespace: store
spec:
  selector:
    app: store
  ports:
  - port: 8080
    targetPort: 8080
---
kind: ServiceExport
apiVersion: net.gke.io/v1
metadata:
  name: store
  namespace: store
---
apiVersion: v1
kind: Service
metadata:
  name: store-east-1 # Specific to this cluster
  namespace: store
spec:
  selector:
    app: store
  ports:
  - port: 8080
    targetPort: 8080
---
kind: ServiceExport
apiVersion: net.gke.io/v1
metadata:
  name: store-east-1 # Exporting the region-specific service
  namespace: store
EOF

检查 ServiceImport：验证是否在 store 命名空间内的每个集群中都创建了 ServiceImport 资源。创建这些资源可能需要几分钟时间。 bash kubectl get serviceimports --context gke-west1 -n store kubectl get serviceimports --context gke-east1 -n store 您应该会看到 store、store-west-1 和 store-east-1 列出（或根据传播情况显示相关条目）。

配置内部多区域网关

定义引用 gke-l7-cross-regional-internal-managed-mc GatewayClass 的 Gateway 资源。您需要将此清单应用于指定的配置集群，例如 gke-west-1。

您可以使用 spec.addresses 字段在特定区域中请求临时 IP 地址，或使用预先分配的静态 IP 地址。

如需使用临时 IP 地址，请将以下 Gateway 清单另存为 cross-regional-gateway.yaml：

# cross-regional-gateway.yaml
kind: Gateway
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: internal-cross-region-gateway
  namespace: store # Namespace for the Gateway resource
spec:
  gatewayClassName: gke-l7-cross-regional-internal-managed-mc
  addresses:
  # Addresses across regions. Address value is allowed to be empty or matching
  # the region name.
  - type: networking.gke.io/ephemeral-ipv4-address/us-west1
    value: "us-west1"
  - type: networking.gke.io/ephemeral-ipv4-address/us-east1
    value: "us-east1"
  listeners:
  - name: http
    protocol: HTTP
    port: 80
    allowedRoutes:
      kinds:
      - kind: HTTPRoute # Only allow HTTPRoute to attach

以下列表定义了上述 YAML 文件中的部分字段：

metadata.namespace：创建 Gateway 资源的命名空间，例如 store。
spec.gatewayClassName：GatewayClass 的名称。必须为 gke-l7-cross-regional-internal-managed-mc。
spec.listeners.allowedRoutes.kinds：可附加的 Route 对象的种类，例如 HTTPRoute。
spec.addresses:
- type: networking.gke.io/ephemeral-ipv4-address/REGION：请求临时 IP 地址。
- value：指定地址的区域，例如 "us-west1" 或 "us-east1"。

将清单应用于配置集群，例如 gke-west1：

kubectl apply --context gke-west1 -f cross-regional-gateway.yaml

将 HTTPRoute 连接到网关

定义 HTTPRoute 资源以管理流量路由，并将其应用于配置集群。

将以下 HTTPRoute 清单保存为 store-route.yaml：

# store-route.yaml
kind: HTTPRoute
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: store-route
  namespace: store
  labels:
    gateway: cross-regional-internal
spec:
  parentRefs:
  - name: internal-cross-region-gateway
    namespace: store # Namespace where the Gateway is deployed
  hostnames:
  - "store.example.internal" # Hostname clients will use
  rules:
  - matches: # Rule for traffic to /west
    - path:
        type: PathPrefix
        value: /west
    backendRefs:
    - group: net.gke.io # Indicates a multi-cluster ServiceImport
      kind: ServiceImport
      name: store-west-1 # Targets the ServiceImport for the west cluster
      port: 8080
  - matches: # Rule for traffic to /east
    - path:
        type: PathPrefix
        value: /east
    backendRefs:
    - group: net.gke.io
      kind: ServiceImport
      name: store-east-1 # Targets the ServiceImport for the east cluster
      port: 8080
  - backendRefs: # Default rule for other paths (e.g., /)
    - group: net.gke.io
      kind: ServiceImport
      name: store # Targets the generic 'store' ServiceImport (any region)
      port: 8080

以下列表定义了上述 YAML 文件中的部分字段：

spec.parentRefs：将此路由附加到 store 命名空间中的 internal-cross-region-gateway。
spec.hostnames：表示客户端用于访问服务的主机名。
spec.rules：定义路由逻辑。此示例使用基于路径的路由：
- /west 流量路由到 store-west-1 ServiceImport。
- /east 流量路由到 store-east-1 ServiceImport。
- 所有其他流量（例如 /）都将路由到通用 store ServiceImport。
backendRefs:
- group: net.gke.io 和 kind: ServiceImport 定位多集群服务。

将HTTPRoute清单应用于配置集群：

kubectl apply --context gke-west1 -f store-route.yaml

验证网关和路由的状态

检查网关状态：
```
kubectl get gateway internal-cross-region-gateway -n store -o yaml --context gke-west1
```
查找具有以下条件的条件：type:已编程and状态：“True”. You should see IP addresses assigned in thestatus.addressesfield, corresponding to the regions you specified (e.g., one forus-west1and one forus-east1）。
检查 HTTPRoute 状态：
```
kubectl get httproute store-route -n store -o yaml --context gke-west1
```
在 status.parents[].conditions 中查找包含 type: Accepted（或 ResolvedRefs）和 status: "True" 的条件。

确认流量

将 IP 地址分配给网关后，您可以测试来自客户端虚拟机的流量，该虚拟机位于您的 VPC 网络中，并且位于某个区域内，或者位于可以连接到网关 IP 地址的区域内。

检索网关 IP 地址。

以下命令尝试解析 JSON 输出。您可能需要根据确切的结构调整 jsonpath。
```
kubectl get gateway cross-region-gateway -n store --context gke-west1 -o=jsonpath="{.status.addresses[*].value}".
```
此命令的输出应包含 VIP，例如 VIP1_WEST 或 VIP2_EAST。

发送测试请求：从 VPC 中的客户端虚拟机执行以下操作：

# Assuming VIP_WEST is an IP in us-west1 and VIP_EAST is an IP in us-east1
# Traffic to /west should ideally be served by gke-west-1
curl -H "host: store.example.internal" http://VIP_WEST/west
curl -H "host: store.example.internal" http://VIP_EAST/west # Still targets store-west-1 due to path

# Traffic to /east should ideally be served by gke-east-1
curl -H "host: store.example.internal" http://VIP_WEST/east # Still targets store-east-1 due to path
curl -H "host: store.example.internal" http://VIP_EAST/east

# Traffic to / (default) could be served by either cluster
curl -H "host: store.example.internal" http://VIP_WEST/
curl -H "host: store.example.internal" http://VIP_EAST/

响应应包含 store 应用的详细信息，以指明哪个后端 pod 处理了请求，例如 cluster_name 或 zone。

使用静态 IP 地址

您可以使用预先分配的静态内部 IP 地址，而不是临时 IP 地址。

在您要使用的区域中创建静态 IP 地址：

gcloud compute addresses create cross-region-gw-ip-west --region us-west1 --subnet default --project=${PROJECT_ID}
gcloud compute addresses create cross-region-gw-ip-east --region us-east1 --subnet default --project=${PROJECT_ID}

如果您不使用默认子网，请将 default 替换为具有您要分配的 IP 地址的子网的名称。这些子网是常规子网，而非代理专用子网。

通过修改 cross-regional-gateway.yaml 文件中的 spec.addresses 部分来更新网关清单：

# cross-regional-gateway-static-ip.yaml
kind: Gateway
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: internal-cross-region-gateway # Or a new name if deploying alongside
  namespace: store
spec:
  gatewayClassName: gke-l7-cross-regional-internal-managed-mc
  addresses:
  - type: networking.gke.io/named-address-with-region # Use for named static IP
    value: "regions/us-west1/addresses/cross-region-gw-ip-west"
  - type: networking.gke.io/named-address-with-region
    value: "regions/us-east1/addresses/cross-region-gw-ip-east"
  listeners:
  - name: http
    protocol: HTTP
    port: 80
    allowedRoutes:
      kinds:
      - kind: HTTPRoute

应用更新后的网关清单。

kubectl apply --context gke-west1 -f cross-regional-gateway.yaml

非默认子网的特殊注意事项

使用非默认子网时，请注意以下事项：

同一 VPC 网络：所有用户创建的资源（例如静态 IP 地址、代理专用子网和 GKE 集群）都必须位于同一 VPC 网络中。
地址子网：为网关创建静态 IP 地址时，系统会从指定区域中的常规子网分配这些地址。
集群子网命名：每个区域都必须有一个子网，其名称与 MCG 配置集群所在的子网相同。
- 例如，如果您的 gke-west-1 配置集群位于 projects/YOUR_PROJECT/regions/us-west1/subnetworks/my-custom-subnet 中，则您要请求地址的区域也必须具有 my-custom-subnet 子网。如果您请求 us-east1 和 us-centra1 区域中的地址，则还必须在这些区域中存在名为 my-custom-subnet 的子网。

使用 Gateway 进行蓝绿多集群路由

gke-l7-global-external-managed-*、gke-l7-regional-external-managed-* 和 gke-l7-rilb-* GatewayClass 具有许多高级流量路由功能，包括流量分配、标头匹配、标头操纵、流量镜像等。在此示例中，您将演示如何使用基于权重的流量拆分来明确控制两个 GKE 集群中的流量比例。

此示例介绍了服务所有者在将应用迁移到或扩展到新 GKE 集群时将采取的一些实际步骤。蓝绿部署的目标是通过多个确认新集群正常运行的验证步骤来降低风险。本示例介绍了四个部署阶段：

100% - 基于标头的 Canary 版：使用 HTTP 标头路由仅发送测试或合成流量新集群。
100% - 镜像流量：将用户流量镜像到 Canary 版集群。这会通过将所有用户流量复制到此集群来测试 Canary 集群的容量。
90% - 10%：对 10% 的流量进行 Canary 拆分，以缓慢地将新集群公开给实时流量。
0%-100%：完全切换到新集群；如果发现任何错误，可切换回来。

两个 GKE 集群之间的蓝绿流量分配

此示例与前一个示例类似，只是改为部署内部多集群 Gateway。这将部署内部应用负载均衡器，该负载均衡器只能从 VPC 内部以私密方式访问。您将使用在先前步骤中部署的集群和同一应用，但需要通过其他 Gateway 部署它们。

前提条件

以下示例基于部署外部多集群 Gateway 中的一些步骤。在继续此示例之前，请确保您已完成以下步骤：

启用多集群 Gateway
部署演示应用

此示例使用您设置的 gke-west-1 和 gke-west-2 集群。这些集群位于同一区域，因为 gke-l7-rilb-mc GatewayClass 是区域级的，仅支持同一区域内的集群后端。

部署每个集群所需的 Service 和 ServiceExport。如果您在上一个示例中部署了 Service 和 ServiceExport，则其中一些已经部署。

kubectl apply --context gke-west-1 -f https://raw.githubusercontent.com/GoogleCloudPlatform/gke-networking-recipes/main/gateway/gke-gateway-controller/multi-cluster-gateway/store-west-1-service.yaml
kubectl apply --context gke-west-2 -f https://raw.githubusercontent.com/GoogleCloudPlatform/gke-networking-recipes/main/gateway/gke-gateway-controller/multi-cluster-gateway/store-west-2-service.yaml

它会为每个集群部署一组类似的资源：

service/store created
serviceexport.net.gke.io/store created
service/store-west-2 created
serviceexport.net.gke.io/store-west-2 created

配置代理专用子网

如果尚未为要部署内部 Gateway 的每个区域配置代理专用子网，请执行此操作。此子网用于向负载均衡器代理提供内部 IP 地址，并且必须采用以下配置：将 --purpose 设置为仅限 REGIONAL_MANAGED_PROXY。

您必须先创建代理专用子网，然后才能创建用于管理内部应用负载均衡器的 Gateway。在使用内部应用负载均衡器的虚拟私有云 (VPC) 网络中，每个区域都必须具有一个代理专用子网。

gcloud compute networks subnets create 命令会创建代理专用子网。

gcloud compute networks subnets create SUBNET_NAME \
    --purpose=REGIONAL_MANAGED_PROXY \
    --role=ACTIVE \
    --region=REGION \
    --network=VPC_NETWORK_NAME \
    --range=CIDR_RANGE

替换以下内容：

SUBNET_NAME：代理专用子网的名称。
REGION：代理专用子网的区域。
VPC_NETWORK_NAME：包含子网的 VPC 网络的名称。
CIDR_RANGE：子网的主要 IP 地址范围。使用的子网掩码长度不得超过 /26，以确保至少有 64 个 IP 地址可用于该区域中的代理。建议的子网掩码为 /23。

部署网关

以下 Gateway 是通过 gke-l7-rilb-mc GatewayClass 创建的，这是一个区域级内部 Gateway，只能定位同一区域中的目标 GKE 集群。

将以下 Gateway 清单应用于配置集群（在此示例中为 gke-west-1）：

cat << EOF | kubectl apply --context gke-west-1 -f -
kind: Gateway
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: internal-http
  namespace: store
spec:
  gatewayClassName: gke-l7-rilb-mc
  listeners:
  - name: http
    protocol: HTTP
    port: 80
    allowedRoutes:
      kinds:
      - kind: HTTPRoute
EOF

验证该 Gateway 是否已启动。您可以使用以下命令仅过滤来自该 Gateway 的事件：

kubectl get events --field-selector involvedObject.kind=Gateway,involvedObject.name=internal-http --context=gke-west-1 --namespace store

如果输出类似于以下内容，则表示 Gateway 部署成功：

LAST SEEN   TYPE     REASON   OBJECT                  MESSAGE
5m18s       Normal   ADD      gateway/internal-http   store/internal-http
3m44s       Normal   UPDATE   gateway/internal-http   store/internal-http
3m9s        Normal   SYNC     gateway/internal-http   SYNC on store/internal-http was a success

基于标头的 Canary

借助基于标头的 Canary 测试，服务所有者可以匹配来自真实用户的合成测试流量。通过这种方法可以轻松地验证应用的基本网络是否正常运行，而无需向用户公开。

将以下 HTTPRoute 清单应用于配置集群（在此示例中为 gke-west-1）：

cat << EOF | kubectl apply --context gke-west-1 -f -
kind: HTTPRoute
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: internal-store-route
  namespace: store
  labels:
    gateway: internal-http
spec:
  parentRefs:
  - kind: Gateway
    namespace: store
    name: internal-http
  hostnames:
  - "store.example.internal"
  rules:
  # Matches for env=canary and sends it to store-west-2 ServiceImport
  - matches:
    - headers:
      - name: env
        value: canary
    backendRefs:
      - group: net.gke.io
        kind: ServiceImport
        name: store-west-2
        port: 8080
  # All other traffic goes to store-west-1 ServiceImport
  - backendRefs:
    - group: net.gke.io
      kind: ServiceImport
      name: store-west-1
      port: 8080
EOF

部署后，此 HTTPRoute 将配置以下路由行为：

对 store.example.internal 的内部请求（不使用 env: canary HTTP 标头）被路由到 gke-west-1 集群上的 store Pod
对 store.example.internal 的内部请求（使用 env: canary HTTP 标头）被路由到 gke-west-2 集群上的 store Pod

HTTPRoute 支持根据 HTTP 标头路由到不同的集群

通过将流量发送到 Gateway IP 地址来验证 HTTPRoute 是否正常运行。

从 internal-http 检索内部 IP 地址。
```
kubectl get gateways.gateway.networking.k8s.io internal-http -o=jsonpath="{.status.addresses[0].value}" --context gke-west-1 --namespace store
```
将以下步骤中的 VIP 替换为您收到的 IP 地址作为输出。

注意：在以下步骤中，会从具有内部 VPC 连接且与 GKE 集群位于同一区域（除非您已在 Gateway 上配置了全球访问）的客户端发送对 VIP 的所有请求。您可在 us-west1 中创建一个虚拟机，并为此通过 SSH 连接到该虚拟机。这是必需的，因为 internal-http Gateway 是内部区域性负载均衡器。此外，请将主机标头设置为 store.example.internal，这样您就无需配置 DNS 即可使此示例正常运行。

使用 env: canary HTTP 标头向 Gateway 发送请求。这将确认流量正路由到 gke-west-2。使用 GKE 集群所在 VPC 中的专用客户端，以确认请求是否正确路由。以下命令必须在对 Gateway IP 地址具有专用访问权限的机器上运行，否则将不起作用。

curl -H "host: store.example.internal" -H "env: canary" http://VIP

输出会确认请求由 gke-west-2 集群中的 Pod 处理：

{
    "cluster_name": "gke-west-2", 
    "host_header": "store.example.internal",
    "node_name": "gke-gke-west-2-default-pool-4cde1f72-m82p.c.agmsb-k8s.internal",
    "pod_name": "store-5f5b954888-9kdb5",
    "pod_name_emoji": "😂",
    "project_id": "agmsb-k8s",
    "timestamp": "2021-05-31T01:21:55",
    "zone": "us-west1-a"
}

流量镜像

此阶段会将流量发送到预期集群，但还会将流量镜像到 Canary 集群。

使用镜像有助于确定流量负载如何影响应用性能，且不会以任何方式影响对客户端的响应。这可能是所有类型发布都需要的，但在发布可能会影响性能或负载的重大更改时非常有用。

将以下 HTTPRoute 清单应用于配置集群（在此示例中为 gke-west-1）：

cat << EOF | kubectl apply --context gke-west-1 -f -
kind: HTTPRoute
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: internal-store-route
  namespace: store
  labels:
    gateway: internal-http
spec:
  parentRefs:
  - kind: Gateway
    namespace: store
    name: internal-http
  hostnames:
  - "store.example.internal"
  rules:
  # Sends all traffic to store-west-1 ServiceImport
  - backendRefs:
    - name: store-west-1
      group: net.gke.io
      kind: ServiceImport
      port: 8080
    # Also mirrors all traffic to store-west-2 ServiceImport
    filters:
    - type: RequestMirror
      requestMirror:
        backendRef:
          group: net.gke.io
          kind: ServiceImport
          name: store-west-2
          port: 8080
EOF

使用您的专用客户端向 internal-http Gateway 发送请求。使用 /mirror 路径，以便您可以在后续步骤中在应用日志中唯一标识此请求。
```
curl -H "host: store.example.internal" http://VIP/mirror
```

输出会确认客户端收到来自 gke-west-1 集群中的 Pod 的响应：

{
    "cluster_name": "gke-west-1", 
    "host_header": "store.example.internal",
    "node_name": "gke-gke-west-1-default-pool-65059399-ssfq.c.agmsb-k8s.internal",
    "pod_name": "store-5f5b954888-brg5w",
    "pod_name_emoji": "🎖",
    "project_id": "agmsb-k8s",
    "timestamp": "2021-05-31T01:24:51",
    "zone": "us-west1-a"
}

这可确认主集群是否正在响应流量。您仍然需要确认要迁移到的集群正在接收镜像流量。

检查 gke-west-2 集群上的 store pod 的应用日志。日志应确认 Pod 已收到来自负载均衡器的镜像流量。
```
kubectl logs deployment/store --context gke-west-2 -n store | grep /mirror
```

此输出确认 gke-west-2 集群上的 Pod 也收到了相同的请求，但它们对这些请求的响应不会发送回客户端。日志中显示的 IP 地址是与 Pod 通信的负载均衡器内部 IP 地址的 IP 地址。

Found 2 pods, using pod/store-5c65bdf74f-vpqbs
[2023-10-12 21:05:20,805] INFO in _internal: 192.168.21.3 - - [12/Oct/2023 21:05:20] "GET /mirror HTTP/1.1" 200 -
[2023-10-12 21:05:27,158] INFO in _internal: 192.168.21.3 - - [12/Oct/2023 21:05:27] "GET /mirror HTTP/1.1" 200 -
[2023-10-12 21:05:27,805] INFO in _internal: 192.168.21.3 - - [12/Oct/2023 21:05:27] "GET /mirror HTTP/1.1" 200 -

流量拆分

流量拆分是发布新代码或安全地部署到新环境的最常用方法之一。服务所有者设置了发送到 Canary 后端的流量的明确百分比，该百分比通常只占总流量的很小一部分，以便可以在对真实用户请求具有可接受的风险量的情况下确定推出是否成功。

通过拆分少数流量，服务所有者可以检查应用的运行状况和响应。如果所有信号看起来都运行状况良好，则可以继续进行完整割接。

将以下 HTTPRoute 清单应用于配置集群（在此示例中为 gke-west-1）：

cat << EOF | kubectl apply --context gke-west-1 -f -
kind: HTTPRoute
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: internal-store-route
  namespace: store
  labels:
    gateway: internal-http
spec:
  parentRefs:
  - kind: Gateway
    namespace: store
    name: internal-http
  hostnames:
  - "store.example.internal"
  rules:
  - backendRefs:
    # 90% of traffic to store-west-1 ServiceImport
    - name: store-west-1
      group: net.gke.io
      kind: ServiceImport
      port: 8080
      weight: 90
    # 10% of traffic to store-west-2 ServiceImport
    - name: store-west-2
      group: net.gke.io
      kind: ServiceImport
      port: 8080
      weight: 10
EOF

使用您的专用客户端向 internal- http Gateway 发送连续 curl 请求。

while true; do curl -H "host: store.example.internal" -s VIP | grep "cluster_name"; sleep 1; done

输出将类似于以下内容，表示发生了 90/10 流量拆分。

"cluster_name": "gke-west-1",
"cluster_name": "gke-west-1",
"cluster_name": "gke-west-1",
"cluster_name": "gke-west-1",
"cluster_name": "gke-west-1",
"cluster_name": "gke-west-1",
"cluster_name": "gke-west-1",
"cluster_name": "gke-west-1",
"cluster_name": "gke-west-2",
"cluster_name": "gke-west-1",
"cluster_name": "gke-west-1",
...

流量切换

蓝绿迁移的最后一个阶段是完全切换到新集群并移除旧集群。如果服务所有者实际上将第二个集群添加到现有集群，则最后一步是不同的，因为最后一步会产生流量进入这两个集群。在这种情况下，建议使用单个 store ServiceImport，其中包含来自 gke-west-1 和 gke-west-2 集群的 Pod。这样，负载均衡器就可以根据距离、运行状况和容量来决定“活跃-活跃”应用的流量去向。

将以下 HTTPRoute 清单应用于配置集群（在此示例中为 gke-west-1）：

cat << EOF | kubectl apply --context gke-west-1 -f -
kind: HTTPRoute
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: internal-store-route
  namespace: store
  labels:
    gateway: internal-http
spec:
  parentRefs:
  - kind: Gateway
    namespace: store
    name: internal-http
  hostnames:
  - "store.example.internal"
  rules:
    - backendRefs:
      # No traffic to the store-west-1 ServiceImport
      - name: store-west-1
        group: net.gke.io
        kind: ServiceImport
        port: 8080
        weight: 0
      # All traffic to the store-west-2 ServiceImport
      - name: store-west-2
        group: net.gke.io
        kind: ServiceImport
        port: 8080
        weight: 100
EOF

使用您的专用客户端向 internal- http Gateway 发送连续 curl 请求。

while true; do curl -H "host: store.example.internal" -s VIP | grep "cluster_name"; sleep 1; done

输出将与此类似，表明所有流量现在都会转到 gke-west-2。

"cluster_name": "gke-west-2",
"cluster_name": "gke-west-2",
"cluster_name": "gke-west-2",
"cluster_name": "gke-west-2",
...

最后一步完成从一个 GKE 集群到另一个 GKE 集群的完全蓝绿应用迁移。

部署基于容量的负载均衡

本部分中的练习通过跨不同区域的两个 GKE 集群部署应用来演示全球负载均衡和 Service 容量概念。生成的流量按各种每秒请求数 (RPS) 级别发送，以显示如何跨集群和区域对流量进行负载均衡。

下图展示了您将部署的拓扑，以及当流量超出 Service 容量时，流量如何在集群和区域之间溢出：

流量从一个集群溢出到另一个集群

如需详细了解流量管理，请参阅 GKE 流量管理。

准备环境

按照启用多集群 Gateway 中的说明准备环境。

确认配置集群上已安装 GatewayClass 资源：

kubectl get gatewayclasses --context=gke-west-1

输出内容类似如下：

NAME                                  CONTROLLER                  ACCEPTED   AGE
gke-l7-global-external-managed        networking.gke.io/gateway   True       16h
gke-l7-global-external-managed-mc     networking.gke.io/gateway   True       14h
gke-l7-gxlb                           networking.gke.io/gateway   True       16h
gke-l7-gxlb-mc                        networking.gke.io/gateway   True       14h
gke-l7-regional-external-managed      networking.gke.io/gateway   True       16h
gke-l7-regional-external-managed-mc   networking.gke.io/gateway   True       14h
gke-l7-rilb                           networking.gke.io/gateway   True       16h
gke-l7-rilb-mc                        networking.gke.io/gateway   True       14h

部署应用

将示例 Web 应用服务器部署到这两个集群：

kubectl apply --context gke-west-1 -f https://raw.githubusercontent.com/GoogleCloudPlatform/gke-networking-recipes/master/gateway/docs/store-traffic-deploy.yaml
kubectl apply --context gke-east-1 -f https://raw.githubusercontent.com/GoogleCloudPlatform/gke-networking-recipes/master/gateway/docs/store-traffic-deploy.yaml

输出内容类似如下：

namespace/store created
deployment.apps/store created

部署 Service、Gateway 和 HTTPRoute

将以下 Service 清单同时应用于 gke-west-1 和 gke-east-1 集群：

cat << EOF | kubectl apply --context gke-west-1 -f -
apiVersion: v1
kind: Service
metadata:
  name: store
  namespace: traffic-test
  annotations:
    networking.gke.io/max-rate-per-endpoint: "10"
spec:
  ports:
  - port: 8080
    targetPort: 8080
    name: http
  selector:
    app: store
  type: ClusterIP
---
kind: ServiceExport
apiVersion: net.gke.io/v1
metadata:
  name: store
  namespace: traffic-test
EOF

cat << EOF | kubectl apply --context gke-east-1 -f -
apiVersion: v1
kind: Service
metadata:
  name: store
  namespace: traffic-test
  annotations:
    networking.gke.io/max-rate-per-endpoint: "10"
spec:
  ports:
  - port: 8080
    targetPort: 8080
    name: http
  selector:
    app: store
  type: ClusterIP
---
kind: ServiceExport
apiVersion: net.gke.io/v1
metadata:
  name: store
  namespace: traffic-test
EOF

此 Service 的 max-rate-per-endpoint 设置为每秒 10 个请求。每个集群有 2 个副本，每个 Service 的每个集群有 20 RPS 容量。

如需详细了解如何为 Service 选择 Service 容量级别，请参阅确定 Service 的容量。

将以下 Gateway 清单应用于配置集群（在此示例中为 gke-west-1）：

cat << EOF | kubectl apply --context gke-west-1 -f -
kind: Gateway
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: store
  namespace: traffic-test
spec:
  gatewayClassName: gke-l7-global-external-managed-mc
  listeners:
  - name: http
    protocol: HTTP
    port: 80
    allowedRoutes:
      kinds:
      - kind: HTTPRoute
EOF

该清单描述了一个外部全球多集群 Gateway，用于部署具有可公开访问的 IP 地址的外部应用负载均衡器。

将以下 HTTPRoute 清单应用于配置集群（在此示例中为 gke-west-1）：

cat << EOF | kubectl apply --context gke-west-1 -f -
kind: HTTPRoute
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: store
  namespace: traffic-test
  labels:
    gateway: store
spec:
  parentRefs:
  - kind: Gateway
    namespace: traffic-test
    name: store
  rules:
  - backendRefs:
    - name: store
      group: net.gke.io
      kind: ServiceImport
      port: 8080
EOF

该清单描述了一个 HTTPRoute，用于使用将所有流量定向到 store ServiceImport 的路由规则配置 Gateway。store ServiceImport 会将跨两个集群的 store Service Pod 分组，并允许负载均衡器将它们作为单个 Service 进行寻址。

几分钟后，您可以检查 Gateway 的事件，以了解其是否已完成部署：

kubectl describe gateway store -n traffic-test --context gke-west-1

输出内容类似如下：

...
Status:
  Addresses:
    Type:   IPAddress
    Value:  34.102.159.147
  Conditions:
    Last Transition Time:  2023-10-12T21:40:59Z
    Message:               The OSS Gateway API has deprecated this condition, do not depend on it.
    Observed Generation:   1
    Reason:                Scheduled
    Status:                True
    Type:                  Scheduled
    Last Transition Time:  2023-10-12T21:40:59Z
    Message:
    Observed Generation:   1
    Reason:                Accepted
    Status:                True
    Type:                  Accepted
    Last Transition Time:  2023-10-12T21:40:59Z
    Message:
    Observed Generation:   1
    Reason:                Programmed
    Status:                True
    Type:                  Programmed
    Last Transition Time:  2023-10-12T21:40:59Z
    Message:               The OSS Gateway API has altered the "Ready" condition semantics and reservedit for future use.  GKE Gateway will stop emitting it in a future update, use "Programmed" instead.
    Observed Generation:   1
    Reason:                Ready
    Status:                True
    Type:                  Ready
  Listeners:
    Attached Routes:  1
    Conditions:
      Last Transition Time:  2023-10-12T21:40:59Z
      Message:
      Observed Generation:   1
      Reason:                Programmed
      Status:                True
      Type:                  Programmed
      Last Transition Time:  2023-10-12T21:40:59Z
      Message:               The OSS Gateway API has altered the "Ready" condition semantics and reservedit for future use.  GKE Gateway will stop emitting it in a future update, use "Programmed" instead.
      Observed Generation:   1
      Reason:                Ready
      Status:                True
      Type:                  Ready
    Name:                    http
    Supported Kinds:
      Group:  gateway.networking.k8s.io
      Kind:   HTTPRoute
Events:
  Type    Reason  Age                  From                   Message
  ----    ------  ----                 ----                   -------
  Normal  ADD     12m                  mc-gateway-controller  traffic-test/store
  Normal  SYNC    6m43s                mc-gateway-controller  traffic-test/store
  Normal  UPDATE  5m40s (x4 over 12m)  mc-gateway-controller  traffic-test/store
  Normal  SYNC    118s (x6 over 10m)   mc-gateway-controller  SYNC on traffic-test/store was a success

此输出显示 Gateway 已成功部署。在部署 Gateway 后，流量可能需要几分钟时间才能开始通过。记下此输出中的 IP 地址，因为在接下来的步骤中会用到。

确认流量

使用 curl 命令测试 Gateway IP 地址，以确认流量正在传递给应用：

curl GATEWAY_IP_ADDRESS

输出内容类似如下：

{
  "cluster_name": "gke-west-1",
  "host_header": "34.117.182.69",
  "pod_name": "store-54785664b5-mxstv",
  "pod_name_emoji": "👳🏿",
  "project_id": "project",
  "timestamp": "2021-11-01T14:06:38",
  "zone": "us-west1-a"
}

此输出会显示 Pod 元数据，表明处理请求所在的区域。

使用负载测试验证流量

如需验证负载均衡器是否正常运行，您可以在 gke-west-1 集群中部署流量生成器。流量生成器在不同的负载级别生成流量，以演示负载均衡器的容量和溢出功能。以下步骤演示了三个级别的负载：

10 RPS，低于 gke-west-1 中 store Service 的容量。
30 RPS，超过 gke-west-1 store Service 的容量，导致流量溢出到 gke-east-1。
60 RPS，超出这两个集群中的 Service 的容量。

配置信息中心

获取 Gateway 的底层 URLmap 的名称：

kubectl get gateway store -n traffic-test --context=gke-west-1 -o=jsonpath="{.metadata.annotations.networking\.gke\.io/url-maps}"

输出类似于以下内容：

/projects/PROJECT_NUMBER/global/urlMaps/gkemcg1-traffic-test-store-armvfyupay1t

在 Google Cloud 控制台中，前往 Metrics Explorer 页面。

转到 Metrics Explorer
在选择指标下，点击代码：MQL。

输入以下查询，以观察跨两个集群的 store Service 的流量指标：

fetch https_lb_rule
| metric 'loadbalancing.googleapis.com/https/backend_request_count'
| filter (resource.url_map_name == 'GATEWAY_URL_MAP')
| align rate(1m)
| every 1m
| group_by [resource.backend_scope],
    [value_backend_request_count_aggregate:
        aggregate(value.backend_request_count)]

将 GATEWAY_URL_MAP 替换为上一步中的 URLmap 名称。

点击运行查询。在下一部分中部署负载生成器后，请至少等待 5 分钟，使指标显示在图表中。

使用 10 RPS 进行测试

将 Pod 部署到 gke-west-1 集群：
```
kubectl run --context gke-west-1 -i --tty --rm loadgen  \
    --image=cyrilbkr/httperf  \
    --restart=Never  \
    -- /bin/sh -c 'httperf  \
    --server=GATEWAY_IP_ADDRESS  \
    --hog --uri="/zone" --port 80  --wsess=100000,1,1 --rate 10'
```
将 GATEWAY_IP_ADDRESS 替换为上一步中的 Gateway IP 地址。

输出类似于以下内容，表明流量生成器正在发送流量：
```
If you don't see a command prompt, try pressing enter.
```
负载生成器会持续向 Gateway 发送 10 RPS。即使流量来自 Google Cloud 区域内部，负载平衡器也会将其视为来自美国西海岸的客户端流量。为了模拟真实的客户端多样性，负载生成器会将每个 HTTP 请求作为新的 TCP 连接发送，这意味着流量会更均匀地分配到后端 Pod 中。

生成器最多需要 5 分钟来为信息中心生成流量。
查看 Metrics Explorer 信息中心。系统将显示两行，指示为了均衡负载而向每个集群分配的流量：

您应该会看到 us-west1-a 收到了大约 10 RPS 的流量，而 us-east1-b 没有收到任何流量。由于流量生成器在 us-west1 中运行，因此所有流量都会发送到 gke-west-1 集群中的 Service。
使用 Ctrl+C 停止负载生成器，然后删除 Pod：
```
kubectl delete pod loadgen --context gke-west-1
```

使用 30 RPS 进行测试

再次部署负载生成器，但将其配置为发送 30 RPS：

kubectl run --context gke-west-1 -i --tty --rm loadgen  \
    --image=cyrilbkr/httperf  \
    --restart=Never  \
    -- /bin/sh -c 'httperf  \
    --server=GATEWAY_IP_ADDRESS  \
    --hog --uri="/zone" --port 80  --wsess=100000,1,1 --rate 30'

生成器最多需要 5 分钟来为信息中心生成流量。

查看 Cloud 运维信息中心。

您应该会看到大约 20 RPS 被发送到 us-west1-a，10 RPS 被发送到 us-east1-b。这表示 gke-west-1 中的 Service 得到充分利用，并将 10 RPS 的流量溢出到 gke-east-1 中的 Service。
使用 Ctrl+C 停止负载生成器，然后删除 Pod：
```
kubectl delete pod loadgen --context gke-west-1
```

使用 60 RPS 进行测试

将负载生成器配置为发送 60 RPS 并部署：

kubectl run --context gke-west-1 -i --tty --rm loadgen  \
    --image=cyrilbkr/httperf  \
    --restart=Never  \
    -- /bin/sh -c 'httperf  \
    --server=GATEWAY_IP_ADDRESS  \
    --hog --uri="/zone" --port 80  --wsess=100000,1,1 --rate 60'

等待 5 分钟并查看您的 Cloud 运维信息中心。现在，它应该显示两个集群都收到大约 30 RPS。由于所有 Service 在全球范围内都被过度利用，因此不会发生流量溢出，而且 Service 会吸收其可以吸收的所有流量。
使用 Ctrl+C 停止负载生成器，然后删除 Pod：
```
kubectl delete pod loadgen --context gke-west-1
```

清理

完成本页面上的练习后，请按照以下步骤移除资源，防止您的账号产生不必要的费用：

删除集群。
如果不需要为其他目的注册集群，请从舰队中取消注册集群。

停用 multiclusterservicediscovery 功能：

gcloud container fleet multi-cluster-services disable

停用 Multi Cluster Ingress：
```
gcloud container fleet ingress disable
```

停用 API：

gcloud services disable \
    multiclusterservicediscovery.googleapis.com \
    multiclusteringress.googleapis.com \
    trafficdirector.googleapis.com \
    --project=PROJECT_ID

将多集群 Gateway 与共享 VPC 结合使用

多集群 Gateway 还可以部署在具有不同拓扑的共享 VPC 环境中，具体取决于应用场景。

下表介绍了共享 VPC 环境中支持的多集群 Gateway 拓扑：

场景	舰队宿主项目	配置集群	工作负载集群
1	共享 VPC 宿主项目	共享 VPC 宿主项目	共享 VPC 宿主项目
2	共享 VPC 服务项目	共享 VPC 服务项目（与舰队服务项目相同）	共享 VPC 服务项目（与舰队服务项目相同）

如需在共享 VPC 环境中创建多集群 Gateway，请按以下步骤操作：

按照步骤设置使用共享 VPC 的多集群 Service
创建服务并将其导出到配置集群
如果您计划使用多集群内部 Gateway，请创建代理专用子网
创建多集群外部或内部 Gateway 和 HTTPRoute

完成这些步骤后，您可以根据拓扑验证部署。

问题排查

内部 Gateway 的代理专用子网不存在

如果您的内部 Gateway 上出现以下事件，则该区域不存在代理专用子网。如需解决此问题，请部署代理专用子网。

generic::invalid_argument: error ensuring load balancer: Insert: Invalid value for field 'resource.target': 'regions/us-west1/targetHttpProxies/gkegw-x5vt-default-internal-http-2jzr7e3xclhj'. A reserved and active subnetwork is required in the same region and VPC as the forwarding rule.

没有健康的上行

具体情况：

创建网关但无法访问后端服务（503 响应代码）时，可能会出现以下问题：

no healthy upstream

原因：

此错误消息表示健康检查探测器找不到健康状况良好的后端服务。您的后端服务可能处于健康状况良好状态，但您可能需要自定义健康检查。

临时解决方法：

如需解决此问题，请使用 HealthCheckPolicy 根据应用的要求自定义监控状况检查（例如 /health）。

后续步骤

详细了解 Gateway Controller。