配置 Pod 横向自动扩缩

Autopilot Standard

本页面介绍了如何使用横向 Pod 自动扩缩以根据不同类型的指标自动扩缩 Deployment。您可以使用相同的准则为任何可扩缩的 Deployment 对象配置一个 HorizontalPodAutoscaler。

准备工作

在开始之前，请确保您已执行以下任务：

启用 Google Kubernetes Engine API。

启用 Google Kubernetes Engine API

如果您要使用 Google Cloud CLI 执行此任务，请安装并初始化 gcloud CLI。如果您之前安装了 gcloud CLI，请运行 gcloud components update 以获取最新版本。
注意：对于现有 gcloud CLI 安装，请务必设置 compute/region 和 compute/zone 属性。通过设置默认位置，您可以避免 gcloud CLI 中出现以下错误：One of [--zone, --region] must be supplied: Please specify location。

适用于 `HorizontalPodAutoscaler` 对象的 API 版本

当您使用 Google Cloud 控制台时，HorizontalPodAutoscaler 对象是使用 autoscaling/v2 API 创建的。

当您使用 kubectl 创建或查看有关 Pod 横向自动扩缩器的信息时，您可以指定 autoscaling/v1 API 或 autoscaling/v2 API。

apiVersion: autoscaling/v1 为默认值，允许您仅根据 CPU 利用率进行自动扩缩。如需根据其他指标进行自动扩缩，建议使用 apiVersion: autoscaling/v2。创建示例 Deployment 中的示例使用 apiVersion: autoscaling/v1。
创建新的 HorizontalPodAutoscaler 对象时，建议使用 apiVersion: autoscaling/v2。它可让您根据多个指标进行自动扩缩，包括自定义指标或外部指标。本页面中的所有其他示例均使用 apiVersion: autoscaling/v2。

如需检查哪些 API 版本受支持，请使用 kubectl api-versions 命令。

您可以指定在查看使用 apiVersion: autoscaling/v2 的 Pod 横向自动扩缩器的详细信息时要使用的 API。

创建示例 Deployment

在创建 Pod 横向自动扩缩器之前，您必须先创建其要监控的工作负载。本页面中的示例会将不同的 Pod 横向自动扩缩器配置应用于以下 nginx Deployment。以下示例分别展示了基于资源利用率、自定义指标或外部指标，以及多个指标的 Pod 横向自动扩缩器。

将以下内容保存到名为 nginx.yaml 的文件中：

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80
        resources:
          # You must specify requests for CPU to autoscale
          # based on CPU utilization
          requests:
            cpu: "250m"

此清单指定 CPU 请求的值。如果要根据资源利用率百分比进行自动扩缩，您必须为该资源指定请求。如果不指定请求，则可以仅根据资源利用率的绝对值进行自动扩缩，例如：以 milliCPU 表示的 CPU 利用率。

如需创建 Deployment，请应用 nginx.yaml 清单：

kubectl apply -f nginx.yaml

Deployment 的 spec.replicas 已设置为 3，因此部署了 3 个 Pod。您可以使用 kubectl get deployment nginx 命令进行验证。

本页面中的每个示例都会对示例 nginx Deployment 应用不同的 Pod 横向自动扩缩器。

根据资源利用率进行自动扩缩

此示例创建一个 HorizontalPodAutoscaler 对象，以在 CPU 利用率超过 50% 时自动扩缩 nginx Deployment，并确保自始至终最少有 1 个副本，最多有 10 个副本。

您可以使用 Google Cloud 控制台、kubectl apply 命令或 kubectl autoscale 命令（仅针对平均 CPU），来创建针对 CPU 的 Pod 横向自动扩缩器。

控制台

转到 Google Cloud 控制台中的工作负载页面。

转到“工作负载”
点击 nginx Deployment 的名称。
点击 操作 > 自动扩缩。
指定以下值：
- 副本数下限：1
- 副本数上限：10
- 自动扩缩指标：CPU
- 目标：50
- 单位：%
点击完成。
点击自动调节。

`kubectl apply`

将以下 YAML 清单保存到名为 nginx-hpa.yaml 的文件：

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: nginx
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx
  minReplicas: 1
  maxReplicas: 10
  targetCPUUtilizationPercentage: 50

如需创建 HPA，请使用以下命令应用清单：

kubectl apply -f nginx-hpa.yaml

`kubectl autoscale`

如需创建仅针对平均 CPU 利用率的 HorizontalPodAutoscaler 对象，您可以使用 kubectl autoscale 命令：

kubectl autoscale deployment nginx --cpu-percent=50 --min=1 --max=10

如需获取集群中 Pod 横向自动扩缩器的列表，请使用以下命令：

kubectl get hpa

输出内容类似如下：

NAME    REFERENCE          TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
nginx   Deployment/nginx   0%/50%    1         10        3          61s

如需获取有关 Pod 横向自动扩缩器的详细信息，您可以使用 Google Cloud 控制台或 kubectl 命令。

控制台

转到 Google Cloud 控制台中的工作负载页面。

转到“工作负载”
点击 nginx Deployment 的名称。
在自动扩缩器部分中查看 Pod 横向自动扩缩器配置。
在事件标签页中查看关于自动扩缩事件的更多详情。

`kubectl get`

如需获取有关 Pod 横向自动扩缩器的详细信息，您可以结合使用 kubectl get hpa 和 -o yaml 标志。status 字段包含关于当前副本数以及所有近期自动扩缩事件的信息。

kubectl get hpa nginx -o yaml

输出类似于以下内容：

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  annotations:
    autoscaling.alpha.kubernetes.io/conditions: '[{"type":"AbleToScale","status":"True","lastTransitionTime":"2019-10-30T19:42:59Z","reason":"ScaleDownStabilized","message":"recent
      recommendations were higher than current one, applying the highest recent recommendation"},{"type":"ScalingActive","status":"True","lastTransitionTime":"2019-10-30T19:42:59Z","reason":"ValidMetricFound","message":"the
      HPA was able to successfully calculate a replica count from cpu resource utilization
      (percentage of request)"},{"type":"ScalingLimited","status":"False","lastTransitionTime":"2019-10-30T19:42:59Z","reason":"DesiredWithinRange","message":"the
      desired count is within the acceptable range"}]'
    autoscaling.alpha.kubernetes.io/current-metrics: '[{"type":"Resource","resource":{"name":"cpu","currentAverageUtilization":0,"currentAverageValue":"0"}}]'
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"autoscaling/v1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"nginx","namespace":"default"},"spec":{"maxReplicas":10,"minReplicas":1,"scaleTargetRef":{"apiVersion":"apps/v1","kind":"Deployment","name":"nginx"},"targetCPUUtilizationPercentage":50}}
  creationTimestamp: "2019-10-30T19:42:43Z"
  name: nginx
  namespace: default
  resourceVersion: "220050"
  selfLink: /apis/autoscaling/v1/namespaces/default/horizontalpodautoscalers/nginx
  uid: 70d1067d-fb4d-11e9-8b2a-42010a8e013f
spec:
  maxReplicas: 10
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx
  targetCPUUtilizationPercentage: 50
status:
  currentCPUUtilizationPercentage: 0
  currentReplicas: 3
  desiredReplicas: 3

在继续按照本页面中的其余示例操作之前，请先删除 HPA：

kubectl delete hpa nginx

删除 Pod 横向自动扩缩器时，Deployment 的副本数量保持不变。Deployment 不会自动还原到应用 Pod 横向自动扩缩器之前的状态。

您可以详细了解如何删除 Pod 横向自动扩缩器。

基于负载均衡器流量的自动扩缩

基于流量的自动扩缩是 GKE 的一项功能，可集成来自负载均衡器的流量利用率信号，从而自动扩缩 Pod。

使用流量作为自动扩缩信号可能会有帮助，因为流量是负载的先行指标，与 CPU 和内存互补。与 GKE 的内置集成可确保设置轻松，并且自动扩缩可快速应对流量高峰以满足需求。

基于流量的自动扩缩通过网关控制器及其全球流量管理功能启用。如需了解详情，请参阅基于流量的自动扩缩。

基于负载均衡器流量的自动扩缩仅适用于 Gateway 工作负载。

要求

基于流量的自动扩缩有以下要求：

在 GKE 1.24 版及更高版本中受支持。
在 GKE 集群中启用 Gateway API。
支持流经使用 Gateway API 以及 gke-l7-global-external-managed、gke-l7-regional-external-managed、gke-l7-rilb 或 gke-l7-gxlb GatewayClass 部署的负载均衡器的流量。

限制

基于流量的自动扩缩有以下限制：

不受多集群 GatewayClass（gke-l7-global-external-managed-mc、gke-l7-regional-external-managed-mc、gke-l7-rilb-mc 和 gke-l7-gxlb-mc）支持。
不支持使用 ClusterIP 或 LoadBalancer 类型的 Service 的流量。

部署基于流量的自动扩缩

以下练习使用 HorizontalPodAutoscaler 根据收到的流量自动扩缩 store-autoscale Deployment。网关接受来自互联网的 Pod 入站流量。自动扩缩器会将来自网关的流量信号与在 store-autoscale Service 资源上配置的每个 Pod 流量容量进行比较。通过生成流向网关的流量，会影响部署的 Pod 数量。

下图展示了基于流量的自动扩缩的工作原理：

HorizontalPodAutoscaler 根据流量扩缩 Deployment。

如需部署基于流量的自动扩缩，请执行以下步骤：

对于 Standard 集群，请确认集群中已安装 GatewayClass。对于 Autopilot 集群，GatewayClass 是默认安装的。

kubectl get gatewayclass

输出会确认 GKE GatewayClass 资源已准备好在集群中使用：

NAME                               CONTROLLER                  ACCEPTED   AGE
gke-l7-global-external-managed     networking.gke.io/gateway   True       16h
gke-l7-regional-external-managed   networking.gke.io/gateway   True       16h
gke-l7-gxlb                        networking.gke.io/gateway   True       16h
gke-l7-rilb                        networking.gke.io/gateway   True       16h

如果您没有看到此输出，请在 GKE 集群中启用 Gateway API。

将示例应用和 Gateway 负载均衡器部署到您的集群：
```
kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/gke-networking-recipes/master/gateway/docs/store-autoscale.yaml
```
示例应用会创建以下内容：
- 包含 2 个副本的 Deployment。
- 将 max-rate-per-endpoint 设置为 10 的 Service 容量。此功能目前为预览版，请在 Service 中使用注解。当此功能的正式版发布后，Service 政策将取代注解。如需详细了解网关功能，请参阅 GatewayClass 功能。
- 用于访问互联网上的应用的外部 Gateway。如需详细了解如何使用 Gateway 负载均衡器，请参阅部署 Gateway。
- 与所有流量匹配并将其发送到 store-autoscale Service 的 HTTPRoute。
使用基于流量的自动扩缩时，服务容量是一个关键元素，因为它决定了触发自动扩缩事件的每个 Pod 流量。它使用 Service 注解 networking.gke.io/max-rate-per-endpoint 进行配置，该容量定义了 Service 应该在每个 Pod 的每秒请求中接收的最大流量。服务容量取决于您的应用。如需了解详情，请参阅确定 Service 的容量。
将以下清单保存为 hpa.yaml：
```
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: store-autoscale
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: store-autoscale
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Object
    object:
      describedObject:
        kind: Service
        name: store-autoscale
      metric:
        name: "autoscaling.googleapis.com|gclb-capacity-utilization"
      target:
        averageValue: 70
        type: AverageValue
```
此清单描述了具有以下属性的 HorizontalPodAutoscaler：
- minReplicas 和 maxReplicas：设置 Deployment 的副本数下限和上限。在此配置中，Pod 的数量可以从 1 个副本扩容到 10 个副本。
- describedObject.name: store-autoscale：对 store-autoscale 服务的引用，定义了流量容量。
- scaleTargetRef.name: store-autoscale：对 store-autoscale Deployment 的引用，定义了由水平 Pod 自动扩缩器扩缩的资源。
- averageValue: 70：容量利用率的目标平均值。此属性可以使水平 Pod 自动扩缩器实现增长利润率，以便正在运行的 Pod 能够在创建新 Pod 时处理超额的流量。
注意：Gateway、Pod 横向自动扩缩器、Deployment 和 Service 之间的比率必须为 1:1:1:1。这意味着 Deployment 或 Service 不能被多个 Pod 横向自动扩缩器引用。Pod 横向自动扩缩器引用的 Service 不能被多个负载均衡器定位。如果不满足此条件，则水平 Pod 自动扩缩器将停止自动扩缩，并且水平 Pod 自动扩缩器事件中会显示错误。

Pod 横向自动扩缩器会产生以下流量行为：

Pod 的数量会在 1 到 10 个副本之间进行调整，以实现每个端点的最大速率的 70%。这会使 max-rate-per-endpoint=10 时每个 Pod 7 RPS。
在每个 Pod 超过 7 RPS 时，Pod 会纵向扩容，直至达到其 10 个副本的上限或直至平均流量为每个 Pod 7 RPS。
如果流量减少，则 Pod 会使用 Pod 横向自动扩缩器算法纵向缩容到合理的速率。

您还可以部署流量生成器，以验证基于流量的自动扩缩行为。

在 30 RPS 时，Deployment 会扩缩到 5 个副本，以使每个副本在理想情况下接收 6 RPS 的流量，即每个 Pod 的 60%。该目标利用率低于 70%，因此 Pod 会进行适当扩缩。根据流量波动，自动扩缩副本的数量也可能会发生波动。如需详细了解如何计算副本数量，请参阅自动扩缩行为。

根据自定义指标或外部指标进行自动扩缩

如需为自定义指标和外部指标创建 Pod 横向自动扩缩器，请参阅根据指标优化 Pod 自动扩缩。

根据多个指标进行自动扩缩

此示例创建的 Pod 横向自动扩缩器根据 CPU 利用率和名为 packets_per_second 的自定义指标进行自动扩缩。

如果您按照前面的示例操作，且仍然存在名为 nginx 的 Pod 横向自动扩缩器，请先删除它，然后再按照本示例进行操作。

本示例要求使用 apiVersion: autoscaling/v2。如需详细了解可用的 API，请参阅适用于 HorizontalPodAutoscaler 对象的 API 版本。

将此 YAML 清单保存为名为 nginx-multiple.yaml 的文件：

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: nginx
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  - type: Resource
    resource:
      name: memory
      target:
        type: AverageValue
        averageValue: 100Mi
  # Uncomment these lines if you create the custom packets_per_second metric and
  # configure your app to export the metric.
  # - type: Pods
  #   pods:
  #     metric:
  #       name: packets_per_second
  #     target:
  #       type: AverageValue
  #       averageValue: 100

应用 YAML 清单：

kubectl apply -f nginx-multiple.yaml

创建完成后，Pod 横向自动扩缩器会监控 nginx Deployment 的平均 CPU 利用率、平均内存利用率和（如果取消注释）packets_per_second 自定义指标。Pod 横向自动扩缩器会根据某项指标对 Deployment 进行自动扩缩，条件是该指标的值将创建更大的自动扩缩事件。

查看有关 Pod 横向自动扩缩器的详细信息

如需查看 Pod 横向自动扩缩器的配置和统计信息，请使用以下命令：

kubectl describe hpa HPA_NAME

将 HPA_NAME 替换为您的 HorizontalPodAutoscaler 名称。

如果 Pod 横向自动扩缩器使用 apiVersion: autoscaling/v2 且基于多个指标，则 kubectl describe hpa 命令仅显示 CPU 指标。如需查看所有指标，请改用以下命令：

kubectl describe hpa.v2.autoscaling HPA_NAME

将 HPA_NAME 替换为您的 HorizontalPodAutoscaler 名称。

每个 Pod 横向自动扩缩器的当前状态显示在 Conditions 字段中，自动扩缩事件列在 Events 字段中。

输出内容类似如下：

Name:                                                  nginx
Namespace:                                             default
Labels:                                                <none>
Annotations:                                           kubectl.kubernetes.io/last-applied-configuration:
                                                         {"apiVersion":"autoscaling/v2","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"nginx","namespace":"default"},"s...
CreationTimestamp:                                     Tue, 05 May 2020 20:07:11 +0000
Reference:                                             Deployment/nginx
Metrics:                                               ( current / target )
  resource memory on pods:                             2220032 / 100Mi
  resource cpu on pods  (as a percentage of request):  0% (0) / 50%
Min replicas:                                          1
Max replicas:                                          10
Deployment pods:                                       1 current / 1 desired
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    recommended size matches current size
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from memory resource
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:                                                <none>

删除 Pod 横向自动扩缩器

您可以使用 Google Cloud 控制台或 kubectl delete 命令删除 Pod 横向自动扩缩器。

控制台

如需删除 nginx Pod 横向自动扩缩器，请执行以下操作：

转到 Google Cloud 控制台中的工作负载页面。

转到“工作负载”
点击 nginx Deployment 的名称。
点击 操作 > 自动扩缩。
点击删除。

`kubectl delete`

如需删除 nginx Pod 横向自动扩缩器，请执行以下操作：

kubectl delete hpa nginx

删除 Pod 横向自动扩缩器时，Deployment（或其他部署对象）的现有规模将保持不变，并且不会还原到 Deployment 的原始清单中的副本数量。如需手动扩缩 Deployment，使其还原为三个 Pod，您可以使用 kubectl scale 命令：

kubectl scale deployment nginx --replicas=3

清理

删除 Pod 横向自动扩缩器（如果您尚未这样做）：
```
kubectl delete hpa nginx
```
删除 nginx Deployment：
```
kubectl delete deployment nginx
```
您也可以选择删除集群。

问题排查

设置 Pod 横向自动扩缩器时，您可能会看到如下警告消息：

unable to fetch pod metrics for pod

当指标服务器启动时，看到此消息是正常现象。但是，如果您仍然看到警告，并发现 Pod 未针对您的工作负载进行扩缩，则请确保您已为工作负载中的每个容器指定资源请求。如需将资源利用百分比目标与 Pod 横向自动扩缩搭配使用，您必须为工作负载中每个 Pod 中运行的每个容器配置对该资源的请求。否则，Pod 横向自动扩缩器无法执行所需的计算，也不会执行与该指标相关的操作。