apiVersion:apps/v1kind:Deploymentmetadata:name:topology-spread-deploymentlabels:app:myappspec:replicas:30selector:matchLabels:app:myapptemplate:metadata:labels:app:myappspec:topologySpreadConstraints:-maxSkew:1# Default. Spreads evenly. Maximum difference in scheduled Pods per Node.topologyKey:kubernetes.io/hostnamewhenUnsatisfiable:DoNotSchedule# Default. Alternatively can be ScheduleAnywaylabelSelector:matchLabels:app:myappmatchLabelKeys:# beta in 1.27-pod-template-hashcontainers:# pause is a lightweight container that simply sleeps-name:pauseimage:registry.k8s.io/pause:3.2
apiVersion:apps/v1kind:Deploymentmetadata:name:pod-affinity-deploymentlabels:app:myappspec:replicas:30selector:matchLabels:app:myapptemplate:metadata:name:with-pod-affinitylabels:app:myappspec:affinity:podAntiAffinity:# requiredDuringSchedulingIgnoredDuringExecution# prevents Pod from being scheduled on a Node if it# does not meet criteria.# Alternatively can use 'preferred' with a weight# rather than 'required'.requiredDuringSchedulingIgnoredDuringExecution:-labelSelector:matchExpressions:-key:appoperator:Invalues:-myapp# Your nodes might be configured with other keys# to use as `topologyKey`. `kubernetes.io/region`# and `kubernetes.io/zone` are common.topologyKey:kubernetes.io/hostnamecontainers:# pause is a lightweight container that simply sleeps-name:pauseimage:registry.k8s.io/pause:3.2
這個 Deployment 範例指定了 30 個副本,但只會擴充至叢集中的可用節點數量。
使用 Pod 反相依性時,請注意下列事項:
Pod 的 labels.app: myapp 會與限制的 labelSelector 相符。
topologyKey 會指定 kubernetes.io/hostname。這個標籤會自動附加至所有節點,並填入節點的主機名稱。如果叢集支援其他標籤,您可以選擇使用這些標籤,例如 region 或 zone。
預先提取容器映像檔
如果沒有其他限制,根據預設,kube-scheduler 會優先將 Pod 排程到已下載容器映像檔的節點上。如果叢集較小,且沒有其他排程設定,可能就會需要這種行為,因為這樣就能在每個節點上都下載映像檔。不過,這項概念應視為最後手段。更好的解決方法是使用 nodeSelector、拓撲擴散限制,或親和性 / 反親和性。詳情請參閱「將 Pod 指派給節點」。
如要確保容器映像檔預先提取至所有節點,可以使用 DaemonSet,例如:
apiVersion:apps/v1kind:DaemonSetmetadata:name:prepulled-imagesspec:selector:matchLabels:name:prepulled-imagestemplate:metadata:labels:name:prepulled-imagesspec:initContainers:-name:prepulled-imageimage:IMAGE# Use a command the terminates immediatelycommand:["sh","-c","'true'"]containers:# pause is a lightweight container that simply sleeps-name:pauseimage:registry.k8s.io/pause:3.2
Pod 在所有節點上 Running 後,請再次重新部署 Pod,確認容器現在是否平均分配到各個節點。
[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["難以理解","hardToUnderstand","thumb-down"],["資訊或程式碼範例有誤","incorrectInformationOrSampleCode","thumb-down"],["缺少我需要的資訊/範例","missingTheInformationSamplesINeed","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["上次更新時間:2025-08-31 (世界標準時間)。"],[],[],null,["This pages shows you how to resolve issues with the Kubernetes scheduler\n(`kube-scheduler`) for Google Distributed Cloud.\n\nKubernetes always schedules Pods to the same set of nodes\n\nThis error might be observed in a few different ways:\n\n- **Unbalanced cluster utilization.** You can inspect cluster utilization for\n each Node with the `kubectl top nodes` command. The following exaggerated\n example output shows pronounced utilization on certain Nodes:\n\n NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%\n XXX.gke.internal 222m 101% 3237Mi 61%\n YYY.gke.internal 91m 0% 2217Mi 0%\n ZZZ.gke.internal 512m 0% 8214Mi 0%\n\n- **Too many requests.** If you schedule a lot of Pods at once onto the same\n Node and those Pods make HTTP requests, it's possible for the Node to be rate\n limited. The common error returned by the server in this scenario is `429 Too\n Many Requests`.\n\n- **Service unavailable.** A webserver, for example, hosted on a Node under high\n load might respond to all requests with `503 Service Unavailable` errors until\n it's under lighter load.\n\nTo check if you have Pods that are always scheduled to the same nodes, use the\nfollowing steps:\n\n1. Run the following `kubectl` command to view the status of the Pods:\n\n kubectl get pods -o wide -n default\n\n To see the distribution of Pods across Nodes, check the `NODE` column in the\n output. In the following example output, all of the Pods are scheduled on the\n same Node: \n\n NAME READY STATUS RESTARTS AGE IP NODE\n nginx-deployment-84c6674589-cxp55 1/1 Running 0 55s 10.20.152.138 10.128.224.44\n nginx-deployment-84c6674589-hzmnn 1/1 Running 0 55s 10.20.155.70 10.128.226.44\n nginx-deployment-84c6674589-vq4l2 1/1 Running 0 55s 10.20.225.7 10.128.226.44\n\nPods have a number of features that allow you to fine tune their scheduling\nbehavior. These features include topology spread constraints and anti-affinity\nrules. You can use one, or a combination, of these features. The requirements\nyou define are ANDed together by `kube-scheduler`.\n\nThe scheduler logs aren't captured at the default logging verbosity level. If\nyou need the scheduler logs for troubleshooting, do the following steps to\ncapture the scheduler logs:\n\n1. Increase the logging verbosity level:\n\n 1. Edit the `kube-scheduler` Deployment:\n\n kubectl --kubeconfig \u003cvar translate=\"no\"\u003eADMIN_CLUSTER_KUBECONFIG\u003c/var\u003e edit deployment kube-scheduler \\\n -n \u003cvar translate=\"no\"\u003eUSER_CLUSTER_NAMESPACE\u003c/var\u003e\n\n 2. Add the flag `--v=5` under the `spec.containers.command` section:\n\n containers:\n - command:\n - kube-scheduler\n - --profiling=false\n - --kubeconfig=/etc/kubernetes/scheduler.conf\n - --leader-elect=true\n - --v=5\n\n2. When you are finished troubleshooting, reset the verbosity level back\n to the default level:\n\n 1. Edit the `kube-scheduler` Deployment:\n\n kubectl --kubeconfig \u003cvar translate=\"no\"\u003eADMIN_CLUSTER_KUBECONFIG\u003c/var\u003e edit deployment kube-scheduler \\\n -n \u003cvar translate=\"no\"\u003eUSER_CLUSTER_NAMESPACE\u003c/var\u003e\n\n 2. Set the verbosity level back to the default value:\n\n containers:\n - command:\n - kube-scheduler\n - --profiling=false\n - --kubeconfig=/etc/kubernetes/scheduler.conf\n - --leader-elect=true\n\nTopology spread constraints\n\n[Topology spread constraints](https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/)\ncan be used to evenly distribute Pods among Nodes according to their `zones`,\n`regions`, `node`, or other custom-defined topology.\n\nThe following example manifest shows a Deployment that spreads replicas evenly\namong all schedulable Nodes using topology spread constraints: \n\n apiVersion: apps/v1\n kind: Deployment\n metadata:\n name: topology-spread-deployment\n labels:\n app: myapp\n spec:\n replicas: 30\n selector:\n matchLabels:\n app: myapp\n template:\n metadata:\n labels:\n app: myapp\n spec:\n topologySpreadConstraints:\n - maxSkew: 1 # Default. Spreads evenly. Maximum difference in scheduled Pods per Node.\n topologyKey: kubernetes.io/hostname\n whenUnsatisfiable: DoNotSchedule # Default. Alternatively can be ScheduleAnyway\n labelSelector:\n matchLabels:\n app: myapp\n matchLabelKeys: # beta in 1.27\n - pod-template-hash\n containers:\n # pause is a lightweight container that simply sleeps\n - name: pause\n image: registry.k8s.io/pause:3.2\n\nThe following considerations apply when using topology spread constraints:\n\n- A Pod's `labels.app: myapp` is matched by the constraint's `labelSelector`.\n- The `topologyKey` specifies `kubernetes.io/hostname`. This label is automatically attached to all Nodes and is populated with the Node's hostname.\n- The `matchLabelKeys` prevents rollouts of new Deployments from considering Pods of old revisions when calculating where to schedule a Pod. The `pod-template-hash` label is automatically populated by a Deployment.\n\nPod anti-affinity\n\n[Pod anti-affinity](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity)\nlets you define constraints for which Pods can be co-located on the same Node.\n\nThe following example manifest shows a Deployment that uses anti-affinity to\nlimit replicas to one Pod per Node: \n\n apiVersion: apps/v1\n kind: Deployment\n metadata:\n name: pod-affinity-deployment\n labels:\n app: myapp\n spec:\n replicas: 30\n selector:\n matchLabels:\n app: myapp\n template:\n metadata:\n name: with-pod-affinity\n labels:\n app: myapp\n spec:\n affinity:\n podAntiAffinity:\n # requiredDuringSchedulingIgnoredDuringExecution\n # prevents Pod from being scheduled on a Node if it\n # does not meet criteria.\n # Alternatively can use 'preferred' with a weight\n # rather than 'required'.\n requiredDuringSchedulingIgnoredDuringExecution:\n - labelSelector:\n matchExpressions:\n - key: app\n operator: In\n values:\n - myapp\n # Your nodes might be configured with other keys\n # to use as `topologyKey`. `kubernetes.io/region`\n # and `kubernetes.io/zone` are common.\n topologyKey: kubernetes.io/hostname\n containers:\n # pause is a lightweight container that simply sleeps\n - name: pause\n image: registry.k8s.io/pause:3.2\n\nThis example Deployment specifies `30` replicas, but only expands to as many Nodes are\navailable in your cluster.\n\nThe following considerations apply when using Pod anti-affinity:\n\n- A Pod's `labels.app: myapp` is matched by the constraint's `labelSelector`.\n- The `topologyKey` specifies `kubernetes.io/hostname`. This label is automatically attached to all Nodes and is populated with the Node's hostname. You can choose to use other labels if your cluster supports them, such as `region` or `zone`.\n\nPre-pull container images\n\nIn the absence of any other constraints, by default `kube-scheduler` prefers to\nschedule Pods on Nodes that already have the container image downloaded onto\nthem. This behavior might be of interest in smaller clusters without other\nscheduling configurations where it would be possible to download the images on\nevery Node. However, relying on this concept should be seen as a last resort. A\nbetter solution is to use `nodeSelector`, topology spread constraints, or\naffinity / anti-affinity. For more information, see\n[Assigning Pods to Nodes](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node).\n\nIf you want to make sure container images are pre-pulled onto all Nodes, you\ncan use a `DaemonSet` like the following example: \n\n apiVersion: apps/v1\n kind: DaemonSet\n metadata:\n name: prepulled-images\n spec:\n selector:\n matchLabels:\n name: prepulled-images\n template:\n metadata:\n labels:\n name: prepulled-images\n spec:\n initContainers:\n - name: prepulled-image\n image: \u003cvar label=\"image\" translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-l devsite-syntax-l-Scalar devsite-syntax-l-Scalar-Plain\"\u003eIMAGE\u003c/span\u003e\u003c/var\u003e\n # Use a command the terminates immediately\n command: [\"sh\", \"-c\", \"'true'\"]\n containers:\n # pause is a lightweight container that simply sleeps\n - name: pause\n image: registry.k8s.io/pause:3.2\n\nAfter the Pod is `Running` on all Nodes, redeploy your Pods again to see if the\ncontainers are now evenly distributed across Nodes.\n\nWhat's next\n\nIf you need additional assistance, reach out to\n\n[Cloud Customer Care](/support-hub).\nYou can also see\n[Getting support](/kubernetes-engine/distributed-cloud/bare-metal/docs/getting-support) for more information about support resources, including the following:\n\n- [Requirements](/kubernetes-engine/distributed-cloud/bare-metal/docs/getting-support#intro-support) for opening a support case.\n- [Tools](/kubernetes-engine/distributed-cloud/bare-metal/docs/getting-support#support-tools) to help you troubleshoot, such as your environment configuration, logs, and metrics.\n- Supported [components](/kubernetes-engine/distributed-cloud/bare-metal/docs/getting-support#what-we-support)."]]