This page provides a set of recommendations for planning, architecting, deploying,
scaling, and operating large workloads on Google Kubernetes Engine (GKE) clusters. We recommend you follow
these recommendations to keep your scaling workloads within
service-level objectives (SLOs).
Available recommendations for scalability
Before planning and designing a GKE architecture, map parameters specific to your
workload (for example the number of active users, expected response time,
required compute resources) with the resources used by Kubernetes (such as Pods,
Services, and 'CustomResourceDefinition'). With this information mapped, review
the GKE scalability recommendations.
The scalability recommendations are divided based in the following planning scopes:
Plan for scalability: To learn about the general best practices for
designing your workloads and clusters for reliable performance when running
on both small and large clusters. These recommendations are useful for architects,
platform administrators, and Kubernetes developers. To learn more, see
Plan for scalability.
Plan for large-size GKE clusters: To learn how to plan to run very
big-size GKE clusters. Learn about known limits of Kubernetes and GKE and ways
to avoid reaching them. These recommendations are useful for architects
and platform administrators. To learn more, see
Plan for large GKE clusters.
Plan for large workloads: To learn how to plan architectures that run
large Kubernetes workloads on GKE. It covers recommendations for distributing
the workload among projects and clusters, and adjusting these workload required
quotas. These recommendations are useful for architects and platform administrators.
To learn more, see
Plan for large workloads.
These scalability recommendations are general to GKE and are applicable to both
GKE Standard and GKE Autopilot modes. GKE Autopilot provisions and manages
the cluster's underlying infrastructure for you, therefore some recommendations
are not applicable.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[],[],null,["# About GKE Scalability\n\n[Autopilot](/kubernetes-engine/docs/concepts/autopilot-overview) [Standard](/kubernetes-engine/docs/concepts/choose-cluster-mode)\n\n*** ** * ** ***\n\nThis page provides a set of recommendations for planning, architecting, deploying, scaling, and operating large workloads on Google Kubernetes Engine (GKE) clusters. We recommend you follow these recommendations to keep your scaling workloads within [service-level objectives (SLOs)](https://landing.google.com/sre/sre-book/chapters/service-level-objectives).\n\n\u003cbr /\u003e\n\nAvailable recommendations for scalability\n-----------------------------------------\n\nBefore planning and designing a GKE architecture, map parameters specific to your\nworkload (for example the number of active users, expected response time,\nrequired compute resources) with the resources used by Kubernetes (such as Pods,\nServices, and 'CustomResourceDefinition'). With this information mapped, review\nthe GKE scalability recommendations.\n\nThe scalability recommendations are divided based in the following planning scopes:\n\n- **Plan for scalability** : To learn about the general best practices for designing your workloads and clusters for reliable performance when running on both small and large clusters. These recommendations are useful for architects, platform administrators, and Kubernetes developers. To learn more, see [Plan for scalability](/kubernetes-engine/docs/concepts/planning-scalability).\n- **Plan for large-size GKE clusters** : To learn how to plan to run very big-size GKE clusters. Learn about known limits of Kubernetes and GKE and ways to avoid reaching them. These recommendations are useful for architects and platform administrators. To learn more, see [Plan for large GKE clusters](/kubernetes-engine/docs/concepts/planning-large-clusters).\n- **Plan for large workloads** : To learn how to plan architectures that run large Kubernetes workloads on GKE. It covers recommendations for distributing the workload among projects and clusters, and adjusting these workload required quotas. These recommendations are useful for architects and platform administrators. To learn more, see [Plan for large workloads](/kubernetes-engine/docs/concepts/planning-large-workloads).\n\nThese scalability recommendations are general to GKE and are applicable to both\nGKE Standard and GKE Autopilot modes. GKE Autopilot provisions and manages\nthe cluster's underlying infrastructure for you, therefore some recommendations\nare not applicable.\n| **Caution:** Test your planned cluster configuration before its implementation. Some design decisions might include fixed parameters, for example, CIDRs definition. Changing these parameters on existing clusters is not available and it requires cluster recreation.\n\nWhat's next?\n------------\n\n- [Plan for scalability](/kubernetes-engine/docs/concepts/planning-scalability).\n- [Plan for large GKE clusters](/kubernetes-engine/docs/concepts/planning-large-clusters)\n- [Plan for large workloads](/kubernetes-engine/docs/concepts/planning-large-workloads)\n- See our episodes about [building large GKE clusters](https://www.youtube.com/watch?v=542XwAPKh4g)."]]