Quotas and Limits

This page contains usage quota and limits that apply when using Knative serving.

Knative serving is subject to the Google Kubernetes Engine quotas and limits.

The number of Knative serving resources is limited by the configuration of the cluster as well as other dependencies. The following limits are recommended limits for a properly scaled Kubernetes Engine cluster.

Resource Description Limit Can be increased Scope
Services Maximum number of services 150 No per cluster
Revisions Maximum number of revisions 300 No per cluster
Timeout Maximum time before timeout for 0.16.0-gke.1 and later 24 hours No per request
Maximum time before timeout for 0.15.0-gke.3 and earlier 900 seconds No per request

Other resource limitations are imposed by the configuration of the Kubernetes Engine cluster that the services are running in. For example, you cannot request more memory than is available in the nodes in the cluster.