使用 Google Cloud 控制台或 Vertex AI SDK for Python 启用 Ray 集群的自动缩放功能。
Vertex AI SDK 上的 Ray
fromgoogle.cloudimportaiplatformimportvertex_rayfromvertex_rayimportAutoscalingSpecautoscaling_spec=AutoscalingSpec(min_replica_count=1,max_replica_count=3,)head_node_type=Resources(machine_type="n1-standard-16",node_count=1,)worker_node_types=[Resources(machine_type="n1-standard-16",accelerator_type="NVIDIA_TESLA_T4",accelerator_count=1,autoscaling_spec=autoscaling_spec,)]# Create the Ray cluster on Vertex AICLUSTER_RESOURCE_NAME=vertex_ray.create_ray_cluster(head_node_type=head_node_type,worker_node_types=worker_node_types,...)
控制台
根据 OSS Ray 最佳实践建议,强制在 Ray 头节点上将逻辑 CPU 数量设置为 0,以便避免在头节点上运行任何工作负载。
Ray on Vertex AI 集群可以扩容到的节点总数上限 (M) 取决于您设置的初始节点总数 (N)。创建 Ray on Vertex AI 集群后,您可以将节点总数扩缩到 P 至 M 之间的值(含边界值),其中 P 是集群中的池数量。
集群中的初始节点总数和纵向扩容目标数量必须位于同一色块中。
更新副本数量
您可以使用 Google Cloud 控制台或 Vertex AI SDK for Python 更新工作器池的副本数量。如果集群包含多个工作器池,您可以在单个请求中单独更改每个副本数量。
Vertex AI SDK 上的 Ray
importvertexaiimportvertex_rayvertexai.init()cluster=vertex_ray.get_ray_cluster("CLUSTER_NAME")# Get the resource name.cluster_resource_name=cluster.cluster_resource_name# Create the new worker poolsnew_worker_node_types=[]forworker_node_typeincluster.worker_node_types:worker_node_type.node_count=REPLICA_COUNT# new worker pool sizenew_worker_node_types.append(worker_node_type)# Make update callupdated_cluster_resource_name=vertex_ray.update_ray_cluster(cluster_resource_name=cluster_resource_name,worker_node_types=new_worker_node_types,)
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-04。"],[],[],null,["# Scale Ray clusters on Vertex AI\n\nRay clusters on Vertex AI offer two scaling options:\nand\n.\nAutoscaling lets the cluster automatically adjust the number of worker nodes\nbased on the resources the Ray tasks and actors require. If you run a heavy workload and are unsure of the resources needed, autoscaling is recommended. Manual\nscaling gives users more granular control of the nodes.\n\nAutoscaling can reduce workload costs but adds node launch overhead and can be\ntricky to configure. If you are new to Ray, start with non-autoscaling clusters,\nand use the manual scaling feature.\n| **Note:** Manual scaling has a limitation due to VPC peering. Google recommends using a [Private Service Connect interface](/vertex-ai/docs/open-source/ray-on-vertex-ai/create-cluster#use-psc-i-egress) when you implement a private VPC network.\n\nAutoscaling\n-----------\n\nEnable a Ray cluster's autoscaling feature by specifying the minimum replica\ncount (`min_replica_count`) and maximum replica count (`max_replica_count`) of\na worker pool.\n\nNote the following:\n\n- Configure the autoscaling specification of all worker pools.\n- Custom upscaling and downscaling speed is not supported. For default values, see [Upscaling and downscaling speed](https://docs.ray.io/en/latest/cluster/vms/user-guides/configuring-autoscaling.html#upscaling-and-downscaling-speed) in the Ray documentation.\n\n### Set worker pool autoscaling specification\n\nUse the Google Cloud console or\n\nto enable a Ray cluster's autoscaling feature. \n\n### Ray on Vertex AI SDK\n\n```python\nfrom google.cloud import aiplatform\nimport vertex_ray\nfrom vertex_ray import AutoscalingSpec\n\nautoscaling_spec = AutoscalingSpec(\n min_replica_count=1,\n max_replica_count=3,\n)\n\nhead_node_type = Resources(\n machine_type=\"n1-standard-16\",\n node_count=1,\n)\n\nworker_node_types = [Resources(\n machine_type=\"n1-standard-16\",\n accelerator_type=\"NVIDIA_TESLA_T4\",\n accelerator_count=1,\n autoscaling_spec=autoscaling_spec,\n)]\n\n# Create the Ray cluster on Vertex AI\nCLUSTER_RESOURCE_NAME = vertex_ray.create_ray_cluster(\nhead_node_type=head_node_type,\nworker_node_types=worker_node_types,\n...\n)\n```\n\n### Console\n\nIn accordance with the\n[OSS Ray best practice](https://docs.ray.io/en/latest/cluster/vms/user-guides/large-cluster-best-practices.html#configuring-the-head-node)\nrecommendation, setting the logical CPU count to 0 on the Ray head node is enforced in order\nto avoid running any workload on the head node.\n\n1. In the Google Cloud console, go to the Ray on Vertex AI page.\n\n [Go to the Ray on Vertex AI page](https://console.cloud.google.com/vertex-ai/ray)\n2. Click **Create cluster** to open the **Create cluster** panel.\n\n3. For each step in the **Create cluster** panel, review or replace the\n default cluster information. Click **Continue** to complete each step:\n\n 1. For **Name and region** , specify a **Name** and choose a location for your cluster.\n 2. For **Compute settings** , specify the configuration of the Ray cluster\n on the head node, including its machine type, accelerator type and\n count, disk type and size, and replica count. Optionally, add a custom\n image URI to specify a custom container image to add Python\n dependencies not provided by the default container image. See\n [Custom image](/vertex-ai/docs/open-source/ray-on-vertex-ai/create-cluster#custom-image).\n\n Under **Advanced options**, you can:\n - Specify your own encryption key.\n - Specify a [custom service account](/vertex-ai/docs/general/custom-service-account).\n - If you don't need to monitor the resource statistics of your workload during training, disable the metrics collection.\n 3. To create a cluster with an autoscaling worker pool, provide a value\n for the worker pool's maximum replica count.\n\n4. Click **Create**.\n\nManual scaling\n--------------\n\nAs your workloads surge or decrease on your Ray clusters on Vertex AI,\nmanually scale the number of replicas to match demand. For example, if you have\nexcess capacity, scale down your worker pools to save costs.\n\n### Limitations with VPC Peering\n\nWhen you scale clusters, you can change only the number of replicas in your\nexisting worker pools. For example, you can't add or remove worker pools from\nyour cluster or change the machine type of your worker pools. Also, the number\nof replicas for your worker pools can't be lower than one.\n\nIf you use a VPC peering connection to connect to your clusters,\na limitation exists on the maximum number of nodes. The maximum number of nodes\ndepends on the number of nodes the cluster had when you created the cluster. For\nmore information, see [Max number of nodes calculation](#calc). This maximum\nnumber includes not just your worker pools but also your head node. If you use\nthe default network configuration, the number of nodes can't exceed the upper\nlimits as described in the [create clusters](/vertex-ai/docs/open-source/ray-on-vertex-ai/create-cluster)\ndocumentation.\n\n#### Maximum number of nodes calculation\n\nIf you use private services access (VPC peering) to connect to\nyour nodes, use the following formulas to check that you don't exceed the\nmaximum number of nodes (`M`), assuming `f(x) = min(29, (32 -\nceiling(log2(x)))`:\n\n- `f(2 * M) = f(2 * N)`\n- `f(64 * M) = f(64 * N)`\n- `f(max(32, 16 + M)) = f(max(32, 16 + N))`\n\nThe maximum total number of nodes in the Ray on Vertex AI cluster you can\nscale up to (`M`) depends on the initial total number of nodes you set up (`N`).\nAfter you create the Ray on Vertex AI cluster, you can scale the total\nnumber of nodes to any amount between `P` and `M` inclusive, where `P` is the\nnumber of pools in your cluster.\n\nThe initial total number of nodes in the cluster and the scaling up target\nnumber must be in the same color block.\n\n### Update replica count\n\nUse the Google Cloud console or Vertex AI SDK for Python to update your worker\npool's replica count. If your cluster includes multiple worker pools, you can\nindividually change each of their replica counts in a single request. \n\n### Ray on Vertex AI SDK\n\n```python\nimport vertexai\nimport vertex_ray\n\nvertexai.init()\ncluster = vertex_ray.get_ray_cluster(\"\u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e\")\n\n# Get the resource name.\ncluster_resource_name = cluster.cluster_resource_name\n\n# Create the new worker pools\nnew_worker_node_types = []\nfor worker_node_type in cluster.worker_node_types:\n worker_node_type.node_count = REPLICA_COUNT # new worker pool size\n new_worker_node_types.append(worker_node_type)\n\n# Make update call\nupdated_cluster_resource_name = vertex_ray.update_ray_cluster(\n cluster_resource_name=cluster_resource_name,\n worker_node_types=new_worker_node_types,\n)\n```\n\n### Console\n\n1. In the Google Cloud console, go to the Ray on Vertex AI page.\n\n [Go to the Ray on Vertex AI page](https://console.cloud.google.com/vertex-ai/ray)\n2. From the list of clusters, click the cluster to modify.\n\n3. On the **Cluster details** page, click **Edit cluster**.\n\n4. In the **Edit cluster** pane, select the worker pool to update and then\n modify the replica count.\n\n5. Click **Update**.\n\n Wait a few minutes for your cluster to update. When the update is\n complete, you can see the updated replica count on the **Cluster details**\n page.\n6. Click **Create**."]]