Try Gemini 1.5 models, our newest multimodal models in Vertex AI, and see what you can build with a 1M token context window. Try Gemini 1.5 models, our newest multimodal models in Vertex AI, and see what you can build with a 1M token context window.

Create a Ray cluster on Vertex AI

You can use the Google Cloud console or the Vertex AI SDK for Python to create a Ray cluster. A cluster can have up to 2,000 nodes. There is an upper limit of 1,000 nodes within one worker pool. There's no limit on the number of worker pools, but having a large number of worker pools, such as having 1,000 worker pools with one node each, can negatively affect cluster performance.

Before you begin, make sure to read the Ray on Vertex AI overview and set up all the prerequisite tools you need.

A Ray cluster on Vertex AI may take 10-20 minutes to start up after you create it.

Console

In the Google Cloud console, go to the Ray on Vertex AI page.

Go to the Ray on Vertex AI page
Click Create Cluster to open the Create Cluster panel.
For each step in the Create Cluster panel, review or replace the default cluster information. Click Continue to complete each step:
1. For Name and region, specify a Name and choose a Location for your cluster.
2. For Compute settings, specify the configuration of the Ray cluster on the Vertex AI's head node, including its machine type, accelerator type and count, disk type and size, and replica count. Optionally, you can add a custom image URI to specify a custom container image to add Python dependencies not provided by the default container image. See Custom image.
  
  Under Advanced options, you can:
  - Specify your own encryption key.
  - Specify a custom service account.
  - Disable metrics collection, if you won't be using model monitoring.
3. (Optional) To set a private endpoint instead of a public endpoint for your cluster, specify a VPC network to use with Ray on Vertex AI. For more information, see Private and public connectivity.
  
  If you haven't set up a connection for your VPC network, click Set up connection. In the Create a private services access connection panel, complete and click Continue for each of the following steps:
  1. Enable the Service Networking API.
  2. For Allocate an IP range, you can select, create, or allow Google to automatically allocate an IP range.
  3. For Create a connection, review the Network and Allocated IP Range information.
  4. Click Create connection.
Click Create.

Ray on Vertex AI SDK

From an interactive Python environment, use the following to create the Ray cluster on Vertex AI:

import ray
import vertex_ray
from google.cloud import aiplatform
from vertex_ray import Resources

# Define a default CPU cluster, machine_type is n1-standard-16, 1 head node and 1 worker node
head_node_type = Resources()
worker_node_types = [Resources()]

# Or define a GPU cluster.
head_node_type = Resources(
  machine_type="n1-standard-16",
  node_count=1,
  custom_image="us-docker.pkg.dev/my-project/ray-custom.2-9.py310:latest",  # Optional. When not specified, a prebuilt image is used.
)

worker_node_types = [Resources(
  machine_type="n1-standard-16",
  node_count=2,  # Must be >= 1
  accelerator_type="NVIDIA_TESLA_T4",
  accelerator_count=1,
  custom_image="us-docker.pkg.dev/my-project/ray-custom.2-9.py310:latest",  # When not specified, a prebuilt image is used.
)]

aiplatform.init()
# Initialize Vertex AI to retrieve projects for downstream operations.
# Create the Ray cluster on Vertex AI
CLUSTER_RESOURCE_NAME = vertex_ray.create_ray_cluster(
  head_node_type=head_node_type,
  network=NETWORK, #Optional
  worker_node_types=worker_node_types,
  python_version="3.10",  # Optional
  ray_version="2.9",  # Optional
  cluster_name=CLUSTER_NAME, # Optional
  service_account=SERVICE_ACCOUNT,  # Optional
  enable_metrics_collection=True,  # Optional. Enable metrics collection for monitoring.
  labels=LABELS,  # Optional.

)

Where:

CLUSTER_NAME: A name for the Ray cluster on Vertex AI that must be unique across your project.
NETWORK: (Optional) The full name of your VPC network, in the format of projects/PROJECT_ID/global/networks/VPC_NAME. To set a private endpoint instead of a public endpoint for your cluster, specify a VPC network to use with Ray on Vertex AI. For more information, see Private and public connectivity.
VPC_NAME: (Optional) The VPC on which the VM is operating.
PROJECT_ID: Your Google Cloud project ID. You can find the project ID in the Google Cloud console welcome page.
SERVICE_ACCOUNT: (Optional) The service account to run Ray applications on the cluster. Required roles should be granted.
LABELS: (Optional) The labels with user-defined metadata used to organize Ray clusters. Label keys and values can be no longer than 64 characters (Unicode codepoints), and can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information and examples of labels.

You should see the following output until the status changes to RUNNING:

[Ray on Vertex AI]: Cluster State = State.PROVISIONING
Waiting for cluster provisioning; attempt 1; sleeping for 0:02:30 seconds
...
[Ray on Vertex AI]: Cluster State = State.RUNNING

Note the following:

The first node is used as the Head node.
TPU machine types are not supported.

Custom Image (Optional)

Prebuilt images align with most use cases. If you want to build your own image, you're encouraged to use the Ray on Vertex prebuilt images as a base image. See the Docker documentation for how to build your images from a base image.

These base images include an installation of Python, Ubuntu, and Ray. They also include dependencies such as:

python-json-logger
google-cloud-resource-manager
ca-certificates-java
libatlas-base-dev
liblapack-dev
g++, libio-all-perl
libyaml-0-2.
rsync

If you want to build your own image without our base image (advanced), be sure that your image includes:

Ray 2.9.3
Python 3.10
python-json-logger==2.0.7

Private and public connectivity

By default, Ray on Vertex AI creates a public, secure endpoint for interactive development with the Ray Client on Ray clusters on Vertex AI. It's recommended that you use public connectivity for development or ephemeral use cases. This public endpoint is accessible through the internet. Only authorized users who have, at a minimum, Vertex AI user role permissions on the Ray cluster's user project can access the cluster.

If you require a private connection to your cluster or if you're using VPC Service Controls, VPC peering is supported for Ray clusters on Vertex AI. Clusters with a private endpoint are only accessible from a client within a VPC network that is peered with Vertex AI.

To set up private connectivity with VPC Peering for Ray on Vertex AI, select a VPC network when you create your cluster. The VPC network requires a private services connection between your VPC network and Vertex AI. If you're using Ray on Vertex AI in the console, you can set up your private services access connection when creating the cluster.

After you create your Ray cluster on Vertex AI, you can connect to the head node using the Vertex AI SDK for Python. The connecting environment, such as a Compute Engine VM or Vertex AI Workbench instance, must be in the VPC network that is peered with Vertex AI. Note that a private services connection has a limited number of IP addresses, which could result in IP address exhaustion. It's therefore recommended to use private connections for long-running clusters.

Notebook Tutorial

Get started with Gemma on Ray on Vertex AI

What's next

Develop a Ray application on Vertex AI