Ray on Vertex AI overview

Ray is an open-source framework for scaling AI and Python applications. Ray provides the infrastructure to perform distributed computing and parallel processing for your machine learning (ML) workflow.

Ray and Vertex AI comparison

If you already use Ray, you can use the same open source Ray code to write programs and develop applications on Vertex AI with minimal changes. You can then use Vertex AI's integrations with other Google Cloud services such as Vertex AI Prediction and BigQuery as part of your machine learning workflow.

If you already use Vertex AI and need a simpler way to manage compute resources, you can use Ray code to scale training.

Workflow for using Ray on Vertex AI

Use Colab Enterprise and Vertex AI SDK for Python to connect to the Ray Cluster.

Steps	Description
1. Set up for Ray on Vertex AI	Set up your Google project, install the version of the Vertex AI SDK for Python that includes the functionality of the Ray Client, and set up a VPC peering network, which is optional.
2. Create a Ray cluster on Vertex AI	Create a Ray cluster on Vertex AI. The Vertex AI Administrator role is required.
3. Develop a Ray application on Vertex AI	Connect to a Ray cluster on Vertex AI and develop an application. Vertex AI User role is required.
4. (Optional) Use Ray on Vertex AI with BigQuery	Read, write, and transform data with BigQuery.
5. (Optional) Deploy a model on Vertex AI and get predictions	Deploy a model to a Vertex AI online endpoint and get predictions.
6. Monitor your Ray cluster on Vertex AI	Monitor generated logs in Cloud Logging and metrics in Cloud Monitoring.
7. Delete a Ray cluster on Vertex AI	Delete a Ray cluster on Vertex AI to avoid unnecessary billing.

Overview

Ray clusters are built in to ensure capacity availability for critical ML workloads or during peak seasons. Unlike custom jobs, where the training service releases the resource after job completion, Ray clusters remain available until deleted.

Note: Use long running Ray clusters in these scenarios:

If you are submitting the same Ray job multiple times and can benefit from data and image caching by running the jobs on the same long running Ray clusters.
If you run many short-lived Ray jobs where the actual processing time is shorter than the job startup time, it may be beneficial to have a long-running cluster.

Ray clusters on Vertex AI can be set up either with public or private connectivity. The following diagrams show the architecture and workflow for Ray on Vertex AI. See Public or private connectivity for more information.

Architecture with public connectivity

Ray on Vertex AI public connectivity

Create the Ray cluster on Vertex AI using the following options:

a. Use the Google Cloud console to create the Ray cluster on Vertex AI.

b. Create the Ray cluster on Vertex AI using the Vertex AI SDK for Python.
Connect to the Ray cluster on Vertex AI for interactive development using the following options:

a. Use Colab Enterprise in the Google Cloud console for seamless connection.

b. Use any Python environment accessible to the public internet.
Develop your application and train your model on the Ray cluster on Vertex AI:
- Use the Vertex AI SDK for Python in your preferred environment (Colab Enterprise or any Python notebook).
- Write a Python script using your preferred environment.
- Submit a Ray Job to the Ray cluster on Vertex AI using the Vertex AI SDK for Python, Ray Job CLI, or Ray Job Submission API.
Deploy the trained model to an online Vertex AI endpoint for live prediction.
Use BigQueryto manage your data.

Architecture with VPC

The following diagram shows the architecture and workflow for Ray on Vertex AI after you set up your Google Cloud project and VPC network, which is optional:

Ray on Vertex AI vpc

Set up your (a) Google project and (b) VPC network.
Create the Ray cluster on Vertex AI using the following options:

a. Use the Google Cloud console to create the Ray cluster on Vertex AI.

b. Create the Ray cluster on Vertex AI using the Vertex AI SDK for Python.
Connect to the Ray cluster on Vertex AI through a VPC peered network using the following options:
- Use Colab Enterprise in the Google Cloud console.
- Use a Vertex AI Workbench notebook.
Develop your application and train your model on the Ray cluster on Vertex AI using the following options:
- Use the Vertex AI SDK for Python in your preferred environment (Colab Enterprise or a Vertex AI Workbench notebook).
- Write a Python script using your preferred environment. Submit a Ray Job to the Ray cluster on Vertex AI using the Vertex AI SDK for Python, Ray Job CLI, or Ray dashboard.
Deploy the trained model to an online Vertex AI endpoint for predictions.
Use BigQuery to manage your data.

Pricing

Pricing for Ray on Vertex AI is calculated as follows:

The compute resources you use are charged based on the machine configuration you select when creating your Ray cluster on Vertex AI. For Ray on Vertex AI pricing, see the pricing page.
Regarding Ray clusters, you are only charged during RUNNING and UPDATING states. No other states are charged. The amount charged is based on the actual cluster size at the moment.
When you perform tasks using the Ray cluster on Vertex AI, logs are automatically generated and charged based on Cloud Logging pricing.
If you deploy your model to an endpoint for online predictions, see the "Prediction and explanation" section of the Vertex AI pricing page.
If you use BigQuery with Ray on Vertex AI, see BigQuery pricing.

What's next

Set up for Ray on Vertex AI