Provision GPUs and enable Vertex AI pre-trained APIs

This page walks you through enabling the Vertex AI pre-trained APIs on Google Distributed Cloud (GDC) air-gapped so you can start leveraging the Vertex AI capabilities. This page also describes how to provision graphics processing unit (GPU) resources on a container to run artificial intelligence (AI) and machine learning (ML) workloads in a GPU environment.

If you lack the necessary permissions, ask your administrator to enable the Vertex AI pre-trained APIs on your behalf.

Vertex AI on Distributed Cloud includes three APIs, one for each of its pre-trained models, which are Optical Character Recognition (OCR), Speech-to-Text, and Translation. To learn more about these pre-trained models, see the following documentation:

Vertex AI on GDC also includes the following services, which provide their own APIs:

Use the GDC console to enable, disable, and view the endpoints of the Vertex AI pre-trained APIs.

Ask an administrator to set up Vertex AI for you

Most tasks to enable or disable Vertex AI pre-trained APIs and create clusters require platform administrator access. This section describes how an administrator obtains the roles to create clusters with GPUs and manage Vertex AI pre-trained APIs.

An administrator must take the following steps to set up a cluster and enable Vertex AI pre-trained APIs in an organization to run AI and ML workloads on a project namespace:

  1. Set up a project to use Vertex AI.
  2. Ask your Organization IAM Admin to grant you the following roles:

    • To enable Vertex AI pre-trained APIs, obtain the AI Platform Admin (ai-platform-admin) role in the project namespace.
    • To create a user cluster with GPUs, obtain the User Cluster Admin (user-cluster-admin) role.

    For information about these roles, see Predefined role descriptions. To learn how to grant permissions to a subject, see Grant and revoke access.

  3. Create a cluster that supports GPU container workloads. GPU resources on a cluster let developers run AI and ML models.

  4. Allocate GPU machines for the correct cluster types.

  5. Enable Vertex AI pre-trained APIs by following the instructions in this document.

Before you begin

Follow these steps before enabling the pre-trained APIs:

  1. Ensure that your project has the adequate ingress communication configured. For more information, see Configure a project network policy.
  2. Enable GPU resources on your containers to run GPU workloads.
  3. Set up the GDC domain name system (DNS). If you haven't set up the DNS, work with your Infrastructure Operator (IO) to complete this prerequisite.
  4. To enable Vertex AI pre-trained APIs, ask your Organization IAM Admin to grant you the AI Platform Admin (ai-platform-admin) role in your project namespace.
  5. Sign in to the GDC console. If you can't sign in, see Connect to an identity provider.

Enable pre-trained APIs

You can enable the OCR, Speech-to-Text, and Translation pre-trained APIs using the GDC console.

After meeting the prerequisites, follow these steps to enable the pre-trained APIs:

  1. Sign in to the GDC console.
  2. On the navigation menu, click Vertex AI > Pre-trained APIs.
  3. On the Pre-trained APIs page, perform one of the following actions:

    • Click Enable all APIs to enable all the pre-trained APIs.
    • Click Enable on a specific service to enable that API.
  4. In the confirmation dialog, click Enable. A progress message displays.

The enablement duration varies. It might take between 15 and 45 minutes to finish, depending on the state of the cluster.

If you want to view the status of the pre-trained APIs, see View service status and endpoints.

The VAI-A0001 alert (Enabling State Time Limit Reached) triggers if the services take a long time to be enabled. In this case, your IO must review the VAI-R0001 runbook for details.

Disable pre-trained APIs

You can disable the OCR, Speech-to-Text, and Translation pre-trained APIs using the GDC console.

After meeting the prerequisites, follow these steps to disable the pre-trained APIs:

  1. Sign in to the GDC console.
  2. On the navigation menu, click Vertex AI > Pre-trained APIs.
  3. On the Pre-trained APIs page, perform one of the following actions:

    • Click Disable all APIs to disable all the pre-trained APIs.
    • Click Disable on a specific service to disable that API.
  4. In the confirmation dialog, enter disable in the text field to confirm that you want to take that action. Then, click Disable. A progress message displays.

If you want to view the status of the pre-trained APIs, see View service status and endpoints.