Overview of partner models on Vertex AI

Vertex AI supports a curated list of models developed by Google partners. Partner models can be used with Vertex AI as a model as a service (MaaS) and are offered as a managed API. When you use a partner model, you continue to send your requests to Vertex AI endpoints. Partner models are serverless so there's no need to provision or manage infrastructure.

Partner models can be discovered using Model Garden. You can also deploy models using Model Garden. For more information, see Explore AI models in Model Garden. While information about each available partner model can be found on its model card in Model Garden, only third-party models that perform as a MaaS with Vertex AI are documented in this guide.

Anthropic Claude and Mistral models are examples of third-party managed models that are available to use on Vertex AI.

Enable partner models for users

For you to enable partner models and make a prompt request, a Google Cloud administrator must set the required permissions and verify the organization policy allows the use of required APIs.

Set required permissions

The following roles and permissions are required to use partner models:

  • You must have the Consumer Procurement Entitlement Manager Identity and Access Management (IAM) role. Anyone who's been granted this role can enable partner models in Model Garden.

  • You must have the aiplatform.endpoints.predict permission. This permission is included in the Vertex AI User IAM role. For more information, see Vertex AI User and Access control.

Console

  1. To grant the Consumer Procurement Entitlement Manager IAM roles to a user, go to the IAM page.

    Go to IAM

  2. In the Principal column, find the user principal for which you want to enable access Anthropic Claude models, and then click Edit principal in that row.

  3. In the Edit access pane, click Add another role.

  4. In Select a role, select Consumer Procurement Entitlement Manager.

  5. In the Edit access pane, click Add another role.

  6. In Select a role, select Vertex AI User.

  7. Click Save.

gcloud

  1. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

  2. Grant the Consumer Procurement Entitlement Manager role that's required to enable Anthropic Claude models in Model Garden

    gcloud projects add-iam-policy-binding  PROJECT_ID \
    --member=PRINCIPAL --role=roles/consumerprocurement.entitlementManager
    
  3. Grant the Vertex AI User role that includes the aiplatform.endpoints.predict permission which is required to make prompt requests:

    gcloud projects add-iam-policy-binding  PROJECT_ID \
    --member=PRINCIPAL --role=roles/aiplatform.user
    

    Replace PRINCIPAL with the identifier for the principal. The identifier takes the form user|group|serviceAccount:email or domain:domain—for example, user:cloudysanfrancisco@gmail.com, group:admins@example.com, serviceAccount:test123@example.domain.com, or domain:example.domain.com.

    The output is a list of policy bindings that includes the following:

    - members:
      - user:PRINCIPAL
      role: roles/roles/consumerprocurement.entitlementManager
    

    For more information, see Grant a single role and gcloud projects add-iam-policy-binding.

Set the organization policy

To enable partner models, your organization policy must allow the following APIs:

  • Cloud Commerce Consumer Procurement API - cloudcommerceconsumerprocurement.googleapis.com
  • Commerce Agreement API - commerceagreement.googleapis.com

If your organization sets an organization policy to restrict service usage, then an organization administrator must verify that cloudcommerceconsumerprocurement.googleapis.com and commerceagreement.googleapis.com are allowed by setting the organization policy.

Predictable performance with capacity assurance

Google offers provisioned throughput for some partner models that reserves throughput capacity for your models for a fixed fee. You decide on the throughput capacity and in which regions to reserve that capacity. Because provisioned throughput requests are prioritized over the standard pay-as-you-go requests, provisioned throughput provides increased availability. When the system is overloaded, your requests can still be completed as long as the throughput remains under your reserved throughput capacity. For more information or to subscribe to the service, Contact sales.