This page provides details to consider before subscribing to Provisioned Throughput, the permissions you must have to place or to view a Provisioned Throughput order, and the instructions for placing and viewing your orders.
What to consider before purchasing
To help you decide whether you want to purchase Provisioned Throughput, consider the following:
You can't cancel your order in the middle of your term.
Your Provisioned Throughput purchase is a commitment, which means that you can't cancel the order in the middle of your term. However, you can increase the number of purchased GSUs. If you accidentally purchase a commitment or there's a problem with your configuration, contact your Google Cloud account representative for assistance.
You can auto-renew your subscription.
When you submit your order, you can choose to auto-renew your subscription at the end of its term, or let the subscription expire. You can cancel the auto-renew process. To cancel your subscription before it auto renews, cancel the auto renewal 30 days before the start of the next term.
You can configure monthly subscriptions to renew automatically each month. Weekly terms don't support automatic renewal.
For more information, see Change Provisioned Throughput order. You can also contact your Google Cloud account representative for assistance.
You can change your model version or region with notice.
After you've chosen your project, region, model, and version and your order is approved and activated, Provisioned Throughput is enabled. You can change your Google model or model version to a new Google model or model version by using the Google Cloud console.
For more information, see Change Provisioned Throughput order. You can also contact your Google Cloud account representative for assistance.
Changes are processed on a best-effort basis and are usually fulfilled within 10 business days of the initial request. Changing your region or project requires enabling a new order before cancelling the earlier order.
You can only change between models from the same vendor. For example, you can switch between Google's models or between partner A's models. However, you can't switch between Google's models and partner A's models.
By default, the overage is billed as pay-as-you-go.
If your throughput exceeds your Provisioned Throughput order amount, overages are processed and billed as standard pay-as-you-go. You can control overages on a per-request basis. For more information, see Use Provisioned Throughput.
For information about pricing, see Provisioned Throughput.
Permissions
The following role must be granted to use Provisioned Throughput:
roles/aiplatform.provisionedThroughputAdmin
: You can access Vertex AI Provisioned Throughput resources.
The following permissions are granted to you by this role:
aiplatform.provisionedThroughputs.create
: You can submit a Provisioned Throughput order in a project.aiplatform.provisionedThroughputs.list
: You can view all Provisioned Throughput orders in a project.
Place a Provisioned Throughput order
Before you place your order to use Imagen models, submit the Request to grant permissions form to be granted permissions.
Before you place an order to use MedLM-large-1.5, contact your Google Cloud account representative to request access. If you expect your QPM to exceed 30,000, then to maximize your Provisioned Throughput order, request an increase to your default Vertex AI system quota using the following information:
- Service: The Vertex AI API.
- Name:
Online prediction requests per minute per region
- Service type: A quota.
- Dimensions: The region where you ordered Provisioned Throughput.
- Value: This is your chosen online-prediction traffic limit.
Follow these steps to purchase Provisioned Throughput:
Console
- In the Google Cloud console, go to the Provisioned Throughput page.
- To start a new order, click Create.
- Enter an Order name.
- Select the Model.
- Select the Region.
- Enter the Number of generative AI scale units (GSUs) that you must
purchase. If you must estimate the number of GSUs, click the
Estimation tool.
- Select your Model.
- Enter the number of Queries per second.
- Enter the number of Input characters per query.
- Enter the number of Input images per query.
- Enter the number of Video seconds per query.
- Enter the number of Audio seconds per query.
- Enter the number of Output characters per query.
- If you want to use the values that you entered into the estimation tool, click Use calculated.
- Select your Term.
If you choose one week, you have the option to provide a start date and time within two weeks into the future of placing an order. If you provide no start date and time, we process the order as soon as we can ensure that the capacity is available. Requested start dates and times are processed on a best-effort basis, and orders aren't guaranteed to be fulfilled by these dates until the order status is set to Approved.
If your requested start date is too close to the current date, your order might be approved and activated after your requested start date, which means that your end date remains seven days from the activation date.
- Select your Renewal option.
- Click Continue.
- In the Summary section, review the price and throughput estimates for your order. Read the terms listed and linked in the form.
- To finalize your order, click Confirm.
Change Provisioned Throughput order
This table describes how you can modify your Provisioned Throughput orders through the Google Cloud console based on the status of your order and any existing conditions. To request access to these preview features, fill out and submit the Provisioned Throughput access control form.
Order status | Action | Note | Steps in Google Cloud console |
---|---|---|---|
Pending review | You can cancel your order. |
If you have additional changes to your order, then cancel the pending order, and place a new order. If you have multiple models, each model can have only one pending order revision or pending order at a time. |
To cancel your pending order in the Google Cloud console, do the following:
|
Active |
You can increase GSUs on existing orders. You can enable or disable automatic renewals. You can change the model's version. |
If both of these conditions are met, you can't change your order:
|
To change your active order in the Google Cloud console, use one of the following methods:
|
Check order status
After you submit your Provisioned Throughput order, the order status might appear as one of the following:
- Pending review: You placed your order. Because approval depends on available capacity to provision your order, your order is waiting for review and approval. For more information about the status of your pending order, contact your Google Cloud account representative.
- Approved: Google has approved your order.
- Active: Google has activated your order, and then billing starts.
- Expired: Your order has expired.
View Provisioned Throughput orders
Follow these steps to view your Provisioned Throughput orders:
Console
- In the Google Cloud console, go to the Provisioned Throughput page.
- Select the Region. Your list of orders appears.