Consumption options

The following provides a description of the consumption options that are supported for AI Hypercomputer. Consumption options are the methods used to request capacity while provisioning models are how you specify what type of capacity to use when creating your VMs or clusters.

When deploying your VMs or clusters, you must specify a provisioning model that matches your required consumption option. For more information about provisioning models, see Provisioning models.

Consumption option How it works Best used for GPU machine type supported
Reservations You request compute resources in advance for a specific amount of time. These resources are dedicated to you for that period of time.

Reservations provide the highest level of assurance for capacity and are cost effective as they are available at a much lower price than the a on-demand request.

Reservations are ideal for long running training jobs and inference workloads. All GPU machine types

  • For A3 Ultra, you must request reservations by using Hypercompute Cluster. To make this request, see Request capacity. Hypercompute Cluster reserves a dense allocation of A3 Ultra machines.
  • For A3 High and A3 Mega, you can also request densely allocated capacity by contacting Technical Account Management (TAM) services.
  • For all GPU machine types except A3 Ultra, you can also request capacity by using the Reservations API. These reservations won't be densely allocated.
Dynamic workload scheduler (DWS) You request compute resources for a specific amount of time, around 28 days.

As these are delivered from a secured pool of resources, the availability of these are much higher than an on-demand request.
DWS is ideal for workloads that need to run at a specific time. These include small model pre-training jobs, model fine-tuning jobs, HPC simulation workloads, and short-term expected increases in inference workloads. All GPU machine types except A3 Ultra
Spot You request compute resources which are delivered based on availability.

These spot resources might be easier to obtain than the on-demand resource but can be deleted at any time by the system. These resources are cost effective as they are available at a much lower price than the standard model.
Spot is a good fit for scheduling lower priority workloads like model pre-training, model fine-tuning jobs and simulation jobs that are tolerant to availability disruptions. All GPU machine types except A3 Ultra

Pricing and discount

The accelerator-optimized machine types are billed for their attached GPUs, predefined vCPU, memory, and bundled Local SSD (if applicable). For more pricing information for accelerator-optimized VMs, see Accelerator-optimized machine type family section on the VM instance pricing page.