Class DeploymentResourcePool (1.69.0)

DeploymentResourcePool(
    deployment_resource_pool_name: str,
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
)

Retrieves a DeploymentResourcePool.

Parameters
Name	Description
`deployment_resource_pool_name`	`str` Required. The fully-qualified resource name or ID of the deployment resource pool. Example: "projects/123/locations/us-central1/deploymentResourcePools/456" or "456" when project and location are initialized or passed.
`project`	`str` Optional. Project containing the deployment resource pool to retrieve. If not set, the project given to `aiplatform.init` will be used.
`location`	`str` Optional. Location containing the deployment resource pool to retrieve. If not set, the location given to `aiplatform.init` will be used.

Properties

create_time

Time this resource was created.

display_name

Display name of this resource.

encryption_spec

Customer-managed encryption key options for this Vertex AI resource.

If this is set, then all resources created by this Vertex AI resource will be encrypted with the provided encryption key.

gca_resource

The underlying resource proto representation.

labels

User-defined labels containing metadata about this resource.

Read more about labels at https://goo.gl/xmQnxf

name

Name of this resource.

resource_name

Full qualified resource name.

update_time

Time this resource was last updated.

Methods

create

create(
    deployment_resource_pool_id: str,
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
    metadata: typing.Sequence[typing.Tuple[str, str]] = (),
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
    machine_type: typing.Optional[str] = None,
    min_replica_count: int = 1,
    max_replica_count: int = 1,
    accelerator_type: typing.Optional[str] = None,
    accelerator_count: typing.Optional[int] = None,
    autoscaling_target_cpu_utilization: typing.Optional[int] = None,
    autoscaling_target_accelerator_duty_cycle: typing.Optional[int] = None,
    sync=True,
    create_request_timeout: typing.Optional[float] = None,
    reservation_affinity_type: typing.Optional[str] = None,
    reservation_affinity_key: typing.Optional[str] = None,
    reservation_affinity_values: typing.Optional[typing.List[str]] = None,
    spot: bool = False,
) -> google.cloud.aiplatform.models.DeploymentResourcePool

Creates a new DeploymentResourcePool.

Parameters
Name	Description
`create_request_timeout`	`float` Optional. The create request timeout in seconds.
`reservation_affinity_type`	`str` Optional. The type of reservation affinity. One of NO_RESERVATION, ANY_RESERVATION, SPECIFIC_RESERVATION, SPECIFIC_THEN_ANY_RESERVATION, SPECIFIC_THEN_NO_RESERVATION
`reservation_affinity_key`	`str` Optional. Corresponds to the label key of a reservation resource. To target a SPECIFIC_RESERVATION by name, use `compute.googleapis.com/reservation-name` as the key and specify the name of your reservation as its value.
`reservation_affinity_values`	`List[str]` Optional. Corresponds to the label values of a reservation resource. This must be the full resource name of the reservation. Format: 'projects/{project_id_or_number}/zones/{zone}/reservations/{reservation_name}'
`spot`	`bool` Optional. Whether to schedule the deployment workload on spot VMs.
`deployment_resource_pool_id`	`str` Required. User-specified name for the new deployment resource pool.
`project`	`str` Optional. Project containing the deployment resource pool to retrieve. If not set, the project given to `aiplatform.init` will be used.
`location`	`str` Optional. Location containing the deployment resource pool to retrieve. If not set, the location given to `aiplatform.init` will be used.
`metadata`	`Sequence[Tuple[str, str]]` Optional. Strings which should be sent along with the request as metadata.
`machine_type`	`str` Optional. Machine type to use for the deployment resource pool. If not set, the default machine type of `n1-standard-2` is used.
`min_replica_count`	`int` Optional. The minimum replica count of the new deployment resource pool. Each replica serves a copy of each model deployed on the deployment resource pool. If this value is less than `max_replica_count`, then autoscaling is enabled, and the actual number of replicas will be adjusted to bring resource usage in line with the autoscaling targets.
`max_replica_count`	`int` Optional. The maximum replica count of the new deployment resource pool.
`accelerator_type`	`str` Optional. Hardware accelerator type. Must also set accelerator_ count if used. One of NVIDIA_TESLA_K80, NVIDIA_TESLA_P100, NVIDIA_TESLA_V100, NVIDIA_TESLA_P4, NVIDIA_TESLA_T4, or NVIDIA_TESLA_A100.
`accelerator_count`	`int` Optional. The number of accelerators attached to each replica.
`autoscaling_target_cpu_utilization`	`int` Optional. Target CPU utilization value for autoscaling. A default value of 60 will be used if not specified.
`autoscaling_target_accelerator_duty_cycle`	`int` Optional. Target accelerator duty cycle percentage to use for autoscaling. Must also set accelerator_type and accelerator count if specified. A default value of 60 will be used if accelerators are requested and this is not specified.
`sync`	`bool` Optional. Whether to execute this method synchronously. If False, this method will be executed in a concurrent Future and any downstream object will be immediately returned and synced when the Future has completed.

delete

delete(sync: bool = True) -> None

Deletes this Vertex AI resource. WARNING: This deletion is permanent.

list

list(
    filter: typing.Optional[str] = None,
    order_by: typing.Optional[str] = None,
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
) -> typing.List[google.cloud.aiplatform.models.DeploymentResourcePool]

Lists the deployment resource pools.

filter (str): Optional. An expression for filtering the results of the request. For field names both snake_case and camelCase are supported. order_by (str): Optional. A comma-separated list of fields to order by, sorted in ascending order. Use "desc" after a field name for descending. Supported fields: display_name, create_time, update_time project (str): Optional. Project to retrieve list from. If not set, project set in aiplatform.init will be used. location (str): Optional. Location to retrieve list from. If not set, location set in aiplatform.init will be used. credentials (auth_credentials.Credentials): Optional. Custom credentials to use to retrieve list. Overrides credentials set in aiplatform.init.

query_deployed_models

query_deployed_models(
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
) -> typing.List[google.cloud.aiplatform_v1.types.deployed_model_ref.DeployedModelRef]

Lists the deployed models using this resource pool.

Parameters
Name	Description
`project`	`str` Optional. Project to retrieve list from. If not set, project set in aiplatform.init will be used.
`location`	`str` Optional. Location to retrieve list from. If not set, location set in aiplatform.init will be used.
`credentials`	`auth_credentials.Credentials` Optional. Custom credentials to use to retrieve list. Overrides credentials set in aiplatform.init.

to_dict

to_dict() -> typing.Dict[str, typing.Any]

Returns the resource proto as a dictionary.

wait

wait()

Helper method that blocks until all futures are complete.