- 1.71.0 (latest)
- 1.70.0
- 1.69.0
- 1.68.0
- 1.67.1
- 1.66.0
- 1.65.0
- 1.63.0
- 1.62.0
- 1.60.0
- 1.59.0
- 1.58.0
- 1.57.0
- 1.56.0
- 1.55.0
- 1.54.1
- 1.53.0
- 1.52.0
- 1.51.0
- 1.50.0
- 1.49.0
- 1.48.0
- 1.47.0
- 1.46.0
- 1.45.0
- 1.44.0
- 1.43.0
- 1.39.0
- 1.38.1
- 1.37.0
- 1.36.4
- 1.35.0
- 1.34.0
- 1.33.1
- 1.32.0
- 1.31.1
- 1.30.1
- 1.29.0
- 1.28.1
- 1.27.1
- 1.26.1
- 1.25.0
- 1.24.1
- 1.23.0
- 1.22.1
- 1.21.0
- 1.20.0
- 1.19.1
- 1.18.3
- 1.17.1
- 1.16.1
- 1.15.1
- 1.14.0
- 1.13.1
- 1.12.1
- 1.11.0
- 1.10.0
- 1.9.0
- 1.8.1
- 1.7.1
- 1.6.2
- 1.5.0
- 1.4.3
- 1.3.0
- 1.2.0
- 1.1.1
- 1.0.1
- 0.9.0
- 0.8.0
- 0.7.1
- 0.6.0
- 0.5.1
- 0.4.0
- 0.3.1
PrivateEndpoint(
endpoint_name: str,
project: Optional[str] = None,
location: Optional[str] = None,
credentials: Optional[google.auth.credentials.Credentials] = None,
)
Represents a Vertex AI PrivateEndpoint resource.
Inheritance
builtins.object > google.cloud.aiplatform.base.VertexAiResourceNoun > builtins.object > google.cloud.aiplatform.base.FutureManager > google.cloud.aiplatform.base.VertexAiResourceNounWithFutureManager > google.cloud.aiplatform.models.Endpoint > PrivateEndpointProperties
explain_http_uri
HTTP path to send explain requests to, used when calling PrivateEndpoint.explain()
health_http_uri
HTTP path to send health check requests to, used when calling PrivateEndpoint.health_check()
predict_http_uri
HTTP path to send prediction requests to, used when calling PrivateEndpoint.predict()
Methods
PrivateEndpoint
PrivateEndpoint(
endpoint_name: str,
project: Optional[str] = None,
location: Optional[str] = None,
credentials: Optional[google.auth.credentials.Credentials] = None,
)
Retrieves a PrivateEndpoint resource.
Example usage: my_private_endpoint = aiplatform.PrivateEndpoint( endpoint_name="projects/123/locations/us-central1/endpoints/1234567891234567890" )
or (when project and location are initialized)
my_private_endpoint = aiplatform.PrivateEndpoint(
endpoint_name="1234567891234567890"
)
Name | Description |
endpoint_name |
str
Required. A fully-qualified endpoint resource name or endpoint ID. Example: "projects/123/locations/us-central1/endpoints/my_endpoint_id" or "my_endpoint_id" when project and location are initialized or passed. |
project |
str
Optional. Project to retrieve endpoint from. If not set, project set in aiplatform.init will be used. |
location |
str
Optional. Location to retrieve endpoint from. If not set, location set in aiplatform.init will be used. |
credentials |
auth_credentials.Credentials
Optional. Custom credentials to use to upload this model. Overrides credentials set in aiplatform.init. |
Type | Description |
ValueError | If the Endpoint being retrieved is not a PrivateEndpoint. |
ImportError | If there is an issue importing the `urllib3` package. |
create
create(
display_name: str,
project: Optional[str] = None,
location: Optional[str] = None,
network: Optional[str] = None,
description: Optional[str] = None,
labels: Optional[Dict[str, str]] = None,
credentials: Optional[google.auth.credentials.Credentials] = None,
encryption_spec_key_name: Optional[str] = None,
sync=True,
)
Creates a new PrivateEndpoint.
Example usage: my_private_endpoint = aiplatform.PrivateEndpoint.create( display_name="my_endpoint_name", project="my_project_id", location="us-central1", network="projects/123456789123/global/networks/my_vpc" )
or (when project and location are initialized)
my_private_endpoint = aiplatform.PrivateEndpoint.create(
display_name="my_endpoint_name",
network="projects/123456789123/global/networks/my_vpc"
)
Name | Description |
display_name |
str
Required. The user-defined name of the Endpoint. The name can be up to 128 characters long and can be consist of any UTF-8 characters. |
project |
str
Optional. Project to retrieve endpoint from. If not set, project set in aiplatform.init will be used. |
location |
str
Optional. Location to retrieve endpoint from. If not set, location set in aiplatform.init will be used. |
network |
str
Optional. The full name of the Compute Engine network to which this Endpoint will be peered. E.g. "projects/123456789123/global/networks/my_vpc". Private services access must already be configured for the network. If not set, network set in aiplatform.init will be used. |
description |
str
Optional. The description of the Endpoint. |
labels |
Dict[str, str]
Optional. The labels with user-defined metadata to organize your Endpoints. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information and examples of labels. |
credentials |
auth_credentials.Credentials
Optional. Custom credentials to use to upload this model. Overrides credentials set in aiplatform.init. |
encryption_spec_key_name |
str
Optional. The Cloud KMS resource identifier of the customer managed encryption key used to protect the model. Has the form: |
sync |
bool
Whether to execute this method synchronously. If False, this method will be executed in concurrent Future and any downstream object will be immediately returned and synced when the Future has completed. |
Type | Description |
ValueError | A network must be instantiated when creating a PrivateEndpoint. |
Type | Description |
endpoint (aiplatform.PrivateEndpoint) | Created endpoint. |
delete
delete(force: bool = False, sync: bool = True)
Deletes this Vertex AI PrivateEndpoint resource. If force is set to True, all models on this PrivateEndpoint will be undeployed prior to deletion.
Name | Description |
force |
bool
Required. If force is set to True, all deployed models on this Endpoint will be undeployed first. Default is False. |
sync |
bool
Whether to execute this method synchronously. If False, this method will be executed in concurrent Future and any downstream object will be immediately returned and synced when the Future has completed. |
Type | Description |
FailedPrecondition | If models are deployed on this Endpoint and force = False. |
deploy
deploy(
model: google.cloud.aiplatform.models.Model,
deployed_model_display_name: Optional[str] = None,
machine_type: Optional[str] = None,
min_replica_count: int = 1,
max_replica_count: int = 1,
accelerator_type: Optional[str] = None,
accelerator_count: Optional[int] = None,
service_account: Optional[str] = None,
explanation_metadata: Optional[
google.cloud.aiplatform_v1.types.explanation_metadata.ExplanationMetadata
] = None,
explanation_parameters: Optional[
google.cloud.aiplatform_v1.types.explanation.ExplanationParameters
] = None,
metadata: Optional[Sequence[Tuple[str, str]]] = (),
sync=True,
)
Deploys a Model to the PrivateEndpoint.
Example Usage: my_private_endpoint.deploy( model=my_model )
Name | Description |
deployed_model_display_name |
str
Optional. The display name of the DeployedModel. If not provided upon creation, the Model's display_name is used. |
machine_type |
str
Optional. The type of machine. Not specifying machine type will result in model to be deployed with automatic resources. |
min_replica_count |
int
Optional. The minimum number of machine replicas this deployed model will be always deployed on. If traffic against it increases, it may dynamically be deployed onto more replicas, and as traffic decreases, some of these extra replicas may be freed. |
max_replica_count |
int
Optional. The maximum number of replicas this deployed model may be deployed on when the traffic against it increases. If requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the deployed model increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, the larger value of min_replica_count or 1 will be used. If value provided is smaller than min_replica_count, it will automatically be increased to be min_replica_count. |
accelerator_type |
str
Optional. Hardware accelerator type. Must also set accelerator_count if used. One of ACCELERATOR_TYPE_UNSPECIFIED, NVIDIA_TESLA_K80, NVIDIA_TESLA_P100, NVIDIA_TESLA_V100, NVIDIA_TESLA_P4, NVIDIA_TESLA_T4 |
accelerator_count |
int
Optional. The number of accelerators to attach to a worker replica. |
service_account |
str
The service account that the DeployedModel's container runs as. Specify the email address of the service account. If this service account is not specified, the container runs as a service account that doesn't have access to the resource project. Users deploying the Model must have the |
explanation_metadata |
aiplatform.explain.ExplanationMetadata
Optional. Metadata describing the Model's input and output for explanation. |
explanation_parameters |
aiplatform.explain.ExplanationParameters
Optional. Parameters to configure explaining for Model's predictions. For more details, see |
metadata |
Sequence[Tuple[str, str]]
Optional. Strings which should be sent along with the request as metadata. |
model |
aiplatform.Model
Required. Model to be deployed. |
sync |
bool
Whether to execute this method synchronously. If False, this method will be executed in concurrent Future and any downstream object will be immediately returned and synced when the Future has completed. |
explain
explain()
Make a prediction with explanations against this Endpoint.
Example usage: response = my_endpoint.explain(instances=[...]) my_explanations = response.explanations
Name | Description |
instances |
List
Required. The instances that are the input to the prediction call. A DeployedModel may have an upper limit on the number of instances it supports per request, and when it is exceeded the prediction call errors in case of AutoML Models, or, in case of customer created Models, the behaviour is as documented by that Model. The schema of any single instance may be specified via Endpoint's DeployedModels' [Model's][google.cloud.aiplatform.v1beta1.DeployedModel.model] [PredictSchemata's][google.cloud.aiplatform.v1beta1.Model.predict_schemata] |
parameters |
Dict
The parameters that govern the prediction. The schema of the parameters may be specified via Endpoint's DeployedModels' [Model's ][google.cloud.aiplatform.v1beta1.DeployedModel.model] [PredictSchemata's][google.cloud.aiplatform.v1beta1.Model.predict_schemata] |
deployed_model_id |
str
Optional. If specified, this ExplainRequest will be served by the chosen DeployedModel, overriding this Endpoint's traffic split. |
timeout |
float
Optional. The timeout for this request in seconds. |
Type | Description |
prediction (aiplatform.Prediction) | Prediction with returned predictions, explanations, and Model ID. |
health_check
health_check()
Makes a request to this PrivateEndpoint's health check URI. Must be within network that this PrivateEndpoint is in.
Example Usage: if my_private_endpoint.health_check(): print("PrivateEndpoint is healthy!")
Type | Description |
RuntimeError | If a model has not been deployed a request cannot be made. |
Type | Description |
bool | Checks if calls can be made to this PrivateEndpoint. |
list
list(
filter: Optional[str] = None,
order_by: Optional[str] = None,
project: Optional[str] = None,
location: Optional[str] = None,
credentials: Optional[google.auth.credentials.Credentials] = None,
)
List all PrivateEndpoint resource instances.
Example Usage: my_private_endpoints = aiplatform.PrivateEndpoint.list()
or
my_private_endpoints = aiplatform.PrivateEndpoint.list(
filter='labels.my_label="my_label_value" OR display_name=!"old_endpoint"',
)
Name | Description |
filter |
str
Optional. An expression for filtering the results of the request. For field names both snake_case and camelCase are supported. |
order_by |
str
Optional. A comma-separated list of fields to order by, sorted in ascending order. Use "desc" after a field name for descending. Supported fields: |
project |
str
Optional. Project to retrieve list from. If not set, project set in aiplatform.init will be used. |
location |
str
Optional. Location to retrieve list from. If not set, location set in aiplatform.init will be used. |
credentials |
auth_credentials.Credentials
Optional. Custom credentials to use to retrieve list. Overrides credentials set in aiplatform.init. |
Type | Description |
List[models.PrivateEndpoint] | A list of PrivateEndpoint resource objects. |
predict
predict(instances: List, parameters: Optional[Dict] = None)
Make a prediction against this PrivateEndpoint using a HTTP request.
This method must be called within the network the PrivateEndpoint is peered to.
Otherwise, the predict() call will fail with error code 404. To check, use PrivateEndpoint.network
.
Example usage: response = my_private_endpoint.predict(instances=[...]) my_predictions = response.predictions
Name | Description |
instances |
List
Required. The instances that are the input to the prediction call. Instance types mut be JSON serializable. A DeployedModel may have an upper limit on the number of instances it supports per request, and when it is exceeded the prediction call errors in case of AutoML Models, or, in case of customer created Models, the behaviour is as documented by that Model. The schema of any single instance may be specified via Endpoint's DeployedModels' [Model's][google.cloud.aiplatform.v1beta1.DeployedModel.model] [PredictSchemata's][google.cloud.aiplatform.v1beta1.Model.predict_schemata] |
parameters |
Dict
The parameters that govern the prediction. The schema of the parameters may be specified via Endpoint's DeployedModels' [Model's ][google.cloud.aiplatform.v1beta1.DeployedModel.model] [PredictSchemata's][google.cloud.aiplatform.v1beta1.Model.predict_schemata] |
Type | Description |
RuntimeError | If a model has not been deployed a request cannot be made. |
Type | Description |
prediction (aiplatform.Prediction) | Prediction object with returned predictions and Model ID. |
raw_predict
raw_predict(body: bytes, headers: Dict[str, str])
Make a prediction request using arbitrary headers.
This method must be called within the network the PrivateEndpoint is peered to.
Otherwise, the predict() call will fail with error code 404. To check, use PrivateEndpoint.network
.
Example usage: my_endpoint = aiplatform.PrivateEndpoint(ENDPOINT_ID) response = my_endpoint.raw_predict( body = b'{"instances":[{"feat_1":val_1, "feat_2":val_2}]}' headers = {'Content-Type':'application/json'} ) status_code = response.status_code results = json.dumps(response.text)
Name | Description |
body |
bytes
The body of the prediction request in bytes. This must not exceed 1.5 mb per request. |
headers |
Dict[str, str]
The header of the request as a dictionary. There are no restrictions on the header. |
undeploy
undeploy(deployed_model_id: str, sync=True)
Undeploys a deployed model from the PrivateEndpoint.
Example Usage: my_private_endpoint.undeploy( deployed_model_id="1234567891232567891" )
or
my_deployed_model_id = my_private_endpoint.list_models()[0].id
my_private_endpoint.undeploy(
deployed_model_id=my_deployed_model_id
)
Name | Description |
deployed_model_id |
str
Required. The ID of the DeployedModel to be undeployed from the PrivateEndpoint. Use PrivateEndpoint.list_models() to get the deployed model ID. |
sync |
bool
Whether to execute this method synchronously. If False, this method will be executed in concurrent Future and any downstream object will be immediately returned and synced when the Future has completed. |