- 1.73.0 (latest)
- 1.72.0
- 1.71.1
- 1.70.0
- 1.69.0
- 1.68.0
- 1.67.1
- 1.66.0
- 1.65.0
- 1.63.0
- 1.62.0
- 1.60.0
- 1.59.0
- 1.58.0
- 1.57.0
- 1.56.0
- 1.55.0
- 1.54.1
- 1.53.0
- 1.52.0
- 1.51.0
- 1.50.0
- 1.49.0
- 1.48.0
- 1.47.0
- 1.46.0
- 1.45.0
- 1.44.0
- 1.43.0
- 1.39.0
- 1.38.1
- 1.37.0
- 1.36.4
- 1.35.0
- 1.34.0
- 1.33.1
- 1.32.0
- 1.31.1
- 1.30.1
- 1.29.0
- 1.28.1
- 1.27.1
- 1.26.1
- 1.25.0
- 1.24.1
- 1.23.0
- 1.22.1
- 1.21.0
- 1.20.0
- 1.19.1
- 1.18.3
- 1.17.1
- 1.16.1
- 1.15.1
- 1.14.0
- 1.13.1
- 1.12.1
- 1.11.0
- 1.10.0
- 1.9.0
- 1.8.1
- 1.7.1
- 1.6.2
- 1.5.0
- 1.4.3
- 1.3.0
- 1.2.0
- 1.1.1
- 1.0.1
- 0.9.0
- 0.8.0
- 0.7.1
- 0.6.0
- 0.5.1
- 0.4.0
- 0.3.1
MatchingEngineIndexEndpoint(
index_endpoint_name: str,
project: Optional[str] = None,
location: Optional[str] = None,
credentials: Optional[google.auth.credentials.Credentials] = None,
)
Matching Engine index endpoint resource for Vertex AI.
Inheritance
builtins.object > google.cloud.aiplatform.base.VertexAiResourceNoun > builtins.object > google.cloud.aiplatform.base.FutureManager > google.cloud.aiplatform.base.VertexAiResourceNounWithFutureManager > MatchingEngineIndexEndpointProperties
create_time
Time this resource was created.
deployed_indexes
Returns a list of deployed indexes on this endpoint.
description
Description of the index endpoint.
display_name
Display name of this resource.
encryption_spec
Customer-managed encryption key options for this Vertex AI resource.
If this is set, then all resources created by this Vertex AI resource will be encrypted with the provided encryption key.
gca_resource
The underlying resource proto representation.
labels
User-defined labels containing metadata about this resource.
Read more about labels at https://goo.gl/xmQnxf
name
Name of this resource.
resource_name
Full qualified resource name.
update_time
Time this resource was last updated.
Methods
MatchingEngineIndexEndpoint
MatchingEngineIndexEndpoint(
index_endpoint_name: str,
project: Optional[str] = None,
location: Optional[str] = None,
credentials: Optional[google.auth.credentials.Credentials] = None,
)
Retrieves an existing index endpoint given a name or ID.
Example Usage:
my_index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
index_endpoint_name='projects/123/locations/us-central1/index_endpoint/my_index_id'
)
or
my_index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
index_endpoint_name='my_index_endpoint_id'
)
Name | Description |
index_endpoint_name |
str
Required. A fully-qualified index endpoint resource name or a index ID. Example: "projects/123/locations/us-central1/index_endpoints/my_index_id" or "my_index_id" when project and location are initialized or passed. |
project |
str
Optional. Project to retrieve index endpoint from. If not set, project set in aiplatform.init will be used. |
location |
str
Optional. Location to retrieve index endpoint from. If not set, location set in aiplatform.init will be used. |
credentials |
auth_credentials.Credentials
Optional. Custom credentials to use to retrieve this IndexEndpoint. Overrides credentials set in aiplatform.init. |
create
create(
display_name: str,
network: Optional[str] = None,
description: Optional[str] = None,
labels: Optional[Dict[str, str]] = None,
project: Optional[str] = None,
location: Optional[str] = None,
credentials: Optional[google.auth.credentials.Credentials] = None,
request_metadata: Optional[Sequence[Tuple[str, str]]] = (),
sync: bool = True,
)
Creates a MatchingEngineIndexEndpoint resource.
Example Usage:
my_index_endpoint = aiplatform.IndexEndpoint.create(
display_name='my_endpoint',
)
Name | Description |
display_name |
str
Required. The display name of the IndexEndpoint. The name can be up to 128 characters long and can be consist of any UTF-8 characters. |
network |
str
Optional. The full name of the Google Compute Engine |
description |
str
Optional. The description of the IndexEndpoint. |
labels |
Dict[str, str]
Optional. The labels with user-defined metadata to organize your IndexEndpoint. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information on and examples of labels. No more than 64 user labels can be associated with one IndexEndpoint (System labels are excluded)." System reserved label keys are prefixed with "aiplatform.googleapis.com/" and are immutable. |
project |
str
Optional. Project to create IndexEndpoint in. If not set, project set in aiplatform.init will be used. |
location |
str
Optional. Location to create IndexEndpoint in. If not set, location set in aiplatform.init will be used. |
credentials |
auth_credentials.Credentials
Optional. Custom credentials to use to create IndexEndpoints. Overrides credentials set in aiplatform.init. |
request_metadata |
Sequence[Tuple[str, str]]
Optional. Strings which should be sent along with the request as metadata. |
sync |
bool
Optional. Whether to execute this creation synchronously. If False, this method will be executed in concurrent Future and any downstream object will be immediately returned and synced when the Future has completed. |
Type | Description |
ValueError | A network must be instantiated when creating a IndexEndpoint. |
delete
delete(force: bool = False, sync: bool = True)
Deletes this MatchingEngineIndexEndpoint resource. If force is set to True, all indexes on this endpoint will be undeployed prior to deletion.
Name | Description |
force |
bool
Required. If force is set to True, all deployed indexes on this endpoint will be undeployed first. Default is False. |
sync |
bool
Whether to execute this method synchronously. If False, this method will be executed in concurrent Future and any downstream object will be immediately returned and synced when the Future has completed. |
Type | Description |
FailedPrecondition | If indexes are deployed on this MatchingEngineIndexEndpoint and force = False. |
deploy_index
deploy_index(
index: google.cloud.aiplatform.matching_engine.matching_engine_index.MatchingEngineIndex,
deployed_index_id: str,
display_name: Optional[str] = None,
machine_type: Optional[str] = None,
min_replica_count: Optional[int] = None,
max_replica_count: Optional[int] = None,
enable_access_logging: Optional[bool] = None,
reserved_ip_ranges: Optional[Sequence[str]] = None,
deployment_group: Optional[str] = None,
auth_config_audiences: Optional[Sequence[str]] = None,
auth_config_allowed_issuers: Optional[Sequence[str]] = None,
request_metadata: Optional[Sequence[Tuple[str, str]]] = (),
)
Deploys an existing index resource to this endpoint resource.
Name | Description |
index |
MatchingEngineIndex
Required. The Index this is the deployment of. We may refer to this Index as the DeployedIndex's "original" Index. |
deployed_index_id |
str
Required. The user specified ID of the DeployedIndex. The ID can be up to 128 characters long and must start with a letter and only contain letters, numbers, and underscores. The ID must be unique within the project it is created in. |
display_name |
str
The display name of the DeployedIndex. If not provided upon creation, the Index's display_name is used. |
machine_type |
str
Optional. The type of machine. Not specifying machine type will result in model to be deployed with automatic resources. |
min_replica_count |
int
Optional. The minimum number of machine replicas this deployed model will be always deployed on. If traffic against it increases, it may dynamically be deployed onto more replicas, and as traffic decreases, some of these extra replicas may be freed. If this value is not provided, the value of 2 will be used. |
max_replica_count |
int
Optional. The maximum number of replicas this deployed model may be deployed on when the traffic against it increases. If requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the deployed model increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, the larger value of min_replica_count or 2 will be used. If value provided is smaller than min_replica_count, it will automatically be increased to be min_replica_count. |
enable_access_logging |
bool
Optional. If true, private endpoint's access logs are sent to StackDriver Logging. These logs are like standard server access logs, containing information like timestamp and latency for each MatchRequest. Note that Stackdriver logs may incur a cost, especially if the deployed index receives a high queries per second rate (QPS). Estimate your costs before enabling this option. |
reserved_ip_ranges |
Sequence[str]
Optional. A list of reserved ip ranges under the VPC network that can be used for this DeployedIndex. If set, we will deploy the index within the provided ip ranges. Otherwise, the index might be deployed to any ip ranges under the provided VPC network. The value sohuld be the name of the address (https://cloud.google.com/compute/docs/reference/rest/v1/addresses) Example: 'vertex-ai-ip-range'. |
deployment_group |
str
Optional. The deployment group can be no longer than 64 characters (eg: 'test', 'prod'). If not set, we will use the 'default' deployment group. Creating |
auth_config_audiences |
Sequence[str]
The list of JWT |
auth_config_allowed_issuers |
Sequence[str]
A list of allowed JWT issuers. Each entry must be a valid Google service account, in the following format: |
request_metadata |
Sequence[Tuple[str, str]]
Optional. Strings which should be sent along with the request as metadata. |
list
list(
filter: Optional[str] = None,
order_by: Optional[str] = None,
project: Optional[str] = None,
location: Optional[str] = None,
credentials: Optional[google.auth.credentials.Credentials] = None,
parent: Optional[str] = None,
)
List all instances of this Vertex AI Resource.
Example Usage:
aiplatform.BatchPredictionJobs.list( filter='state="JOB_STATE_SUCCEEDED" AND display_name="my_job"', )
aiplatform.Model.list(order_by="create_time desc, display_name")
Name | Description |
filter |
str
Optional. An expression for filtering the results of the request. For field names both snake_case and camelCase are supported. |
order_by |
str
Optional. A comma-separated list of fields to order by, sorted in ascending order. Use "desc" after a field name for descending. Supported fields: |
project |
str
Optional. Project to retrieve list from. If not set, project set in aiplatform.init will be used. |
location |
str
Optional. Location to retrieve list from. If not set, location set in aiplatform.init will be used. |
credentials |
auth_credentials.Credentials
Optional. Custom credentials to use to retrieve list. Overrides credentials set in aiplatform.init. |
parent |
str
Optional. The parent resource name if any to retrieve list from. |
match
match(
deployed_index_id: str,
queries: List[List[float]],
num_neighbors: int = 1,
filter: Optional[
List[
google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint.Namespace
]
] = [],
)
Retrieves nearest neighbors for the given embedding queries on the specified deployed index.
Name | Description |
deployed_index_id |
str
Required. The ID of the DeployedIndex to match the queries against. |
queries |
List[List[float]]
Required. A list of queries. Each query is a list of floats, representing a single embedding. |
num_neighbors |
int
Required. The number of nearest neighbors to be retrieved from database for each query. |
filter |
List[Namespace]
Optional. A list of Namespaces for filtering the matching results. For example, [Namespace("color", ["red"], []), Namespace("shape", [], ["squared"])] will match datapoints that satisfy "red color" but not include datapoints with "squared shape". Please refer to https://cloud.google.com/vertex-ai/docs/matching-engine/filtering#json for more detail. |
mutate_deployed_index
mutate_deployed_index(
deployed_index_id: str,
min_replica_count: int = 1,
max_replica_count: int = 1,
request_metadata: Optional[Sequence[Tuple[str, str]]] = (),
)
Updates an existing deployed index under this endpoint resource.
Name | Description |
deployed_index_id |
str
Required. The user specified ID of the DeployedIndex. The ID can be up to 128 characters long and must start with a letter and only contain letters, numbers, and underscores. The ID must be unique within the project it is created in. |
min_replica_count |
int
Optional. The minimum number of machine replicas this deployed model will be always deployed on. If traffic against it increases, it may dynamically be deployed onto more replicas, and as traffic decreases, some of these extra replicas may be freed. |
max_replica_count |
int
Optional. The maximum number of replicas this deployed model may be deployed on when the traffic against it increases. If requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the deployed model increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, the larger value of min_replica_count or 1 will be used. If value provided is smaller than min_replica_count, it will automatically be increased to be min_replica_count. |
request_metadata |
Sequence[Tuple[str, str]]
Optional. Strings which should be sent along with the request as metadata. |
index_id |
str
Required. The ID of the MatchingEnginIndex associated with the DeployedIndex. |
to_dict
to_dict()
Returns the resource proto as a dictionary.
undeploy_all
undeploy_all(sync: bool = True)
Undeploys every index deployed to this MatchingEngineIndexEndpoint.
Name | Description |
sync |
bool
Whether to execute this method synchronously. If False, this method will be executed in concurrent Future and any downstream object will be immediately returned and synced when the Future has completed. |
undeploy_index
undeploy_index(
deployed_index_id: str, request_metadata: Optional[Sequence[Tuple[str, str]]] = ()
)
Undeploy a deployed index endpoint resource.
Name | Description |
deployed_index_id |
str
Required. The ID of the DeployedIndex to be undeployed from the IndexEndpoint. |
request_metadata |
Sequence[Tuple[str, str]]
Optional. Strings which should be sent along with the request as metadata. |
update
update(
display_name: str,
description: Optional[str] = None,
labels: Optional[Dict[str, str]] = None,
request_metadata: Optional[Sequence[Tuple[str, str]]] = (),
)
Updates an existing index endpoint resource.
Name | Description |
display_name |
str
Required. The display name of the IndexEndpoint. The name can be up to 128 characters long and can be consist of any UTF-8 characters. |
description |
str
Optional. The description of the IndexEndpoint. |
labels |
Dict[str, str]
Optional. The labels with user-defined metadata to organize your Indexs. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information on and examples of labels. No more than 64 user labels can be associated with one IndexEndpoint (System labels are excluded)." System reserved label keys are prefixed with "aiplatform.googleapis.com/" and are immutable. |
request_metadata |
Sequence[Tuple[str, str]]
Optional. Strings which should be sent along with the request as metadata. |
wait
wait()
Helper method that blocks until all futures are complete.