- NAME
-
- gcloud ai index-endpoints deploy-index - deploy an index to a Vertex AI index endpoint
- SYNOPSIS
-
-
gcloud ai index-endpoints deploy-index
(INDEX_ENDPOINT
:--region
=REGION
)--deployed-index-id
=DEPLOYED_INDEX_ID
--display-name
=DISPLAY_NAME
--index
=INDEX
[--allowed-issuers
=[ALLOWED_ISSUERS
,…]] [--audiences
=[AUDIENCES
,…]] [--deployment-group
=DEPLOYMENT_GROUP
] [--enable-access-logging
] [--machine-type
=MACHINE_TYPE
] [--max-replica-count
=MAX_REPLICA_COUNT
] [--min-replica-count
=MIN_REPLICA_COUNT
] [--reserved-ip-ranges
=[RESERVED_IP_RANGES
,…]] [GCLOUD_WIDE_FLAG …
]
-
- DESCRIPTION
- Deploy an index to a Vertex AI index endpoint.
- EXAMPLES
-
To deploy index
to an index endpoint345
with 2 min replica count and 10 max replica count under project456
in regionexample
, within reserved ip rangesus-central1
andvertex-ai-ip-range-1
run:vertex-ai-ip-range-2
gcloud ai index-endpoints deploy-index 456 --project=example --region=us-central1 --index=345 --deployed-index-id=deployed-index-345 --display-name=deployed-index-345 --min-replica-count=2 --max-replica-count=10 --reserved-ip-ranges=vertex-ai-ip-range-1,vertex-ai-ip-range-2
- POSITIONAL ARGUMENTS
-
-
Index endpoint resource - The index endpoint to deploy an index. The arguments
in this group can be used to specify the attributes of this resource. (NOTE)
Some attributes are not given arguments in this group but can be set in other
ways.
To set the
project
attribute:-
provide the argument
index_endpoint
on the command line with a fully specified name; -
provide the argument
--project
on the command line; -
set the property
core/project
.
This must be specified.
INDEX_ENDPOINT
-
ID of the index_endpoint or fully qualified identifier for the index_endpoint.
To set the
name
attribute:-
provide the argument
index_endpoint
on the command line.
This positional argument must be specified if any of the other arguments in this group are specified.
-
provide the argument
--region
=REGION
-
Cloud region for the index_endpoint.
To set the
region
attribute:-
provide the argument
index_endpoint
on the command line with a fully specified name; -
provide the argument
--region
on the command line; -
set the property
ai/region
; - choose one from the prompted list of available regions.
-
provide the argument
-
provide the argument
-
Index endpoint resource - The index endpoint to deploy an index. The arguments
in this group can be used to specify the attributes of this resource. (NOTE)
Some attributes are not given arguments in this group but can be set in other
ways.
- REQUIRED FLAGS
-
--deployed-index-id
=DEPLOYED_INDEX_ID
- Id of the deployed index.
--display-name
=DISPLAY_NAME
- Display name of the deployed index.
--index
=INDEX
- ID of the index.
- OPTIONAL FLAGS
-
--allowed-issuers
=[ALLOWED_ISSUERS
,…]-
List of allowed JWT issuers for a deployed index.
Each entry must be a valid Google service account, in the following format:
service-account-name@project-id.iam.gserviceaccount.com
--audiences
=[AUDIENCES
,…]-
List of JWT audiences that are allowed to access a deployed index.
JWT containing any of these audiences (https://tools.ietf.org/html/draft-ietf-oauth-json-web-token-32#section -4.1.3) will be accepted.
--deployment-group
=DEPLOYMENT_GROUP
-
Deployment group can be no longer than 64 characters (eg:
test
,prod
). If not set, we will use thedefault
deployment group.Creating deployment_groups with
reserved_ip_ranges
is a recommended practice when the peered network has multiple peering ranges.This creates your deployments from predictable IP spaces for easier traffic administration. --enable-access-logging
-
If true, online prediction access logs are sent to Cloud Logging.
These logs are standard server access logs, containing information like timestamp and latency for each prediction request.
--machine-type
=MACHINE_TYPE
- The machine resources to be used for each node of this deployment. For available machine types, see https://cloud.google.com/ai-platform-unified/docs/predictions/machine-types.
--max-replica-count
=MAX_REPLICA_COUNT
- Maximum number of machine replicas the deployed index will be always deployed on.
--min-replica-count
=MIN_REPLICA_COUNT
- Minimum number of machine replicas the deployed index will be always deployed on. If specified, the value must be equal to or larger than 1.
--reserved-ip-ranges
=[RESERVED_IP_RANGES
,…]- List of reserved IP ranges deployed index will be deployed to.
- GCLOUD WIDE FLAGS
-
These flags are available to all commands:
--access-token-file
,--account
,--billing-project
,--configuration
,--flags-file
,--flatten
,--format
,--help
,--impersonate-service-account
,--log-http
,--project
,--quiet
,--trace-token
,--user-output-enabled
,--verbosity
.Run
$ gcloud help
for details. - NOTES
-
These variants are also available:
gcloud alpha ai index-endpoints deploy-index
gcloud beta ai index-endpoints deploy-index
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-02-06 UTC.