To customize how Vertex AI serves online predictions from your
custom-trained model, you can specify a custom container instead of a pre-built
container when you create a Model
resource. When you use a custom container, Vertex AI runs a
Docker container of your choice on each prediction node.
You might want to use a custom container for any of the following reasons:
- to serve predictions from an ML model trained using a framework other than TensorFlow, scikit-learn, or XGBoost
- to preprocess prediction requests or postprocess the predictions generated by your model
- to run a prediction server written in a programming language of your choice
- to install dependencies that you want to use to customize prediction
This guide describes how to create a Model
that uses a custom container. It
does not provide detailed instructions about designing and creating a Docker
container image.
Prepare a container image
To create a Model
that uses a custom container, you must provide a
Docker container image as the basis of that container. This container image must
meet the requirements described in Custom container
requirements.
If you plan to use an existing container image created by a third party that you trust, then you might be able to skip one or both of the following sections.
Create a container image
Design and build a Docker container image that meets the container image requirements.
To learn the basics of designing and building a Docker container image, read the Docker documentation's quickstart
Push the container image to Artifact Registry or Container Registry
Push your container image to an Artifact Registry repository or a Container Registry repository that meets the container image publishing requirements.
Learn how to push a container image to Artifact Registry or push a container image to Container Registry.
Create a Model
To create a Model
that uses a custom container, do one of the following:
The follow sections show how to configure the API fields related to custom
containers when you create a Model
in one of these ways.
Container-related API fields
When you create the Model
, make sure to configure the containerSpec
field with
your custom container details, rather than with a pre-built
container.
You must specify a ModelContainerSpec
message in
the Model.containerSpec
field. Within this message, you can specify the
following subfields:
imageUri
(required)The Artifact Registry or Container Registry URI of your container image.
If you are using the
gcloud ai models upload
command, then you can use the--container-image-uri
flag to specify this field.command
(optional)An array of an executable and arguments to override the container's
ENTRYPOINT
. To learn more about how to format this field and how it interacts with theargs
field, read the API reference forModelContainerSpec
.If you are using the
gcloud ai models upload
command, then you can use the--container-command
flag to specify this field.args
(optional)An array of an executable and arguments to override the container's
CMD
To learn more about how to format this field and how it interacts with thecommand
field, read the API reference forModelContainerSpec
.If you are using the
gcloud ai models upload
command, then you can use the--container-args
flag to specify this field.ports
(optional)An array of ports; Vertex AI sends liveness checks, health checks, and prediction requests to your container on the first port listed, or
8080
by default. Specifying additional ports has no effect.If you are using the
gcloud ai models upload
command, then you can use the--container-ports
flag to specify this field.env
(optional)An array of environment variables that the container's entrypoint command, as well as the
command
andargs
fields, can reference. To learn more about how other fields can reference these environment variables, read the API reference forModelContainerSpec
.If you are using the
gcloud ai models upload
command, then you can use the--container-env-vars
flag to specify this field.healthRoute
(optional)The path on your container's HTTP server where you want Vertex AI to send health checks.
If you don't specify this field, then when you deploy the
Model
as aDeployedModel
to anEndpoint
resource it defaults to/v1/endpoints/ENDPOINT/deployedModels/DEPLOYED_MODEL
, where ENDPOINT is replaced by the last segment of theEndpoint
'sname
field (followingendpoints/
) and DEPLOYED_MODEL is replaced by theDeployedModel
'sid
field.If you are using the
gcloud ai models upload
command, then you can use the--container-health-route
flag to specify this field.predictRoute
(optional)The path on your container's HTTP server where you want Vertex AI to forward prediction requests.
If you don't specify this field, then when you deploy the
Model
as aDeployedModel
to anEndpoint
resource it defaults to/v1/endpoints/ENDPOINT/deployedModels/DEPLOYED_MODEL
, where ENDPOINT is replaced by the last segment of theEndpoint
'sname
field (followingendpoints/
) and DEPLOYED_MODEL is replaced by theDeployedModel
'sid
field.If you are using the
gcloud ai models upload
command, then you can use the--container-predict-route
flag to specify this field.
In addition to the variables that you set in the Model.containerSpec.env
field, Vertex AI sets several other variables based on your
configuration. Learn more about using these environment variables in these
fields and in the container's entrypoint
command.
Model import examples
The following examples show how to specify container-related API fields when you import a model.
gcloud CLI
The following example uses the gcloud ai models upload
command:
gcloud ai models upload \
--region=LOCATION \
--display-name=MODEL_NAME \
--container-image-uri=IMAGE_URI \
--container-command=COMMAND \
--container-args=ARGS \
--container-ports=PORTS \
--container-env-vars=ENV \
--container-health-route=HEALTH_ROUTE \
--container-predict-route=PREDICT_ROUTE \
--artifact-uri=PATH_TO_MODEL_ARTIFACT_DIRECTORY
The --container-image-uri
flag is required; all other flags that begin
with --container-
are optional. To learn about the values for these fields,
see the preceding section of this guide.
Java
To learn how to install and use the client library for Vertex AI, see Vertex AI client libraries. For more information, see the Vertex AI Java API reference documentation.
Node.js
To learn how to install and use the client library for Vertex AI, see Vertex AI client libraries. For more information, see the Vertex AI Node.js API reference documentation.
Python
To learn how to install and use the client library for Vertex AI, see Vertex AI client libraries. For more information, see the Vertex AI Python API reference documentation.
For more context, read the Model import guide.
Send prediction requests
To send an online prediction request to your Model
, follow the instructions at
Get predictions from a custom trained model:
this process works the same regardless of whether you use a custom container.
Read about predict request and response requirements for custom containers.
What's next
- To learn about everything to consider when you design a custom container to use with Vertex AI, read Custom container requirements.