This guide describes how to import models into the Model Registry. After you import your model, it is visible in the Model Registry. From the Model Registry, you can deploy your imported model to an endpoint and run predictions.
Required roles
To get the permissions that you need to import models,
ask your administrator to grant you the
Vertex AI User (roles/aiplatform.user
) IAM role on the project.
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
Prebuilt or custom containers
When you import a model, you associate it with a container for Vertex AI to run prediction requests. You can use prebuilt containers provided by Vertex AI, or use your own custom containers that you build and push to Artifact Registry.
You can use a prebuilt container if your model meets the following requirements:
- Trained in Python 3.7 or later
- Trained using TensorFlow, PyTorch, scikit-learn, or XGBoost
- Exported to meet framework-specific requirements for one of the prebuilt prediction containers
If you are importing a tabular AutoML model that you previously exported, you must use a specific custom container provided by Vertex AI.
Otherwise, create a new custom container, or use an existing custom container that you have in Artifact Registry.
Upload model artifacts to Cloud Storage
You must store your model artifacts in a Cloud Storage bucket, where the bucket's region matches the regional endpoint you're using.
If your Cloud Storage bucket is in a different Google Cloud project, you need to grant Vertex AI access to read your model artifacts.
If you're using a prebuilt container, ensure that your model artifacts have filenames that exactly match the following examples:
- TensorFlow SavedModel:
saved_model.pb
- PyTorch:
model.mar
- scikit-learn:
model.joblib
ormodel.pkl
- XGBoost:
model.bst
,model.joblib
, ormodel.pkl
Learn more about exporting model artifacts for prediction.
Import a model using Google Cloud console
To import a model using Google Cloud console:
In the Google Cloud console, go to the Vertex AI Models page.
Click Import.
Select Import as new model to import a new model.
Select Import as new version to import a model as a version of an existing model. To learn more about model versioning, see Model versioning.
Name and region: Enter a name for your model. Select the region that matches both your bucket's region, and the Vertex AI regional endpoint you're using. Click Continue.
If you expand Advanced options, you can optionally decide to add a customer-managed encryption key.
Depending on the type of container you are using, select the appropriate tab below.
Prebuilt container
Select Import model artifacts into a new prebuilt container.
Select the Model framework and Model framework version you used to train your model.
If you want to use GPUs for serving predictions, set the Accelerator type to GPUs.
You select the type of GPU later on, when you deploy the model to an endpoint.
Specify the Cloud Storage path to the directory that contains your model artifacts.
For example,
gs://BUCKET_NAME/models/
.Leave the Predict schemata blank.
To import your model without Vertex Explainable AI settings, click Import.
After the import has completed, your model appears on the Models page.
Otherwise, continue configuring your model by entering your explainability settings on the Explainability tab. Learn more about the Explainability settings.
Custom container
Select Import an existing custom container.
Set the container image URI.
If you want to provide model artifacts in addition to a container image, specify the Cloud Storage path to the directory that contains your model artifacts.
For example,
gs://BUCKET_NAME/models/
.Specify values for any of the other fields.
Learn more about these optional fields.
To import your model without Vertex Explainable AI settings, click Import.
After the import has completed, your model appears on the Models page.
Otherwise, continue configuring your model by entering your explainability settings on the Explainability tab. Learn more about the Explainability settings.
AutoML tabular container
Select Import an existing custom container.
In the Container image field, enter
MULTI_REGION-docker.pkg.dev/vertex-ai/automl-tabular/prediction-server-v1:latest
.Replace
MULTI_REGION
withus
,europe
, orasia
to select which Docker repository you want to pull the Docker image from. Each repository provides the same Docker image, but choosing the Artifact Registry multi-region closest to the machine where you are running Docker might reduce latency.In the Package location field, specify the Cloud Storage path to the directory that contains your model artifacts.
The path looks similar to the following example:
gs://BUCKET_NAME/models-MODEL_ID/tf-saved-model/TIMESTAMP/
Leave all other fields blank.
Click Import.
After the import has completed, your model appears on the Models page. You can use this model just like other AutoML tabular models, except imported AutoML tabular models don't support Vertex Explainable AI.
Import a model programmatically
The following examples show how to import a model using various tools:
gcloud
The following example uses the gcloud ai models upload
command:
gcloud ai models upload \
--region=LOCATION \
--display-name=MODEL_NAME \
--container-image-uri=IMAGE_URI \
--artifact-uri=PATH_TO_MODEL_ARTIFACT_DIRECTORY
Replace the following:
- LOCATION_ID: The region where you are using Vertex AI.
-
MODEL_NAME: A display name for the
Model
. -
IMAGE_URI: The URI of the container image to use for serving predictions. For example,
us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-1:latest
. Use a prebuilt container or a custom container. -
PATH_TO_MODEL_ARTIFACT_DIRECTORY: The Cloud Storage URI (beginning with
gs://
) of a directory in Cloud Storage that contains your model artifacts.
The preceding example demonstrates all the flags necessary to import most
models. If you are not using a prebuilt container for prediction, then you
likely need to specify some additional optional
flags so that
Vertex AI can use your container image. These flags, which begin
with --container-
, correspond to fields of your Model
's
containerSpec
.
REST
Use the following code sample to upload a model using the
upload
method of the model
resource.
Before using any of the request data, make the following replacements:
- LOCATION_ID: The region where you are using Vertex AI.
- PROJECT_ID: Your project ID.
-
MODEL_NAME: A display name for the
Model
. - MODEL_DESCRIPTION: Optional. A description for the model.
-
IMAGE_URI: The URI of the container image to use for serving predictions. For example,
us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-1:latest
. Use a prebuilt container or a custom container. -
PATH_TO_MODEL_ARTIFACT_DIRECTORY: The Cloud Storage URI (beginning with
gs://
) of a directory in Cloud Storage that contains your model artifacts. This variable and theartifactUri
field are optional if you're using a custom container. labels
: Optional. Any set of key-value pairs to organize your models. For example:- "env": "prod"
- "tier": "backend"
- Specify the LABEL_NAME and LABEL_VALUE for any labels that you want to apply to this training pipeline.
HTTP method and URL:
POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/models:upload
Request JSON body:
{ "model": { "displayName": "MODEL_NAME", "predictSchemata": {}, "containerSpec": { "imageUri": "IMAGE_URI" }, "artifactUri": "PATH_TO_MODEL_ARTIFACT_DIRECTORY", "labels": { "LABEL_NAME_1": "LABEL_VALUE_1", "LABEL_NAME_2": "LABEL_VALUE_2" } } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/models:upload"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/models:upload" | Select-Object -Expand Content
Java
Before trying this sample, follow the Java setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Java API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Node.js API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
To import a model with Vertex Explainable AI settings enabled, refer to the Vertex Explainable AI model import examples.
Get operation status
Some requests start long-running operations that require time to complete. These requests return an operation name, which you can use to view the operation's status or cancel the operation. Vertex AI provides helper methods to make calls against long-running operations. For more information, see Working with long-running operations.
Limitations
- Maximum supported model size is 10 GiB.
What's next
- Deploy your model to an endpoint, programmatically or by using Google Cloud console.