Export model artifacts for inference and explanation

Vertex AI offers prebuilt containers to serve inferences and explanations from models trained using the following machine learning (ML) frameworks:

TensorFlow
PyTorch
XGBoost
scikit-learn

To use one of these prebuilt containers, you must save your model as one or more model artifacts that comply with the requirements of the prebuilt container. These requirements apply whether or not your model artifacts are created on Vertex AI.

If you use a custom container to serve inferences, you don't need to comply with the requirements in this page, but you can still use them as guidelines.

Framework-specific requirements for exporting to prebuilt containers

Depending on which ML framework you plan to use for inference, you must export model artifacts in different formats. The following sections describe the acceptable model formats for each ML framework.

TensorFlow

If you use TensorFlow to train a model, export your model as a TensorFlow SavedModel directory.

There are several ways to export SavedModels from TensorFlow training code. The following list describes a few ways that work for various TensorFlow APIs:

If you use Keras for training, use tf.keras.Model.save to export a SavedModel.
If you use an Estimator for training, use tf.estimator.Estimator.export_saved_model to export a SavedModel.
Otherwise, use tf.saved_model.save or use tf.compat.v1.saved_model.SavedModelBuilder.

If you aren't using Keras or an Estimator, make sure to use the serve tag and serving_default signature when you export your SavedModel in order to make sure Vertex AI can use your model artifacts to serve inferences. Keras and Estimator handle this automatically. Learn more about specifying signatures during export.

To serve inferences using these artifacts, create a Model with the prebuilt container for inference matching the version of TensorFlow that you used for training.

TensorFlow for Vertex Explainable AI

If you want to get explanations from a Model that uses a TensorFlow prebuilt container to serve inferences, read the additional requirements for exporting a TensorFlow model for Vertex Explainable AI.

Enable server-side request batching for TensorFlow

If you want to enable request batching for a Model that uses a TensorFlow prebuilt container to serve inferences, include config/batching_parameters_config in the same Cloud Storage directory as saved_model.pb file. To configure the batching config file, see the TensorFlow's official documentation.

PyTorch

You must package the model artifacts including either a default or custom handler by creating an archive file using Torch model archiver. The prebuilt PyTorch images expect the archive to be named model.mar, so make sure you set the model-name to 'model'.

For information about optimizing the memory usage, latency or throughput of a PyTorch model served with TorchServe, see the PyTorch performance guide.

XGBoost

If you use an XGBoost prebuilt container to train a model, you can export the trained model in the following ways:

Use xgboost.Booster's save_model method to export a file named model.bst.
Use the joblib library to export a file named model.joblib.

Your model artifact's filename must exactly match one of these options.

The following examples show how to train and export a model:

xgboost.Booster

import os

from google.cloud import storage
from sklearn import datasets
import xgboost as xgb

digits = datasets.load_digits()
dtrain = xgb.DMatrix(digits.data, label=digits.target)
bst = xgb.train({}, dtrain, 20)

artifact_filename = 'model.bst'

# Save model artifact to local filesystem (doesn't persist)
local_path = artifact_filename
bst.save_model(local_path)

# Upload model artifact to Cloud Storage
model_directory = os.environ['AIP_MODEL_DIR']
storage_path = os.path.join(model_directory, artifact_filename)
blob = storage.blob.Blob.from_string(storage_path, client=storage.Client())
blob.upload_from_filename(local_path)

joblib

import os

from google.cloud import storage
from sklearn import datasets
import joblib
import xgboost as xgb

digits = datasets.load_digits()
dtrain = xgb.DMatrix(digits.data, label=digits.target)
bst = xgb.train({}, dtrain, 20)

artifact_filename = 'model.joblib'

# Save model artifact to local filesystem (doesn't persist)
local_path = artifact_filename
joblib.dump(bst, local_path)

# Upload model artifact to Cloud Storage
model_directory = os.environ['AIP_MODEL_DIR']
storage_path = os.path.join(model_directory, artifact_filename)
blob = storage.blob.Blob.from_string(storage_path, client=storage.Client())
blob.upload_from_filename(local_path)

To serve inferences using this artifact, create a Model with the prebuilt container for inference matching the version of XGBoost that you used for training.

scikit-learn

If you use a scikit-learn prebuilt model to train a model, you can export it by using the joblib library to export a file named model.joblib.

Your model artifact's filename must exactly match one of these options. You can export standard scikit-learn estimators or scikit-learn pipelines.

The following example shows how to train and export a model:

joblib

import os

from google.cloud import storage
from sklearn import datasets
from sklearn.ensemble import RandomForestClassifier
import joblib

digits = datasets.load_digits()
classifier = RandomForestClassifier()
classifier.fit(digits.data, digits.target)

artifact_filename = 'model.joblib'

# Save model artifact to local filesystem (doesn't persist)
local_path = artifact_filename
joblib.dump(classifier, local_path)

# Upload model artifact to Cloud Storage
model_directory = os.environ['AIP_MODEL_DIR']
storage_path = os.path.join(model_directory, artifact_filename)
blob = storage.blob.Blob.from_string(storage_path, client=storage.Client())
blob.upload_from_filename(local_path)

To serve inferences using this artifact, create a Model with the prebuilt container for inference matching the version of scikit-learn that you used for training.

What's next

Read about additional requirements for your training code that you must consider when performing custom training on Vertex AI.
Learn how to create a custom TrainingPipeline resource in order to run your custom training code and create a Model from the resulting model artifacts.
Learn how to import a Model from model artifacts in Cloud Storage. This applies to model artifacts that you create using a CustomJob resource or a HyperparameterTuningJob resource, as well as model artifacts that you train outside of Vertex AI.

Export model artifacts for inference and explanation Stay organized with collections Save and categorize content based on your preferences.

Framework-specific requirements for exporting to prebuilt containers

TensorFlow

TensorFlow for Vertex Explainable AI

Enable server-side request batching for TensorFlow

PyTorch

XGBoost

xgboost.Booster

joblib

scikit-learn

joblib

What's next

Export model artifacts for inference and explanation