Export model artifacts for prediction

Vertex AI offers pre-built containers to serve predictions and explanations from models trained using the following machine learning (ML) frameworks:

  • TensorFlow
  • XGBoost
  • scikit-learn

To use one of these pre-built containers, you must save your model as one or more model artifacts that comply with the requirements of the pre-built container. These requirements apply whether or not your model artifacts are created on Vertex AI.

If you want to use a custom container to serve predictions, you don't need to comply with these requirements, but you can still use them as guides for how to persist trained ML models for custom prediction containers.

If you use PyTorch as your ML framework, you must use a custom container to serve predictions. For guidance on saving and loading models for inference in PyTorch, see the PyTorch documentation.

Maximum model size

The total file size of the model artifacts that you specify in the artifactUri field when you create a Model resource must be 10 GB or less.

Framework-specific requirements

Depending on which ML framework you plan to use for prediction, you must export model artifacts in different formats. The following sections describe the acceptable model formats for each ML framework.

TensorFlow

If you use TensorFlow to train a model, export your model as a TensorFlow SavedModel directory.

There are several ways to export SavedModels from TensorFlow training code. The following list describes a few ways that work for various TensorFlow APIs:

To serve predictions using these artifacts, create a Model with the pre-built container for prediction matching the version of TensorFlow that you used for training.

TensorFlow for Vertex Explainable AI

If you want to get explanations from a Model that uses a TensorFlow pre-built container to serve predictions, then read the additional requirements for exporting a TensorFlow model for Vertex Explainable AI.

Enable server-side request batching for Tensorflow

If you want to enable request batching for a Model that uses a TensorFlow pre-built container to serve predictions, then include config/batching_parameters_config in the same gcs directory as saved_model.pb file. To configure the batching config file, please refer to the Tensorflow's official documentation.

XGBoost

If you use XGBoost to train a model, you may export the trained model in one of three ways:

  • Use xgboost.Booster's save_model method to export a file named model.bst.
  • Use the joblib library to export a file named model.joblib.
  • Use Python's pickle module to export a file named model.pkl.

Your model artifact's filename must exactly match one of these options.

The following tabbed examples show how to train and export a model in each of the three ways:

xgboost.Booster

import os

from google.cloud import storage
from sklearn import datasets
import xgboost as xgb

digits = datasets.load_digits()
dtrain = xgb.DMatrix(digits.data, label=digits.target)
bst = xgb.train({}, dtrain, 20)

artifact_filename = 'model.bst'

# Save model artifact to local filesystem (doesn't persist)
local_path = artifact_filename
bst.save_model(local_path)

# Upload model artifact to Cloud Storage
model_directory = os.environ['AIP_MODEL_DIR']
storage_path = os.path.join(model_directory, artifact_filename)
blob = storage.blob.Blob.from_string(storage_path, client=storage.Client())
blob.upload_from_filename(local_path)

joblib

import os

from google.cloud import storage
from sklearn import datasets
import joblib
import xgboost as xgb

digits = datasets.load_digits()
dtrain = xgb.DMatrix(digits.data, label=digits.target)
bst = xgb.train({}, dtrain, 20)

artifact_filename = 'model.joblib'

# Save model artifact to local filesystem (doesn't persist)
local_path = artifact_filename
joblib.dump(bst, local_path)

# Upload model artifact to Cloud Storage
model_directory = os.environ['AIP_MODEL_DIR']
storage_path = os.path.join(model_directory, artifact_filename)
blob = storage.blob.Blob.from_string(storage_path, client=storage.Client())
blob.upload_from_filename(local_path)

pickle

import os
import pickle

from google.cloud import storage
from sklearn import datasets
import xgboost as xgb

digits = datasets.load_digits()
dtrain = xgb.DMatrix(digits.data, label=digits.target)
bst = xgb.train({}, dtrain, 20)

artifact_filename = 'model.pkl'

# Save model artifact to local filesystem (doesn't persist)
local_path = artifact_filename
with open(local_path, 'wb') as model_file:
  pickle.dump(bst, model_file)

# Upload model artifact to Cloud Storage
model_directory = os.environ['AIP_MODEL_DIR']
storage_path = os.path.join(model_directory, artifact_filename)
blob = storage.blob.Blob.from_string(storage_path, client=storage.Client())
blob.upload_from_filename(local_path)

To serve predictions using this artifact, create a Model with the pre-built container for prediction matching the version of XGBoost that you used for training.

scikit-learn

If you use scikit-learn to train a model, you may export it in one of two ways:

  • Use the joblib library to export a file named model.joblib.
  • Use Python's pickle module to export a file named model.pkl.

Your model artifact's filename must exactly match one of these options. You can export standard scikit-learn estimators or scikit-learn pipelines.

The following tabbed examples show how to train and export a model in each of the two ways:

joblib

import os

from google.cloud import storage
from sklearn import datasets
from sklearn.ensemble import RandomForestClassifier
import joblib

digits = datasets.load_digits()
classifier = RandomForestClassifier()
classifier.fit(digits.data, digits.target)

artifact_filename = 'model.joblib'

# Save model artifact to local filesystem (doesn't persist)
local_path = artifact_filename
joblib.dump(classifier, local_path)

# Upload model artifact to Cloud Storage
model_directory = os.environ['AIP_MODEL_DIR']
storage_path = os.path.join(model_directory, artifact_filename)
blob = storage.blob.Blob.from_string(storage_path, client=storage.Client())
blob.upload_from_filename(local_path)

pickle

import os
import pickle

from google.cloud import storage
from sklearn import datasets
from sklearn.ensemble import RandomForestClassifier

digits = datasets.load_digits()
classifier = RandomForestClassifier()
classifier.fit(digits.data, digits.target)

artifact_filename = 'model.pkl'

# Save model artifact to local filesystem (doesn't persist)
local_path = artifact_filename
with open(local_path, 'wb') as model_file:
  pickle.dump(classifier, model_file)

# Upload model artifact to Cloud Storage
model_directory = os.environ['AIP_MODEL_DIR']
storage_path = os.path.join(model_directory, artifact_filename)
blob = storage.blob.Blob.from_string(storage_path, client=storage.Client())
blob.upload_from_filename(local_path)

To serve predictions using this artifact, create a Model with the pre-built container for prediction matching the version of scikit-learn that you used for training.

What's next