Deploying models

This page explains how to deploy your model to AI Platform to get predictions.

In order to deploy your trained model on AI Platform, you must:

  • Upload your saved model to a Cloud Storage bucket.
  • Create an AI Platform model resource.
  • Create an AI Platform version resource, specifying the Cloud Storage path to your saved model.

Before you begin

After you have trained your model, you must make important adjustments before deploying it to AI Platform for predictions.

If you have chosen to use a custom prediction routine (beta), read the guide to custom prediction routines to learn about additional artifacts and code that you must upload to Cloud Storage and additional parameters that you must specify when you create a version.

Store your model in Cloud Storage

Generally, it is easiest to use a dedicated Cloud Storage bucket in the same project you're using for AI Platform.

If you're using a bucket in a different project, you must ensure that your AI Platform service account can access your model in Cloud Storage. Without the appropriate permissions, your request to create an AI Platform model version fails. See more about granting permissions for storage.

Set up your Cloud Storage bucket

This section shows you how to create a new bucket. You can use an existing bucket, but if it is not part of the project you are using to run AI Platform, you must explicitly grant access to the AI Platform service accounts.

  1. Specify a name for your new bucket. The name must be unique across all buckets in Cloud Storage.

    BUCKET_NAME="your_bucket_name"

    For example, use your project name with -mlengine appended:

    PROJECT_ID=$(gcloud config list project --format "value(core.project)")
    BUCKET_NAME=${PROJECT_ID}-mlengine
  2. Check the bucket name that you created.

    echo $BUCKET_NAME
  3. Select a region for your bucket and set a REGION environment variable.

    For example, the following code creates REGION and sets it to us-central1:

    REGION=us-central1
  4. Create the new bucket:

    gsutil mb -l $REGION gs://$BUCKET_NAME

    Note: Use the same region where you plan on running AI Platform jobs. The example uses us-central1 because that is the region used in the getting-started instructions.

Upload the exported model to Cloud Storage

Run the following command to upload your saved model to your bucket in Cloud Storage:

SAVED_MODEL_DIR=$(ls ./your-export-dir-base | tail -1)
gsutil cp -r $SAVED_MODEL_DIR gs://your-bucket

When you export a SavedModel from tf.keras or from a TensorFlow estimator, it gets saved as a timestamped subdirectory of a base export directory that you choose, like your-export-dir-base/1487877383942. This example shows how to upload the directory with the most recent timestamp. If you created your SavedModel in a different way, it may be in a different location on your local filesystem.

If you are deploying a custom prediction routine (beta) upload all your model artifacts to a model directory in your Cloud Storage bucket.

The total file size of your model directory must be 250 MB or less. You can request a higher quota to deploy larger models.

When you create subsequent versions of your model, organize them by placing each one into its own separate directory within your Cloud Storage bucket.

Upload custom code

If you are deploying a custom prediction routine, you must also upload the source distribution package containing your custom code. For example:

gsutil cp dist/my_custom_code-0.1.tar.gz gs://your-bucket/my_custom_code-0.1.tar.gz

You may upload this tarball to the same directory in Cloud Storage as your model file, but you don't have to. In fact, keeping them separate may provide better organization, especially if you deploy many versions of your model and code.

The total file size of your custom code must be 250 MB or less. You can request a higher quota to deploy more custom code.

Test your model with local predictions

You can use gcloud to deploy your model for local predictions. This optional step helps you to save time by sanity-checking your model before deploying it to AI Platform. Using the model file you uploaded to Cloud Storage, you can run online prediction locally and get a preview of the results the AI Platform prediction server would return.

Use local prediction with a small subset of your test data to debug a mismatch between training and serving features. For example, if the data you send with your prediction request does not match what your model expects, you can find that out before you incur costs for cloud online prediction requests.

See more about using gcloud ai-platform local predict.

  1. Set environment variables for the Cloud Storage directory that contains your model ("gs://your-bucket/"), framework, and the name of your input file, if you have not already done so:

    MODEL_DIR="gs://your-bucket/"
    INPUT_FILE="input.json"
    FRAMEWORK="TENSORFLOW"
    
  2. Send the prediction request:

    gcloud ai-platform local predict --model-dir=$MODEL_DIR \
        --json-instances $INPUT_FILE \
        --framework $FRAMEWORK
    

Deploy models and versions

AI Platform organizes your trained models using model and version resources. An AI Platform model is a container for the versions of your machine learning model.

To deploy a model, you create a model resource in AI Platform, create a version of that model, then link the model version to the model file stored in Cloud Storage.

Create a model resource

AI Platform uses model resources to organize different versions of your model.

console

  1. Open the AI Platform models page in the GCP Console:

    Open models in the GCP Console

  2. If needed, create the model to add your new version to:

    1. Click the New Model button at the top of the Models page. This brings you to the Create model page.

    2. Enter a unique name for your model in the Model name box. Optionally, enter a description for your model in the Description field.

    3. Click Create.

    4. Verify that you have returned to the Models page, and that your new model appears in the list.

gcloud

Create a model resource for your model versions, filling in your desired name for your model without the enclosing brackets:

    gcloud ai-platform models create "[YOUR-MODEL-NAME]"

REST API

  1. Format your request by placing the model object in the request body. At minimum, you must specify a name for your model. Fill in your desired name for your model without the enclosing brackets:

      {"name": "[YOUR-MODEL-NAME]" }
    
  2. Make your REST API call to the following path, replacing [VALUES_IN_BRACKETS] with the appropriate values:

      POST https://ml.googleapis.com/v1/projects/[YOUR-PROJECT-ID]/models/
    

    For example, you can make the following request using cURL:

      curl -X POST -H "Content-Type: application/json" \
        -d '{"name": "[YOUR-MODEL-NAME]"}' \
        -H "Authorization: Bearer `gcloud auth print-access-token`" \
        "https://ml.googleapis.com/v1/projects/[YOUR-PROJECT-ID]/models"
    

    You should see output similar to this:

      {
        "name": "projects/[YOUR-PROJECT-ID]/models/[YOUR-MODEL-NAME]",
        "regions": [
          "us-central1"
        ]
      }
    

See the AI Platform model API for more details.

Create a model version

Now you are ready to create a model version with the trained model you previously uploaded to Cloud Storage. When you create a version, specify the following parameters:

  • name: must be unique within the AI Platform model.
  • deploymentUri: the path to your SavedModel directory in Cloud Storage. This is often a timestamped directory like gs://your_bucket_name/job_20190321/export/1553208972357/

    Or, if you're deploying a custom prediction routine (beta), the model directory containing all your model artifacts.

    The total size of this directory must be 250 MB or less.

  • framework: TENSORFLOW. Omit this parameter if you're deploying a custom prediction routine.

  • runtimeVersion: a runtime verison based on the version of TensorFlow and other dependencies your model needs. If you're deploying a custom prediction routine, this must be at least 1.4.

  • packageUris (optional): a list of paths to your custom code distribution packages (.tar.gz files) in Cloud Storage. Only provide this parameter if you are deploying a custom prediction routine (beta).

  • predictionClass (optional): the name of your Predictor class in module_name.class_name format. Only provide this parameter if you are deploying a custom prediction routine (beta).

  • pythonVersion: must be set to "3.5" to be compatible with model files exported using Python 3. If not set, this defaults to "2.7".

See more information about each of these parameters in the AI Platform Training and Prediction API for a version resource.

See the full details for each runtime version.

console

  1. On the Models page, select the name of the model resource you would like to use to create your version. This brings you to the Model Details page.

    Open models in the GCP Console

  2. Click the New Version button at the top of the Model Details page. This brings you to the Create version page.

  3. Enter your version name in the Name field. Optionally, enter a description for your version in the Description field.

  4. Enter the following information about how you trained your model in the corresponding dropdown boxes:

    • Select the Python version you used to train your model.
    • Select the Framework and Framework version.
    • Select the ML runtime version. Learn more about AI Platform runtime versions.
  5. Optionally, select a Machine type to run online prediction. This field defaults to "Single core CPU".

  6. In the Model URI field, enter the Cloud Storage bucket location where you uploaded your model file. You may use the Browse button to find the correct path.

    Make sure to specify the path to the directory containing the file, not the path to the model file itself. For example, use "gs://your_bucket_name/model-dir/" instead of "gs://your_bucket_name/model-dir/model.pkl".

  7. Select a Scaling option for online prediction deployment:

    • If you select "Auto scaling", the optional Minimum number of nodes field displays. You can enter the minimum number of nodes to keep running at all times, when the service has scaled down. This field defaults to 0.

    • If you select "Manual scaling", you must enter the Number of nodes you want to keep running at all times.

      Learn more about pricing for prediction costs.

  8. To finish creating your model version, click Save.

gcloud

  1. Set environment variables to store the path to the Cloud Storage directory where your model binary is located, your model name, your version name and your framework choice.

    Replace [VALUES_IN_BRACKETS] with the appropriate values:

    MODEL_DIR="gs://your_bucket_name/"
    VERSION_NAME="[YOUR-VERSION-NAME]"
    MODEL_NAME="[YOUR-MODEL-NAME]"
    FRAMEWORK="TENSORFLOW"
    

    For a custom prediction routine (beta), omit the FRAMEWORK variable and set additional variables with the path to your custom code tarball and the name of your predictor class:

    MODEL_DIR="gs://your_bucket_name/"
    VERSION_NAME="[YOUR-VERSION-NAME]"
    MODEL_NAME="[YOUR-MODEL-NAME]"
    CUSTOM_CODE_PATH="gs://your_bucket_name/my_custom_code-0.1.tar.gz"
    PREDICTOR_CLASS="[MODULE_NAME].[CLASS_NAME]"
    
  2. Create the version:

    gcloud ai-platform versions create $VERSION_NAME \
      --model $MODEL_NAME \
      --origin $MODEL_DIR \
      --runtime-version=1.13 \
      --framework $FRAMEWORK \
      --python-version=3.5
    

    For a custom prediction routine (beta), use the gcloud beta component, omit the --framework flag, and set the --package-uris and --prediction-class flags:

    gcloud components install beta
    
    gcloud beta ai-platform versions create $VERSION_NAME \
      --model $MODEL_NAME \
      --origin $MODEL_DIR \
      --runtime-version=1.13 \
      --python-version=3.5
      --package-uris=$CUSTOM_CODE_PATH
      --prediction-class=$PREDICTOR_CLASS
    

    Creating the version takes a few minutes. When it is ready, you should see the following output:

    Creating version (this might take a few minutes)......done.

  3. Get information about your new version:

    gcloud ai-platform versions describe $VERSION_NAME \
      --model $MODEL_NAME
    

    You should see output similar to this:

    createTime: '2018-02-28T16:30:45Z'
    deploymentUri: gs://your_bucket_name
    framework: TENSORFLOW
    machineType: mls1-highmem-1
    name: projects/[YOUR-PROJECT-ID]/models/[YOUR-MODEL-NAME]/versions/[YOUR-VERSION-NAME]
    pythonVersion: '3.5'
    runtimeVersion: '1.13'
    state: READY

REST API

  1. Format your request body to contain the version object. This example specifies the version name, deploymentUri, runtimeVersion and framework. Replace [VALUES_IN_BRACKETS] with the appropriate values:

      {
        "name": "[YOUR-VERSION-NAME]",
        "deploymentUri": "gs://your_bucket_name/"
        "runtimeVersion": "1.13"
        "framework": "TENSORFLOW"
        "pythonVersion": "3.5"
      }
    
  2. Make your REST API call to the following path, replacing [VALUES_IN_BRACKETS] with the appropriate values:

      POST https://ml.googleapis.com/v1/projects/[YOUR-PROJECT-ID]/models/[YOUR-MODEL-NAME]/versions
    

    For example, you can make the following request using cURL:

        curl -X POST -H "Content-Type: application/json" \
          -d '{"name": "[YOUR-VERSION-NAME]", "deploymentUri": "gs://your_bucket_name/", "runtimeVersion": "1.13", "framework": "TENSORFLOW", "pythonVersion": "3.5"}' \
          -H "Authorization: Bearer `gcloud auth print-access-token`" \
          "https://ml.googleapis.com/v1/projects/[YOUR-PROJECT-ID]/models/[YOUR-MODEL-NAME]/versions"
    

    Creating the version takes a few minutes. When it is ready, you should see output similar to this:

      {
        "name": "projects/[YOUR-PROJECT-ID]/operations/create_[YOUR-MODEL-NAME]_[YOUR-VERSION-NAME]-[TIMESTAMP]",
        "metadata": {
          "@type": "type.googleapis.com/google.cloud.ml.v1.OperationMetadata",
          "createTime": "2018-07-07T02:51:50Z",
          "operationType": "CREATE_VERSION",
          "modelName": "projects/[YOUR-PROJECT-ID]/models/[YOUR-MODEL-NAME]",
          "version": {
            "name": "projects/[YOUR-PROJECT-ID]/models/[YOUR-MODEL-NAME]/versions/[YOUR-VERSION-NAME]",
            "deploymentUri": "gs://your_bucket_name",
            "createTime": "2018-07-07T02:51:49Z",
            "runtimeVersion": "1.13",
            "framework": "TENSORFLOW",
            "machineType": "mls1-highmem-1",
            "pythonVersion": "3.5"
          }
        }
      }
    

What's next

Was this page helpful? Let us know how we did:

Send feedback about...

AI Platform for TensorFlow