This legacy version of AI Platform Prediction is deprecated and will no longer be available on Google Cloud after January 31, 2025. All models, associated metadata, and deployments will be deleted after January 31, 2025. Migrate your resources to Vertex AI to get new machine learning features that are unavailable in AI Platform.

Get ML predictions from scikit-learn or XGBoost models

The AI Platform Prediction online prediction service manages computing resources in the cloud to run your models. These models can be scikit-learn or XGBoost models that you have trained elsewhere (locally, or via another service) and exported to a file. This page describes the process to get online predictions from these exported models using AI Platform Prediction.

Before you begin

Overview

In this tutorial, you train a simple model to predict the species of flowers, using the Iris dataset. After you train and save the model locally, you deploy it to AI Platform Prediction and query it to get online predictions.

Before you begin

Complete the following steps to set up a GCP account, activate the AI Platform Prediction API, and install and activate the Cloud SDK.

Set up your GCP project

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Make sure that billing is enabled for your Google Cloud project.

Enable the AI Platform Training & Prediction and Compute Engine APIs.

Enable the APIs

Install the Google Cloud CLI.

To initialize the gcloud CLI, run the following command:

gcloud init

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Make sure that billing is enabled for your Google Cloud project.

Enable the AI Platform Training & Prediction and Compute Engine APIs.

Enable the APIs

Install the Google Cloud CLI.

To initialize the gcloud CLI, run the following command:

gcloud init

Set up your environment

Choose one of the options below to set up your environment locally on macOS or in a remote environment on Cloud Shell.

For macOS users, we recommend that you set up your environment using the MACOS tab below. Cloud Shell, shown on the CLOUD SHELL tab, is available on macOS, Linux, and Windows. Cloud Shell provides a quick way to try AI Platform Prediction, but isn't suitable for ongoing development work.

macOS

Check Python installation
Confirm that you have Python installed and, if necessary, install it.
```
python -V
```
Check pip installation
pip is Python's package manager, included with current versions of Python. Check if you already have pip installed by running pip --version. If not, see how to install pip.

You can upgrade pip using the following command:
```
pip install -U pip
```
See the pip documentation for more details.
Install virtualenv
virtualenv is a tool to create isolated Python environments. Check if you already have virtualenv installed by running virtualenv --version. If not, install virtualenv:
```
pip install --user --upgrade virtualenv
```
To create an isolated development environment for this guide, create a new virtual environment in virtualenv. For example, the following command activates an environment named aip-env:
```
virtualenv aip-env
source aip-env/bin/activate
```
For the purposes of this tutorial, run the rest of the commands within your virtual environment.
See more information about using virtualenv. To exit virtualenv, run deactivate.

Cloud Shell

Open the Google Cloud console.

Google Cloud console
Click the Activate Google Cloud Shell button at the top of the console window.

A Cloud Shell session opens inside a new frame at the bottom of the console and displays a command-line prompt. It can take a few seconds for the shell session to be initialized.

Your Cloud Shell session is ready to use.
Configure the gcloud command-line tool to use your selected project.
```
gcloud config set project [selected-project-id]
```
where [selected-project-id] is your project ID. (Omit the enclosing brackets.)

Install frameworks

macOS

Within your virtual environment, run the following command to install the versions of scikit-learn, XGBoost, and pandas used in AI Platform Prediction runtime version 2.11:

(aip-env)$ pip install scikit-learn==1.0.2 xgboost==1.6.2 pandas==1.3.5

By providing version numbers in the preceding command, you ensure that the dependencies in your virtual environment match the dependencies in the runtime version. This helps prevent unexpected behavior when your code runs on AI Platform Prediction.

For more details, installation options, and troubleshooting information, refer to the installation instructions for each framework:

Cloud Shell

Run the following command to install scikit-learn, XGBoost, and pandas:

pip install --user scikit-learn xgboost pandas

For more details, installation options, and troubleshooting information, refer to the installation instructions for each framework:

Versions of scikit-learn and XGBoost

AI Platform Prediction runtime versions are updated periodically to include support for new releases of scikit-learn and XGBoost. See the full details for each runtime version.

Train and save a model

Start by training a simple model for the Iris dataset.

scikit-learn

Following the scikit-learn example on model persistence, you can train and export a model as shown below:

Note: To export a joblib model artifact compatible with AI Platform Prediction, you must use the version of joblib that is distributed with scikit-learn, not the standalone version. To import this library in Python, use the statement from sklearn.externals import joblib.

from sklearn.externals import joblib
from sklearn import datasets
from sklearn import svm

# Load the Iris dataset
iris = datasets.load_iris()

# Train a classifier
classifier = svm.SVC()
classifier.fit(iris.data, iris.target)

# Export the classifier to a file
joblib.dump(classifier, 'model.joblib')

To export the model, you also have the option to use the pickle library as follows:

import pickle
with open('model.pkl', 'wb') as model_file:
  pickle.dump(classifier, model_file)

XGBoost

You can export the model by using the "save_model" method of the Booster object.

For the purposes of this tutorial, scikit-learn is used with XGBoost only to import the Iris dataset.

from sklearn import datasets
import xgboost as xgb

# Load the Iris dataset
iris = datasets.load_iris()

# Load data into DMatrix object
dtrain = xgb.DMatrix(iris.data, label=iris.target)

# Train XGBoost model
bst = xgb.train({}, dtrain, 20)

# Export the classifier to a file
bst.save_model('./model.bst')

To export the model, you also have the option to use the pickle library as follows:

import pickle
with open('model.pkl', 'wb') as model_file:
  pickle.dump(bst, model_file)

Model file naming requirements

For online prediction, the saved model file that you upload to Cloud Storage must be named one of: model.pkl, model.joblib, or model.bst, depending on which library you used. This restriction ensures that AI Platform Prediction uses the same pattern to reconstruct the model on import as was used during export.

This requirement does not apply if you create a custom prediction routine (beta).

scikit-learn

Library used to export model	Correct model name
`pickle`	`model.pkl`
`sklearn.externals.joblib`	`model.joblib`

XGBoost

Library used to export model	Correct model name
`pickle`	`model.pkl`
`joblib`	`model.joblib`
`xgboost.Booster`	`model.bst`

For future iterations of your model, organize your Cloud Storage bucket so that each new model has a dedicated directory.

Store your model in Cloud Storage

For the purposes of this tutorial, it is easiest to use a dedicated Cloud Storage bucket in the same project you're using for AI Platform Prediction.

If you're using a bucket in a different project, you must ensure that your AI Platform Prediction service account can access your model in Cloud Storage. Without the appropriate permissions, your request to create an AI Platform Prediction model version fails. See more about granting permissions for storage.

Set up your Cloud Storage bucket

This section shows you how to create a new bucket. You can use an existing bucket, but it must be in the same region where you plan on running AI Platform jobs. Additionally, if it is not part of the project you are using to run AI Platform Prediction, you must explicitly grant access to the AI Platform Prediction service accounts.

Specify a name for your new bucket. The name must be unique across all buckets in Cloud Storage.
```
BUCKET_NAME="YOUR_BUCKET_NAME"
```
For example, use your project name with -aiplatform appended:
```
PROJECT_ID=$(gcloud config list project --format "value(core.project)")
BUCKET_NAME=${PROJECT_ID}-aiplatform
```
Check the bucket name that you created.
```
echo $BUCKET_NAME
```
Select a region for your bucket and set a REGION environment variable.

Use the same region where you plan on running AI Platform Prediction jobs. See the available regions for AI Platform Prediction services.

For example, the following code creates REGION and sets it to us-central1:
```
REGION=us-central1
```

Create the new bucket:

gcloud storage buckets create gs://$BUCKET_NAME --location=$REGION

Upload the exported model file to Cloud Storage

Run the following command to upload the model you exported earlier in this tutorial, to your bucket in Cloud Storage:

gcloud storage cp ./model.joblib gs://$BUCKET_NAME/model.joblib

You can use the same Cloud Storage bucket for multiple model files. Each model file must be within its own directory inside the bucket.

Format data for prediction

gcloud

Create an input.json file with each input instance on a separate line:

[6.8,  2.8,  4.8,  1.4]
[6.0,  3.4,  4.5,  1.6]

Note that the format of input instances needs to match what your model expects. In this example, the Iris model requires 4 features, so your input must be a matrix of shape (num_instances, 4).

REST API

Create an input.json file formatted as a simple list of floats, with each input instance on a separate line:

{
  "instances": [

    [6.8,  2.8,  4.8,  1.4],
    [6.0,  3.4,  4.5,  1.6]

  ]
}

Note that the format of input instances needs to match what your model expects. In this example, the Iris model requires 4 features, so your input must be a matrix of shape (num_instances, 4).

For XGBoost, AI Platform Prediction does not support sparse representation of input instances. If the value of a feature is zero, use 0.0 in the corresponding input. If the value of a feature is missing, use NaN in the corresponding input.

See more information on formatting your input for online prediction.

Test your model with local predictions

You can use the gcloud ai-platform local predict command to test how your model serves predictions before you deploy it to AI Platform Prediction. The command uses dependencies in your local environment to perform prediction and returns results in the same format that gcloud ai-platform predict uses when it performs online predictions. Testing predictions locally can help you discover errors before you incur costs for online prediction requests.

For the --model-dir argument, specify a directory containing your exported machine learning model, either on your local machine or in Cloud Storage. For the --framework argument, specify tensorflow, scikit-learn, or xgboost. You cannot use the gcloud ai-platform local predict command with a custom prediction routine.

The following example shows how to perform local prediction:

gcloud ai-platform local predict --model-dir LOCAL_OR_CLOUD_STORAGE_PATH_TO_MODEL_DIRECTORY/ \
  --json-instances LOCAL_PATH_TO_PREDICTION_INPUT.JSON \
  --framework NAME_OF_FRAMEWORK

Deploy models and versions

AI Platform Prediction organizes your trained models using model and version resources. An AI Platform Prediction model is a container for the versions of your machine learning model.

To deploy a model, you create a model resource in AI Platform Prediction, create a version of that model, then link the model version to the model file stored in Cloud Storage.

Create a model resource

AI Platform Prediction uses model resources to organize different versions of your model.

You must decide at this time whether you want model versions belonging to this this model to use a regional endpoint or the global endpoint. In most cases, choose a regional endpoint. If you need functionality that is only available on legacy (MLS1) machine types, then use the global endpoint.

You must also decide at this time if you want model versions belonging to this model to export any logs when they serve predictions. The following examples do not enable logging. Learn how to enable logging.

console

Open the AI Platform Prediction Models page in the Google Cloud console:

Go to the Models page
Click the New Model button at the top of the Models page. This brings you to the Create model page.
Enter a unique name for your model in the Model name field.
When the Use regional endpoint checkbox is selected, AI Platform Prediction uses a regional endpoint. To use the global endpoint instead, clear the Use regional endpoint checkbox.
From the Region drop-down list, select a location for your prediction nodes. The available regions differ depending on whether you use a regional endpoint or the global endpoint.
Click Create.
Verify that you have returned to the Models page, and that your new model appears in the list.

gcloud

Regional endpoint

Run the following command:

gcloud ai-platform models create MODEL_NAME \
  --region=REGION

Replace the following:

MODEL_NAME: A name that you choose for your model.
REGION: The region of the regional endpoint where you want prediction nodes to run. This must be a region that supports Compute Engine (N1) machine types.

If you don't specify the --region flag, then the gcloud CLI prompts you to select a regional endpoint (or to use us-central on the global endpoint).

Alternatively, you can set the ai_platform/region property to a specific region in order to make sure the gcloud CLI always uses the corresponding regional endpoint for AI Platform Prediction, even when you don't specify the --region flag. (This configuration doesn't apply to commands in the gcloud ai-platform operations command group.)

Global endpoint

Run the following command:

gcloud ai-platform models create MODEL_NAME \
  --regions=REGION

Replace the following:

MODEL_NAME: A name that you choose for your model.
REGION: The region on the global endpoint where you want prediction nodes to run. This must be a region that supports legacy (MLS1) machine types.

If you don't specify the --regions flag, then the gcloud CLI prompts you to select a regional endpoint (or to use us-central1 on the global endpoint).

REST API

Regional endpoint

Format your request by placing the model object in the request body. At minimum, specify a name for your model by replacing MODEL_NAME in the following sample:
```
{
  "name": "MODEL_NAME"
}
```
Make a REST API call to the following URL, replacing PROJECT_ID with your Google Cloud project ID:
```
POST https://REGION-ml.googleapis.com/v1/projects/PROJECT_ID/models/
```
Replace the following:
- REGION: The region of the regional endpoint to deploy your model to. This must be a region that supports Compute Engine (N1) machine types.
- PROJECT_ID: Your Google Cloud project ID.
For example, you can make the following request using the curl command. This command authorizes the request using the credentials associated with your Google Cloud CLI installation.
```
curl -X POST -H "Content-Type: application/json" \
  -d '{"name": "MODEL_NAME"}' \
  -H "Authorization: Bearer `gcloud auth print-access-token`" \
  "https://REGION-ml.googleapis.com/v1/projects/PROJECT_ID/models"
```
The API returns a response similar to the following:
```
{
  "name": "projects/PROJECT_ID/models/MODEL_NAME",
  "regions": [
    "REGION"
  ]
}
```

Global endpoint

Format your request by placing the model object in the request body. At minimum, specify a name for your model by replacing MODEL_NAME in the following sample, and specify a region by replacing REGION with a region that supports legacy (MLS1) machine types:
```
{
  "name": "MODEL_NAME",
  "regions": ["REGION"]
}
```

Make a REST API call to the following URL, replacing PROJECT_ID with your Google Cloud project ID:

POST https://ml.googleapis.com/v1/projects/PROJECT_ID/models/

For example, you can make the following request using the curl command. This command authorizes the request using the credentials associated with your Google Cloud CLI installation.

curl -X POST -H "Content-Type: application/json" \
  -d '{"name": "MODEL_NAME", "regions": ["REGION"]}' \
  -H "Authorization: Bearer `gcloud auth print-access-token`" \
  "https://ml.googleapis.com/v1/projects/PROJECT_ID/models"

The API returns a response similar to the following:

{
  "name": "projects/PROJECT_ID/models/MODEL_NAME",
  "regions": [
    "REGION"
  ]
}

See the AI Platform Prediction model API for more details.

Create a model version

Now you are ready to create a model version with the trained model you previously uploaded to Cloud Storage. When you create a version, you can specify a number of parameters. The following list describes common parameters, some of which are required:

name: must be unique within the AI Platform Prediction model.
deploymentUri: the path to your model directory in Cloud Storage.
- If you're deploying a TensorFlow model, this is a SavedModel directory.
- If you're deploying a scikit-learn or XGBoost model, this is the directory containing your model.joblib, model.pkl, or model.bst file.
- If you're deploying a custom prediction routine, this is the directory containing all your model artifacts. The total size of this directory must be 500 MB or less.
framework: TENSORFLOW, SCIKIT_LEARN, or XGBOOST.
runtimeVersion: a runtime version based on the dependencies your model needs. If you're deploying a scikit-learn model or an XGBoost model, this must be at least 1.4. If you plan to use the model version for batch prediction, then you must use runtime version 2.1 or earlier.
pythonVersion: must be set to "3.5" (for runtime versions 1.4 through 1.14) or "3.7" (for runtime versions 1.15 and later) to be compatible with model files exported using Python 3. Can also be set to "2.7" if used with runtime version 1.15 or earlier.
machineType (optional): the type of virtual machine that AI Platform Prediction uses for the nodes that serve predictions. Learn more about machine types. If not set, this defaults to n1-standard-2 on regional endpoints and mls1-c1-m2 on the global endpoint.

If your model version uses a Compute Engine (N1) machine type and fewer than two nodes, then it is excluded from the AI Platform Training and Prediction Service Level Agreement (SLA).

In other words, if you use a Compute Engine (N1) machine type, then you must set either manualScaling.nodes or autoScaling.minNodes to 2 or greater in order for the model version to be covered by the SLA.

Additionally, using automatic scaling with GPUs is available in preview; therefore model versions that use automatic scaling with GPUs are not covered by the SLA.

See more information about each of these parameters, as well as additional less common parameters, in the API reference for the version resource.

Additionally, if you created your model on a regional endpoint, make sure to also create the version on the same regional endpoint.

console

Open the AI Platform Prediction Models page in the Google Cloud console:

Go to the Models page
On the Models page, select the name of the model resource you would like to use to create your version. This brings you to the Model Details page.
Click the New Version button at the top of the Model Details page. This brings you to the Create version page.
Enter your version name in the Name field. Optionally, enter a description for your version in the Description field.
Enter the following information about how you trained your model in the corresponding dropdown boxes:
- Select the Python version you used to train your model.
- Select the Framework and Framework version.
- Select the ML runtime version. Learn more about AI Platform Prediction runtime versions.
Select a Machine type to run online prediction.
In the Model URI field, enter the Cloud Storage bucket location where you uploaded your model file. You may use the Browse button to find the correct path.

Make sure to specify the path to the directory containing the file, not the path to the model file itself. For example, use gs://your_bucket_name/model-dir/ instead of gs://your_bucket_name/model-dir/saved_model.pb or gs://your_bucket_name/model-dir/model.pkl.
Select a Scaling option for online prediction deployment:
- If you select "Auto scaling", the optional Minimum number of nodes field displays. You can enter the minimum number of nodes to keep running at all times, when the service has scaled down.
- If you select "Manual scaling", you must enter the Number of nodes you want to keep running at all times.
Learn how scaling options differ depending on machine type.

Learn more about pricing for prediction costs.
To finish creating your model version, click Save.

gcloud

Set environment variables to store the path to the Cloud Storage directory where your model binary is located, your model name, your version name and your framework choice.

When you create a version with the gcloud CLI, you may provide the framework name in capital letters with underscores (for example, SCIKIT_LEARN) or in lowercase letters with hyphens (for example, scikit-learn). Both options lead to identical behavior.

Replace [VALUES_IN_BRACKETS] with the appropriate values:
```
MODEL_DIR="gs://your_bucket_name/"
VERSION_NAME="[YOUR-VERSION-NAME]"
MODEL_NAME="[YOUR-MODEL-NAME]"
FRAMEWORK="[YOUR-FRAMEWORK_NAME]"
```
Create the version:
```
gcloud ai-platform versions create $VERSION_NAME \
  --model=$MODEL_NAME \
  --origin=$MODEL_DIR \
  --runtime-version=2.11 \
  --framework=$FRAMEWORK \
  --python-version=3.7 \
  --region=REGION \
  --machine-type=MACHINE_TYPE
```
Replace the following:
- REGION: The region of the regional endpoint on which you created the model. If you created the model on the global endpoint, omit the --region flag.
- MACHINE_TYPE: A machine type, determining the computing resources available to your prediction nodes.
Creating the version takes a few minutes. When it is ready, you should see the following output:
```
Creating version (this might take a few minutes)......done.
```

Get information about your new version:

gcloud ai-platform versions describe $VERSION_NAME \
  --model=$MODEL_NAME

You should see output similar to this:

createTime: '2018-02-28T16:30:45Z'
deploymentUri: gs://your_bucket_name
framework: [YOUR-FRAMEWORK-NAME]
machineType: mls1-c1-m2
name: projects/[YOUR-PROJECT-ID]/models/[YOUR-MODEL-NAME]/versions/[YOUR-VERSION-NAME]
pythonVersion: '3.7'
runtimeVersion: '2.11'
state: READY

REST API

Format your request body to contain the version object. This example specifies the version name, deploymentUri, runtimeVersion, framework and machineType. Replace [VALUES_IN_BRACKETS] with the appropriate values:

{
  "name": "[YOUR-VERSION-NAME]",
  "deploymentUri": "gs://your_bucket_name/",
  "runtimeVersion": "2.11",
  "framework": "[YOUR_FRAMEWORK_NAME]",
  "pythonVersion": "3.7",
  "machineType": "[YOUR_MACHINE_TYPE]"
}

Make your REST API call to the following path, replacing [VALUES_IN_BRACKETS] with the appropriate values:

POST https://REGION-ml.googleapis.com/v1/projects/[YOUR-PROJECT-ID]/models/[YOUR-MODEL-NAME]/versions

Replace REGION with the region of the regional endpoint where you created your model. If you created your model on the global endpoint, useml.googleapis.com.

For example, you can make the following request using the curl command:

curl -X POST -H "Content-Type: application/json" \
  -d '{"name": "[YOUR-VERSION-NAME]", "deploymentUri": "gs://your_bucket_name/", "runtimeVersion": "2.11", "framework": "[YOUR_FRAMEWORK_NAME]", "pythonVersion": "3.7", "machineType": "[YOUR_MACHINE_TYPE]"}' \
  -H "Authorization: Bearer `gcloud auth print-access-token`" \
  "https://REGION-ml.googleapis.com/v1/projects/[YOUR-PROJECT-ID]/models/[YOUR-MODEL-NAME]/versions"

Creating the version takes a few minutes. When it is ready, you should see output similar to this:

{
  "name": "projects/[YOUR-PROJECT-ID]/operations/create_[YOUR-MODEL-NAME]_[YOUR-VERSION-NAME]-[TIMESTAMP]",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.ml.v1.OperationMetadata",
    "createTime": "2018-07-07T02:51:50Z",
    "operationType": "CREATE_VERSION",
    "modelName": "projects/[YOUR-PROJECT-ID]/models/[YOUR-MODEL-NAME]",
    "version": {
      "name": "projects/[YOUR-PROJECT-ID]/models/[YOUR-MODEL-NAME]/versions/[YOUR-VERSION-NAME]",
      "deploymentUri": "gs://your_bucket_name",
      "createTime": "2018-07-07T02:51:49Z",
      "runtimeVersion": "2.11",
      "framework": "[YOUR_FRAMEWORK_NAME]",
      "machineType": "[YOUR_MACHINE_TYPE]",
      "pythonVersion": "3.7"
    }
  }
}

Send online prediction request

After you have successfully created a model version, AI Platform Prediction starts a new server that is ready to serve prediction requests.

gcloud

Set environment variables for your model name, version name, and the name of your input file:
```
MODEL_NAME="iris"
VERSION_NAME="v1"
INPUT_FILE="input.json"
```

Send the prediction request:

gcloud ai-platform predict --model $MODEL_NAME --version \
  $VERSION_NAME --json-instances $INPUT_FILE

Python

This sample assumes that you are familiar with the Google Cloud Client library for Python. If you aren't familiar with it, see Using the Python Client Library.

import googleapiclient.discovery

def predict_json(project, model, instances, version=None):
    """Send json data to a deployed model for prediction.
    Args:
        project (str): project where the AI Platform Prediction Model is deployed.
        model (str): model name.
        instances ([[float]]): List of input instances, where each input
           instance is a list of floats.
        version: str, version of the model to target.
    Returns:
        Mapping[str: any]: dictionary of prediction results defined by the
            model.
    """
    # Create the AI Platform Prediction service object.
    # To authenticate set the environment variable
    # GOOGLE_APPLICATION_CREDENTIALS=<path_to_service_account_file>
    service = googleapiclient.discovery.build('ml', 'v1')
    name = 'projects/{}/models/{}'.format(project, model)

    if version is not None:
        name += '/versions/{}'.format(version)

    response = service.projects().predict(
        name=name,
        body={'instances': instances}
    ).execute()

    if 'error' in response:
        raise RuntimeError(response['error'])

    return response['predictions']

See more information about each of these parameters in the AI Platform Prediction API for prediction input.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.

What's next

Learn how to use XGBoost to get online predictions.
Learn how to use scikit-learn pipelines with AI Platform Prediction.