Method: projects.predict

Performs online prediction on the data in the request.

AI Platform Prediction implements a custom predict verb on top of an HTTP POST method. The predict method performs prediction on the data in the request.

The URL is described in Google API HTTP annotation syntax:

POST https://ml.googleapis.com/v1/{name=projects/**}:predict

The name parameter is required. It must contain the name of your model and, optionally, a version. If you specify a model without a version, the default version for that model is used.

Example specifying both model and version:

POST https://ml.googleapis.com/v1/projects/YOUR_PROJECT_ID/models/YOUR_MODEL_NAME/versions/YOUR_VERSION_NAME:predict

Example specifying only a model. The default version for that model is used:

POST https://ml.googleapis.com/v1/projects/YOUR_PROJECT_ID/models/YOUR_MODEL_NAME:predict

This page describes the format of the prediction request body and of the response body. For a code sample showing how to send a prediction request, see the guide to requesting online predictions.

Request body details

TensorFlow

The request body contains data with the following structure (JSON representation):

{
  "instances": [
    <value>|<simple/nested list>|<object>,
    ...
  ]
}

The instances[] object is required, and must contain the list of instances to get predictions for.

The structure of each element of the instances list is determined by your model's input definition. Instances can include named inputs (as objects) or can contain only unlabeled values.

Not all data includes named inputs. Some instances are simple JSON values (boolean, number, or string). However, instances are often lists of simple values, or complex nested lists.

Below are some examples of request bodies.

CSV data with each row encoded as a string value:

{"instances": ["1.0,true,\\"x\\"", "-2.0,false,\\"y\\""]}

Plain text:

{"instances": ["the quick brown fox", "the lazy dog"]}

Sentences encoded as lists of words (vectors of strings):

{
  "instances": [
    ["the","quick","brown"],
    ["the","lazy","dog"],
    ...
  ]
}

Floating point scalar values:

{"instances": [0.0, 1.1, 2.2]}

Vectors of integers:

{
  "instances": [
    [0, 1, 2],
    [3, 4, 5],
    ...
  ]
}

Tensors (in this case, two-dimensional tensors):

{
  "instances": [
    [
      [0, 1, 2],
      [3, 4, 5]
    ],
    ...
  ]
}

Images, which can be represented different ways. In this encoding scheme the first two dimensions represent the rows and columns of the image, and the third dimension contains lists (vectors) of the R, G, and B values for each pixel:

{
  "instances": [
    [
      [
        [138, 30, 66],
        [130, 20, 56],
        ...
      ],
      [
        [126, 38, 61],
        [122, 24, 57],
        ...
      ],
      ...
    ],
    ...
  ]
}

Data encoding

JSON strings must be encoded as UTF-8. To send binary data, you must base64-encode the data and mark it as binary. To mark a JSON string as binary, replace it with a JSON object with a single attribute named b64:

{"b64": "..."} 

The following example shows two serialized tf.Examples instances, requiring base64 encoding (fake data, for illustrative purposes only):

{"instances": [{"b64": "X5ad6u"}, {"b64": "IA9j4nx"}]}

The following example shows two JPEG image byte strings, requiring base64 encoding (fake data, for illustrative purposes only):

{"instances": [{"b64": "ASa8asdf"}, {"b64": "JLK7ljk3"}]}

Multiple input tensors

Some models have an underlying TensorFlow graph that accepts multiple input tensors. In this case, use the names of JSON name/value pairs to identify the input tensors.

For a graph with input tensor aliases "tag" (string) and "image" (base64-encoded string):

{
  "instances": [
    {
      "tag": "beach",
      "image": {"b64": "ASa8asdf"}
    },
    {
      "tag": "car",
      "image": {"b64": "JLK7ljk3"}
    }
  ]
}

For a graph with input tensor aliases "tag" (string) and "image" (3-dimensional array of 8-bit ints):

{
  "instances": [
    {
      "tag": "beach",
      "image": [
        [
          [138, 30, 66],
          [130, 20, 56],
          ...
        ],
        [
          [126, 38, 61],
          [122, 24, 57],
          ...
        ],
        ...
      ]
    },
    {
      "tag": "car",
      "image": [
        [
          [255, 0, 102],
          [255, 0, 97],
          ...
        ],
        [
          [254, 1, 101],
          [254, 2, 93],
          ...
        ],
        ...
      ]
    },
    ...
  ]
}

scikit-learn

The request body contains data with the following structure (JSON representation):

{
  "instances": [
    <simple list>,
    ...
  ]
}

The instances[] object is required, and must contain the list of instances to get predictions for. In the following example, each input instance is a list of floats:

{
  "instances": [
    [0.0, 1.1, 2.2],
    [3.3, 4.4, 5.5],
    ...
  ]
}

The dimension of input instances must match what your model expects. For example, if your model requires three features, then the length of each input instance must be 3.

XGBoost

The request body contains data with the following structure (JSON representation):

{
  "instances": [
    <simple list>,
    ...
  ]
}

The instances[] object is required, and must contain the list of instances to get predictions for. In the following example, each input instance is a list of floats:

{
  "instances": [
    [0.0, 1.1, 2.2],
    [3.3, 4.4, 5.5],
    ...
  ]
}

The dimension of input instances must match what your model expects. For example, if your model requires three features, then the length of each input instance must be 3.

AI Platform Prediction does not support sparse representation of input instances for XGBoost.

The Online Prediction service interprets zeros and NaNs differently. If the value of a feature is zero, use 0.0 in the corresponding input. If the value of a feature is missing, use NaN in the corresponding input.

The following example represents a prediction request with a single input instance, where the value of the first feature is 0.0, the value of the second feature is 1.1, and the value of the third feature is missing:

{"instances": [[0.0, 1.1, NaN]]}

Custom prediction routine

The request body contains data with the following structure (JSON representation):

{
  "instances": [
    <value>|<simple/nested list>|<object>,
    ...
  ],
  "<other-key>": <value>|<simple/nested list>|<object>,
  ...
}

The instances[] object is required, and must contain the list of instances to get predictions for.

You may optionally provide any other valid JSON key-value pairs. AI Platform Prediction parses the JSON and provides these fields to the predict method of your Predictor class as entries in the **kwargs dictionary.

How to structure the list of instances

The structure of each element of the instances list is determined by the predict method of your Predictor class. Instances can include named inputs (as objects) or can contain only unlabeled values.

Not all data includes named inputs. Some instances are simple JSON values (boolean, number, or string). However, instances are often lists of simple values, or complex nested lists.

Below are some examples of request bodies.

CSV data with each row encoded as a string value:

{"instances": ["1.0,true,\\"x\\"", "-2.0,false,\\"y\\""]}

Plain text:

{"instances": ["the quick brown fox", "the lazy dog"]}

Sentences encoded as lists of words (vectors of strings):

{
  "instances": [
    ["the","quick","brown"],
    ["the","lazy","dog"],
    ...
  ]
}

Floating point scalar values:

{"instances": [0.0, 1.1, 2.2]}

Vectors of integers:

{
  "instances": [
    [0, 1, 2],
    [3, 4, 5],
    ...
  ]
}

Tensors (in this case, two-dimensional tensors):

{
  "instances": [
    [
      [0, 1, 2],
      [3, 4, 5]
    ],
    ...
  ]
}

Images, which can be represented different ways. In this encoding scheme the first two dimensions represent the rows and columns of the image, and the third dimension contains lists (vectors) of the R, G, and B values for each pixel:

{
  "instances": [
    [
      [
        [138, 30, 66],
        [130, 20, 56],
        ...
      ],
      [
        [126, 38, 61],
        [122, 24, 57],
        ...
      ],
      ...
    ],
    ...
  ]
}

Multiple input tensors

Some models have an underlying TensorFlow graph that accepts multiple input tensors. In this case, use the names of JSON name/value pairs to identify the input tensors.

For a graph with input tensor aliases "tag" (string) and "image" (3-dimensional array of 8-bit ints):

{
  "instances": [
    {
      "tag": "beach",
      "image": [
        [
          [138, 30, 66],
          [130, 20, 56],
          ...
        ],
        [
          [126, 38, 61],
          [122, 24, 57],
          ...
        ],
        ...
      ]
    },
    {
      "tag": "car",
      "image": [
        [
          [255, 0, 102],
          [255, 0, 97],
          ...
        ],
        [
          [254, 1, 101],
          [254, 2, 93],
          ...
        ],
        ...
      ]
    },
    ...
  ]
}

Response body details

Responses are very similar to requests.

If the call is successful, the response body contains one prediction entry per instance in the request body, given in the same order:

{
  "predictions": [
    {
      object
    }
  ]
}

If prediction fails for any instance, the response body contains no predictions. Instead, it contains a single error entry:

{
  "error": string
}

The predictions[] object contains the list of predictions, one for each instance in the request. For a custom prediction routine (beta), predictions contains the return value of the predict method of your Predictor class, serialized as JSON.

On error, the error string contains a message describing the problem. The error is returned instead of a prediction list if an error occurred while processing any instance.

Even though there is one prediction per instance, the format of a prediction is not directly related to the format of an instance. Predictions take whatever format is specified in the outputs collection defined in the model. The collection of predictions is returned in a JSON list. Each member of the list can be a simple value, a list, or a JSON object of any complexity. If your model has more than one output tensor, each prediction will be a JSON object containing a name/value pair for each output. The names identify the output aliases in the graph.

Response body examples

TensorFlow

The following examples show some possible responses:

  • A simple set of predictions for three input instances, where each prediction is an integer value:

    {"predictions": [5, 4, 3]}
    
  • A more complex set of predictions, each containing two named values that correspond to output tensors, named label and scores respectively. The value of label is the predicted category ("car" or "beach") and scores contains a list of probabilities for that instance across the possible categories.

    {
      "predictions": [
        {
          "label": "beach",
          "scores": [0.1, 0.9]
        },
        {
          "label": "car",
          "scores": [0.75, 0.25]
        }
      ]
    }
    
  • A response when there is an error processing an input instance:

    {"error": "Divide by zero"}
    

scikit-learn

The following examples show some possible responses:

  • A simple set of predictions for three input instances, where each prediction is an integer value:

    {"predictions": [5, 4, 3]}
    
  • A response when there is an error processing an input instance:

    {"error": "Divide by zero"}
    

XGBoost

The following examples show some possible responses:

  • A simple set of predictions for three input instances, where each prediction is an integer value:

    {"predictions": [5, 4, 3]}
    
  • A response when there is an error processing an input instance:

    {"error": "Divide by zero"}
    

Custom prediction routine

The following examples show some possible responses:

  • A simple set of predictions for three input instances, where each prediction is an integer value:

    {"predictions": [5, 4, 3]}
    
  • A more complex set of predictions, each containing two named values that correspond to output tensors, named label and scores respectively. The value of label is the predicted category ("car" or "beach") and scores contains a list of probabilities for that instance across the possible categories.

    {
      "predictions": [
        {
          "label": "beach",
          "scores": [0.1, 0.9]
        },
        {
          "label": "car",
          "scores": [0.75, 0.25]
        }
      ]
    }
    
  • A response when there is an error processing an input instance:

    {"error": "Divide by zero"}
    

API specification

The following section describes the specification of the predict method as defined in the AI Platform Training and Prediction API discovery document. Refer to the previous sections of this document for detailed information about the method.

HTTP request

POST https://{endpoint}/v1/{name=projects/**}:predict

Where {endpoint} is one of the supported service endpoints.

The URLs use gRPC Transcoding syntax.

Path parameters

Parameters
name

string

Required. The resource name of a model or a version.

Authorization: requires the predict permission on the specified resource.

Authorization requires one or more of the following IAM permissions on the specified resource name:

  • ml.models.predict
  • ml.versions.predict

Request body

The request body contains data with the following structure:

JSON representation
{
  "httpBody": {
    object (HttpBody)
  }
}
Fields
httpBody

object (HttpBody)

Required. The prediction request body. Refer to the request body details section for more information on how to structure your request.

Response body

If successful, the response is a generic HTTP response whose format is defined by the method.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.