Method: projects.locations.models.predict

Perform an online prediction. The prediction result will be directly returned in the response. Available for following ML problems, and their expected request payloads: * Image Classification - Image in .JPEG, .GIF or .PNG format, imageBytes up to 30MB. * Image Object Detection - Image in .JPEG, .GIF or .PNG format, imageBytes up to 30MB. * Text Classification - TextSnippet, content up to 60,000 characters, UTF-8 encoded. * Text Extraction - TextSnippet, content up to 30,000 characters, UTF-8 NFC encoded. * Translation - TextSnippet, content up to 25,000 characters, UTF-8 encoded. * Tables - Row, with column values matching the columns of the model, up to 5MB. Not available for FORECASTING

predictionType. * Text Sentiment - TextSnippet, content up 500 characters, UTF-8 encoded.

HTTP request

POST https://automl.googleapis.com/v1beta1/{name}:predict

Path parameters

Parameters
name

string

Name of the model requested to serve the prediction.

Authorization requires the following Google IAM permission on the specified resource name:

  • automl.models.predict

Request body

The request body contains data with the following structure:

JSON representation
{
  "payload": {
    object (ExamplePayload)
  },
  "params": {
    string: string,
    ...
  }
}
Fields
payload

object (ExamplePayload)

Required. Payload to perform a prediction on. The payload must match the problem type that the model was trained to solve.

params

map (key: string, value: string)

Additional domain-specific parameters, any string must be up to 25000 characters long.

  • For Image Classification:

score_threshold - (float) A value from 0.0 to 1.0. When the model makes predictions for an image, it will only produce results that have at least this confidence score. The default is 0.5.

  • For Image Object Detection: score_threshold - (float) When Model detects objects on the image, it will only produce bounding boxes which have at least this confidence score. Value in 0 to 1 range, default is 0.5. max_bounding_box_count - (int64) No more than this number of bounding boxes will be returned in the response. Default is 100, the requested value may be limited by server.
  • For Tables: feature_importance - (boolean) Whether

[feature_importance][google.cloud.automl.v1beta1.TablesModelColumnInfo.feature_importance should be populated in the returned

[TablesAnnotation(-s)][google.cloud.automl.v1beta1.TablesAnnotation. The default is false.

Response body

If successful, the response body contains data with the following structure:

Response message for PredictionService.Predict.

JSON representation
{
  "payload": [
    {
      object (AnnotationPayload)
    }
  ],
  "preprocessedInput": {
    object (ExamplePayload)
  },
  "metadata": {
    string: string,
    ...
  }
}
Fields
payload[]

object (AnnotationPayload)

Prediction result. Translation and Text Sentiment will return precisely one payload.

preprocessedInput

object (ExamplePayload)

The preprocessed example that AutoML actually makes prediction on. Empty if AutoML does not preprocess the input example. * For Text Extraction: If the input is a .pdf file, the OCR'ed text will be provided in documentText.

metadata

map (key: string, value: string)

Additional domain-specific prediction response metadata.

  • For Image Object Detection: max_bounding_box_count - (int64) At most that many bounding boxes per image could have been returned.

  • For Text Sentiment: sentiment_score - (float, deprecated) A value between -1 and 1, -1 maps to least positive sentiment, while 1 maps to the most positive one and the higher the score, the more positive the sentiment in the document is. Yet these values are relative to the training data, so e.g. if all data was positive then -1 will be also positive (though the least). The sentiment_score shouldn't be confused with "score" or "magnitude" from the previous Natural Language Sentiment Analysis API.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ExamplePayload

Example data used for training or prediction.

JSON representation
{

  // Union field payload can be only one of the following:
  "image": {
    object (Image)
  },
  "textSnippet": {
    object (TextSnippet)
  },
  "document": {
    object (Document)
  },
  "row": {
    object (Row)
  }
  // End of list of possible types for union field payload.
}
Fields
Union field payload. Required. Input only. The example data. payload can be only one of the following:
image

object (Image)

Example image.

textSnippet

object (TextSnippet)

Example text.

document

object (Document)

Example document.

row

object (Row)

Example relational table row.

Image

A representation of an image. Only images up to 30MB in size are supported.

JSON representation
{
  "thumbnailUri": string,

  // Union field data can be only one of the following:
  "imageBytes": string,
  "inputConfig": {
    object (InputConfig)
  }
  // End of list of possible types for union field data.
}
Fields
thumbnailUri

string

Output only. HTTP URI to the thumbnail image.

Union field data. Input only. The data representing the image. For Predict calls image_bytes must be set, as other options are not currently supported by prediction API. You can read the contents of an uploaded image by using the content_uri field. data can be only one of the following:
imageBytes

string (bytes format)

Image content represented as a stream of bytes. Note: As with all bytes fields, protobuffers use a pure binary representation, whereas JSON representations use base64.

A base64-encoded string.

inputConfig

object (InputConfig)

An input config specifying the content of the image.

TextSnippet

A representation of a text snippet.

JSON representation
{
  "content": string,
  "mimeType": string,
  "contentUri": string
}
Fields
content

string

Required. The content of the text snippet as a string. Up to 250000 characters long.

mimeType

string

Optional. The format of content. Currently the only two allowed values are "text/html" and "text/plain". If left blank, the format is automatically determined from the type of the uploaded content.

contentUri

string

Output only. HTTP URI where you can download the content.

Document

A structured text document e.g. a PDF.

JSON representation
{
  "inputConfig": {
    object (DocumentInputConfig)
  },
  "documentText": {
    object (TextSnippet)
  },
  "layout": [
    {
      object (Layout)
    }
  ],
  "documentDimensions": {
    object (DocumentDimensions)
  },
  "pageCount": integer
}
Fields
inputConfig

object (DocumentInputConfig)

An input config specifying the content of the document.

documentText

object (TextSnippet)

The plain text version of this document.

layout[]

object (Layout)

Describes the layout of the document. Sorted by [pageNumber][].

documentDimensions

object (DocumentDimensions)

The dimensions of the page in the document.

pageCount

integer

Number of pages in the document.

DocumentInputConfig

Input configuration of a Document.

JSON representation
{
  "gcsSource": {
    object (GcsSource)
  }
}
Fields
gcsSource

object (GcsSource)

The Google Cloud Storage location of the document file. Only a single path should be given. Max supported size: 512MB. Supported extensions: .PDF.

Layout

Describes the layout information of a textSegment in the document.

JSON representation
{
  "textSegment": {
    object (TextSegment)
  },
  "pageNumber": integer,
  "boundingPoly": {
    object (BoundingPoly)
  },
  "textSegmentType": enum (TextSegmentType)
}
Fields
textSegment

object (TextSegment)

Text Segment that represents a segment in documentText.

pageNumber

integer

Page number of the textSegment in the original document, starts from 1.

boundingPoly

object (BoundingPoly)

The position of the textSegment in the page. Contains exactly 4

normalizedVertices and they are connected by edges in the order provided, which will represent a rectangle parallel to the frame. The NormalizedVertex-s are relative to the page. Coordinates are based on top-left as point (0,0).

textSegmentType

enum (TextSegmentType)

The type of the textSegment in document.

TextSegment

A contiguous part of a text (string), assuming it has an UTF-8 NFC encoding.

JSON representation
{
  "content": string,
  "startOffset": string,
  "endOffset": string
}
Fields
content

string

Output only. The content of the TextSegment.

startOffset

string (int64 format)

Required. Zero-based character index of the first character of the text segment (counting characters from the beginning of the text).

endOffset

string (int64 format)

Required. Zero-based character index of the first character past the end of the text segment (counting character from the beginning of the text). The character at the endOffset is NOT included in the text segment.

BoundingPoly

A bounding polygon of a detected object on a plane. On output both vertices and normalizedVertices are provided. The polygon is formed by connecting vertices in the order they are listed.

JSON representation
{
  "normalizedVertices": [
    {
      object (NormalizedVertex)
    }
  ]
}
Fields
normalizedVertices[]

object (NormalizedVertex)

Output only . The bounding polygon normalized vertices.

NormalizedVertex

A vertex represents a 2D point in the image. The normalized vertex coordinates are between 0 to 1 fractions relative to the original plane (image, video). E.g. if the plane (e.g. whole image) would have size 10 x 20 then a point with normalized coordinates (0.1, 0.3) would be at the position (1, 6) on that plane.

JSON representation
{
  "x": number,
  "y": number
}
Fields
x

number

Required. Horizontal coordinate.

y

number

Required. Vertical coordinate.

TextSegmentType

The type of TextSegment in the context of the original document.

Enums
TEXT_SEGMENT_TYPE_UNSPECIFIED Should not be used.
TOKEN The text segment is a token. e.g. word.
PARAGRAPH The text segment is a paragraph.
FORM_FIELD The text segment is a form field.
FORM_FIELD_NAME The text segment is the name part of a form field. It will be treated as child of another FORM_FIELD TextSegment if its span is subspan of another TextSegment with type FORM_FIELD.
FORM_FIELD_CONTENTS The text segment is the text content part of a form field. It will be treated as child of another FORM_FIELD TextSegment if its span is subspan of another TextSegment with type FORM_FIELD.
TABLE The text segment is a whole table, including headers, and all rows.
TABLE_HEADER The text segment is a table's headers. It will be treated as child of another TABLE TextSegment if its span is subspan of another TextSegment with type TABLE.
TABLE_ROW The text segment is a row in table. It will be treated as child of another TABLE TextSegment if its span is subspan of another TextSegment with type TABLE.
TABLE_CELL The text segment is a cell in table. It will be treated as child of another TABLE_ROW TextSegment if its span is subspan of another TextSegment with type TABLE_ROW.

DocumentDimensions

Message that describes dimension of a document.

JSON representation
{
  "unit": enum (DocumentDimensionUnit),
  "width": number,
  "height": number
}
Fields
unit

enum (DocumentDimensionUnit)

Unit of the dimension.

width

number

Width value of the document, works together with the unit.

height

number

Height value of the document, works together with the unit.

DocumentDimensionUnit

Unit of the document dimension.

Enums
DOCUMENT_DIMENSION_UNIT_UNSPECIFIED Should not be used.
INCH Document dimension is measured in inches.
CENTIMETER Document dimension is measured in centimeters.
POINT Document dimension is measured in points. 72 points = 1 inch.

Row

A representation of a row in a relational table.

JSON representation
{
  "columnSpecIds": [
    string
  ],
  "values": [
    value
  ]
}
Fields
columnSpecIds[]

string

The resource IDs of the column specs describing the columns of the row. If set must contain, but possibly in a different order, all input feature

columnSpecIds of the Model this row is being passed to. Note: The below values field must match order of this field, if this field is set.

values[]

value (Value format)

Required. The values of the row cells, given in the same order as the columnSpecIds, or, if not set, then in the same order as input feature

columnSpecs of the Model this row is being passed to.

AnnotationPayload

Contains annotation information that is relevant to AutoML.

JSON representation
{
  "annotationSpecId": string,
  "displayName": string,

  // Union field detail can be only one of the following:
  "translation": {
    object (TranslationAnnotation)
  },
  "classification": {
    object (ClassificationAnnotation)
  },
  "imageObjectDetection": {
    object (ImageObjectDetectionAnnotation)
  },
  "videoClassification": {
    object (VideoClassificationAnnotation)
  },
  "videoObjectTracking": {
    object (VideoObjectTrackingAnnotation)
  },
  "textExtraction": {
    object (TextExtractionAnnotation)
  },
  "textSentiment": {
    object (TextSentimentAnnotation)
  },
  "tables": {
    object (TablesAnnotation)
  }
  // End of list of possible types for union field detail.
}
Fields
annotationSpecId

string

Output only . The resource ID of the annotation spec that this annotation pertains to. The annotation spec comes from either an ancestor dataset, or the dataset that was used to train the model in use.

displayName

string

Output only. The value of displayName when the model was trained. Because this field returns a value at model training time, for different models trained using the same dataset, the returned value could be different as model owner could update the displayName between any two model training.

Union field detail. Output only . Additional information about the annotation specific to the AutoML domain. detail can be only one of the following:
translation

object (TranslationAnnotation)

Annotation details for translation.

classification

object (ClassificationAnnotation)

Annotation details for content or image classification.

imageObjectDetection

object (ImageObjectDetectionAnnotation)

Annotation details for image object detection.

videoClassification

object (VideoClassificationAnnotation)

Annotation details for video classification. Returned for Video Classification predictions.

videoObjectTracking

object (VideoObjectTrackingAnnotation)

Annotation details for video object tracking.

textExtraction

object (TextExtractionAnnotation)

Annotation details for text extraction.

textSentiment

object (TextSentimentAnnotation)

Annotation details for text sentiment.

tables

object (TablesAnnotation)

Annotation details for Tables.

TranslationAnnotation

Annotation details specific to translation.

JSON representation
{
  "translatedContent": {
    object (TextSnippet)
  }
}
Fields
translatedContent

object (TextSnippet)

Output only . The translated content.

ClassificationAnnotation

Contains annotation details specific to classification.

JSON representation
{
  "score": number
}
Fields
score

number

Output only. A confidence estimate between 0.0 and 1.0. A higher value means greater confidence that the annotation is positive. If a user approves an annotation as negative or positive, the score value remains unchanged. If a user creates an annotation, the score is 0 for negative or 1 for positive.

ImageObjectDetectionAnnotation

Annotation details for image object detection.

JSON representation
{
  "boundingBox": {
    object (BoundingPoly)
  },
  "score": number
}
Fields
boundingBox

object (BoundingPoly)

Output only. The rectangle representing the object location.

score

number

Output only. The confidence that this annotation is positive for the parent example, value in [0, 1], higher means higher positivity confidence.

VideoClassificationAnnotation

Contains annotation details specific to video classification.

JSON representation
{
  "type": string,
  "classificationAnnotation": {
    object (ClassificationAnnotation)
  },
  "timeSegment": {
    object (TimeSegment)
  }
}
Fields
type

string

Output only. Expresses the type of video classification. Possible values:

  • segment - Classification done on a specified by user time segment of a video. AnnotationSpec is answered to be present in that time segment, if it is present in any part of it. The video ML model evaluations are done only for this type of classification.

  • shot- Shot-level classification. AutoML Video Intelligence determines the boundaries for each camera shot in the entire segment of the video that user specified in the request configuration. AutoML Video Intelligence then returns labels and their confidence scores for each detected shot, along with the start and end time of the shot. WARNING: Model evaluation is not done for this classification type, the quality of it depends on training data, but there are no metrics provided to describe that quality.

  • 1s_interval - AutoML Video Intelligence returns labels and their confidence scores for each second of the entire segment of the video that user specified in the request configuration. WARNING: Model evaluation is not done for this classification type, the quality of it depends on training data, but there are no metrics provided to describe that quality.

classificationAnnotation

object (ClassificationAnnotation)

Output only . The classification details of this annotation.

timeSegment

object (TimeSegment)

Output only . The time segment of the video to which the annotation applies.

TimeSegment

A time period inside of an example that has a time dimension (e.g. video).

JSON representation
{
  "startTimeOffset": string,
  "endTimeOffset": string
}
Fields
startTimeOffset

string (Duration format)

Start of the time segment (inclusive), represented as the duration since the example start.

A duration in seconds with up to nine fractional digits, terminated by 's'. Example: "3.5s".

endTimeOffset

string (Duration format)

End of the time segment (exclusive), represented as the duration since the example start.

A duration in seconds with up to nine fractional digits, terminated by 's'. Example: "3.5s".

VideoObjectTrackingAnnotation

Annotation details for video object tracking.

JSON representation
{
  "instanceId": string,
  "timeOffset": string,
  "boundingBox": {
    object (BoundingPoly)
  },
  "score": number
}
Fields
instanceId

string

Optional. The instance of the object, expressed as a positive integer. Used to tell apart objects of the same type (i.e. AnnotationSpec) when multiple are present on a single example. NOTE: Instance ID prediction quality is not a part of model evaluation and is done as best effort. Especially in cases when an entity goes off-screen for a longer time (minutes), when it comes back it may be given a new instance ID.

timeOffset

string (Duration format)

Required. A time (frame) of a video to which this annotation pertains. Represented as the duration since the video's start.

A duration in seconds with up to nine fractional digits, terminated by 's'. Example: "3.5s".

boundingBox

object (BoundingPoly)

Required. The rectangle representing the object location on the frame (i.e. at the timeOffset of the video).

score

number

Output only. The confidence that this annotation is positive for the video at the timeOffset, value in [0, 1], higher means higher positivity confidence. For annotations created by the user the score is 1. When user approves an annotation, the original float score is kept (and not changed to 1).

TextExtractionAnnotation

Annotation for identifying spans of text.

JSON representation
{
  "score": number,
  "textSegment": {
    object (TextSegment)
  }
}
Fields
score

number

Output only. A confidence estimate between 0.0 and 1.0. A higher value means greater confidence in correctness of the annotation.

textSegment

object (TextSegment)

An entity annotation will set this, which is the part of the original text to which the annotation pertains.

TextSentimentAnnotation

Contains annotation details specific to text sentiment.

JSON representation
{
  "sentiment": integer
}
Fields
sentiment

integer

Output only. The sentiment with the semantic, as given to the AutoMl.ImportData when populating the dataset from which the model used for the prediction had been trained. The sentiment values are between 0 and Dataset.text_sentiment_dataset_metadata.sentiment_max (inclusive), with higher value meaning more positive sentiment. They are completely relative, i.e. 0 means least positive sentiment and sentimentMax means the most positive from the sentiments present in the train data. Therefore e.g. if train data had only negative sentiment, then sentimentMax, would be still negative (although least negative). The sentiment shouldn't be confused with "score" or "magnitude" from the previous Natural Language Sentiment Analysis API.

TablesAnnotation

Contains annotation details specific to Tables.

JSON representation
{
  "score": number,
  "predictionInterval": {
    object (DoubleRange)
  },
  "value": value,
  "tablesModelColumnInfo": [
    {
      object (TablesModelColumnInfo)
    }
  ]
}
Fields
score

number

Output only. A confidence estimate between 0.0 and 1.0, inclusive. A higher value means greater confidence in the returned value. For

targetColumnSpec of FLOAT64 data type the score is not populated.

predictionInterval

object (DoubleRange)

Output only. Only populated when

targetColumnSpec has FLOAT64 data type. An interval in which the exactly correct target value has 95% chance to be in.

value

value (Value format)

The predicted value of the row's

target_column. The value depends on the column's DataType:

  • CATEGORY - the predicted (with the above confidence score) CATEGORY value.

  • FLOAT64 - the predicted (with above predictionInterval) FLOAT64 value.

tablesModelColumnInfo[]

object (TablesModelColumnInfo)

Output only. Auxiliary information for each of the model's

inputFeatureColumnSpecs with respect to this particular prediction. If no other fields than

columnSpecName and

columnDisplayName would be populated, then this whole field is not.

DoubleRange

A range between two double numbers.

JSON representation
{
  "start": number,
  "end": number
}
Fields
start

number

Start of the range, inclusive.

end

number

End of the range, exclusive.