The ML.ANNOTATE_IMAGE function

This document describes the ML.ANNOTATE_IMAGE function, which lets you annotate images that are stored in BigQuery object tables by using the Cloud Vision API.

Syntax

ML.ANNOTATE_IMAGE(
  MODEL `project_id.dataset.model_name`,
  TABLE `project_id.dataset.object_table,
  STRUCT( ['feature_name_1', 'feature_name_2', ...] AS vision_features )
)

Arguments

ML.ANNOTATE_IMAGE takes the following arguments:

Output

ML.ANNOTATE_IMAGE returns the input table plus the following columns:

  • ml_annotate_image_result: a JSON value that contains the image annotation result from the Vision API.
  • ml_annotate_image_status: a STRING value that contains the API response status for the corresponding row. This value is empty if the operation was successful.

Quotas

See Cloud AI service functions quotas and limits.

Known issues

Sometimes after a query job that uses this function finishes successfully, some returned rows contain the following error message:

A retryable error occurred: RESOURCE EXHAUSTED error from <remote endpoint>

This issue occurs because BigQuery query jobs finish successfully even if the function fails for some of the rows. The function fails when the volume of API calls to the remote endpoint exceeds the quota limits for that service. This issue occurs most often when you are running multiple parallel batch queries. BigQuery retries these calls, but if the retries fail, the resource exhausted error message is returned.

Examples

Example 1

The following example performs label detection on the object table mytable in mydataset:

# Create model
CREATE OR REPLACE MODEL
`myproject.mydataset.myvisionmodel`
REMOTE WITH CONNECTION `myproject.myregion.myconnection`
OPTIONS (remote_service_type = 'cloud_ai_vision_v1');
# Annotate image
SELECT *
FROM ML.ANNOTATE_IMAGE(
  MODEL `mydataset.myvisionmodel`,
  TABLE `mydataset.mytable`,
  STRUCT(['label_detection'] AS vision_features)
);

The result is similar to the following:

ml_annotate_image_result ml_annotate_image_status uri generation content_type size md5_hash updated metadata
{"label_annotations":[{"description":"Food","mid":"/m/02wbm","score":0.97591567,"topicality":0.97591567}]} gs://my-bucket/images/Cheeseburger.jpg 1661921874516197 image/jpeg 174600 a259a5076c22696848a1bc10b7162cc2 2022-08-31 04:57:54 []

Example 2

The following example annotates images in the object table mytable, selects the rows where the detected label is food and the score is higher than 0.97, and then returns the results in separate columns:

CREATE TABLE
  `mydataset.label_score` AS (
  SELECT
    uri AS `Input image path`,
    STRING(ml_annotate_image_result.label_annotations[0].description) AS `Detected label`,
    FLOAT64(ml_annotate_image_result.label_annotations[0].score) AS Score,
    FLOAT64(ml_annotate_image_result.label_annotations[0].topicality) AS Topicality,
    ml_annotate_image_status AS Status
  FROM
    ML.ANNOTATE_IMAGE( MODEL `mydataset.myvisionmodel`,
      TABLE `mydataset.mytable`,
      STRUCT(['label_detection'] AS vision_features))
  );

SELECT
  *
FROM
  `mydataset.label_score`
WHERE
  `Detected label` ='Food'
  AND Score > 0.97;

The result is similar to the following:

Input image path Detected label Score Topicality Status
gs://my-bucket/images/Cheeseburger.jpg Food 0.97591567 0.97591567

If you get an error like query limit exceeded, you might have exceeded the quota for this function, which can leave you with unprocessed rows. Use the following query to complete processing the unprocessed rows:

CREATE TABLE
  `mydataset.label_score_next` AS (
  SELECT
    uri AS `Input image path`,
    STRING(ml_annotate_image_result.label_annotations[0].description) AS `Detected label`,
    FLOAT64(ml_annotate_image_result.label_annotations[0].score) AS Score,
    FLOAT64(ml_annotate_image_result.label_annotations[0].topicality) AS Topicality,
    ml_annotate_image_status AS Status
  FROM
    ML.ANNOTATE_IMAGE( MODEL `mydataset.myvisionmodel`,
      TABLE `mydataset.mytable`,
      STRUCT(['label_detection'] AS vision_features))
  WHERE uri NOT IN (
    SELECT `Input image path` FROM `mydataset.label_score`
    WHERE STATUS = '')
  );

SELECT * FROM `mydataset.label_score_next`;

What's next