The ML.ANNOTATE_IMAGE function
This document describes the ML.ANNOTATE_IMAGE
function, which lets you
annotate images that are stored in BigQuery
object tables by using the
Cloud Vision API.
Syntax
ML.ANNOTATE_IMAGE( MODEL `project_id.dataset.model_name`, TABLE `project_id.dataset.object_table, STRUCT( ['feature_name_1', 'feature_name_2', ...] AS vision_features ) )
Arguments
ML.ANNOTATE_IMAGE
takes the following arguments:
project_id
: Your project ID.dataset
: The BigQuery dataset that contains the model.model
: The name of a remote model with aREMOTE_SERVICE_TYPE
ofCLOUD_AI_VISION_V1
.object_table
: The name of the object table that contains URIs of the images.vision_features
: aARRAY<STRING>
value that specifies one or more feature names of supported Vision API features. The supported features are as follows:
Output
ML.ANNOTATE_IMAGE
returns the input table plus the following columns:
ml_annotate_image_result
: aJSON
value that contains the image annotation result from the Vision API.ml_annotate_image_status
: aSTRING
value that contains the API response status for the corresponding row. This value is empty if the operation was successful.
Quotas
See Cloud AI service functions quotas and limits.
Known issues
Sometimes after a query job that uses this function finishes successfully, some returned rows contain the following error message:
A retryable error occurred: RESOURCE EXHAUSTED error from <remote endpoint>
This issue occurs because BigQuery query jobs finish successfully
even if the function fails for some of the rows. The function fails when the
volume of API calls to the remote endpoint exceeds the quota limits for that
service. This issue occurs most often when you are running multiple parallel
batch queries. BigQuery retries these calls, but if the retries
fail, the resource exhausted
error message is returned.
To iterate through inference calls until all rows are successfully processed, you can use the BigQuery remote inference SQL scripts or the BigQuery remote inference pipeline Dataform package.
Locations
ML.ANNOTATE_IMAGE
must run in the same region as the remote model that the
function references. For more information about supported locations for models
based on the Vision API, see Locations for remote models.
Examples
Example 1
The following example performs label detection on the object table mytable
in
mydataset
:
# Create model CREATE OR REPLACE MODEL `myproject.mydataset.myvisionmodel` REMOTE WITH CONNECTION `myproject.myregion.myconnection` OPTIONS (remote_service_type = 'cloud_ai_vision_v1');
# Annotate image SELECT * FROM ML.ANNOTATE_IMAGE( MODEL `mydataset.myvisionmodel`, TABLE `mydataset.mytable`, STRUCT(['label_detection'] AS vision_features) );
The result is similar to the following:
ml_annotate_image_result | ml_annotate_image_status | uri | generation | content_type | size | md5_hash | updated | metadata |
---|---|---|---|---|---|---|---|---|
{"label_annotations":[{"description":"Food","mid":"/m/02wbm","score":0.97591567,"topicality":0.97591567}]} | gs://my-bucket/images/Cheeseburger.jpg | 1661921874516197 | image/jpeg | 174600 | a259a5076c22696848a1bc10b7162cc2 | 2022-08-31 04:57:54 | [] |
Example 2
The following example annotates images in the object table mytable
, selects
the rows where the detected label is food
and the score is higher than 0.97
,
and then returns the results in separate columns:
CREATE TABLE `mydataset.label_score` AS ( SELECT uri AS `Input image path`, STRING(ml_annotate_image_result.label_annotations[0].description) AS `Detected label`, FLOAT64(ml_annotate_image_result.label_annotations[0].score) AS Score, FLOAT64(ml_annotate_image_result.label_annotations[0].topicality) AS Topicality, ml_annotate_image_status AS Status FROM ML.ANNOTATE_IMAGE( MODEL `mydataset.myvisionmodel`, TABLE `mydataset.mytable`, STRUCT(['label_detection'] AS vision_features)) ); SELECT * FROM `mydataset.label_score` WHERE `Detected label` ='Food' AND Score > 0.97;
The result is similar to the following:
Input image path | Detected label | Score | Topicality | Status |
---|---|---|---|---|
gs://my-bucket/images/Cheeseburger.jpg | Food | 0.97591567 | 0.97591567 |
If you get an error like query limit exceeded
, you might have exceeded the
quota for this function, which
can leave you with unprocessed rows. Use the following query to complete
processing the unprocessed rows:
CREATE TABLE `mydataset.label_score_next` AS ( SELECT uri AS `Input image path`, STRING(ml_annotate_image_result.label_annotations[0].description) AS `Detected label`, FLOAT64(ml_annotate_image_result.label_annotations[0].score) AS Score, FLOAT64(ml_annotate_image_result.label_annotations[0].topicality) AS Topicality, ml_annotate_image_status AS Status FROM ML.ANNOTATE_IMAGE( MODEL `mydataset.myvisionmodel`, TABLE `mydataset.mytable`, STRUCT(['label_detection'] AS vision_features)) WHERE uri NOT IN ( SELECT `Input image path` FROM `mydataset.label_score` WHERE STATUS = '') ); SELECT * FROM `mydataset.label_score_next`;
What's next
- Get step-by-step instructions on how to
annotate images in an object table
using the
ML.ANNOTATE_IMAGE
function. - Learn more about other functions you can use to analyze BigQuery data.
- For information about model inference, see Model inference overview.
- For information about the supported SQL statements and functions for each model type, see End-to-end user journey for each model.