Run inference on image object tables
This document describes how to use BigQuery ML to run inference on image object tables.
You can run inference on image data by using an object table as input to the
ML.PREDICT
function.
To do this, you must first choose an appropriate model, upload it to
Cloud Storage, and import it into BigQuery by running the
CREATE MODEL
statement.
You can either create your own model, or download one
from TensorFlow Hub.
Limitations
- Using BigQuery ML imported models with object tables is only supported when you use capacity-based pricing through reservations; on-demand pricing isn't supported.
- The image files associated with the object table must meet the following
requirements:
- Are less than 20 MB in size.
- Have a format of JPEG, PNG or BMP.
- The combined size of the image files associated with the object table must be less than 1 TB.
The model must be one of following:
- A TensorFlow or TensorFlow Lite model in SavedModel format.
- A PyTorch model in ONNX format.
The model must meet the input requirements and limitations described in Supported inputs.
The serialized size of the model must be less than 450 MB.
The deserialized (in-memory) size of the model must be less than 1000 MB.
The model input tensor must meet the following criteria:
- Have a data type of
tf.float32
with values in[0, 1)
or have a data type oftf.uint8
with values in[0, 255)
. - Have the shape
[batch_size, weight, height, 3]
, where:batch_size
must be-1
,None
, or1
.width
andheight
must be greater than 0.
- Have a data type of
The model must be trained with images in one of the following color spaces:
RGB
HSV
YIQ
YUV
GRAYSCALE
You can use the
ML.CONVERT_COLOR_SPACE
function to convert input images to the color space that the model was trained with.
Example models
The following models on TensorFlow Hub work with BigQuery ML and image object tables:
- ResNet 50. To try using this model, see Tutorial: Run inference on an object table by using a classification model.
- MobileNet V3. To try using this model, see Tutorial: Run inference on an object table by using a feature vector model.
Required permissions
- To upload the model to Cloud Storage, you need the
storage.objects.create
andstorage.objects.get
permissions. To load the model into BigQuery ML, you need the following permissions:
bigquery.jobs.create
bigquery.models.create
bigquery.models.getData
bigquery.models.updateData
To run inference, you need the following permissions:
bigquery.tables.getData
on the object tablebigquery.models.getData
on the modelbigquery.jobs.create
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the BigQuery and BigQuery Connection API APIs.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the BigQuery and BigQuery Connection API APIs.
Upload a model to Cloud Storage
Follow these steps to upload a model:
- If you have created your own model, save it locally. If you
are using a model from TensorFlow Hub, download it to your
local machine. If you are using TensorFlow, this should give
you a
saved_model.pb
file and avariables
folder for the model. - If necessary, create a Cloud Storage bucket.
- Upload the model artifacts to the bucket.
Load the model into BigQuery ML
Loading a model that works with image object tables is the same as loading a model that works with structured data. Follow these steps to load a model into BigQuery ML:
CREATE MODEL `PROJECT_ID.DATASET_ID.MODEL_NAME` OPTIONS( model_type = 'MODEL_TYPE', model_path = 'BUCKET_PATH');
Replace the following:
PROJECT_ID
: your project ID.DATASET_ID
: the ID of the dataset to contain the model.MODEL_NAME
: the name of the model.MODEL_TYPE
: use one of the following values:TENSORFLOW
for a TensorFlow modelONNX
for a PyTorch model in ONNX format
BUCKET_PATH
: the path to the Cloud Storage bucket that contains the model, in the format[gs://bucket_name/[folder_name/]*]
.
The following example uses the default project and loads a TensorFlow
model to BigQuery ML as my_vision_model
, using the
saved_model.pb
file and variables
folder from
gs://my_bucket/my_model_folder
:
CREATE MODEL `my_dataset.my_vision_model` OPTIONS( model_type = 'TENSORFLOW', model_path = 'gs://my_bucket/my_model_folder/*');
Inspect the model
You can inspect the uploaded model to see what its input and output fields are. You need to reference these fields when you run inference on the object table.
Follow these steps to inspect a model:
Go to the BigQuery page.
In the Explorer pane, expand your project, expand the dataset that contains the model, and then expand the Models node.
Click the model.
In the model pane that opens, click the Schema tab.
Look at the Labels section. This identifies the fields that are output by the model.
Look at the Features section. This identifies the fields that must be input into the model. You reference them in the
SELECT
statement for theML.DECODE_IMAGE
function.
For more detailed inspection of a TensorFlow model, for example to
determine the shape of the model input,
install TensorFlow
and use the
saved_model_cli show
command.
Preprocess images
You must use the
ML.DECODE_IMAGE
function
to convert image bytes to a multi-dimensional ARRAY
representation. You can
use ML.DECODE_IMAGE
output directly in an ML.PREDICT
function,
or you can write the results from ML.DECODE_IMAGE
to a table column and
reference that column when you call ML.PREDICT
.
The following example writes the output of the ML.DECODE_IMAGE
function to
a table:
CREATE OR REPLACE TABLE mydataset.mytable AS ( SELECT ML.DECODE_IMAGE(data) AS decoded_image FROM mydataset.object_table );
Use the following functions to further process images so that they work with your model:
- The
ML.CONVERT_COLOR_SPACE
function converts images with anRGB
color space to a different color space. - The
ML.CONVERT_IMAGE_TYPE
function converts the pixel values output by theML.DECODE_IMAGE
function from floating point numbers to integers with a range of[0, 255)
. - The
ML.RESIZE_IMAGE
function resizes images.
You can use these as part of the ML.PREDICT
function, or run them on a
table column containing image data output by ML.DECODE_IMAGE
.
Run inference
Once you have an appropriate model loaded, and optionally preprocessed the image data,you can run inference on the image data.
To run inference:
SELECT * FROM ML.PREDICT( MODEL `PROJECT_ID.DATASET_ID.MODEL_NAME`, (SELECT [other columns from the object table,] IMAGE_DATA AS MODEL_INPUT FROM PROJECT_ID.DATASET_ID.TABLE_NAME) );
Replace the following:
PROJECT_ID
: the project ID of the project that contains the model and object table.DATASET_ID
: the ID of the dataset that contains the model and object table.MODEL_NAME
: the name of the model.IMAGE_DATA
: the image data, represented either by the output of theML.DECODE_IMAGE
function, or by a table column containing image data output byML.DECODE_IMAGE
or other image processing functions.MODEL_INPUT
: the name of an input field for the model.You can find this information by inspecting the model and looking at the field names in the Features section.TABLE_NAME
: the name of the object table.
Examples
Example 1
The following example uses the ML.DECODE_IMAGE
function directly in the
ML.PREDICT
function. It returns the inference results for all images in the
object table, for a model with an input field of input
and an output
field of feature
:
SELECT * FROM ML.PREDICT( MODEL `my_dataset.vision_model`, (SELECT uri, ML.RESIZE_IMAGE(ML.DECODE_IMAGE(data), 480, 480, FALSE) AS input FROM `my_dataset.object_table`) );
Example 2
The following example uses the ML.DECODE_IMAGE
function directly in the
ML.PREDICT
function, and uses the ML.CONVERT_COLOR_SPACE
function in the
ML.PREDICT
function to convert
the image color space from RBG
to YIQ
. It also shows how to
use object table fields to filter the objects included in inference.
It returns the inference results for all JPG images in the
object table, for a model with an input field of input
and an output
field of feature
:
SELECT * FROM ML.PREDICT( MODEL `my_dataset.vision_model`, (SELECT uri, ML.CONVERT_COLOR_SPACE(ML.RESIZE_IMAGE(ML.DECODE_IMAGE(data), 224, 280, TRUE), 'YIQ') AS input FROM `my_dataset.object_table` WHERE content_type = 'image/jpeg') );
Example 3
The following example uses results from ML.DECODE_IMAGE
that have been
written to a table column but not processed any further. It uses
ML.RESIZE_IMAGE
and ML.CONVERT_IMAGE_TYPE
in the ML.PREDICT
function to
process the image data. It returns the inference results for all images in the
decoded images table, for a model with an input field of input
and an output
field of feature
.
Create the decoded images table:
CREATE OR REPLACE TABLE `my_dataset.decoded_images` AS (SELECT ML.DECODE_IMAGE(data) AS decoded_image FROM `my_dataset.object_table`);
Run inference on the decoded images table:
SELECT * FROM ML.PREDICT( MODEL`my_dataset.vision_model`, (SELECT uri, ML.CONVERT_IMAGE_TYPE(ML.RESIZE_IMAGE(decoded_image, 480, 480, FALSE)) AS input FROM `my_dataset.decoded_images`) );
Example 4
The following example uses results from ML.DECODE_IMAGE
that have been
written to a table column and preprocessed using
ML.RESIZE_IMAGE
. It returns the inference results for all images in the
decoded images table, for a model with an input field of input
and an output
field of feature
.
Create the table:
CREATE OR REPLACE TABLE `my_dataset.decoded_images` AS (SELECT ML.RESIZE_IMAGE(ML.DECODE_IMAGE(data) 480, 480, FALSE) AS decoded_image FROM `my_dataset.object_table`);
Run inference on the decoded images table:
SELECT * FROM ML.PREDICT( MODEL `my_dataset.vision_model`, (SELECT uri, decoded_image AS input FROM `my_dataset.decoded_images`) );
Example 5
The following example uses the ML.DECODE_IMAGE
function directly in the
ML.PREDICT
function. In this example, the model has an output field of
embeddings
and two input fields: one that expects an
image, f_img
, and one that expects a string, f_txt
. The image
input comes from the object table and the string input comes from a
standard BigQuery table that is joined with the object table
by using the uri
column.
SELECT * FROM ML.PREDICT( MODEL `my_dataset.mixed_model`, (SELECT uri, ML.RESIZE_IMAGE(ML.DECODE_IMAGE(my_dataset.my_object_table.data), 224, 224, FALSE) AS f_img, my_dataset.image_description.description AS f_txt FROM `my_dataset.object_table` JOIN `my_dataset.image_description` ON object_table.uri = image_description.uri) );
What's next
- Learn how to analyze object tables by using remote functions.
- Try running inference on an object table by using a feature vector model.
- Try running inference on an object table by using a classification model.
- Try analyzing an object table by using a remote function.
- Try annotating an image with the
ML.ANNOTATE_IMAGE
function.