Run inference on image object tables
This document describes how to use BigQuery ML to run inference on image object tables using TensorFlow models.
Overview
You can run inference on image data by using an object table as input to the
ML.PREDICT
function.
To do this, you must first choose an appropriate TensorFlow
model, upload it to Cloud Storage, and import it into BigQuery
by running the
CREATE MODEL
statement.
You can either create your own TensorFlow model, or download one
from TensorFlow Hub.
Limitations
- BigQuery ML for object tables is only supported when using flat-rate pricing through reservations; on-demand pricing isn't supported.
- The image files associated with the object table must meet the following
requirements:
- Are less than 20 MB in size.
- Have a format of JPEG, PNG or BMP.
- Have a color space of sRGB.
- The combined size of the image files associated with the object table must be less than 1 TB.
- The model must be a TensorFlow1 or TensorFlow2 model in a SavedModel format.
- The model must meet the input requirements and limitations described in Supported inputs.
- The model must be trained with images in the sRGB color space.
- The serialized size of the model must be less than 450 MB.
- The deserialized (in-memory) size of the model must be less than 1000 MB.
- The model input must have a data type of
tf.float32
and have the shape[batch_size, weight, height, 3]
, where:batch_size
must be-1
,None
, or1
.weight
andheight
must be greater than 0.
Recommended models
The following models on TensorFlow Hub work with BigQuery ML and image object tables:
- ResNet 50. To try using this model, see Tutorial: Run inference on an object table by using a classification model.
- MobileNet V3. To try using this model, see Tutorial: Run inference on an object table by using a feature vector model.
Required permissions
- To upload the model to Cloud Storage, you need the
storage.objects.create
andstorage.objects.get
permissions. To load the model into BigQuery ML, you need the following permissions:
bigquery.jobs.create
bigquery.models.create
bigquery.models.getData
bigquery.models.updateData
To run inference, you need the following permissions:
bigquery.tables.getData
on the object tablebigquery.models.getData
on the modelbigquery.jobs.create
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.
-
Enable the BigQuery and BigQuery Connection API APIs.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.
-
Enable the BigQuery and BigQuery Connection API APIs.
Upload a model to Cloud Storage
Follow these steps to upload a model:
- If you have created your own TensorFlow model, save it
locally in SavedModel format. If you
are using a model from TensorFlow Hub, download it to your
local machine. This should give you a
saved_model.pb
file and avariables
folder for the model. - If necessary, create a Cloud Storage bucket.
- Upload the
saved_model.pb
file and thevariables
folder to the bucket.
Load the model into BigQuery ML
Loading a TensorFlow model that works with image object
tables is similar to loading a model that work with structured data,
as described in
The CREATE MODEL
statement for importing TensorFlow models.
Follow these steps to load a model into BigQuery ML:
CREATE MODEL PROJECT_ID.DATASET_ID.MODEL_NAME OPTIONS( model_type = 'TENSORFLOW', model_path = 'BUCKET_PATH');
Replace the following:
PROJECT_ID
: your project ID.DATASET_ID
: the ID of the dataset to contain the model.MODEL_NAME
: the name of the model.BUCKET_PATH
: the path to the Cloud Storage bucket that contains the model, in the format[gs://bucket_name/[folder_name/]*]
.
The following example uses the default project and loads a model to
BigQuery ML as my_vision_model
, using the saved_model.pb
file
and variables
folder from gs://my_bucket/my_model_folder
:
CREATE MODEL my_dataset.my_vision_model OPTIONS( model_type = 'TENSORFLOW', model_path = 'gs://my_bucket/my_model_folder/*');
Inspect the model
You can inspect the uploaded model to see what its input and output fields are. You need to reference these fields when you run inference on the object table.
Follow these steps to inspect a model:
Go to the BigQuery page.
In the Explorer pane, expand your project, expand the dataset that contains the model, and then expand the Models node.
Click the model.
In the model pane that opens, click the Schema tab.
Look at the Labels section. This identifies the fields that are output by the model.
Look at the Features section. This identifies the fields that must be input into the model. You reference them in the
SELECT
statement for theML.DECODE_IMAGE
function.
For more detailed inspection of a model, for example to determine the shape
of the model input,
install TensorFlow
and use the
saved_model_cli show
command.
Run inference
Once you have an appropriate model loaded, you can run inference on image data
by using an object table as input to the
ML.PREDICT
function.
You must use the
ML.DECODE_IMAGE
function
to decode the image data so that it can be interpreted by ML.PREDICT
.
To run inference:
SELECT * FROM ML.PREDICT( MODEL PROJECT_ID.DATASET_ID.MODEL_NAME, (SELECT [other columns from the object table,] ML.DECODE_IMAGE(data) AS MODEL_INPUT FROM PROJECT_ID.DATASET_ID.TABLE_NAME) );
Replace the following:
PROJECT_ID
: the project ID of the project that contains the model and object table.DATASET_ID
: the ID of the dataset that contains the model and object table.MODEL_NAME
: the name of the model.MODEL_INPUT
: the name of an input field for the model.TABLE_NAME
: the name of the object table.
Examples
The following example returns the inference results for all images in the
object table, for a model with an input field of input
and an output
field of feature
:
SELECT * FROM ML.PREDICT( MODEL my_dataset.vision_model, (SELECT uri, ML.DECODE_IMAGE(data) AS input FROM my_dataset.object_table) );
This returns results similar to the following:
-----------------------------------------------------------------------
| feature | uri | input |
—----------------------------------------------------------------------
| 5.2563899544111337e-07 | gs://mybucket/a.png | 0.0941176563501358 |
—----------------------------------------------------------------------
| 0.0076000699773430824 | gs://mybucket/b.jpg | 0.1352241039276123 |
—----------------------------------------------------------------------
You can use object table fields to filter the objects included in inference. The following example runs inference only for JPG images:
SELECT * FROM ML.PREDICT( MODEL my_dataset.vision_model, (SELECT uri, ML.DECODE_IMAGE(data) AS input FROM my_dataset.object_table WHERE content_type = 'image/jpeg') );
This returns results similar to the following:
-----------------------------------------------------------------------
| feature | uri | input |
—----------------------------------------------------------------------
| 1.145291776083468e-06 | gs://mybucket/a.jpg | 0.10889355838298798 |
—----------------------------------------------------------------------
| 0.0076000699773430824 | gs://mybucket/b.jpg | 0.1352241039276123 |
—----------------------------------------------------------------------
In the following example, the model has an output field of embeddings
and
two input fields: one that expects an
image, f_img
, and one that expects a string, f_txt
. The image
input comes from the object table and the string input comes from a
standard BigQuery table that is joined with the object table
by using the uri
column.
SELECT * FROM ML.PREDICT( MODEL `my_dataset.mixed_model`, (SELECT uri, ML.DECODE_IMAGE(my_dataset.my_object_table.data) AS f_img, my_dataset.image_description.description AS f_txt FROM my_dataset.object_table JOIN my_dataset.image_description ON object_table.uri = image_description.uri) );
This returns results similar to the following:
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| embeddings | uri | f_img | f_txt |
—---------------------------------------------------------------------------------------------------------------------------------------------------------------------
| ["4.920103549957275","0.2415941208600998","-1.6325242519378662","-1.1537792682647705","-0.05942607671022415"] | gs://mybucket/a.png | 0.0941176563501358 | daisy |
—---------------------------------------------------------------------------------------------------------------------------------------------------------------------
| ["-2.3324382305145264","-0.542818725109103","-0.1122859714154053","-0.5696073174476624","2.1635284423828125"] | gs://mybucket/b.jpg | 0.1352241039276123 | rose |
—---------------------------------------------------------------------------------------------------------------------------------------------------------------------
What's next
- Learn how to analyze object tables by using remote functions.
- Try running inference on an object table by using a feature vector model.
- Try running inference on an object table by using a classification model.
- Try analyzing an object table by using a remote function.