Stay organized with collections Save and categorize content based on your preferences.

Run inference on image object tables

This document describes how to use BigQuery ML to run inference on image object tables using TensorFlow models.

Overview

You can run inference on image data by using an object table as input to the ML.PREDICT function.

To do this, you must first choose an appropriate TensorFlow model, upload it to Cloud Storage, and import it into BigQuery by running the CREATE MODEL statement. You can either create your own TensorFlow model, or download one from TensorFlow Hub.

Limitations

  • BigQuery ML for object tables is only supported when using flat-rate pricing through reservations; on-demand pricing isn't supported.
  • The image files associated with the object table must meet the following requirements:
    • Are less than 20 MB in size.
    • Have a format of JPEG, PNG or BMP.
    • Have a color space of sRGB.
  • The combined size of the image files associated with the object table must be less than 1 TB.
  • The model must be a TensorFlow1 or TensorFlow2 model in a SavedModel format.
  • The model must meet the input requirements and limitations described in Supported inputs.
  • The model must be trained with images in the sRGB color space.
  • The serialized size of the model must be less than 450 MB.
  • The deserialized (in-memory) size of the model must be less than 1000 MB.
  • The model input must have a data type of tf.float32 and have the shape [batch_size, weight, height, 3], where:
    • batch_size must be -1, None, or 1.
    • weight and height must be greater than 0.

The following models on TensorFlow Hub work with BigQuery ML and image object tables:

Required permissions

  • To upload the model to Cloud Storage, you need the storage.objects.create and storage.objects.get permissions.
  • To load the model into BigQuery ML, you need the following permissions:

    • bigquery.jobs.create
    • bigquery.models.create
    • bigquery.models.getData
    • bigquery.models.updateData
  • To run inference, you need the following permissions:

    • bigquery.tables.getData on the object table
    • bigquery.models.getData on the model
    • bigquery.jobs.create

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.

  4. Enable the BigQuery and BigQuery Connection API APIs.

    Enable the APIs

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  6. Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.

  7. Enable the BigQuery and BigQuery Connection API APIs.

    Enable the APIs

Upload a model to Cloud Storage

Follow these steps to upload a model:

  1. If you have created your own TensorFlow model, save it locally in SavedModel format. If you are using a model from TensorFlow Hub, download it to your local machine. This should give you a saved_model.pb file and a variables folder for the model.
  2. If necessary, create a Cloud Storage bucket.
  3. Upload the saved_model.pb file and the variables folder to the bucket.

Load the model into BigQuery ML

Loading a TensorFlow model that works with image object tables is similar to loading a model that work with structured data, as described in The CREATE MODEL statement for importing TensorFlow models.

Follow these steps to load a model into BigQuery ML:

CREATE MODEL PROJECT_ID.DATASET_ID.MODEL_NAME
OPTIONS(
  model_type = 'TENSORFLOW',
  model_path = 'BUCKET_PATH');

Replace the following:

  • PROJECT_ID: your project ID.
  • DATASET_ID: the ID of the dataset to contain the model.
  • MODEL_NAME: the name of the model.
  • BUCKET_PATH: the path to the Cloud Storage bucket that contains the model, in the format [gs://bucket_name/[folder_name/]*].

The following example uses the default project and loads a model to BigQuery ML as my_vision_model, using the saved_model.pb file and variables folder from gs://my_bucket/my_model_folder:

CREATE MODEL my_dataset.my_vision_model
OPTIONS(
  model_type = 'TENSORFLOW',
  model_path = 'gs://my_bucket/my_model_folder/*');

Inspect the model

You can inspect the uploaded model to see what its input and output fields are. You need to reference these fields when you run inference on the object table.

Follow these steps to inspect a model:

  1. Go to the BigQuery page.

    Go to BigQuery

  2. In the Explorer pane, expand your project, expand the dataset that contains the model, and then expand the Models node.

  3. Click the model.

  4. In the model pane that opens, click the Schema tab.

  5. Look at the Labels section. This identifies the fields that are output by the model.

  6. Look at the Features section. This identifies the fields that must be input into the model. You reference them in the SELECT statement for the ML.DECODE_IMAGE function.

For more detailed inspection of a model, for example to determine the shape of the model input, install TensorFlow and use the saved_model_cli show command.

Run inference

Once you have an appropriate model loaded, you can run inference on image data by using an object table as input to the ML.PREDICT function. You must use the ML.DECODE_IMAGE function to decode the image data so that it can be interpreted by ML.PREDICT.

To run inference:

SELECT *
FROM ML.PREDICT(
  MODEL PROJECT_ID.DATASET_ID.MODEL_NAME,
  (SELECT [other columns from the object table,] ML.DECODE_IMAGE(data) AS MODEL_INPUT
  FROM PROJECT_ID.DATASET_ID.TABLE_NAME)
);

Replace the following:

  • PROJECT_ID: the project ID of the project that contains the model and object table.
  • DATASET_ID: the ID of the dataset that contains the model and object table.
  • MODEL_NAME: the name of the model.
  • MODEL_INPUT: the name of an input field for the model.
  • TABLE_NAME: the name of the object table.

Examples

The following example returns the inference results for all images in the object table, for a model with an input field of input and an output field of feature:

SELECT * FROM
ML.PREDICT(
  MODEL my_dataset.vision_model,
  (SELECT uri, ML.DECODE_IMAGE(data) AS input
  FROM my_dataset.object_table)
);

This returns results similar to the following:

-----------------------------------------------------------------------
| feature                |  uri                 | input               |
—----------------------------------------------------------------------
| 5.2563899544111337e-07 | gs://mybucket/a.png  | 0.0941176563501358  |
—----------------------------------------------------------------------
| 0.0076000699773430824  | gs://mybucket/b.jpg  | 0.1352241039276123  |
—----------------------------------------------------------------------

You can use object table fields to filter the objects included in inference. The following example runs inference only for JPG images:

SELECT * FROM
  ML.PREDICT(
    MODEL my_dataset.vision_model,
    (SELECT uri, ML.DECODE_IMAGE(data) AS input
    FROM my_dataset.object_table
    WHERE content_type = 'image/jpeg')
  );

This returns results similar to the following:

-----------------------------------------------------------------------
| feature                |  uri                 | input               |
—----------------------------------------------------------------------
| 1.145291776083468e-06  | gs://mybucket/a.jpg  | 0.10889355838298798 |
—----------------------------------------------------------------------
| 0.0076000699773430824  | gs://mybucket/b.jpg  | 0.1352241039276123  |
—----------------------------------------------------------------------

In the following example, the model has an output field of embeddings and two input fields: one that expects an image, f_img, and one that expects a string, f_txt. The image input comes from the object table and the string input comes from a standard BigQuery table that is joined with the object table by using the uri column.

SELECT * FROM
  ML.PREDICT(
    MODEL `my_dataset.mixed_model`,
    (SELECT uri, ML.DECODE_IMAGE(my_dataset.my_object_table.data) AS f_img,
      my_dataset.image_description.description AS f_txt
    FROM my_dataset.object_table
    JOIN my_dataset.image_description
    ON object_table.uri = image_description.uri)
  );

This returns results similar to the following:

----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| embeddings                                                                                                     |  uri                 | f_img               | f_txt |
—---------------------------------------------------------------------------------------------------------------------------------------------------------------------
| ["4.920103549957275","0.2415941208600998","-1.6325242519378662","-1.1537792682647705","-0.05942607671022415"] | gs://mybucket/a.png  | 0.0941176563501358  | daisy |
—---------------------------------------------------------------------------------------------------------------------------------------------------------------------
| ["-2.3324382305145264","-0.542818725109103","-0.1122859714154053","-0.5696073174476624","2.1635284423828125"] | gs://mybucket/b.jpg  | 0.1352241039276123  | rose  |
—---------------------------------------------------------------------------------------------------------------------------------------------------------------------

What's next