The AI.IF function
This document describes the AI.IF function, which uses a
Vertex AI Gemini model to
evaluate a condition described in natural language and returns a BOOL.
You can use the AI.IF function to filter and join data based on conditions
described in natural language or multimodal input.
The following are common use cases:
- Sentiment analysis: Find customer reviews with negative sentiment.
- Topic analysis: Identify news articles related to a specific subject.
- Image analysis: Select images that contain a specific item.
- Security: Identify suspicious emails.
Input
AI.IF accepts the following types of input:
- Text data from standard tables.
- ObjectRefRuntimevalues that are generated by the- OBJ.GET_ACCESS_URLfunction. You can use- ObjectRefvalues from standard tables as input to the- OBJ.GET_ACCESS_URLfunction. (Preview)
When you analyze unstructured data, that data must meet the following requirements:
- Content must be in one of the supported formats that are
described in the Gemini API model
mimeTypeparameter.
- For more information about accepted multimodal input, see the technical specifications for Gemini.
This function passes your input to a Gemini model and incurs charges in Vertex AI each time it's called.
Syntax
AI.IF( [ prompt => ] 'PROMPT', connection_id => 'CONNECTION' )
Arguments
AI.IF takes the following arguments:
- PROMPT: a- STRINGor- STRUCTvalue that specifies the prompt to send to the model. The prompt must be the first argument that you specify. You can provide the prompt value in the following ways:- Specify a STRINGvalue. For example,('Is Seattle a US city?').
- Specify a - STRUCTvalue that contains one or more fields. You can use the following types of fields within the- STRUCTvalue:- Field type - Description - Examples - STRING- A string literal, or the name of a - STRINGcolumn.- String literal: - 'Is Seattle a US city?'
 String column name:- my_string_column- ARRAY<STRING>- You can only use string literals in the array. - Array of string literals: - ['Is ', 'Seattle', ' a US city?']- ObjectRefRuntime- An - ObjectRefRuntimevalue returned by the- OBJ.GET_ACCESS_URLfunction. The- OBJ.GET_ACCESS_URLfunction takes an- ObjectRefvalue as input, which you can provide by either specifying the name of a column that contains- ObjectRefvalues, or by constructing an- ObjectRefvalue.- ObjectRefRuntimevalues must have the- access_url.read_urland- details.gcs_metadata.content_typeelements of the JSON value populated.- Function call with - ObjectRefcolumn:- OBJ.GET_ACCESS_URL(my_objectref_column, 'r')
 Function call with constructed- ObjectRefvalue:- OBJ.GET_ACCESS_URL(OBJ.MAKE_REF('gs://image.jpg', 'myconnection'), 'r')- ARRAY<ObjectRefRuntime>- ObjectRefRuntimevalues returned from multiple calls to the- OBJ.GET_ACCESS_URLfunction. The- OBJ.GET_ACCESS_URLfunction takes an- ObjectRefvalue as input, which you can provide by either specifying the name of a column that contains- ObjectRefvalues, or by constructing an- ObjectRefvalue.- ObjectRefRuntimevalues must have the- access_url.read_urland- details.gcs_metadata.content_typeelements of the JSON value populated.- Function calls with - ObjectRefcolumns:- [OBJ.GET_ACCESS_URL(my_objectref_column1, 'r'), OBJ.GET_ACCESS_URL(my_objectref_column2, 'r')]
 Function calls with constructed- ObjectRefvalues:- [OBJ.GET_ACCESS_URL(OBJ.MAKE_REF('gs://image1.jpg', 'myconnection'), 'r'), OBJ.GET_ACCESS_URL(OBJ.MAKE_REF('gs://image2.jpg', 'myconnection'), 'r')]- The function combines - STRUCTfields similarly to a- CONCAToperation and concatenates the fields in their specified order. The same is true for the elements of any arrays used within the struct. The following table shows some examples of- STRUCTprompt values and how they are interpreted:- Struct field types - Struct value - Semantic equivalent - STRUCT<STRING>- ('Is Seattle a US city?')- 'Is Seattle a US city?' - STRUCT<STRING, STRING, STRING>- ('Is the city of ', my_city_column, ' in India?')- 'Is the city of my_city_column_value in India?' - STRUCT<STRING, ARRAY<STRING>>- ('Is the city of ', ['Oslo', ' in India?'])- 'Is the city of Oslo in India?' - STRUCT<STRING, ObjectRefRuntime>- ('Is the city in the following image in Canada?', OBJ.GET_ACCESS_URL(image_objectref_column, 'r'))- 'Is the city in the following image in Canada?' image - STRUCT<STRING, ObjectRefRuntime, ObjectRefRuntime>- ('Is the city in the first image within the country of the second image?',
 OBJ.GET_ACCESS_URL(city_image_objectref_column, 'r'),
 OBJ.GET_ACCESS_URL(country_image_objectref_column, 'r'))- 'Is the city in the first image within the country of the second image?' city_image country_image 
 
- Specify a 
- CONNECTION: a- STRINGvalue specifying the Cloud resource connection to use. The following forms are accepted:- Connection name: - [PROJECT_ID].LOCATION.CONNECTION_ID- For example, - myproject.us.myconnection.
- Fully qualified connection ID: - projects/PROJECT_ID/locations/LOCATION/connections/CONNECTION_ID- For example, - projects/myproject/locations/us/connections/myconnection.
 - Replace the following: - PROJECT_ID: the project ID of the project that contains the connection.
- LOCATION: the location used by the connection.
- CONNECTION_ID: the connection ID—for example,- myconnection.- You can get this value by viewing the connection details in the Google Cloud console and copying the value in the last section of the fully qualified connection ID that is shown in Connection ID. For example, - projects/myproject/locations/connection_location/connections/myconnection.
 
Output
AI.IF returns a BOOL based on evaluation of the condition in the
input prompt.
If the call to Vertex AI is unsuccessful for any reason,
such as exceeding quota or model unavailability, then the function returns
NULL.
Examples
The following examples show how to use the AI.IF function to filter text
and join multimodal data.
Filter text by topic
The following query uses the AI.IF function to filter news
stories to those that cover a natural disaster:
SELECT
  title, body
FROM
  `bigquery-public-data.bbc_news.fulltext`
WHERE
  AI.IF((
    'The following news story is about a natural disaster: ', body),
    connection_id => 'us.example_connection');
Combine filters
BigQuery optimizes queries to reduce the number of calls to
Gemini. The following query first filters by the
equality operator, and then filters using the AI.IF function:
SELECT
  title, body 
FROM
  `bigquery-public-data.bbc_news.fulltext`
WHERE
  AI.IF(
    ('This news is related to Google: ', body),
    connection_id => 'us.example_connection')
  AND category = 'tech';    -- Non-AI filters are evaluated first
Filter images
The following query creates an external table from images of pet products stored in a publicly available Cloud Storage bucket. Then, it filters the results to images that contain a ball.
-- Create a dataset
CREATE SCHEMA IF NOT EXISTS cymbal_pets;
-- Create an object table
CREATE OR REPLACE EXTERNAL TABLE cymbal_pets.product_images
WITH CONNECTION us.example_connection 
OPTIONS (
  object_metadata = 'SIMPLE',
  uris = ['gs://cloud-samples-data/bigquery/tutorials/cymbal-pets/images/*.png']
);
-- Filter images in the object table
SELECT
  STRING(OBJ.GET_ACCESS_URL(ref,'r').access_urls.read_url) AS signed_url,
FROM
  `cymbal_pets.product_images`
WHERE
  AI.IF(('The image contains a ball.', OBJ.GET_ACCESS_URL(ref, 'r')),
        connection_id => 'us.example_connection');
Join tables based on image content
The following queries create a table of product data and a table of product images. The tables are joined based on whether the image is of the product.
-- Create a dataset
CREATE SCHEMA IF NOT EXISTS cymbal_pets;
-- Load a non-object table
LOAD DATA OVERWRITE cymbal_pets.products
FROM
  FILES(
    format = 'avro',
    uris = [
      'gs://cloud-samples-data/bigquery/tutorials/cymbal-pets/tables/products/products_*.avro']);
-- Create an object table
CREATE OR REPLACE EXTERNAL TABLE cymbal_pets.product_images
  WITH CONNECTION us.example_connection
  OPTIONS (
    object_metadata = 'SIMPLE',
    uris = ['gs://cloud-samples-data/bigquery/tutorials/cymbal-pets/images/*.png']);
-- Join the standard table and object table
SELECT product_name, brand, signed_url
FROM
  cymbal_pets.products INNER JOIN
  EXTERNAL_OBJECT_TRANSFORM(TABLE `cymbal_pets.product_images`, ['SIGNED_URL']) as images
ON
  AI.IF(
    (
      """You will be provided an image of a pet product.
      Determine if the image is of the following pet toy: """,
      products.product_name,
      images.ref
    ),
    connection_id => 'us.example_connection')
WHERE
  products.category = "Toys" AND
  products.brand = "Fluffy Buns";
Filter audio by speech topic
The following queries create a table of audio data stored in a publicly available Cloud Storage bucket. The query filters the audio samples to those that contain speech discussing a large language model.
-- Create a dataset
CREATE SCHEMA IF NOT EXISTS audio_repo;
-- Create an object table with audios
CREATE OR REPLACE EXTERNAL TABLE audio_repo.prompt_audio
WITH CONNECTION us.test_connection 
OPTIONS (
  object_metadata = 'SIMPLE',
  uris = ['gs://cloud-samples-data/generative-ai/audio/*.mp3']
);
-- Filter audios in the object table
SELECT
  STRING(OBJ.GET_ACCESS_URL(ref,'r').access_urls.read_url) AS signed_url,
FROM
  `audio_repo.prompt_audio`
WHERE
  AI.IF(('Does the audio talk about large language models? ', OBJ.GET_ACCESS_URL(ref, 'r')),
        connection_id => 'us.test_connection');
Related functions
The AI.IF and
AI.GENERATE_BOOL
functions both use models to generate a
boolean value in response to a prompt. The following differences can help you
choose which function to use:
- Prompt Optimization: AI.IFautomatically structures your prompts to improve the quality of the output.
- Input: AI.IFautomatically selects a Gemini model.AI.GENERATE_BOOLlets you specify a specific model endpoint and model parameters to use.
- Output: AI.IFreturns aBOOLvalue, which makes it easier to work with in queries.AI.GENERATE_BOOLreturns aSTRUCTthat contains aBOOLvalue, as well as additional information about the model call, which is useful if you need to view details such as the safety rating or API response status.
- Error handling: If AI.IFproduces an error for any input, then the function returnsNULL.AI.GENERATE_BOOLrecords details about the errors in its output.
Locations
You can run AI.IF in all of the
regions
that support Gemini models, and also in the US and EU
multi-regions.
Quotas
See Generative AI functions quotas and limits.
What's next
- For more information about using Vertex AI models to generate text and embeddings, see Generative AI overview.
- For more information about using Cloud AI APIs to perform AI tasks, see AI application overview.
- For more information about supported SQL statements and functions for generative AI models, see End-to-end user journeys for generative AI models.