Generative AI overview

This document describes the generative artificial intelligence (AI) features that BigQuery ML supports. These features let you perform AI tasks in BigQuery ML by using pre-trained Vertex AI foundation models. Supported tasks include the following:

Generative AI
Embedding

You access a Vertex AI model to perform one of these functions by creating a remote model in BigQuery ML that represents the Vertex AI model's endpoint. Once you have created a remote model over the Vertex AI model that you want to use, you access that model's capabilities by running a BigQuery ML function against the remote model.

This approach lets you use the capabilities of these Vertex AI models to analyze BigQuery data by using SQL.

Workflow

You can use remote models over Vertex AI models and remote models over Cloud AI services together with BigQuery ML functions in order to accomplish complex data analysis and generative AI tasks.

The following diagram shows some typical workflows where you might use these capabilities together:

Diagram showing common workflows for remote models that use Vertex AI models or Cloud AI services.

Generative AI

You can use large language models (LLMs) to perform tasks such as text summarization and generation. For example, you could summarize a long report, or analyze sentiment in customer feedback. You can use vision language models (VLMs) to analyze visual content such as images and videos for tasks like visual captioning and visual Q&A. You can use multimodal models to perform the same tasks as LLMs and VLMs, plus additional tasks like audio transcription and document analysis.

To perform generative AI tasks, you can create a reference to a pre-trained Vertex AI model by creating a remote model and specifying the model name for the ENDPOINT value. The following Vertex AI models are supported:

gemini-1.5-flash
gemini-1.5-pro
gemini-1.0-pro
gemini-1.0-pro-vision (Preview)
text-bison
text-bison-32k
text-unicorn

Anthropic Claude models (Preview) are also supported.

To provide feedback or request support for the models in preview, send an email to bqml-feedback@google.com.

When you create a remote model that references any of the following models, you can optionally choose to configure supervised tuning at the same time:

gemini-1.5-pro-002
gemini-1.5-flash-002
gemini-1.0-pro-002 (Preview)

After you create the model, you can use the ML.GENERATE_TEXT function to interact with that model:

For remote models based on gemini-1.0-pro, text-bison, text-bison-32k, or text-unicorn models, you can use the ML.GENERATE_TEXT function with a prompt you provide in a query or from a column in a standard table.
For remote models based on the gemini-1.0-pro-vision model, you can use the ML.GENERATE_TEXT function to analyze image or video content from an object table with a prompt you provide as a function argument.
For remote models based on the gemini-1.5-flash or gemini-1.5-pro models, you can use the ML.GENERATE_TEXT function to analyze text, image, audio, video, or PDF content from an object table with a prompt you provide as a function argument, or you can generate text from a prompt you provide in a query or from a column in a standard table.

You can use grounding and safety attributes when you use Gemini models with the ML.GENERATE_TEXT function, provided that you are using a standard table for input. Grounding lets the Gemini model use additional information from the internet to generate more specific and factual responses. Safety attributes let the Gemini model filter the responses it returns based on the attributes you specify.

All inference occurs in Vertex AI. The results are stored in BigQuery.

To learn more, try generating text with the ML.GENERATE_TEXT function.

Embedding

You can use embedding to identify semantically similar items. For example, you can use text embedding to identify how similar two pieces of text are. If the pieces of text are semantically similar, then their respective embeddings are located near each other in the embedding vector space.

You can use BigQuery ML models to create the following types of embeddings:

To create text embeddings, you can create a reference to one of the Vertex AI text-embedding or text-multilingual-embedding embedding models by creating a remote model and specifying the LLM name for the ENDPOINT value.
To create multimodal embeddings, which can embed text, images, and videos into the same semantic space, you can create a reference to the Vertex AI multimodalembedding LLM by creating a remote model and specifying the LLM name for the ENDPOINT value.
To create embeddings for structured independent and identically distributed random variables (IID) data, you can use a Principal component analysis (PCA) model or an Autoencoder model.
To create embeddings for user or item data, you can use a Matrix factorization model.

After you create the model, you can use the ML.GENERATE_EMBEDDING function to interact with it. For all types of supported models, ML.GENERATE_EMBEDDING works with data in standard tables. For multimodal embedding models, ML.GENERATE_EMBEDDING also works with visual content in object tables. For remote models, all inference occurs in Vertex AI. For other model types, all inference occurs in BigQuery. The results are stored in BigQuery.

To learn more, try creating text embeddings, image embeddings and video embeddings with the ML.GENERATE_EMBEDDING function.

For a smaller, lightweight text embedding, try using a pretrained TensorFlow model, such as NNLM, SWIVEL, or BERT.

For information about choosing the best model for your embedding use case, see Choosing a text embedding model.

What's next

Generate text by using a text-bison model and the ML.GENERATE_TEXT function.
Generate text by using a Gemini model and the ML.GENERATE_TEXT function.
Generate text by using the ML.GENERATE_TEXT function with your data.
Tune a model using your data.
Analyze images with a Gemini vision model.
For more information about performing inference over machine learning models, see Model inference overview.