The CREATE MODEL statement for remote models over LLMs
This document describes the CREATE MODEL
statement for creating remote models
in BigQuery over Vertex AI
models.
CREATE MODEL
syntax
{CREATE MODEL | CREATE MODEL IF NOT EXISTS | CREATE OR REPLACE MODEL} `project_id.dataset.model_name` REMOTE WITH CONNECTION `project_id.region.connection_id` OPTIONS( ENDPOINT = 'vertex_ai_llm_endpoint' [, PROMPT_COL = 'prompt_col'] [, INPUT_LABEL_COLS = input_label_cols] [, MAX_ITERATIONS = max_iterations] [, LEARNING_RATE_MULTIPLIER = learning_rate_multiplier] [, EARLY_STOP = early_stop] [, ACCELERATOR_TYPE = 'accelerator_type'] [, DATA_SPLIT_METHOD = 'data_split_method'] [, DATA_SPLIT_EVAL_FRACTION = data_split_eval_fraction] [, DATA_SPLIT_COL = 'data_split_col'] [, EVALUATION_TASK = 'evaluation_task']) [AS SELECT prompt_column, label_column FROM `project_id.dataset.table_name`]
CREATE MODEL
Creates and trains a new model in the specified dataset. If the model name
exists, CREATE MODEL
returns an error.
CREATE MODEL IF NOT EXISTS
Creates and trains a new model only if the model doesn't exist in the specified dataset.
CREATE OR REPLACE MODEL
Creates and trains a model and replaces an existing model with the same name in the specified dataset.
model_name
The name of the model you're creating or replacing. The model name must be unique in the dataset: no other model or table can have the same name. The model name must follow the same naming rules as a BigQuery table. A model name can:
- Contain up to 1,024 characters
- Contain letters (upper or lower case), numbers, and underscores
model_name
is not case-sensitive.
If you don't have a default project configured, then you must prepend the project ID to the model name in the following format, including backticks:
`[PROJECT_ID].[DATASET].[MODEL]`
For example, `myproject.mydataset.mymodel`.
REMOTE WITH CONNECTION
Syntax
`[PROJECT_ID].[LOCATION].[CONNECTION_ID]`
BigQuery uses a Cloud resource connection to interact with the Vertex AI endpoint.
The connection elements are as follows:
PROJECT_ID
: the project ID of the project that contains the connection.LOCATION
: the location used by the connection. The connection must be in the same location as the dataset that contains the model.CONNECTION_ID
: the connection ID—for example,myconnection
.To find your connection ID, view the connection details in the Google Cloud console. The connection ID is the value in the last section of the fully qualified connection ID that is shown in Connection ID—for example
projects/myproject/locations/connection_location/connections/myconnection
.
Example
`myproject.us.my_connection`
ENDPOINT
Syntax
ENDPOINT = 'vertex_ai_llm_endpoint'
Description
The model name of the target Vertex AI LLM.
BigQuery ML identifies and uses the full endpoint of the hosted
LLM based on the model name that you specify. For example,
https://us-central1-aiplatform.googleapis.com/v1/projects/your_project_id/locations/us-central1/publishers/google/models/text-embedding:predict
. The resource portion of the URI is saved
as the remote endpoint value in the
model metadata.
For example,
projects/your_project_id/locations/us-central1/publishers/google/models/text-embedding
.
Arguments
A STRING
value that contains the model name of the target
Vertex AI LLM. The following LLMs are supported:
Gemini API multimodal models
The following Gemini API multimodal models are supported:
gemini-1.5-flash-001
andgemini-1.5-flash-002
(Preview): you can process text, image, audio, video, and PDF input by using this model.gemini-1.5-pro-001
andgemini-1.5-pro-002
(Preview): you can process text, image, audio, video, and PDF input by using this model.gemini-1.0-pro-vision
(Preview): you can process image and video input by using this model. You can specify the model name or the aliasgemini-pro-vision
as the endpoint value.
To always use the latest version of the model, specify the model name
without any version information. For example, gemini-1.5-flash
or gemini-1.5-pro
.
You can't specify a version of the gemini-1.0-pro-vision
model. The
latest stable version is always used.
For more information on Gemini API model versioning, see Gemini model versions and lifecycle.
After you create a remote model based on the gemini-1.5-flash-001
or gemini-1.5-pro-001
models, you can use the model with the
ML.GENERATE_TEXT
function
to analyze content in a BigQuery
object table using a prompt you
provide as a function argument, or to generate text based on a prompt you
provide in a query or in a column in a
standard table.
After you create a remote model based on a gemini-1.0-pro-vision
model, you can use the model with the
ML.GENERATE_TEXT
function
to analyze content in a BigQuery
object table using a prompt you
provide as a function argument.
Gemini API text models
The gemini-1.0-pro
Vertex AI Gemini API text model
is supported.
To specify a version for the model, do the following:
- To use the latest stable release of the model, specify either
gemini-1.0-pro
or the aliasgemini-pro
. - To use a specific stable release of the model, specify
gemini-1.0-pro
and append-version
to the model name. For example,gemini-1.0-pro-001
. - To use the preview release of the model, specify
gemini-1.0-pro
and append-preview-release date
to the model name. For example,gemini-1.0-pro-preview-0409
. Preview versions of the model aren't generally available (GA) and aren't recommended for production use.
For more information on Vertex AI Gemini API model versioning, see Gemini model versions and lifecycle.
After you create a remote model based on a gemini-1.0-pro
model, you can use the model with the
ML.GENERATE_TEXT
function
to generate text based on a prompt you provide in a query or from a column in a
standard table.
Claude models
The following Anthropic Claude models are supported:
claude-3-5-sonnet@20240620
claude-3-sonnet@20240229
claude-3-haiku@20240307
claude-3-opus@20240229
Although Claude models are multimodal, you can only use text input with Claude models in BigQuery ML.
After you create a remote model based on a Claude model, you can use the
model with the
ML.GENERATE_TEXT
function
to generate text based on a prompt you provide in a query or from a column in a
standard table.
multimodalembedding
embedding models
The multimodalembedding
embedding model
is supported. You must specify the 001
version of the model,
multimodalembedding@001
.
After you create a remote model based on a multimodalembedding
embedding
model, you can use the model with the
ML.GENERATE_EMBEDDING
function
to generate embeddings from text data in a BigQuery table
or from visual content in a BigQuery
object table.
text embedding models
The following embedding models are supported:
text-embedding
or the aliastextembedding-gecko
text-multilingual-embedding
or the aliastextembedding-gecko-multilingual
You must specify a supported model version.
After you create a remote model based on an embedding
model, you can use the model with the
ML.GENERATE_EMBEDDING
function
to generate embeddings from text data in a BigQuery table.
PaLM API text models
The following Vertex AI PaLM API text models are supported:
text-bison
text-bison-32k
text-unicorn
You can specify a particular
version
of a text model by appending @version
to the model
name. For example, text-bison@version
. Set the model version to the
stable
or
latest
version that is most appropriate for your use case. We recommend using the
most recent stable version of a model for best performance. For information
about text model version defaults, see
Model versions.
After you create a remote model based on a PaLM API text
model, you can use the
ML.GENERATE_TEXT
function
to generate text based on a prompt you provide in a query or from a column in a
standard table.
For information that can help you choose between the supported models, see Model information.
PROMPT_COL
Syntax
PROMPT_COL = 'prompt_col'
Description
The name of the prompt column in the training data table to use when performing
supervised tuning. If you don't specify a value for this option, you must have
a column named or aliased as prompt
in your input data.
You can only use this option with a remote model that targets a
Vertex AI gemini-1.0-pro-002
or text-bison
model. If you
specify this option, you must also specify the AS SELECT
clause.
Arguments
A STRING
value. The default value is prompt
.
INPUT_LABEL_COLS
Syntax
INPUT_LABEL_COLS = input_label_cols
Description
The name of the label column in the training data table to use when performing
supervised tuning. If you don't specify a value for this option, you must have
a column named or aliased as label
in your input data.
You can only use this option with a remote model that targets a
Vertex AI gemini-1.0-pro-002
or text-bison
model. If you
specify this option, you must also specify the AS SELECT
clause.
Arguments
A one-element ARRAY<STRING>
value. The default value is an empty array.
MAX_ITERATIONS
Syntax
MAX_ITERATIONS = max_iterations
Description
The number of steps to run when performing supervised tuning.
You can only use this option with a remote model
that targets a Vertex AI gemini-1.0-pro-002
or text-bison
model. If you specify this option, you must also specify the
AS SELECT
clause.
When you use a gemini-1.0-pro-002
model, BigQuery ML
automatically converts the MAX_ITERATIONS
value to epochs, which is
what the gemini-1.0-pro-002
model uses for training. The default value for
MAX_ITERATIONS
is the number of rows in the input data, which is equivalent
to one epoch. To use multiple epochs, specify a multiple of the number of rows
in your training data. For example, if you have 100 rows of input data and you
want to use two epochs, specify 200
for the argument value. If you provide a
value that isn't a multiple of the number of rows in the input data,
BigQuery ML rounds up to the nearest epoch.
For example, if you have 100 rows of input data and you specify 101
for the
MAX_ITERATIONS
value, training is performed with two epochs.
For more information about the parameters that are used to tune
gemini-1.0-pro-002
models, see
Create a tuning job.
For more guidance on choosing the number of epochs for a
gemini-1.0-pro-002
model, see
Recommended configurations.
When you use a text-bison
model, the batch size varies by tuning location,
which is determined by the ACCELERATOR_TYPE
option. If you choose to use a GPU
accelerator, the batch size is 8. If you choose to use a TPU accelerator, the
batch size is 24.
For example:
- If you are using a TPU accelerator, and there are 240 examples in a training dataset, and you set steps to 20, then the number of training examples is the product of 20 steps and the batch size of 24, or 480 training steps. In this case, there are two epochs in the training process because it goes through the examples twice.
- If you are using a GPU accelerator, and there are 240 examples in a training dataset, and you set steps to 15, then the number of training examples is the product of 15 steps and the batch size of 8, or 120 training steps. In this case, there are 0.5 epochs because there are half as many training steps as there are examples.
For more guidance on choosing the number of training steps for a
text-bison
model, see
Recommended configurations.
Arguments
An INT64
value between 1
and ∞. Typically, 100 steps takes about an
hour to complete. The default value is 300
.
LEARNING_RATE_MULTIPLIER
Syntax
LEARNING_RATE_MULTIPLIER = learning_rate_multiplier
Description
A multiplier to apply to the recommended learning rate when performing
supervised tuning. You can only use this option with a remote model that targets
a Vertex AI gemini-1.0-pro-002
or text-bison
model. If you
specify this option, you must also specify the AS SELECT
clause.
Arguments
A positive FLOAT64
value. The default value is 1.0
.
EARLY_STOP
Syntax
EARLY_STOP = early_stop
Description
When performing supervised tuning, this option determines whether to stop supervised tuning
before completing all of the tuning
steps. You can only use this option with a remote model that targets a
Vertex AI text-bison
model. If you
specify this option, you must also specify the AS SELECT
clause.
When set to TRUE
, this option stops supervised tuning if model performance, as measured
by the accuracy of predicted tokens, doesn't improve enough between evaluation
runs. When set to FALSE
, supervised tuning continues until all the tuning steps are
complete.
Arguments
A BOOL
value. The default value is TRUE
.
ACCELERATOR_TYPE
Syntax
ACCELERATOR_TYPE = 'accelerator_type'
Description
The type of accelerator to use when performing supervised tuning. You can only
use this option with a remote model that targets a Vertex AI
text-bison
model. gemini-1.0-pro-002
models always use TPUs.
If you specify this option, you
must also specify the AS SELECT
clause.
Arguments
A STRING
value. The valid options are the following:
GPU
: Uses eight A100 80 GB GPUs for supervised tuning.TPU
: Uses 64 cores of the TPU v3 pod for supervised tuning. If you choose this option, you must run your query in theeurope-west4
region. For information about changing query options such as the location, see Run an interactive query.
The default value is GPU
. Whichever option you use, make sure you have enough
quota.
DATA_SPLIT_METHOD
Syntax
DATA_SPLIT_METHOD = { 'AUTO_SPLIT' | 'RANDOM' | 'CUSTOM' | 'SEQ' | 'NO_SPLIT' }
Description
The method used to split input data into training and evaluation sets when
performing supervised tuning. You can only
use this option with a remote model that targets a Vertex AI
gemini-1.0-pro-002
or text-bison
model. If you specify this option, you
must also specify the AS SELECT
clause.
Training data is used to train the model. Evaluation data is used to avoid overfitting by using early stopping.
The percentage sizes of the data sets produced by the various arguments for this option are approximate. Larger input data sets come closer to the percentages described than smaller input data sets do.
You can see the model's data split information in the following ways:
- The data split method and percentage are shown in the Training Options section of the model's Details page on the BigQuery page of the Google Cloud console.
- Links to temporary tables that contain the split data are available in the
Model Details section of the model's Details page on the
BigQuery page of the Google Cloud console. You can also return
this information from the
DataSplitResult
field in the BigQuery API. These tables are saved for 48 hours. If you need this information for more than 48 hours, then you should export this data or copy it to permanent tables.
Arguments
This option accepts the following values:
AUTO_SPLIT
: This is the default value. This option splits the data as follows:- If there are fewer than 500 rows in the input data, then all rows are used as training data.
If there are more than 500 rows in the input data, then data is randomized and split as follows:
- If there are between 500 and 50,000 rows in the input data, then 20% of the data is used as evaluation data and 80% is used as training data.
- If there are more than 50,000 rows, then 10,000 rows are used as evaluation data and the remaining rows are used as training data.
RANDOM
: Data is randomized before being split into sets. To customize the data split, you can use this option with theDATA_SPLIT_EVAL_FRACTION
option. If you don't specify that option, data is split in the same way as for theAUTO_SPLIT
option.A random split is deterministic: different training runs produce the same split results if the same underlying training data is used.
CUSTOM
: Split data using the value provided in theDATA_SPLIT_COL
option. TheDATA_SPLIT_COL
value must be the name of a column of typeBOOL
. Rows with a value ofTRUE
orNULL
are used as evaluation data, and rows with a value ofFALSE
are used as training data.SEQ
: Split data sequentially by using the value in a specified column of one of the following types:NUMERIC
BIGNUMERIC
STRING
TIMESTAMP
The data is sorted smallest to largest based on the specified column.
The first n rows are used as evaluation data, where n is the value specified for
DATA_SPLIT_EVAL_FRACTION
. The remaining rows are used as training data.All rows with split values smaller than the threshold are used as training data. The remaining rows, including those with
NULL
values, are used as evaluation data.Use the
DATA_SPLIT_COL
option option to identify the column that contains the data split information.NO_SPLIT
: No data split; all input data is used as training data.
DATA_SPLIT_EVAL_FRACTION
Syntax
DATA_SPLIT_EVAL_FRACTION = data_split_eval_fraction
Description
The fraction of the data to use as evaluation data when performing supervised
tuning. Use when you specify RANDOM
or SEQ
as the value for the
DATA_SPLIT_METHOD
option. You can only
use this option with a remote model that targets a Vertex AI
gemini-1.0-pro-002
or text-bison
model. If you specify this option, you
must also specify the AS SELECT
clause.
Arguments
A FLOAT64
value in the range [0, 1.0]
. The default is 0.2
. The service
maintains the accuracy of the input value to two decimal places.
DATA_SPLIT_COL
Syntax
DATA_SPLIT_COL = 'data_split_col'
Description
The name of the column to use to sort input data into the training or
evaluation set when performing supervised tuning. Use when you are specifying
CUSTOM
or SEQ
as the value for DATA_SPLIT_METHOD
. You can only
use this option with a remote model that targets a Vertex AI
gemini-1.0-pro-002
or text-bison
model. If you specify this option, you
must also specify the AS SELECT
clause.
If you are specifying SEQ
as the value for DATA_SPLIT_METHOD
, then the data
is first sorted smallest to largest based on the specified column. The last
n rows are used as evaluation data, where n is the value
specified for DATA_SPLIT_EVAL_FRACTION
. The
remaining rows are used as training data.
If you are specifying CUSTOM
as the value for DATA_SPLIT_COL
, then you must
provide the name of a column of type BOOL
. Rows with a value of TRUE
or
NULL
are used as evaluation data, rows with a value of FALSE
are used as
training data.
The column you specify for DATA_SPLIT_COL
can't be used as a feature or
label, and the column is excluded from features automatically.
Arguments
A STRING
value.
EVALUATION_TASK
Syntax
EVALUATION_TASK = 'evaluation_task'
Description
When performing supervised tuning, the type of task that you want to tune
the model to perform. You can only use
this option with a remote model that targets a Vertex AI
gemini-1.0-pro-002
or text-bison
model. If you specify this option, you must
also specify the AS SELECT
clause.
Arguments
A STRING
value. The valid options are the following:
TEXT_GENERATION
CLASSIFICATION
SUMMARIZATION
QUESTION_ANSWERING
UNSPECIFIED
The default value is UNSPECIFIED
.
AS SELECT
Syntax
AS SELECT prompt_column, label_column FROM `project_id.dataset.table_name`
Description
Provides the training data to use when performing supervised tuning.
You can only use this option with a remote model that targets a
Vertex AI gemini-1.0-pro-002
or text-bison
model.
Arguments
prompt_column
: The name of the column in the training data table that contains the prompt for evaluating the content in thelabel_column
column. This column must be ofSTRING
type or be cast toSTRING
. If you specify a value for thePROMPT_COL
option, you must specify the same value forprompt_column
. Otherwise this value must beprompt
. If your table does not have aprompt
column, use an alias to specify an existing table column. For example,SELECT AS hint AS prompt, label FROM mydataset.mytable
.label_column
: The name of the column in the training data table that contains the examples to train the model with. This column must be ofSTRING
type or be cast toSTRING
. If you specify a value for theINPUT_LABEL_COLS
option, you must specify the same value forlabel_column
. Otherwise this value must belabel
. If your table does not have alabel
column, use an alias to specify an existing table column. For example,SELECT AS prompt, feature AS label FROM mydataset.mytable
.project_id
: The project ID of the project that contains the training data table.dataset
: The dataset name of the dataset that contains the training data table. After optional data splitting, the number of rows in the training dataset has to be greater or equal to 10.table_name
: The name of the training data table.
Supervised tuning
If you create a remote model that references either the gemini-1.0-pro-002
model or any version of the text-bison
model, you can optionally configure
supervised tuning
when you create the remote model. To configure supervised tuning, specify
the AS SELECT
clause, and optionally some of the other CREATE MODEL
arguments that affect supervised tuning. Supervised tuning lets you train the
model on your own data to make it better suited for your use case.
For more information on supervised tuning use cases, see
Use cases for using supervised tuning on text models.
After you create a tuned model, use the
EVALUATE
function
to ensure that the tuned model performs well for your use case. To learn more,
try the
Use tuning and evaluation to improve LLM performance
tutorial.
Locations
For information about supported locations for supervised tuning with remote models over Vertex AI models, see Supported pipeline job and model upload regions.
If you choose to use a TPU accelerator when performing supervised tuning of a
Vertex AI model, you must run your query in the europe-west4
region. Your supervised tuning computations also occur in the europe-west4
region, because that's where TPU resources are located.
Limitations
You can't perform supervised tuning with remote models over Vertex AI models in projects that use VPC Service Controls.
Costs
When using supervised tuning with remote models over Vertex AI LLMs, costs are calculated based on the following:
- The bytes processed from the training data table specified in the
AS SELECT
clause. These charges are billed from BigQuery to your project. For more information, see BigQuery pricing. - The GPU or TPU usage to tune the LLM. These charges are billed from Vertex AI to your project. For more information, see Vertex AI pricing.
Pipeline Jobs
When you create a remote model that uses supervised tuning, a pre-configured Vertex AI managed pipeline job is automatically created. You can see the managed pipeline job on the Pipelines page of the Google Cloud console. The managed pipeline job consists of multiple components and tasks. The tasks are executed top-down in the runtime graph for the job, which you can view by clicking on the run in the Pipelines page. To diagnose an error in a task, click that task and look at the error information in the Node Info tab.
Locations
You can create remote models over Vertex AI models in all of the
regions
that support Generative AI APIs, and also in the US
and EU
multi-regions.
You can create remote models over Claude models in all of the supported regions for Claude models.
Examples
The following examples create BigQuery ML remote models.
Create a model without tuning
The following example creates a BigQuery ML remote model over a
Vertex AI gemini-1.5-flash-001
model:
CREATE OR REPLACE MODEL `mydataset.flash_model` REMOTE WITH CONNECTION `myproject.us.test_connection` OPTIONS(ENDPOINT = 'gemini-1.5-flash-001');
Create a tuned model
The following example creates a BigQuery ML remote model over a
tuned version of the Vertex AI gemini-1.0-pro-002
model:
CREATE OR REPLACE MODEL `mydataset.tuned_model` REMOTE WITH CONNECTION `myproject.us.test_connection` OPTIONS ( endpoint = 'gemini-1.0-pro-002', max_iterations = 500, prompt_col = 'prompt', input_label_cols = ['label']) AS SELECT CONCAT( 'Please do sentiment analysis on the following text and only output a number from 0 to 5 where 0 means sadness, 1 means joy, 2 means love, 3 means anger, 4 means fear, and 5 means surprise. Text: ', sentiment_column) AS prompt, text_column AS label FROM `mydataset.emotion_classification_train`;
What's next
- For more information about using Vertex AI models with BigQuery ML, see Generative AI overview.
- Try generating text from BigQuery data.
- Try customizing a model by using supervised fine tuning.
- Try generating embeddings from BigQuery data.