The ML.GENERATE_TEXT function
This document describes the ML.GENERATE_TEXT
function, which lets you perform
generative natural language tasks on text that's stored in BigQuery
tables. Example language tasks include the following:
- Classification
- Sentiment Analysis
- Entity extraction
- Extractive Question Answering
- Summarization
- Rewriting text in a different style
- Ad copy generation
- Concept ideation
The function works by sending requests to a BigQuery ML
remote model
that represents one of the Vertex AI text-bison*
natural language
foundation models
(LLMs), and then returning the LLM's response. Several of the function's
arguments provide the
parameters
that shape the LLM's response.
The prompt
parameter provides the text for the model to analyze.
Prompt design can strongly affect the responses returned by the LLM. For
more information, see
Design text prompts.
Syntax
ML.GENERATE_TEXT( MODEL `project_id.dataset.model`, { TABLE `project_id.dataset.table` | (query_statement) }, STRUCT( [number_of_output_tokens AS max_output_tokens] [, top_k_value AS top_k] [, top_p_value AS top_p] [, temperature AS temperature] [, flatten_json_output AS flatten_json_output]) )
Arguments
ML.GENERATE_TEXT
takes the following arguments:
project_id
: aSTRING
value that specifies your project ID.dataset
: aSTRING
value that specifies the BigQuery dataset that contains the model.model
: aSTRING
value that specifies the name of a remote model that uses one of thetext-bison*
Vertex AI LLMs. For more information about how to create this type of remote model, seeENDPOINT
.table
: aSTRING
value that specifies the name of the BigQuery table that contains the prompt data. The text in the column that's namedprompt
is sent to the model. If your table does not have aprompt
column, use aSELECT
statement for this argument to provide an alias for an existing table column. An error occurs if noprompt
column is available.query_statement
: aSTRING
value that specifies the GoogleSQL query that is used to generate the prompt data.max_output_tokens
: anINT64
value in the range[1,1024]
that sets the maximum number of tokens that the model outputs. Specify a lower value for shorter responses and a higher value for longer responses. The default is50
.A token might be smaller than a word and is approximately four characters. 100 tokens correspond to approximately 60-80 words.
temperature
: aFLOAT64
value in the range[0.0,1.0]
that is used for sampling during the response generation, which occurs whentop_k
andtop_p
are applied. It controls the degree of randomness in token selection. Lowertemperature
values are good for prompts that require a more deterministic and less open-ended or creative response, while highertemperature
values can lead to more diverse or creative results. Atemperature
value of0
is deterministic, meaning that the highest probability response is always selected. The default is1.0
.top_k
: anINT64
value in the range[1,40]
that changes how the model selects tokens for output. Specify a lower value for less random responses and a higher value for more random responses. The default is40
.A
top_k
value of1
means the next selected token is the most probable among all tokens in the model's vocabulary, while atop_k
value of3
means that the next token is selected from among the three most probable tokens by using thetemperature
value.For each token selection step, the
top_k
tokens with the highest probabilities are sampled. Then tokens are further filtered based on thetop_p
value, with the final token selected using temperature sampling.top_p
: aFLOAT64
value in the range[0.0,1.0]
that changes how the model selects tokens for output. Specify a lower value for less random responses and a higher value for more random responses. The default is1.0
.Tokens are selected from the most (based on the
top_k
value) to least probable until the sum of their probabilities equals thetop_p
value. For example, if tokens A, B, and C have a probability of0.3
,0.2
, and0.1
and thetop_p
value is0.5
, then the model selects either A or B as the next token by using thetemperature
value and doesn't consider C.flatten_json_output
: aBOOL
value that determines whether the JSON content returned by the function is parsed into separate columns. The default isFALSE
.
Output
ML.GENERATE_TEXT
returns the input table plus the following columns:
ml_generate_text_result
: the JSON response from theprojects.locations.endpoints.predict
call to the model. The generated text is in thecontent
element. The safety attributes are in thesafetyAttributes
element. This column is returned whenflatten_json_output
isFALSE
.ml_generate_text_llm_result
: aSTRING
value that contains the generated text returned by theprojects.locations.endpoints.predict
call to the model. This column is returned whenflatten_json_output
isTRUE
.ml_generate_text_rai_result
: aSTRING
value that contains the safety attributes returned by theprojects.locations.endpoints.predict
call to the model. This column is returned whenflatten_json_output
isTRUE
.ml_generate_text_status
: aSTRING
value that contains the API response status for the corresponding row. This value is empty if the operation was successful.
Locations
ML.GENERATE_TEXT
must run in the same region as the remote model that the
function references. You can only create remote models based on
text-bison*
models in the following locations:
asia-northeast3
asia-southeast1
eu
europe-west1
europe-west2
europe-west3
europe-west4
europe-west9
us
us-central1
us-west4
Quotas
See Cloud AI service table-valued functions quotas and limits.
Examples
Example 1
This example shows a request with the following characteristics:
- Provides a single prompt.
- Returns a longer generated text response.
- Returns a less probable generated text response.
SELECT * FROM ML.GENERATE_TEXT( MODEL `mydataset.llm_model`, (SELECT 'What is the purpose of dreams?' AS prompt), STRUCT( 0.8 AS temperature, 1024 AS max_output_tokens, 0.95 AS top_p, 40 AS top_k));
Example 2
This example shows a request with the following characteristics:
- Provides prompt data from a table column that's named
prompt
. - Returns a shorter generated text response.
- Returns a more probable generated text response.
- Flattens the JSON response into separate columns.
SELECT * FROM ML.GENERATE_TEXT( MODEL `mydataset.llm_model` TABLE `mydataset.prompt_table`, STRUCT( 0.2 AS temperature, 75 AS max_output_tokens, 0.3 AS top_p, 15 AS top_k, TRUE AS flatten_json_output));
Example 3
This example shows a request with the following characteristics:
- Provides prompt data from a table column named
question
that is aliased asprompt
. - Returns a moderately long generated text response.
- Returns a moderately probable generated text response.
SELECT * FROM ML.GENERATE_TEXT( MODEL `mydataset.llm_model`, (SELECT question AS prompt FROM `mydataset.prompt_table`), STRUCT( 0.4 AS temperature, 750 AS max_output_tokens, 0.5 AS top_p, 30 AS top_k));
Example 4
This example shows a request with the following characteristics:
- Concatenates strings and a table column to provide the prompt data.
- Returns a longer generated text response.
- Returns a more probable generated text response.
SELECT * FROM ML.GENERATE_TEXT( MODEL `mydataset.llm_model`, ( SELECT CONCAT( 'Classify the sentiment of the following text as positive or negative.Text:', input_column, 'Sentiment:') AS prompt FROM `mydataset.input_table` ), STRUCT( 0.1 AS temperature, 1000 AS max_output_tokens, 0.1 AS top_p, 10 AS top_k));
What's next
- Try a tutorial on generating text using a public dataset.
- Get step-by-step instructions on how to generate text using your own data.
- Learn more about AI functions you can use to analyze BigQuery data.