To get started with model grounding in Generative AI on Vertex AI, you need to complete
some prerequisites. These include creating a Vertex AI Search data source,
enabling Enterprise edition for your datastore, and linking your datastore to your app
in Vertex AI Search. The data source serves as the foundation for grounding
text-bison
and chat-bison
in Vertex AI.
Vertex AI Search helps you get started with high-quality search or recommendations based on data that you provide. To learn more about Vertex AI Search, see the Introduction to Vertex AI Search.
Enable Vertex AI Search
In the Google Cloud console, go to the Search & Conversation page.
Read and agree to the terms of service, then click Continue and activate the API.
Create a datastore in Vertex AI Search
To ground your models to your source data, you need to have prepared and saved your data to Vertex AI Search. To do this, you need to create a data store in Vertex AI Search.
If you are starting from scratch, you need to prepare your data for ingestion into Vertex AI Search. See Prepare data for ingesting to get started. Depending on the size of your data, ingestion can take several minutes to several hours. Only unstructured data stores are supported for grounding. After you've prepared your data for ingestion, you can Create a search data store. After you've successfully created a data store, Create a search app to link to it and Turn Enterprise edition on.
Ground the text-bison
model
Grounding is available for the text-bison
and chat-bison
models. These
following examples use the text-bison
foundation model.
If using the API, you ground the text-bison
when calling predict. To do
this, you add the optional groundingConfig
and reference your datastore
location, and your datastore ID.
If you don't know your datastore ID, follow these steps:
- In the Google Cloud console, go to the Vertex AI Search page and in the navigation menu, click Data stores. Go to the Data stores page
- Click the name of your datastore.
- On the Data page for your datastore, get the datastore ID.
REST
To test a text prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- PROMPT: A prompt is a natural language request submitted to a language model to receive a response back. Prompts can contain questions, instructions, contextual information, examples, and text for the model to complete or continue. (Don't add quotes around the prompt here.)
- TEMPERATURE:
The temperature is used for sampling during response generation, which occurs when
topP
andtopK
are applied. Temperature controls the degree of randomness in token selection. Lower temperatures are good for prompts that require a less open-ended or creative response, while higher temperatures can lead to more diverse or creative results. A temperature of0
means that the highest probability tokens are always selected. In this case, responses for a given prompt are mostly deterministic, but a small amount of variation is still possible.If the model returns a response that's too generic, too short, or the model gives a fallback response, try increasing the temperature.
- MAX_OUTPUT_TOKENS:
Maximum number of tokens that can be generated in the response. A token is
approximately four characters. 100 tokens correspond to roughly 60-80 words.
Specify a lower value for shorter responses and a higher value for potentially longer responses.
- TOP_P:
Top-P changes how the model selects tokens for output. Tokens are selected
from the most (see top-K) to least probable until the sum of their probabilities
equals the top-P value. For example, if tokens A, B, and C have a probability of
0.3, 0.2, and 0.1 and the top-P value is
0.5
, then the model will select either A or B as the next token by using temperature and excludes C as a candidate.Specify a lower value for less random responses and a higher value for more random responses.
- TOP_K:
Top-K changes how the model selects tokens for output. A top-K of
1
means the next selected token is the most probable among all tokens in the model's vocabulary (also called greedy decoding), while a top-K of3
means that the next token is selected from among the three most probable tokens by using temperature.For each token selection step, the top-K tokens with the highest probabilities are sampled. Then tokens are further filtered based on top-P with the final token selected using temperature sampling.
Specify a lower value for less random responses and a higher value for more random responses.
- SOURCE_TYPE The data source type that the model grounds to. Only Vertex AI Search is supported.
- VERTEX_AI_SEARCH_DATA_STORE: The Vertex AI Search data store ID path.
The VERTEX_AI_SEARCH_DATA_STORE must use the following format. Use the provided values for locations and collections:
projects/{project_id}/locations/global/collections/default_collection/dataStores/{data_store_id}
Note: The project ID in this data store ID path is your Vertex AI Search project ID.
HTTP method and URL:
POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-bison:predict
Request JSON body:
{ "instances": [ { "prompt": "PROMPT"} ], "parameters": { "temperature": TEMPERATURE, "maxOutputTokens": MAX_OUTPUT_TOKENS, "topP": TOP_P, "topK": TOP_K, "groundingConfig": { "sources": [ { "type": "VERTEX_AI_SEARCH", "vertexAiSearchDatastore": "VERTEX_AI_SEARCH_DATA_STORE" } ] } } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-bison:predict"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-bison:predict" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
Console
To ground a model from Vertex AI Studio, follow these instructions.
- Select the PaLM 2 for Text Bison or PaLM 2 for Chat Bison model card in the Model Garden. Go to the Model Garden
- From the model card, click Open prompt design. The Vertex AI Studio opens.
- From the parameters panel, select Advanced.
- Toggle the Enable Grounding option and select Customize.
- From the grounding source dropdown, select Vertex AI Search.
- Enter the Vertex AI Search data store path to your content. Path should
follow this format:
projects/{project_id}/locations/global/collections/default_collection/dataStores/{data_store_id}
. - Enter your prompt and click Submit.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
What's next
- Learn how to send chat prompt requests.
- Learn about responsible AI best practices and Vertex AI's safety filters.
- To learn how to ground Gemini models, see Ground responses for Gemini models