Generative AI on Vertex AI (also known as genAI or gen AI) gives you access to Google's generative AI models for multiple modalities (text, code, images, speech). You can test and tune these large language models (LLM), and then deploy them for use in your AI-powered applications. For more information, see the Overview of Generative AI on Vertex AI.
Vertex AI has a variety of generative AI foundation models that are accessible through an API, including the models used in the following examples:
- Gemini Pro is designed to handle natural language tasks, multiturn text and code chat, and code generation.
- Gemini Pro Vision supports multimodal prompts. You can include text, images, and video in your prompt requests and get text or code responses.
- Pathways Language Model 2 (PaLM 2) for text is fine-tuned for language tasks such as classification, summarization, and entity extraction.
Each model is exposed through a publisher endpoint that's specific to your Google Cloud project so there's no need to deploy the foundation model unless you need to tune it for a specific use case. You can send a prompt to the publisher endpoint. A prompt is a natural language request sent to an LLM to elicit a response back.
This tutorial demonstrates workflows that generate responses from
Vertex AI models by sending text prompts to the publisher
endpoints using either a Workflows connector or an HTTP POST
request. For more information, see the
Vertex AI API connector overview
and Make an HTTP request.
Note that you can deploy and run each workflow independently of each other.
Objectives
In this tutorial, you will do the following:
- Enable the Vertex AI and Workflows APIs, and
grant the Vertex AI User (
roles/aiplatform.user
) role to your service account. This role allows access to most Vertex AI capabilities. For more information about setting up Vertex AI, see Get set up on Google Cloud. - Deploy and run a workflow that prompts a Vertex AI model (Gemini Pro Vision) to describe an image that is publicly available through Cloud Storage. For more information, see Make data public.
- Deploy and run a workflow that loops through a list of countries in parallel and prompts a Vertex AI model (Gemini Pro) to generate and return the histories of the countries. Using parallel branches allows you to reduce the total execution time by starting the calls to the LLM at the same time and waiting for all of them to complete before combining the results. For more information, see Execute workflow steps in parallel.
- Deploy a workflow similar to the preceding one; however, prompt a Vertex AI model (PaLM 2 for text) to generate and return the histories of the countries. For more information about how to choose a model, see Model information.
- Deploy a workflow that can summarize a large document. Because there is a limit to the context window which sets how far back the model looks during training (and for forecasts), the workflow divides the document into smaller parts, and then prompts a Vertex AI model (Gemini Pro) to summarize each part in parallel. For more information, see Summarization prompts and Forecast horizon, context window, and forecast window.
Costs
In this document, you use the following billable components of Google Cloud:
To generate a cost estimate based on your projected usage,
use the pricing calculator.
When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, see Clean up.
Before you begin
Before trying out the examples in this tutorial, ensure that you have completed the following.
Console
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the Vertex AI and Workflows APIs.
-
Create a service account:
-
In the Google Cloud console, go to the Create service account page.
Go to Create service account - Select your project.
-
In the Service account name field, enter a name. The Google Cloud console fills in the Service account ID field based on this name.
In the Service account description field, enter a description. For example,
Service account for quickstart
. - Click Create and continue.
-
Grant the Vertex AI > Vertex AI User role to the service account.
To grant the role, find the Select a role list, then select Vertex AI > Vertex AI User.
- Click Continue.
-
Click Done to finish creating the service account.
-
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the Vertex AI and Workflows APIs.
-
Create a service account:
-
In the Google Cloud console, go to the Create service account page.
Go to Create service account - Select your project.
-
In the Service account name field, enter a name. The Google Cloud console fills in the Service account ID field based on this name.
In the Service account description field, enter a description. For example,
Service account for quickstart
. - Click Create and continue.
-
Grant the Vertex AI > Vertex AI User role to the service account.
To grant the role, find the Select a role list, then select Vertex AI > Vertex AI User.
- Click Continue.
-
Click Done to finish creating the service account.
-
gcloud
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
- Install the Google Cloud CLI.
-
To initialize the gcloud CLI, run the following command:
gcloud init
-
Create or select a Google Cloud project.
-
Create a Google Cloud project:
gcloud projects create PROJECT_ID
Replace
PROJECT_ID
with a name for the Google Cloud project you are creating. -
Select the Google Cloud project that you created:
gcloud config set project PROJECT_ID
Replace
PROJECT_ID
with your Google Cloud project name.
-
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the Vertex AI and Workflows APIs:
gcloud services enable aiplatform.googleapis.com
workflows.googleapis.com -
Set up authentication:
-
Create the service account:
gcloud iam service-accounts create SERVICE_ACCOUNT_NAME
Replace
SERVICE_ACCOUNT_NAME
with a name for the service account. -
Grant the
roles/aiplatform.user
IAM role to the service account:gcloud projects add-iam-policy-binding PROJECT_ID --member="serviceAccount:SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com" --role=roles/aiplatform.user
Replace the following:
SERVICE_ACCOUNT_NAME
: the name of the service accountPROJECT_ID
: the project ID where you created the service account
-
- Install the Google Cloud CLI.
-
To initialize the gcloud CLI, run the following command:
gcloud init
-
Create or select a Google Cloud project.
-
Create a Google Cloud project:
gcloud projects create PROJECT_ID
Replace
PROJECT_ID
with a name for the Google Cloud project you are creating. -
Select the Google Cloud project that you created:
gcloud config set project PROJECT_ID
Replace
PROJECT_ID
with your Google Cloud project name.
-
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the Vertex AI and Workflows APIs:
gcloud services enable aiplatform.googleapis.com
workflows.googleapis.com -
Set up authentication:
-
Create the service account:
gcloud iam service-accounts create SERVICE_ACCOUNT_NAME
Replace
SERVICE_ACCOUNT_NAME
with a name for the service account. -
Grant the
roles/aiplatform.user
IAM role to the service account:gcloud projects add-iam-policy-binding PROJECT_ID --member="serviceAccount:SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com" --role=roles/aiplatform.user
Replace the following:
SERVICE_ACCOUNT_NAME
: the name of the service accountPROJECT_ID
: the project ID where you created the service account
-
Deploy a workflow that describes an image (Gemini Pro Vision)
Deploy a workflow that uses a connector method
(generateContent
) to make a request to a Gemini Pro Vision
publisher endpoint. The method provides support for content generation with
multimodal inputs.
The workflow provides a text prompt and the URI of an image that is publicly available in a Cloud Storage bucket. You can view the image and, in the Google Cloud console, you can view the object details.
The workflow returns a description of the image from the model's generated response.
For more information about the HTTP request body parameters used when prompting the LLM, and the response body elements, see the Gemini API reference.
Console
In the Google Cloud console, go to the Workflows page.
Click
Create.Enter a name for the new workflow:
describe-image
.In the Region list, select us-central1 (Iowa).
For the Service account, select the service account you previously created.
Click Next.
In the workflow editor, enter the following definition for your workflow:
Click Deploy.
gcloud
Create a source code file for your workflow:
touch describe-image.yaml
In a text editor, copy the following workflow to your source code file:
Deploy the workflow by entering the following command:
gcloud workflows deploy describe-image \ --source=describe-image.yaml \ --location=us-central1 \ --service-account=SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com
Execute the workflow
Executing a workflow runs the current workflow definition associated with the workflow.
Console
In the Google Cloud console, go to the Workflows page.
On the Workflows page, select the describe-image workflow to go to its details page.
On the Workflow details page, click play_arrow Execute.
For the Input, enter the following:
{"image_url":"gs://generativeai-downloads/images/scones.jpg"}
Click Execute again.
View the results of the workflow in the Output pane.
The output should be similar to the following:
{ "image_description": "There are three pink peony flowers on the right side of the picture[]...]There is a white napkin on the table.", "image_url": "gs://generativeai-downloads/images/scones.jpg" }
gcloud
Open a terminal.
Execute the workflow:
gcloud workflows run describe-image \ --data='{"image_url":"gs://generativeai-downloads/images/scones.jpg"}'
The execution results should be similar to the following:
Waiting for execution [258b530e-a093-46d7-a4ff-cbf5392273c0] to complete...done. argument: '{"image_url":"gs://generativeai-downloads/images/scones.jpg"}' createTime: '2024-02-09T13:59:32.166409938Z' duration: 4.174708484s endTime: '2024-02-09T13:59:36.341118422Z' name: projects/1051295516635/locations/us-central1/workflows/describe-image/executions/258b530e-a093-46d7-a4ff-cbf5392273c0 result: "{\"image_description\":\"The picture shows a rustic table with a white surface,\ \ on which there are several scones with blueberries, as well as two cups of coffee\ [...] \ on the table. The background of the table is a dark blue color.\",\"image_url\"\ :\"gs://generativeai-downloads/images/scones.jpg\"}" startTime: '2024-02-09T13:59:32.166409938Z' state: SUCCEEDED
Deploy a workflow that generates country histories (Gemini Pro)
Deploy a workflow that loops through an input list of countries in
parallel
and uses a connector method
(generateContent
) to make a request to a Gemini Pro
publisher endpoint. The method provides support for content generation with
multimodal inputs.
The workflow returns the country histories generated by the model, combining them in a map.
For more information about the HTTP request body parameters used when prompting the LLM, and the response body elements, see the Gemini API reference.
Console
In the Google Cloud console, go to the Workflows page.
Click
Create.Enter a name for the new workflow:
gemini-pro-country-histories
.In the Region list, select us-central1 (Iowa).
For the Service account, select the service account you previously created.
Click Next.
In the workflow editor, enter the following definition for your workflow:
Click Deploy.
gcloud
Create a source code file for your workflow:
touch gemini-pro-country-histories.yaml
In a text editor, copy the following workflow to your source code file:
Deploy the workflow by entering the following command:
gcloud workflows deploy gemini-pro-country-histories \ --source=gemini-pro-country-histories.yaml \ --location=us-central1 \ --service-account=SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com
Execute the workflow
Executing a workflow runs the current workflow definition associated with the workflow.
Console
In the Google Cloud console, go to the Workflows page.
On the Workflows page, select the gemini-pro-country-histories workflow to go to its details page.
On the Workflow details page, click play_arrow Execute.
For the Input, enter the following:
{"countries":["Argentina", "Bhutan", "Cyprus", "Denmark", "Ethiopia"]}
Click Execute again.
View the results of the workflow in the Output pane.
The output should be similar to the following:
{ "Argentina": "The history of Argentina is a complex and fascinating one, marked by periods of prosperity and decline, political [...] "Bhutan": "The history of Bhutan is a rich and fascinating one, dating back to the 7th century AD. Here is a brief overview: [...] "Cyprus": "The history of Cyprus is a long and complex one, spanning over 10,000 years. The island has been ruled by a succession [...] "Denmark": "1. **Prehistory and Early History (c. 12,000 BC - 800 AD)**\\n - The earliest evidence of human habitation in Denmark [...] "Ethiopia": "The history of Ethiopia is a long and complex one, stretching back to the earliest human civilizations. The country is [...] }
gcloud
Open a terminal.
Execute the workflow:
gcloud workflows run gemini-pro-country-histories \ --data='{"countries":["Argentina", "Bhutan", "Cyprus", "Denmark", "Ethiopia"]}' \ --location=us-central1
The execution results should be similar to the following:
Waiting for execution [7ae1ccf1-29b7-4c2c-99ec-7a12ae289391] to complete...done. argument: '{"countries":["Argentina","Bhutan","Cyprus","Denmark","Ethiopia"]}' createTime: '2024-02-09T16:25:16.742349156Z' duration: 12.075968673s endTime: '2024-02-09T16:25:28.818317829Z' name: projects/1051295516635/locations/us-central1/workflows/gemini-pro-country-histories/executions/7ae1ccf1-29b7-4c2c-99ec-7a12ae289391 result: "{\"Argentina\":\"The history of Argentina can be traced back to the arrival\ [...] n* 2015: Argentina elects Mauricio Macri as president.\",\"Bhutan\":\"The history\ [...] \ natural beauty, ancient monasteries, and friendly people.\",\"Cyprus\":\"The history\ [...] ,\"Denmark\":\"The history of Denmark can be traced back to the Stone Age, with\ [...] \ a high standard of living.\",\"Ethiopia\":\"The history of Ethiopia is long and\ [...] startTime: '2024-02-09T16:25:16.742349156Z' state: SUCCEEDED
Deploy a workflow that generates country histories (PaLM 2 for text)
You might not want to use Gemini Pro as your model. The
following example uses a workflow similar to the preceding one; however, it uses
a connector method
(predict
) to make a request to a PaLM 2
for text publisher endpoint. The method performs an online prediction.
For more information about the HTTP request body parameters used when prompting the LLM, and the response body elements, see the PaLM 2 for text API reference.
Console
In the Google Cloud console, go to the Workflows page.
Click
Create.Enter a name for the new workflow:
text-bison-country-histories
.In the Region list, select us-central1 (Iowa).
For the Service account, select the service account you previously created.
Click Next.
In the workflow editor, enter the following definition for your workflow:
Note that depending on the model used, you might need to remove any unnecessary whitespace from the response.
Click Deploy.
gcloud
Create a source code file for your workflow:
touch text-bison-country-histories.yaml
In a text editor, copy the following workflow to your source code file:
Note that depending on the model used, you might need to remove any unnecessary whitespace from the response.
Deploy the workflow by entering the following command:
gcloud workflows deploy text-bison-country-histories \ --source=text-bison-country-histories.yaml \ --location=us-central1 \ --service-account=SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com
Deploy a workflow that summarizes a large document (Gemini Pro)
Deploy a workflow that divides a large document into smaller parts, making
http.post
requests to a Gemini Pro publisher endpoint in
parallel so that the model can summarize each part simultaneously. The workflow finally
combines all the partial summaries into a complete one.
For more information about the HTTP request body parameters used when prompting the LLM, and the response body elements, see the Gemini API reference.
The workflow definition assumes that you have created a Cloud Storage
bucket to which you can upload a text file. For more information
about the Workflows connector (googleapis.storage.v1.objects.get
)
used to retrieve objects from the Cloud Storage bucket, see the
Connectors reference.
After deploying the workflow, you can execute it by creating an appropriate
Eventarc trigger and then by uploading a file to the bucket. For
more information, see
Route Cloud Storage events to Workflows.
Note that additional APIs must be enabled, and additional roles must be
granted, including granting to your service account the Storage Object User
(roles/storage.objectUser
) role that supports using Cloud Storage
objects. For more information, see the
Prepare to create a trigger section.
Console
In the Google Cloud console, go to the Workflows page.
Click
Create.Enter a name for the new workflow:
gemini-pro-summaries
.In the Region list, select us-central1 (Iowa).
For the Service account, select the service account you previously created.
Click Next.
In the workflow editor, enter the following definition for your workflow:
Click Deploy.
gcloud
Create a source code file for your workflow:
touch gemini-pro-summaries.yaml
In a text editor, copy the following workflow to your source code file:
Deploy the workflow by entering the following command:
gcloud workflows deploy gemini-pro-summaries \ --source=gemini-pro-summaries.yaml \ --location=us-central1 \ --service-account=SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com
Clean up
To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.
Delete the project
Console
- In the Google Cloud console, go to the Manage resources page.
- In the project list, select the project that you want to delete, and then click Delete.
- In the dialog, type the project ID, and then click Shut down to delete the project.
gcloud
Delete a Google Cloud project:
gcloud projects delete PROJECT_ID
Delete individual resources
Delete the workflows that you created in this tutorial.
What's next
- Learn more about Workflows connectors.
- Learn more about the Vertex AI
generateContent
method. - Learn more about the Vertex AI
predict
method.