从工作流访问 Vertex AI 模型


借助 Vertex AI 上的生成式 AI(也称为genAI生成式 AI 或genAI生成式 AI),您可以使用 Google 适用于多种模态(文本、代码、图片、语音)的生成式 AI 模型。您可以测试和调整这些大语言模型 (LLM),然后部署它们,以便在依托 AI 技术的应用中使用。如需了解详情,请参阅 Vertex AI 上的生成式 AI 概览

Vertex AI 提供各种可通过 API 访问的生成式 AI 基础模型,包括以下示例中使用的模型:

  • Gemini Pro 旨在处理自然语言任务、多轮文本和代码聊天以及代码生成。
  • Gemini Pro Vision 支持多模态提示。您可以在提示请求中包含文本、图片和视频,并获取文本或代码回答。
  • 适用于文本的 Pathways Language Model 2 (PaLM 2) 针对分类、摘要和实体提取等语言任务进行了优化。

每个模型都通过特定于您的 Google Cloud 项目的发布者端点公开,因此无需部署基础模型,除非您需要针对特定用例对其进行调整。您可以向发布者端点发送提示。提示是发送到 LLM 以触发响应的自然语言请求。

本教程演示了如下工作流:使用 Workflows 连接器或 HTTP POST 请求向发布商端点发送文本提示,从而通过 Vertex AI 模型生成响应。如需了解详情,请参阅 Vertex AI API 连接器概览发出 HTTP 请求

请注意,您可以单独部署和运行每个工作流。

目标

在本教程中,您将执行以下操作:

  1. 启用 Vertex AI 和 Workflows API,并向您的服务帐号授予 Vertex AI User (roles/aiplatform.user) 角色。此角色允许使用大多数 Vertex AI 功能。如需详细了解如何设置 Vertex AI,请参阅在 Google Cloud 上设置
  2. 部署并运行一个工作流,提示 Vertex AI 模型 (Gemini Pro Vision) 描述通过 Cloud Storage 公开提供的图片。如需了解详情,请参阅公开数据
  3. 部署并运行一个工作流,该工作流会并行遍历一系列国家/地区,并提示 Vertex AI 模型 (Gemini Pro) 生成并返回国家/地区的历史记录。使用并行分支可以同时启动对 LLM 的调用并等待所有调用完成后再合并结果,从而缩短总执行时间。如需了解详情,请参阅并行执行工作流步骤
  4. 部署与上述工作流类似的工作流;不过,您需要提示 Vertex AI 模型(用于文本的 PaLM 2)生成并返回国家/地区的历史记录。如需详细了解如何选择模型,请参阅模型信息
  5. 部署可汇总大型文档的工作流。由于上下文窗口存在限制,该限制决定了模型在训练(以及预测)期间回溯多长时间,因此工作流将文档分成了较小的部分,然后提示 Vertex AI 模型 (Gemini Pro) 并行总结每个部分。如需了解详情,请参阅摘要提示预测范围、上下文窗口和预测窗口

费用

在本文档中,您将使用 Google Cloud 的以下收费组件:

您可使用价格计算器根据您的预计使用情况来估算费用。 Google Cloud 新用户可能有资格申请免费试用

完成本文档中描述的任务后,您可以通过删除所创建的资源来避免继续计费。如需了解详情,请参阅清理

准备工作

在尝试本教程中的示例之前,请确保您已完成以下操作。

控制台

  1. 登录您的 Google Cloud 账号。如果您是 Google Cloud 新手,请创建一个账号来评估我们的产品在实际场景中的表现。新客户还可获享 $300 赠金,用于运行、测试和部署工作负载。
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. 确保您的 Google Cloud 项目已启用结算功能

  4. 启用 Vertex AI and Workflows API。

    启用 API

  5. Create a service account:

    1. In the Google Cloud console, go to the Create service account page.

      Go to Create service account
    2. Select your project.
    3. In the Service account name field, enter a name. The Google Cloud console fills in the Service account ID field based on this name.

      In the Service account description field, enter a description. For example, Service account for quickstart.

    4. Click Create and continue.
    5. Grant the Vertex AI > Vertex AI User role to the service account.

      To grant the role, find the Select a role list, then select Vertex AI > Vertex AI User.

    6. Click Continue.
    7. Click Done to finish creating the service account.

  6. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  7. 确保您的 Google Cloud 项目已启用结算功能

  8. 启用 Vertex AI and Workflows API。

    启用 API

  9. Create a service account:

    1. In the Google Cloud console, go to the Create service account page.

      Go to Create service account
    2. Select your project.
    3. In the Service account name field, enter a name. The Google Cloud console fills in the Service account ID field based on this name.

      In the Service account description field, enter a description. For example, Service account for quickstart.

    4. Click Create and continue.
    5. Grant the Vertex AI > Vertex AI User role to the service account.

      To grant the role, find the Select a role list, then select Vertex AI > Vertex AI User.

    6. Click Continue.
    7. Click Done to finish creating the service account.

gcloud

  1. 登录您的 Google Cloud 账号。如果您是 Google Cloud 新手,请创建一个账号来评估我们的产品在实际场景中的表现。新客户还可获享 $300 赠金,用于运行、测试和部署工作负载。
  2. 安装 Google Cloud CLI。
  3. 如需初始化 gcloud CLI,请运行以下命令:

    gcloud init
  4. Create or select a Google Cloud project.

    • Create a Google Cloud project:

      gcloud projects create PROJECT_ID

      Replace PROJECT_ID with a name for the Google Cloud project you are creating.

    • Select the Google Cloud project that you created:

      gcloud config set project PROJECT_ID

      Replace PROJECT_ID with your Google Cloud project name.

  5. 确保您的 Google Cloud 项目已启用结算功能

  6. Enable the Vertex AI and Workflows APIs:

    gcloud services enable aiplatform.googleapis.com workflows.googleapis.com
  7. Set up authentication:

    1. Create the service account:

      gcloud iam service-accounts create SERVICE_ACCOUNT_NAME

      Replace SERVICE_ACCOUNT_NAME with a name for the service account.

    2. Grant the roles/aiplatform.user IAM role to the service account:

      gcloud projects add-iam-policy-binding PROJECT_ID --member="serviceAccount:SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com" --role=roles/aiplatform.user

      Replace the following:

      • SERVICE_ACCOUNT_NAME: the name of the service account
      • PROJECT_ID: the project ID where you created the service account
  8. 安装 Google Cloud CLI。
  9. 如需初始化 gcloud CLI,请运行以下命令:

    gcloud init
  10. Create or select a Google Cloud project.

    • Create a Google Cloud project:

      gcloud projects create PROJECT_ID

      Replace PROJECT_ID with a name for the Google Cloud project you are creating.

    • Select the Google Cloud project that you created:

      gcloud config set project PROJECT_ID

      Replace PROJECT_ID with your Google Cloud project name.

  11. 确保您的 Google Cloud 项目已启用结算功能

  12. Enable the Vertex AI and Workflows APIs:

    gcloud services enable aiplatform.googleapis.com workflows.googleapis.com
  13. Set up authentication:

    1. Create the service account:

      gcloud iam service-accounts create SERVICE_ACCOUNT_NAME

      Replace SERVICE_ACCOUNT_NAME with a name for the service account.

    2. Grant the roles/aiplatform.user IAM role to the service account:

      gcloud projects add-iam-policy-binding PROJECT_ID --member="serviceAccount:SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com" --role=roles/aiplatform.user

      Replace the following:

      • SERVICE_ACCOUNT_NAME: the name of the service account
      • PROJECT_ID: the project ID where you created the service account

部署用于描述图片的工作流 (Gemini Pro Vision)

部署一个使用连接器方法 (generateContent) 向 Gemini Pro Vision 发布商端点发出请求的工作流。该方法可支持使用多模态输入生成内容。

工作流提供文本提示和 Cloud Storage 存储桶中公开提供的图片的 URI。您可以查看映像,也可以在 Google Cloud 控制台中查看对象详细信息

工作流会根据模型生成的回答返回图片说明。

如需详细了解提示 LLM 时使用的 HTTP 请求正文参数以及响应正文元素,请参阅 Gemini API 参考文档

控制台

  1. 在 Google Cloud 控制台中,前往工作流页面。

    进入 Workflows

  2. 点击 创建

  3. 输入新工作流的名称:describe-image

  4. 区域列表中,选择 us-central1(爱荷华)

  5. 对于服务帐号,选择您之前创建的服务帐号。

  6. 点击下一步

  7. 在工作流编辑器中,为工作流输入以下定义:

    main:
        params: [args]
        steps:
        - init:
            assign:
                - project: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
                - location: "us-central1"
                - model: "gemini-1.0-pro-vision"
                - text_combined: ""
        - ask_llm:
            call: googleapis.aiplatform.v1.projects.locations.endpoints.generateContent
            args:
                model: ${"projects/" + project + "/locations/" + location + "/publishers/google/models/" + model}
                region: ${location}
                body:
                    contents:
                        role: user
                        parts:
                        - fileData:
                            mimeType: image/jpeg
                            fileUri: ${args.image_url}
                        - text: Describe this picture in detail
                    generation_config:
                        temperature: 0.4
                        max_output_tokens: 2048
                        top_p: 1
                        top_k: 32
            result: llm_response
        - return_result:
            return:
                image_url: ${args.image_url}
                image_description: ${llm_response.candidates[0].content.parts[0].text}
  8. 点击部署

gcloud

  1. 为工作流创建源代码文件:

    touch describe-image.yaml
    
  2. 在文本编辑器中,将以下工作流复制到源代码文件中:

    main:
        params: [args]
        steps:
        - init:
            assign:
                - project: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
                - location: "us-central1"
                - model: "gemini-1.0-pro-vision"
                - text_combined: ""
        - ask_llm:
            call: googleapis.aiplatform.v1.projects.locations.endpoints.generateContent
            args:
                model: ${"projects/" + project + "/locations/" + location + "/publishers/google/models/" + model}
                region: ${location}
                body:
                    contents:
                        role: user
                        parts:
                        - fileData:
                            mimeType: image/jpeg
                            fileUri: ${args.image_url}
                        - text: Describe this picture in detail
                    generation_config:
                        temperature: 0.4
                        max_output_tokens: 2048
                        top_p: 1
                        top_k: 32
            result: llm_response
        - return_result:
            return:
                image_url: ${args.image_url}
                image_description: ${llm_response.candidates[0].content.parts[0].text}
  3. 输入以下命令以部署工作流:

    gcloud workflows deploy describe-image \
        --source=describe-image.yaml \
        --location=us-central1 \
        --service-account=SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com

执行工作流

执行某个工作流会运行与该工作流关联的当前工作流定义。

控制台

  1. 在 Google Cloud 控制台中,前往工作流页面。

    进入 Workflows

  2. Workflows 页面上,选择 describe-image 工作流以转到其详情页面。

  3. 工作流详情页面上,点击 执行

  4. 对于 Input,输入以下内容:

    {"image_url":"gs://generativeai-downloads/images/scones.jpg"}
  5. 再次点击执行

  6. 输出窗格中查看工作流的结果。

    输出应类似如下所示:

    {
      "image_description": "There are three pink peony flowers on the right side of the picture[]...]There is a white napkin on the table.",
      "image_url": "gs://generativeai-downloads/images/scones.jpg"
    }

gcloud

  1. 打开终端。

  2. 执行工作流:

    gcloud workflows run describe-image \
        --data='{"image_url":"gs://generativeai-downloads/images/scones.jpg"}'

    执行结果应类似于以下内容:

      Waiting for execution [258b530e-a093-46d7-a4ff-cbf5392273c0] to complete...done.
      argument: '{"image_url":"gs://generativeai-downloads/images/scones.jpg"}'
      createTime: '2024-02-09T13:59:32.166409938Z'
      duration: 4.174708484s
      endTime: '2024-02-09T13:59:36.341118422Z'
      name: projects/1051295516635/locations/us-central1/workflows/describe-image/executions/258b530e-a093-46d7-a4ff-cbf5392273c0
      result: "{\"image_description\":\"The picture shows a rustic table with a white surface,\
        \ on which there are several scones with blueberries, as well as two cups of coffee\
        [...]
        \ on the table. The background of the table is a dark blue color.\",\"image_url\"\
        :\"gs://generativeai-downloads/images/scones.jpg\"}"
      startTime: '2024-02-09T13:59:32.166409938Z'
      state: SUCCEEDED

部署可生成国家/地区历史记录的工作流 (Gemini Pro)

部署一个工作流,该工作流会parallel遍历国家/地区的输入列表,并使用连接器方法 (generateContent) 向 Gemini Pro 发布商端点发出请求。该方法可支持使用多模态输入生成内容。

该工作流会返回此模型生成的国家/地区历史记录,并将其合并到一个地图中。

如需详细了解提示 LLM 时使用的 HTTP 请求正文参数以及响应正文元素,请参阅 Gemini API 参考文档

控制台

  1. 在 Google Cloud 控制台中,前往工作流页面。

    进入 Workflows

  2. 点击 创建

  3. 输入新工作流的名称:gemini-pro-country-histories

  4. 区域列表中,选择 us-central1(爱荷华)

  5. 对于服务帐号,选择您之前创建的服务帐号。

  6. 点击下一步

  7. 在工作流编辑器中,为工作流输入以下定义:

    main:
        params: [args]
        steps:
        - init:
            assign:
                - project: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
                - location: "us-central1"
                - model: "gemini-1.0-pro"
                - histories: {}
        - loop_over_countries:
            parallel:
                shared: [histories]
                for:
                    value: country
                    in: ${args.countries}
                    steps:
                        - ask_llm:
                            call: googleapis.aiplatform.v1.projects.locations.endpoints.generateContent
                            args:
                                model: ${"projects/" + project + "/locations/" + location + "/publishers/google/models/" + model}
                                region: ${location}
                                body:
                                    contents:
                                        role: "USER"
                                        parts:
                                            text: ${"Can you tell me about the history of " + country}
                                    generation_config:
                                        temperature: 0.5
                                        max_output_tokens: 2048
                                        top_p: 0.8
                                        top_k: 40
                            result: llm_response
                        - add_to_histories:
                            assign:
                                - histories[country]: ${llm_response.candidates[0].content.parts[0].text}
        - return_result:
            return: ${histories}
  8. 点击部署

gcloud

  1. 为工作流创建源代码文件:

    touch gemini-pro-country-histories.yaml
    
  2. 在文本编辑器中,将以下工作流复制到源代码文件中:

    main:
        params: [args]
        steps:
        - init:
            assign:
                - project: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
                - location: "us-central1"
                - model: "gemini-1.0-pro"
                - histories: {}
        - loop_over_countries:
            parallel:
                shared: [histories]
                for:
                    value: country
                    in: ${args.countries}
                    steps:
                        - ask_llm:
                            call: googleapis.aiplatform.v1.projects.locations.endpoints.generateContent
                            args:
                                model: ${"projects/" + project + "/locations/" + location + "/publishers/google/models/" + model}
                                region: ${location}
                                body:
                                    contents:
                                        role: "USER"
                                        parts:
                                            text: ${"Can you tell me about the history of " + country}
                                    generation_config:
                                        temperature: 0.5
                                        max_output_tokens: 2048
                                        top_p: 0.8
                                        top_k: 40
                            result: llm_response
                        - add_to_histories:
                            assign:
                                - histories[country]: ${llm_response.candidates[0].content.parts[0].text}
        - return_result:
            return: ${histories}
  3. 输入以下命令以部署工作流:

    gcloud workflows deploy gemini-pro-country-histories \
        --source=gemini-pro-country-histories.yaml \
        --location=us-central1 \
        --service-account=SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com

执行工作流

执行某个工作流会运行与该工作流关联的当前工作流定义。

控制台

  1. 在 Google Cloud 控制台中,前往工作流页面。

    进入 Workflows

  2. Workflows 页面上,选择 gemini-pro-country-hi 试试 工作流,进入其详情页面。

  3. 工作流详情页面上,点击 执行

  4. 对于 Input,输入以下内容:

    {"countries":["Argentina", "Bhutan", "Cyprus", "Denmark", "Ethiopia"]}
  5. 再次点击执行

  6. 输出窗格中查看工作流的结果。

    输出应类似如下所示:

    {
      "Argentina": "The history of Argentina is a complex and fascinating one, marked by periods of prosperity and decline, political [...]
      "Bhutan": "The history of Bhutan is a rich and fascinating one, dating back to the 7th century AD. Here is a brief overview: [...]
      "Cyprus": "The history of Cyprus is a long and complex one, spanning over 10,000 years. The island has been ruled by a succession [...]
      "Denmark": "1. **Prehistory and Early History (c. 12,000 BC - 800 AD)**\\n   - The earliest evidence of human habitation in Denmark [...]
      "Ethiopia": "The history of Ethiopia is a long and complex one, stretching back to the earliest human civilizations. The country is [...]
    }

gcloud

  1. 打开终端。

  2. 执行工作流:

    gcloud workflows run gemini-pro-country-histories \
        --data='{"countries":["Argentina", "Bhutan", "Cyprus", "Denmark", "Ethiopia"]}' \
        --location=us-central1

    执行结果应类似于以下内容:

      Waiting for execution [7ae1ccf1-29b7-4c2c-99ec-7a12ae289391] to complete...done.
      argument: '{"countries":["Argentina","Bhutan","Cyprus","Denmark","Ethiopia"]}'
      createTime: '2024-02-09T16:25:16.742349156Z'
      duration: 12.075968673s
      endTime: '2024-02-09T16:25:28.818317829Z'
      name: projects/1051295516635/locations/us-central1/workflows/gemini-pro-country-histories/executions/7ae1ccf1-29b7-4c2c-99ec-7a12ae289391
      result: "{\"Argentina\":\"The history of Argentina can be traced back to the arrival\
        [...]
        n* 2015: Argentina elects Mauricio Macri as president.\",\"Bhutan\":\"The history\
        [...]
        \ natural beauty, ancient monasteries, and friendly people.\",\"Cyprus\":\"The history\
        [...]
        ,\"Denmark\":\"The history of Denmark can be traced back to the Stone Age, with\
        [...]
        \ a high standard of living.\",\"Ethiopia\":\"The history of Ethiopia is long and\
        [...]
      startTime: '2024-02-09T16:25:16.742349156Z'
      state: SUCCEEDED

部署可生成国家/地区历史记录的工作流(用于文本的 PaLM 2)

您可能不想使用 Gemini Pro 作为模型。以下示例使用与上例类似的工作流;不过,它使用连接器方法 (predict) 向 PaLM 2 发出文本发布器端点的请求。该方法会执行在线预测。

如需详细了解提示 LLM 时使用的 HTTP 请求正文参数以及响应正文元素,请参阅 PaLM 2 for Text API 参考文档

控制台

  1. 在 Google Cloud 控制台中,前往工作流页面。

    进入 Workflows

  2. 点击 创建

  3. 输入新工作流的名称:text-bison-country-histories

  4. 区域列表中,选择 us-central1(爱荷华)

  5. 对于服务帐号,选择您之前创建的服务帐号。

  6. 点击下一步

  7. 在工作流编辑器中,为工作流输入以下定义:

    main:
        params: [args]
        steps:
        - init:
            assign:
                - project: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
                - location: "us-central1"
                - model: "text-bison"
                - histories: {}
        - loop_over_countries:
            parallel:
                shared: [histories]
                for:
                    value: country
                    in: ${args.countries}
                    steps:
                        - ask_llm:
                            call: googleapis.aiplatform.v1.projects.locations.endpoints.predict
                            args:
                                endpoint: ${"projects/" + project + "/locations/" + location + "/publishers/google/models/" + model }
                                region: ${location}
                                body:
                                    instances:
                                        - prompt: '${"Can you  tell me about the history of " + country}'
                                    parameters:
                                        temperature: 0.5
                                        maxOutputTokens: 2048
                                        topP: 0.8
                                        topK: 40
                            result: llm_response
                        - add_to_histories:
                            assign:
                                - history: ${llm_response.predictions[0].content}
                                # Remove leading whitespace from start of text
                                - history: ${text.substring(history, 1, len(history))}
                                - histories[country]: ${history}
        - return_result:
            return: ${histories}

    请注意,根据所使用的模型,您可能需要从响应中移除任何不必要的空格。

  8. 点击部署

gcloud

  1. 为工作流创建源代码文件:

    touch text-bison-country-histories.yaml
    
  2. 在文本编辑器中,将以下工作流复制到源代码文件中:

    main:
        params: [args]
        steps:
        - init:
            assign:
                - project: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
                - location: "us-central1"
                - model: "text-bison"
                - histories: {}
        - loop_over_countries:
            parallel:
                shared: [histories]
                for:
                    value: country
                    in: ${args.countries}
                    steps:
                        - ask_llm:
                            call: googleapis.aiplatform.v1.projects.locations.endpoints.predict
                            args:
                                endpoint: ${"projects/" + project + "/locations/" + location + "/publishers/google/models/" + model }
                                region: ${location}
                                body:
                                    instances:
                                        - prompt: '${"Can you  tell me about the history of " + country}'
                                    parameters:
                                        temperature: 0.5
                                        maxOutputTokens: 2048
                                        topP: 0.8
                                        topK: 40
                            result: llm_response
                        - add_to_histories:
                            assign:
                                - history: ${llm_response.predictions[0].content}
                                # Remove leading whitespace from start of text
                                - history: ${text.substring(history, 1, len(history))}
                                - histories[country]: ${history}
        - return_result:
            return: ${histories}

    请注意,根据所使用的模型,您可能需要从响应中移除任何不必要的空格。

  3. 输入以下命令以部署工作流:

    gcloud workflows deploy text-bison-country-histories \
        --source=text-bison-country-histories.yaml \
        --location=us-central1 \
        --service-account=SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com

部署汇总大型文档的工作流 (Gemini Pro)

部署一个工作流,将大型文档分成更小的部分,parallel向 Gemini Pro 发布商端点发出 http.post 请求,以便模型可以同时汇总每个部分。工作流最终会将所有部分摘要合并成一个完整的摘要。

如需详细了解提示 LLM 时使用的 HTTP 请求正文参数以及响应正文元素,请参阅 Gemini API 参考文档

工作流定义假设您已经创建了一个 Cloud Storage 存储桶,可以向其中上传文本文件。如需详细了解用于从 Cloud Storage 存储桶中检索对象的 Workflows 连接器 (googleapis.storage.v1.objects.get),请参阅连接器参考文档

部署工作流后,您可以通过以下方式执行工作流:创建适当的 Eventarc 触发器,然后将文件上传到存储桶。如需了解详情,请参阅将 Cloud Storage 事件路由到 Workflows。请注意,必须启用其他 API,并且必须授予其他角色,包括向您的服务帐号授予支持使用 Cloud Storage 对象的 Storage Object User (roles/storage.objectUser) 角色。如需了解详情,请参阅准备创建触发器部分。

控制台

  1. 在 Google Cloud 控制台中,前往工作流页面。

    进入 Workflows

  2. 点击 创建

  3. 输入新工作流的名称:gemini-pro-summaries

  4. 区域列表中,选择 us-central1(爱荷华)

  5. 对于服务帐号,选择您之前创建的服务帐号。

  6. 点击下一步

  7. 在工作流编辑器中,为工作流输入以下定义:

    main:
        params: [input]
        steps:
        - assign_file_vars:
            assign:
                - file_size: ${int(input.data.size)}
                - chunk_size: 64000
                - n_chunks: ${int(file_size / chunk_size)}
                - summaries: []
                - all_summaries_concatenated: ""
        - loop_over_chunks:
            parallel:
                shared: [summaries]
                for:
                    value: chunk_idx
                    range: ${[0, n_chunks]}
                    steps:
                        - assign_bounds:
                            assign:
                                - lower_bound: ${chunk_idx * chunk_size}
                                - upper_bound: ${(chunk_idx + 1) * chunk_size}
                                - summaries: ${list.concat(summaries, "")}
                        - dump_file_content:
                            call: http.get
                            args:
                                url: ${"https://storage.googleapis.com/storage/v1/b/" + input.data.bucket + "/o/" + input.data.name + "?alt=media"}
                                auth:
                                    type: OAuth2
                                headers:
                                    Range: ${"bytes=" + lower_bound + "-" + upper_bound}
                            result: file_content
                        - assign_chunk:
                            assign:
                                - chunk: ${file_content.body}
                        - generate_chunk_summary:
                            call: ask_gemini_for_summary
                            args:
                                textToSummarize: ${chunk}
                            result: summary
                        - assign_summary:
                            assign:
                                - summaries[chunk_idx]: ${summary}
        - concat_summaries:
            for:
                value: summary
                in: ${summaries}
                steps:
                    - append_summaries:
                        assign:
                            - all_summaries_concatenated: ${all_summaries_concatenated + "\n" + summary}
        - reduce_summary:
            call: ask_gemini_for_summary
            args:
                textToSummarize: ${all_summaries_concatenated}
            result: final_summary
        - return_result:
            return:
                - summaries: ${summaries}
                - final_summary: ${final_summary}
    
    ask_gemini_for_summary:
        params: [textToSummarize]
        steps:
            - init:
                assign:
                    - project: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
                    - location: "us-central1"
                    - model: "gemini-pro"
                    - summary: ""
            - call_gemini:
                call: http.post
                args:
                    url: ${"https://" + location + "-aiplatform.googleapis.com" + "/v1/projects/" + project + "/locations/" + location + "/publishers/google/models/" + model + ":generateContent"}
                    auth:
                        type: OAuth2
                    body:
                        contents:
                            role: user
                            parts:
                                - text: '${"Make a summary of the following text:\n\n" + textToSummarize}'
                        generation_config:
                            temperature: 0.2
                            maxOutputTokens: 2000
                            topK: 10
                            topP: 0.9
                result: gemini_response
            # Sometimes, there's no text, for example, due to safety settings
            - check_text_exists:
                switch:
                - condition: ${not("parts" in gemini_response.body.candidates[0].content)}
                  next: return_summary
            - extract_text:
                assign:
                    - summary: ${gemini_response.body.candidates[0].content.parts[0].text}
            - return_summary:
                return: ${summary}
  8. 点击部署

gcloud

  1. 为工作流创建源代码文件:

    touch gemini-pro-summaries.yaml
    
  2. 在文本编辑器中,将以下工作流复制到源代码文件中:

    main:
        params: [input]
        steps:
        - assign_file_vars:
            assign:
                - file_size: ${int(input.data.size)}
                - chunk_size: 64000
                - n_chunks: ${int(file_size / chunk_size)}
                - summaries: []
                - all_summaries_concatenated: ""
        - loop_over_chunks:
            parallel:
                shared: [summaries]
                for:
                    value: chunk_idx
                    range: ${[0, n_chunks]}
                    steps:
                        - assign_bounds:
                            assign:
                                - lower_bound: ${chunk_idx * chunk_size}
                                - upper_bound: ${(chunk_idx + 1) * chunk_size}
                                - summaries: ${list.concat(summaries, "")}
                        - dump_file_content:
                            call: http.get
                            args:
                                url: ${"https://storage.googleapis.com/storage/v1/b/" + input.data.bucket + "/o/" + input.data.name + "?alt=media"}
                                auth:
                                    type: OAuth2
                                headers:
                                    Range: ${"bytes=" + lower_bound + "-" + upper_bound}
                            result: file_content
                        - assign_chunk:
                            assign:
                                - chunk: ${file_content.body}
                        - generate_chunk_summary:
                            call: ask_gemini_for_summary
                            args:
                                textToSummarize: ${chunk}
                            result: summary
                        - assign_summary:
                            assign:
                                - summaries[chunk_idx]: ${summary}
        - concat_summaries:
            for:
                value: summary
                in: ${summaries}
                steps:
                    - append_summaries:
                        assign:
                            - all_summaries_concatenated: ${all_summaries_concatenated + "\n" + summary}
        - reduce_summary:
            call: ask_gemini_for_summary
            args:
                textToSummarize: ${all_summaries_concatenated}
            result: final_summary
        - return_result:
            return:
                - summaries: ${summaries}
                - final_summary: ${final_summary}
    
    ask_gemini_for_summary:
        params: [textToSummarize]
        steps:
            - init:
                assign:
                    - project: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
                    - location: "us-central1"
                    - model: "gemini-pro"
                    - summary: ""
            - call_gemini:
                call: http.post
                args:
                    url: ${"https://" + location + "-aiplatform.googleapis.com" + "/v1/projects/" + project + "/locations/" + location + "/publishers/google/models/" + model + ":generateContent"}
                    auth:
                        type: OAuth2
                    body:
                        contents:
                            role: user
                            parts:
                                - text: '${"Make a summary of the following text:\n\n" + textToSummarize}'
                        generation_config:
                            temperature: 0.2
                            maxOutputTokens: 2000
                            topK: 10
                            topP: 0.9
                result: gemini_response
            # Sometimes, there's no text, for example, due to safety settings
            - check_text_exists:
                switch:
                - condition: ${not("parts" in gemini_response.body.candidates[0].content)}
                  next: return_summary
            - extract_text:
                assign:
                    - summary: ${gemini_response.body.candidates[0].content.parts[0].text}
            - return_summary:
                return: ${summary}
  3. 输入以下命令以部署工作流:

    gcloud workflows deploy gemini-pro-summaries \
        --source=gemini-pro-summaries.yaml \
        --location=us-central1 \
        --service-account=SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com

清理

为避免因本教程中使用的资源导致您的 Google Cloud 账号产生费用,请删除包含这些资源的项目,或者保留项目但删除各个资源。

删除项目

控制台

  1. 在 Google Cloud 控制台中,进入管理资源页面。

    转到“管理资源”

  2. 在项目列表中,选择要删除的项目,然后点击删除
  3. 在对话框中输入项目 ID,然后点击关闭以删除项目。

gcloud

删除 Google Cloud 项目:

gcloud projects delete PROJECT_ID

删除各个资源

删除您在本教程中创建的工作流

后续步骤