更新架构

您可以更新包含支持架构的数据的任何数据的架构,例如结构化数据包含结构化数据的网站数据或其他包含元数据的非结构化数据

您可以在 Google Cloud 控制台中更新架构 使用 schemas.patch API 方法。 只能通过 REST API 更新网站架构。

如需更新架构,您可以添加新字段、更改字段的可编入索引、可搜索和可检索注释,或将字段标记为关键属性(例如 titleuridescription)。

更新架构

您可以在 Google Cloud 控制台中或使用 API 更新架构。

控制台

如需在 Google Cloud 控制台中更新架构,请按以下步骤操作:

  1. 查看要求和限制部分,检查 则表示您的架构更新有效

  2. 如果您要更新字段注释(将字段设置为可编入索引、可检索、可动态生成 Facetable、可搜索或可填充),请参阅配置字段设置,了解每种注释类型的限制和要求。

  3. 检查您是否已完成数据注入。否则,您可能还无法修改架构。

  4. 在 Google Cloud 控制台中,前往 Agent Builder 页面。

    Agent Builder

  5. 在导航菜单中,点击 Data Stores

  6. 名称列中,点击具有要 更新。

  7. 点击 Schema 标签页可查看数据的架构。

    如果这是您首次修改这些字段,此标签页可能为空。

  8. 点击修改按钮。

  9. 更新您的架构:

    • 映射关键属性:在架构的关键属性列中,选择要将字段映射到的关键属性。例如,如果名为 details 始终包含文档的说明,映射该字段 添加到键属性说明

    • 更新维度数量(高级):您可以通过 自定义矢量嵌入和自定义矢量嵌入, Vertex AI Search。请参阅高级:使用自定义嵌入

    • 更新字段注释:如需更新字段的注释,请选择 或取消选择字段的注解设置可用的注释包括可检索可编入索引动态分面可搜索可填充。某些字段设置存在限制。如需了解每种注释类型的说明和要求,请参阅配置字段设置

    • 添加新字段:在导入包含这些字段的新文档之前,先将新字段添加到架构中,可以缩短 Vertex AI Agent Builder 在导入后重新编制数据索引所需的时间。

      1. 点击添加新字段以展开该部分。

      2. 点击 add_box Add node,然后为新字段指定设置。

        如需指明数组,请将数组设置为。例如,如需添加字符串数组,请将 type 设置为 string,并将 Array 设置为 Yes

        对于网站数据存储区索引,您添加的所有字段都是依据 默认值。

  10. 点击保存以应用架构更改。

    更改架构会触发重新编入索引。对于大型数据存储区,重新编制索引的流程可能需要几小时才能完成。

REST

如需使用 API 更新架构,请按以下步骤操作:

  1. 查看要求和限制以及限制 示例(仅限 REST)部分,用于检查架构是否发生更改 都有效。

    如需更新包含网站或非结构化数据的数据存储区的架构,请使用 请跳到第 5 步以调用 schema.patch 方法。

  2. 如果您要更新字段注解(将字段设置为可编入索引, 可检索、动态分面或可搜索),查看 配置 每种注解类型的限制和要求。

  3. 如果您要修改自动检测的架构,请确保您已 已完成数据注入。否则,该架构可能不可用 修改。

  4. 查找数据存储区 ID。如果您已经有数据存储区 ID,请跳至下一步。

    1. 在 Google Cloud 控制台中,前往 Agent Builder 页面,然后在导航菜单中点击数据存储区

      转到“数据存储区”页面

    2. 点击您的数据存储区的名称。

    3. 在数据存储区的数据页面上,获取数据存储区 ID。

  5. 使用 schemas.patch 以 JSON 对象形式提供新 JSON 架构的 API 方法。

    curl -X PATCH \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
    "https://discoveryengine.googleapis.com/v1beta/projects/PROJECT_ID/locations/global/collections/default_collection/dataStores/DATA_STORE_ID/schemas/default_schema" \
    -d '{
      "structSchema": JSON_SCHEMA_OBJECT
    }'
    

    替换以下内容:

    • PROJECT_ID:您的 Google Cloud 项目的 ID。
    • DATA_STORE_ID:Vertex AI Search 数据存储区的 ID。
    • JSON_SCHEMA_OBJECT:您的新 JSON 架构(作为 JSON 对象)。例如:

      {
        "$schema": "https://json-schema.org/draft/2020-12/schema",
        "type": "object",
        "properties": {
          "title": {
            "type": "string",
            "keyPropertyMapping": "title"
          },
          "categories": {
            "type": "array",
            "items": {
              "type": "string",
              "keyPropertyMapping": "category"
            }
          },
          "uri": {
            "type": "string",
            "keyPropertyMapping": "uri"
          }
        }
      }
  6. 可选:按照查看架构定义中的步骤查看架构。

C#

有关详情,请参阅 Vertex AI Agent Builder C# API 参考文档

如需向 Vertex AI Agent Builder 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

using Google.Cloud.DiscoveryEngine.V1;
using Google.LongRunning;

public sealed partial class GeneratedSchemaServiceClientSnippets
{
    /// <summary>Snippet for UpdateSchema</summary>
    /// <remarks>
    /// This snippet has been automatically generated and should be regarded as a code template only.
    /// It will require modifications to work:
    /// - It may require correct/in-range values for request initialization.
    /// - It may require specifying regional endpoints when creating the service client as shown in
    ///   https://cloud.google.com/dotnet/docs/reference/help/client-configuration#endpoint.
    /// </remarks>
    public void UpdateSchemaRequestObject()
    {
        // Create client
        SchemaServiceClient schemaServiceClient = SchemaServiceClient.Create();
        // Initialize request argument(s)
        UpdateSchemaRequest request = new UpdateSchemaRequest
        {
            Schema = new Schema(),
            AllowMissing = false,
        };
        // Make the request
        Operation<Schema, UpdateSchemaMetadata> response = schemaServiceClient.UpdateSchema(request);

        // Poll until the returned long-running operation is complete
        Operation<Schema, UpdateSchemaMetadata> completedResponse = response.PollUntilCompleted();
        // Retrieve the operation result
        Schema result = completedResponse.Result;

        // Or get the name of the operation
        string operationName = response.Name;
        // This name can be stored, then the long-running operation retrieved later by name
        Operation<Schema, UpdateSchemaMetadata> retrievedResponse = schemaServiceClient.PollOnceUpdateSchema(operationName);
        // Check if the retrieved long-running operation has completed
        if (retrievedResponse.IsCompleted)
        {
            // If it has completed, then access the result
            Schema retrievedResult = retrievedResponse.Result;
        }
    }
}

Go

如需了解详情,请参阅 Vertex AI Agent Builder Go API 参考文档

如需向 Vertex AI Agent Builder 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证


package main

import (
	"context"

	discoveryengine "cloud.google.com/go/discoveryengine/apiv1"
	discoveryenginepb "cloud.google.com/go/discoveryengine/apiv1/discoveryenginepb"
)

func main() {
	ctx := context.Background()
	// This snippet has been automatically generated and should be regarded as a code template only.
	// It will require modifications to work:
	// - It may require correct/in-range values for request initialization.
	// - It may require specifying regional endpoints when creating the service client as shown in:
	//   https://pkg.go.dev/cloud.google.com/go#hdr-Client_Options
	c, err := discoveryengine.NewSchemaClient(ctx)
	if err != nil {
		// TODO: Handle error.
	}
	defer c.Close()

	req := &discoveryenginepb.UpdateSchemaRequest{
		// TODO: Fill request struct fields.
		// See https://pkg.go.dev/cloud.google.com/go/discoveryengine/apiv1/discoveryenginepb#UpdateSchemaRequest.
	}
	op, err := c.UpdateSchema(ctx, req)
	if err != nil {
		// TODO: Handle error.
	}

	resp, err := op.Wait(ctx)
	if err != nil {
		// TODO: Handle error.
	}
	// TODO: Use resp.
	_ = resp
}

Java

如需了解详情,请参阅 Vertex AI Agent Builder Java API 参考文档

如需向 Vertex AI Agent Builder 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

import com.google.cloud.discoveryengine.v1.Schema;
import com.google.cloud.discoveryengine.v1.SchemaServiceClient;
import com.google.cloud.discoveryengine.v1.UpdateSchemaRequest;

public class SyncUpdateSchema {

  public static void main(String[] args) throws Exception {
    syncUpdateSchema();
  }

  public static void syncUpdateSchema() throws Exception {
    // This snippet has been automatically generated and should be regarded as a code template only.
    // It will require modifications to work:
    // - It may require correct/in-range values for request initialization.
    // - It may require specifying regional endpoints when creating the service client as shown in
    // https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
    try (SchemaServiceClient schemaServiceClient = SchemaServiceClient.create()) {
      UpdateSchemaRequest request =
          UpdateSchemaRequest.newBuilder()
              .setSchema(Schema.newBuilder().build())
              .setAllowMissing(true)
              .build();
      Schema response = schemaServiceClient.updateSchemaAsync(request).get();
    }
  }
}

Python

如需了解详情,请参阅 Vertex AI Agent Builder Python API 参考文档

如需向 Vertex AI Agent Builder 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

# This snippet has been automatically generated and should be regarded as a
# code template only.
# It will require modifications to work:
# - It may require correct/in-range values for request initialization.
# - It may require specifying regional endpoints when creating the service
#   client as shown in:
#   https://googleapis.dev/python/google-api-core/latest/client_options.html
from google.cloud import discoveryengine_v1


def sample_update_schema():
    # Create a client
    client = discoveryengine_v1.SchemaServiceClient()

    # Initialize request argument(s)
    request = discoveryengine_v1.UpdateSchemaRequest(
    )

    # Make the request
    operation = client.update_schema(request=request)

    print("Waiting for operation to complete...")

    response = operation.result()

    # Handle the response
    print(response)

Ruby

如需了解详情,请参阅 Vertex AI Agent Builder Ruby API 参考文档

如需向 Vertex AI Agent Builder 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

require "google/cloud/discovery_engine/v1"

##
# Snippet for the update_schema call in the SchemaService service
#
# This snippet has been automatically generated and should be regarded as a code
# template only. It will require modifications to work:
# - It may require correct/in-range values for request initialization.
# - It may require specifying regional endpoints when creating the service
# client as shown in https://cloud.google.com/ruby/docs/reference.
#
# This is an auto-generated example demonstrating basic usage of
# Google::Cloud::DiscoveryEngine::V1::SchemaService::Client#update_schema.
#
def update_schema
  # Create a client object. The client can be reused for multiple calls.
  client = Google::Cloud::DiscoveryEngine::V1::SchemaService::Client.new

  # Create a request. To set request fields, pass in keyword arguments.
  request = Google::Cloud::DiscoveryEngine::V1::UpdateSchemaRequest.new

  # Call the update_schema method.
  result = client.update_schema request

  # The returned object is of type Gapic::Operation. You can use it to
  # check the status of an operation, cancel it, or wait for results.
  # Here is how to wait for a response.
  result.wait_until_done! timeout: 60
  if result.response?
    p result.response
  else
    puts "No response received."
  end
end

要求和限制

更新架构时,请确保新架构是向后兼容的 您要更新的架构兼容更新架构 您需要删除所有 删除数据存储区中的文档,然后创建新架构。

更新架构会触发对所有文档重新编制索引的流程。这可能需要一些时间,并且会产生额外费用:

  • 时间:为大型数据存储区重新编制索引可能需要数小时或数天的时间。

  • 费用。重新编制索引可能会产生费用,具体取决于解析器。例如: 对使用 OCR 解析器或布局解析器的数据存储区重新编制索引都会引发 费用。如需了解详情,请参阅 Document AI 功能 价格

架构更新不支持以下操作:

  • 更改字段类型。架构更新不支持更改字段的类型。例如,映射到整数的字段不能更改为 字符串。
  • 移除字段。字段一经定义便无法移除。您可以 继续添加新字段,但您无法移除现有字段。

限制示例(仅限 REST)

本部分介绍了有效和无效架构更新类型的示例。这些 示例使用以下示例 JSON 架构:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "title": {
      "type": "string"
    },
    "description": {
      "type": "string",
      "keyPropertyMapping": "description"
    },
    "categories": {
      "type": "array",
      "items": {
        "type": "string",
        "keyPropertyMapping": "category"
      }
    }
  }
}

受支持的更新示例

支持对示例架构进行以下更新。

  • 添加字段。在此示例中,字段 properties.uri 添加到架构中

    {
      "$schema": "https://json-schema.org/draft/2020-12/schema",
      "type": "object",
      "properties": {
        "title": {
          "type": "string"
        },
        "description": {
          "type": "string",
          "keyPropertyMapping": "description"
        },
        "uri": { // Added field. This is supported.
          "type": "string",
          "keyPropertyMapping": "uri"
        },
        "categories": {
          "type": "array",
          "items": {
            "type": "string",
            "keyPropertyMapping": "category"
          }
        }
      }
    }
    
  • titledescriptionuri 添加或移除关键属性注解。在此示例中,keyPropertyMapping 已添加到 title 字段。

    {
      "$schema": "https://json-schema.org/draft/2020-12/schema",
      "type": "object",
      "properties": {
        "title": {
          "type": "string",
          "keyPropertyMapping": "title" // Added "keyPropertyMapping". This is supported.
        },
        "description": {
          "type": "string",
          "keyPropertyMapping": "description"
        },
        "categories": {
          "type": "array",
          "items": {
            "type": "string",
            "keyPropertyMapping": "category"
          }
        }
      }
    }
    

无效架构更新的示例

不支持对示例架构进行以下更新。

  • 更改字段类型。在此示例中,title 字段的类型具有 已从字符串更改为数字。系统不支持此操作。

      {
        "$schema": "https://json-schema.org/draft/2020-12/schema",
        "type": "object",
        "properties": {
          "title": {
            "type": "number" // Changed from string. Not allowed.
          },
          "description": {
            "type": "string",
            "keyPropertyMapping": "description"
          },
          "categories": {
            "type": "array",
            "items": {
              "type": "string",
              "keyPropertyMapping": "category"
            }
          }
        }
      }
    
  • 移除字段。在此示例中,title 字段已移除。 系统不支持此操作。

      {
        "$schema": "https://json-schema.org/draft/2020-12/schema",
        "type": "object",
        "properties": {
          // "title" is removed. Not allowed.
          "description": {
            "type": "string",
            "keyPropertyMapping": "description"
          },
          "uri": {
            "type": "string",
            "keyPropertyMapping": "uri"
          },
          "categories": {
            "type": "array",
            "items": {
              "type": "string",
              "keyPropertyMapping": "category"
            }
          }
        }
      }
    

后续步骤