在 VPC 网络中部署和管理索引端点

将索引部署到端点包括以下三项任务：

创建 IndexEndpoint（如有需要）或重复使用现有的 IndexEndpoint。
获取 IndexEndpoint ID。
将索引部署到 IndexEndpoint。

在 VPC 网络中创建 `IndexEndpoint`

如果要将 Index 部署到现有 IndexEndpoint，则可以跳过此步骤。

在使用索引响应在线向量匹配查询之前，您必须将 Index 部署到 VPC 网络对等互连网络中的 IndexEndpoint。第一步是创建 IndexEndpoint。您可以将多个索引部署到共享同一 VPC 网络的 IndexEndpoint。

gcloud

以下示例使用 gcloud ai index-endpoints create 命令。

在使用下面的命令数据之前，请先进行以下替换：

INDEX_ENDPOINT_NAME：索引端点的显示名称。
VPC_NETWORK_NAME：索引端点应与之建立对等互连的 Google Compute Engine 网络的名称。
LOCATION：您在其中使用 Vertex AI 的区域。
PROJECT_ID：您的 Google Cloud 项目 ID。

执行以下命令：

Linux、macOS 或 Cloud Shell

gcloud ai index-endpoints create \
    --display-name=INDEX_ENDPOINT_NAME \
    --network=VPC_NETWORK_NAME \
    --region=LOCATION \
    --project=PROJECT_ID

Windows (PowerShell)

gcloud ai index-endpoints create `
    --display-name=INDEX_ENDPOINT_NAME `
    --network=VPC_NETWORK_NAME `
    --region=LOCATION `
    --project=PROJECT_ID

Windows (cmd.exe)

gcloud ai index-endpoints create ^
    --display-name=INDEX_ENDPOINT_NAME ^
    --network=VPC_NETWORK_NAME ^
    --region=LOCATION ^
    --project=PROJECT_ID

您应该会收到类似如下所示的响应：

The Google Cloud CLI tool might take a few minutes to create the IndexEndpoint.

REST

在使用任何请求数据之前，请先进行以下替换：

INDEX_ENDPOINT_NAME：索引端点的显示名称。
VPC_NETWORK_NAME：索引端点应与之建立对等互连的 Google Compute Engine 网络的名称。
LOCATION：您在其中使用 Vertex AI 的区域。
PROJECT_ID：您的 Google Cloud 项目 ID。
PROJECT_NUMBER：自动生成的项目编号。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints

请求 JSON 正文：

{
  "display_name": "INDEX_ENDPOINT_NAME",
  "network": "VPC_NETWORK_NAME"
}

如需发送您的请求，请展开以下选项之一：

curl（Linux、macOS 或 Cloud Shell）

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI，或者使用了 Cloud Shell，这会使您自动登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints"

PowerShell (Windows)

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints" | Select-Object -Expand Content

您应该收到类似以下内容的 JSON 响应：

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateIndexEndpointOperationMetadata",
    "genericMetadata": {
      "createTime": "2022-01-13T04:09:56.641107Z",
      "updateTime": "2022-01-13T04:09:56.641107Z"
    }
  }
}

您可以轮询操作状态，直到响应包含 "done": true。

Terraform

以下示例使用 vertex_ai_index_endpoint Terraform 资源创建索引端点。

如需了解如何应用或移除 Terraform 配置，请参阅基本 Terraform 命令。

resource "google_vertex_ai_index_endpoint" "default" {
  display_name = "sample-endpoint"
  description  = "A sample index endpoint within a VPC network"
  region       = "us-central1"
  network      = "projects/${data.google_project.project.number}/global/networks/${google_compute_network.default.name}"
  depends_on = [
    google_service_networking_connection.default
  ]
}

resource "google_service_networking_connection" "default" {
  network                 = google_compute_network.default.id
  service                 = "servicenetworking.googleapis.com"
  reserved_peering_ranges = [google_compute_global_address.default.name]
  # Workaround to allow `terraform destroy`, see https://github.com/hashicorp/terraform-provider-google/issues/18729
  deletion_policy = "ABANDON"
}

resource "google_compute_global_address" "default" {
  name          = "sample-address"
  purpose       = "VPC_PEERING"
  address_type  = "INTERNAL"
  prefix_length = 16
  network       = google_compute_network.default.id
}

resource "google_compute_network" "default" {
  name = "sample-network"
}

data "google_project" "project" {}

# Cloud Storage bucket name must be unique
resource "random_id" "default" {
  byte_length = 8
}

# Create a Cloud Storage bucket
resource "google_storage_bucket" "bucket" {
  name                        = "vertex-ai-index-bucket-${random_id.default.hex}"
  location                    = "us-central1"
  uniform_bucket_level_access = true
}

# Create index content
resource "google_storage_bucket_object" "data" {
  name    = "contents/data.json"
  bucket  = google_storage_bucket.bucket.name
  content = <<EOF
{"id": "42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline"]}]}
{"id": "43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine"]}]}
EOF
}

resource "google_vertex_ai_index" "default" {
  region       = "us-central1"
  display_name = "sample-index-batch-update"
  description  = "A sample index for batch update"
  labels = {
    foo = "bar"
  }

  metadata {
    contents_delta_uri = "gs://${google_storage_bucket.bucket.name}/contents"
    config {
      dimensions                  = 2
      approximate_neighbors_count = 150
      distance_measure_type       = "DOT_PRODUCT_DISTANCE"
      algorithm_config {
        tree_ah_config {
          leaf_node_embedding_count    = 500
          leaf_nodes_to_search_percent = 7
        }
      }
    }
  }
  index_update_method = "BATCH_UPDATE"

  timeouts {
    create = "2h"
    update = "1h"
  }
}

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。

def vector_search_create_index_endpoint_vpc(
    project: str, location: str, display_name: str, network: str
) -> aiplatform.MatchingEngineIndexEndpoint:
    """Create a vector search index endpoint within a VPC network.

    Args:
        project (str): Required. Project ID
        location (str): Required. The region name
        display_name (str): Required. The index endpoint display name
        network(str): Required. The VPC network name, in the format of
            projects/{project number}/global/networks/{network name}.

    Returns:
        aiplatform.MatchingEngineIndexEndpoint - The created index endpoint.
    """
    # Initialize the Vertex AI client
    aiplatform.init(project=project, location=location)

    # Create Index Endpoint
    index_endpoint = aiplatform.MatchingEngineIndexEndpoint.create(
        display_name=display_name,
        network=network,
        description="Matching Engine VPC Index Endpoint",
    )

    return index_endpoint

控制台

按照以下说明创建索引端点。

在 Google Cloud 控制台的 Vertex AI 部分中，前往部署和使用部分。选择 Vector Search。
前往 Vector Search
系统会显示活跃索引列表。
选择页面顶部的索引端点标签页。系统会显示索引端点。
点击 创建新的索引端点。系统会打开“创建新的索引端点”面板。
输入索引端点的显示名称。
在区域字段中，从下拉列表选择一个区域。
在访问权限字段中，选择专用。
输入对等互连 VPC 网络详情。输入作业应对等互连的 Compute Engine 网络的全名。格式应为 projects/{project_num}/global/networks/{network_id}
点击创建。

部署索引

gcloud

本示例使用 gcloud ai index-endpoints deploy-index 命令。

在使用下面的命令数据之前，请先进行以下替换：

INDEX_ENDPOINT_ID：索引端点的 ID。
DEPLOYED_INDEX_ID：用户指定的字符串，用于唯一标识已部署的索引。必须以字母开头，并且只包含字母、数字或下划线。如需了解格式准则，请参阅 DeployedIndex.id。
DEPLOYED_INDEX_ENDPOINT_NAME：已部署索引端点的显示名称。
INDEX_ID：索引的 ID。
LOCATION：您在其中使用 Vertex AI 的区域。
PROJECT_ID：您的 Google Cloud 项目 ID。

执行以下命令：

Linux、macOS 或 Cloud Shell

gcloud ai index-endpoints deploy-index INDEX_ENDPOINT_ID \
    --deployed-index-id=DEPLOYED_INDEX_ID \
    --display-name=DEPLOYED_INDEX_ENDPOINT_NAME \
    --index=INDEX_ID \
    --region=LOCATION \
    --project=PROJECT_ID

Windows (PowerShell)

gcloud ai index-endpoints deploy-index INDEX_ENDPOINT_ID `
    --deployed-index-id=DEPLOYED_INDEX_ID `
    --display-name=DEPLOYED_INDEX_ENDPOINT_NAME `
    --index=INDEX_ID `
    --region=LOCATION `
    --project=PROJECT_ID

Windows (cmd.exe)

gcloud ai index-endpoints deploy-index INDEX_ENDPOINT_ID ^
    --deployed-index-id=DEPLOYED_INDEX_ID ^
    --display-name=DEPLOYED_INDEX_ENDPOINT_NAME ^
    --index=INDEX_ID ^
    --region=LOCATION ^
    --project=PROJECT_ID

您应该会收到类似如下所示的响应：

The Google Cloud CLI tool might take a few minutes to create the IndexEndpoint.

REST

在使用任何请求数据之前，请先进行以下替换：

INDEX_ENDPOINT_ID：索引端点的 ID。
DEPLOYED_INDEX_ID：用户指定的字符串，用于唯一标识已部署的索引。必须以字母开头，并且只包含字母、数字或下划线。如需了解格式准则，请参阅 DeployedIndex.id。
DEPLOYED_INDEX_ENDPOINT_NAME：已部署索引端点的显示名称。
INDEX_ID：索引的 ID。
LOCATION：您在其中使用 Vertex AI 的区域。
PROJECT_ID：您的 Google Cloud 项目 ID。
PROJECT_NUMBER：自动生成的项目编号。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:deployIndex

请求 JSON 正文：

{
 "deployedIndex": {
   "id": "DEPLOYED_INDEX_ID",
   "index": "projects/PROJECT_ID/locations/LOCATION/indexes/INDEX_ID",
   "displayName": "DEPLOYED_INDEX_ENDPOINT_NAME"
 }
}

如需发送您的请求，请展开以下选项之一：

curl（Linux、macOS 或 Cloud Shell）

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:deployIndex"

PowerShell (Windows)

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:deployIndex" | Select-Object -Expand Content

您应该收到类似以下内容的 JSON 响应：

{
 "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID",
 "metadata": {
   "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployIndexOperationMetadata",
   "genericMetadata": {
     "createTime": "2022-10-19T17:53:16.502088Z",
     "updateTime": "2022-10-19T17:53:16.502088Z"
   },
   "deployedIndexId": "DEPLOYED_INDEX_ID"
 }
}

Terraform

以下示例使用 vertex_ai_index_endpoint_deployed_index Terraform 资源创建所部署的索引端点。

如需了解如何应用或移除 Terraform 配置，请参阅基本 Terraform 命令。

provider "google" {
  region = "us-central1"
}

resource "google_vertex_ai_index_endpoint_deployed_index" "default" {
  depends_on        = [google_vertex_ai_index_endpoint.default]
  index_endpoint    = google_vertex_ai_index_endpoint.default.id
  index             = google_vertex_ai_index.default.id
  deployed_index_id = "deployed_index_for_vpc"
}

resource "google_vertex_ai_index_endpoint" "default" {
  display_name = "sample-endpoint"
  description  = "A sample index endpoint within a VPC network"
  region       = "us-central1"
  network      = "projects/${data.google_project.project.number}/global/networks/${google_compute_network.default.name}"
  depends_on = [
    google_service_networking_connection.default
  ]
}

resource "google_service_networking_connection" "default" {
  network                 = google_compute_network.default.id
  service                 = "servicenetworking.googleapis.com"
  reserved_peering_ranges = [google_compute_global_address.default.name]
  # Workaround to allow `terraform destroy`, see https://github.com/hashicorp/terraform-provider-google/issues/18729
  deletion_policy = "ABANDON"
}

resource "google_compute_global_address" "default" {
  name          = "sample-address"
  purpose       = "VPC_PEERING"
  address_type  = "INTERNAL"
  prefix_length = 16
  network       = google_compute_network.default.id
}

resource "google_compute_network" "default" {
  name = "sample-network"
}

data "google_project" "project" {}

# Cloud Storage bucket name must be unique
resource "random_id" "default" {
  byte_length = 8
}

# Create a Cloud Storage bucket
resource "google_storage_bucket" "bucket" {
  name                        = "vertex-ai-index-bucket-${random_id.default.hex}"
  location                    = "us-central1"
  uniform_bucket_level_access = true
}

# Create index content
resource "google_storage_bucket_object" "data" {
  name    = "contents/data.json"
  bucket  = google_storage_bucket.bucket.name
  content = <<EOF
{"id": "42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline"]}]}
{"id": "43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine"]}]}
EOF
}

resource "google_vertex_ai_index" "default" {
  region       = "us-central1"
  display_name = "sample-index-batch-update"
  description  = "A sample index for batch update"
  labels = {
    foo = "bar"
  }

  metadata {
    contents_delta_uri = "gs://${google_storage_bucket.bucket.name}/contents"
    config {
      dimensions                  = 2
      approximate_neighbors_count = 150
      distance_measure_type       = "DOT_PRODUCT_DISTANCE"
      algorithm_config {
        tree_ah_config {
          leaf_node_embedding_count    = 500
          leaf_nodes_to_search_percent = 7
        }
      }
    }
  }
  index_update_method = "BATCH_UPDATE"

  timeouts {
    create = "2h"
    update = "1h"
  }
}

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。

def vector_search_deploy_index(
    project: str,
    location: str,
    index_name: str,
    index_endpoint_name: str,
    deployed_index_id: str,
) -> None:
    """Deploy a vector search index to a vector search index endpoint.

    Args:
        project (str): Required. Project ID
        location (str): Required. The region name
        index_name (str): Required. The index to update. A fully-qualified index
          resource name or a index ID.  Example:
          "projects/123/locations/us-central1/indexes/my_index_id" or
          "my_index_id".
        index_endpoint_name (str): Required. Index endpoint to deploy the index
          to.
        deployed_index_id (str): Required. The user specified ID of the
          DeployedIndex.
    """
    # Initialize the Vertex AI client
    aiplatform.init(project=project, location=location)

    # Create the index instance from an existing index
    index = aiplatform.MatchingEngineIndex(index_name=index_name)

    # Create the index endpoint instance from an existing endpoint.
    index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
        index_endpoint_name=index_endpoint_name
    )

    # Deploy Index to Endpoint
    index_endpoint = index_endpoint.deploy_index(
        index=index, deployed_index_id=deployed_index_id
    )

    print(index_endpoint.deployed_indexes)

控制台

按照以下说明将索引部署到端点。

在 Google Cloud 控制台的 Vertex AI 部分中，前往部署和使用部分。选择 Vector Search。
前往 Vector Search
系统会显示活跃索引列表。
选择您要部署的索引的名称。系统会打开索引详情页面。
在索引详情页面中，点击 部署到端点。系统会打开索引部署面板。
输入显示名称，此名称充当 ID 且无法更新。
从端点下拉列表中，选择要将此索引部署到的端点。注意：如果索引已部署到该端点，则该端点不可用。
可选：在机器类型字段中，选择标准内存或高内存。
可选。选择启用自动扩缩功能，以根据工作负载的需求自动调整节点数量。如果停用自动扩缩，则默认副本数为 2。
点击部署以将索引部署到端点。注意：部署过程大约需要 30 分钟。

启用自动扩缩功能

Vector Search 支持自动扩缩，该功能可以根据工作负载的需求自动调整节点数量。当需求较高时，节点会被添加到节点池（不会超过您指定的大小上限）。当需求较低时，节点池会缩减到您指定的最小大小。您可以通过监控当前副本来检查使用中的实际节点和更改。

如需启用自动扩缩功能，请在部署索引时指定 maxReplicaCount 和 minReplicaCount：

gcloud

以下示例使用 gcloud ai index-endpoints deploy-index 命令。

在使用下面的命令数据之前，请先进行以下替换：

INDEX_ENDPOINT_ID：索引端点的 ID。
DEPLOYED_INDEX_ID：用户指定的字符串，用于唯一标识已部署的索引。必须以字母开头，并且只包含字母、数字或下划线。如需了解格式准则，请参阅 DeployedIndex.id。
DEPLOYED_INDEX_NAME：已部署索引的显示名称。
INDEX_ID：索引的 ID。
MIN_REPLICA_COUNT：始终在其中部署已部署索引的机器副本的数量下限。如果指定，则此值必须等于或大于 1。
MAX_REPLICA_COUNT：可以在其中部署已部署索引的机器副本的数量上限。
LOCATION：您在其中使用 Vertex AI 的区域。
PROJECT_ID：您的 Google Cloud 项目 ID。

执行以下命令：

Linux、macOS 或 Cloud Shell

gcloud ai index-endpoints deploy-index INDEX_ENDPOINT_ID \
    --deployed-index-id=DEPLOYED_INDEX_ID \
    --display-name=DEPLOYED_INDEX_NAME \
    --index=INDEX_ID \
    --min-replica-count=MIN_REPLICA_COUNT \
    --max-replica-count=MAX_REPLICA_COUNT \
    --region=LOCATION \
    --project=PROJECT_ID

Windows (PowerShell)

gcloud ai index-endpoints deploy-index INDEX_ENDPOINT_ID `
    --deployed-index-id=DEPLOYED_INDEX_ID `
    --display-name=DEPLOYED_INDEX_NAME `
    --index=INDEX_ID `
    --min-replica-count=MIN_REPLICA_COUNT `
    --max-replica-count=MAX_REPLICA_COUNT `
    --region=LOCATION `
    --project=PROJECT_ID

Windows (cmd.exe)

gcloud ai index-endpoints deploy-index INDEX_ENDPOINT_ID ^
    --deployed-index-id=DEPLOYED_INDEX_ID ^
    --display-name=DEPLOYED_INDEX_NAME ^
    --index=INDEX_ID ^
    --min-replica-count=MIN_REPLICA_COUNT ^
    --max-replica-count=MAX_REPLICA_COUNT ^
    --region=LOCATION ^
    --project=PROJECT_ID

REST

在使用任何请求数据之前，请先进行以下替换：

INDEX_ENDPOINT_ID：索引端点的 ID。
DEPLOYED_INDEX_ID：用户指定的字符串，用于唯一标识已部署的索引。必须以字母开头，并且只包含字母、数字或下划线。如需了解格式准则，请参阅 DeployedIndex.id。
DEPLOYED_INDEX_NAME：已部署索引的显示名称。
INDEX_ID：索引的 ID。
MIN_REPLICA_COUNT：始终在其中部署已部署索引的机器副本的数量下限。如果指定，则此值必须等于或大于 1。
MAX_REPLICA_COUNT：可以在其中部署已部署索引的机器副本的数量上限。
LOCATION：您在其中使用 Vertex AI 的区域。
PROJECT_ID：您的 Google Cloud 项目 ID。
PROJECT_NUMBER：自动生成的项目编号。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:deployIndex

请求 JSON 正文：

{
 "deployedIndex": {
   "id": "DEPLOYED_INDEX_ID",
   "index": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID",
   "displayName": "DEPLOYED_INDEX_NAME",
   "automaticResources": {
     "minReplicaCount": MIN_REPLICA_COUNT,
     "maxReplicaCount": MAX_REPLICA_COUNT
   }
 }
}

如需发送您的请求，请展开以下选项之一：

curl（Linux、macOS 或 Cloud Shell）

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:deployIndex"

PowerShell (Windows)

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:deployIndex" | Select-Object -Expand Content

您应该收到类似以下内容的 JSON 响应：

{
 "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID",
 "metadata": {
   "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployIndexOperationMetadata",
   "genericMetadata": {
     "createTime": "2023-10-19T17:53:16.502088Z",
     "updateTime": "2023-10-19T17:53:16.502088Z"
   },
   "deployedIndexId": "DEPLOYED_INDEX_ID"
 }
}

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。

def vector_search_deploy_autoscaling_index(
    project: str,
    location: str,
    index_name: str,
    index_endpoint_name: str,
    deployed_index_id: str,
    min_replica_count: int,
    max_replica_count: int,
) -> None:
    """Deploy a vector search index to a vector search index endpoint.

    Args:
        project (str): Required. Project ID
        location (str): Required. The region name
        index_name (str): Required. The index to update. A fully-qualified index
          resource name or a index ID.  Example:
          "projects/123/locations/us-central1/indexes/my_index_id" or
          "my_index_id".
        index_endpoint_name (str): Required. Index endpoint to deploy the index
          to.
        deployed_index_id (str): Required. The user specified ID of the
          DeployedIndex.
        min_replica_count (int): Required. The minimum number of replicas to
          deploy.
        max_replica_count (int): Required. The maximum number of replicas to
          deploy.
    """
    # Initialize the Vertex AI client
    aiplatform.init(project=project, location=location)

    # Create the index instance from an existing index
    index = aiplatform.MatchingEngineIndex(index_name=index_name)

    # Create the index endpoint instance from an existing endpoint.
    index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
        index_endpoint_name=index_endpoint_name
    )

    # Deploy Index to Endpoint. Specifying min and max replica counts will
    # enable autoscaling.
    index_endpoint.deploy_index(
        index=index,
        deployed_index_id=deployed_index_id,
        min_replica_count=min_replica_count,
        max_replica_count=max_replica_count,
    )

控制台

您只能在索引部署期间从控制台启用自动扩缩。

在 Google Cloud 控制台的 Vertex AI 部分中，前往部署和使用部分。选择 Vector Search。
前往 Vector Search
系统会显示活跃索引列表。
选择您要部署的索引的名称。系统会打开索引详情页面。
在索引详情页面中，点击 部署到端点。系统会打开索引部署面板。
输入显示名称，此名称充当 ID 且无法更新。
从端点下拉列表中，选择要将此索引部署到的端点。注意：如果索引已部署到该端点，则该端点不可用。
可选：在机器类型字段中，选择标准内存或高内存。
可选。选择启用自动扩缩功能，以根据工作负载的需求自动调整节点数量。如果停用自动扩缩，则默认副本数为 2。

如果 minReplicaCount 和 maxReplicaCount 均未设置，则默认设置为 2。
如果仅设置了 maxReplicaCount，则 minReplicaCount 默认设置为 2。
如果仅设置了 minReplicaCount，则 maxReplicaCount 设置为等于 minReplicaCount。

更改 `DeployedIndex`

您可以使用 MutateDeployedIndex API 更新已部署索引的部署资源（例如 minReplicaCount 和 maxReplicaCount）。

部署索引后，用户无法更改 machineType。
如果请求中未指定 maxReplicaCount，则 DeployedIndex 将继续使用现有的 maxReplicaCount。

gcloud

以下示例使用 gcloud ai index-endpoints mutate-deployed-index 命令。

在使用下面的命令数据之前，请先进行以下替换：

INDEX_ENDPOINT_ID：索引端点的 ID。
DEPLOYED_INDEX_ID：用户指定的字符串，用于唯一标识已部署的索引。必须以字母开头，并且只包含字母、数字或下划线。如需了解格式准则，请参阅 DeployedIndex.id。
MIN_REPLICA_COUNT：始终在其中部署已部署索引的机器副本的数量下限。如果指定，则此值必须等于或大于 1。
MAX_REPLICA_COUNT：可以在其中部署已部署索引的机器副本的数量上限。
LOCATION：您在其中使用 Vertex AI 的区域。
PROJECT_ID：您的 Google Cloud 项目 ID。

执行以下命令：

Linux、macOS 或 Cloud Shell

gcloud ai index-endpoints mutate-deployed-index INDEX_ENDPOINT_ID \
    --deployed-index-id=DEPLOYED_INDEX_ID \
    --min-replica-count=MIN_REPLICA_COUNT \
    --max-replica-count=MAX_REPLICA_COUNT \
    --region=LOCATION \
    --project=PROJECT_ID

Windows (PowerShell)

gcloud ai index-endpoints mutate-deployed-index INDEX_ENDPOINT_ID `
    --deployed-index-id=DEPLOYED_INDEX_ID `
    --min-replica-count=MIN_REPLICA_COUNT `
    --max-replica-count=MAX_REPLICA_COUNT `
    --region=LOCATION `
    --project=PROJECT_ID

Windows (cmd.exe)

gcloud ai index-endpoints mutate-deployed-index INDEX_ENDPOINT_ID ^
    --deployed-index-id=DEPLOYED_INDEX_ID ^
    --min-replica-count=MIN_REPLICA_COUNT ^
    --max-replica-count=MAX_REPLICA_COUNT ^
    --region=LOCATION ^
    --project=PROJECT_ID

REST

在使用任何请求数据之前，请先进行以下替换：

INDEX_ENDPOINT_ID：索引端点的 ID。
DEPLOYED_INDEX_ID：用户指定的字符串，用于唯一标识已部署的索引。必须以字母开头，并且只包含字母、数字或下划线。如需了解格式准则，请参阅 DeployedIndex.id。
MIN_REPLICA_COUNT：始终在其中部署已部署索引的机器副本的数量下限。如果指定，则此值必须等于或大于 1。
MAX_REPLICA_COUNT：可以在其中部署已部署索引的机器副本的数量上限。
LOCATION：您在其中使用 Vertex AI 的区域。
PROJECT_ID：您的 Google Cloud 项目 ID。
PROJECT_NUMBER：自动生成的项目编号。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:mutateDeployedIndex

请求 JSON 正文：

{
  "deployedIndex": {
    "id": "DEPLOYED_INDEX_ID",
    "index": "projects/PROJECT_ID/locations/LOCATION/indexes/INDEX_ID",
    "displayName": "DEPLOYED_INDEX_NAME",
    "min_replica_count": "MIN_REPLICA_COUNT",
    "max_replica_count": "MAX_REPLICA_COUNT"
  }
}

如需发送您的请求，请展开以下选项之一：

curl（Linux、macOS 或 Cloud Shell）

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:mutateDeployedIndex"

PowerShell (Windows)

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:mutateDeployedIndex" | Select-Object -Expand Content

您应该收到类似以下内容的 JSON 响应：

{
"name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID",
"metadata": {
  "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeployIndexOperationMetadata",
  "genericMetadata": {
    "createTime": "2020-10-19T17:53:16.502088Z",
    "updateTime": "2020-10-19T17:53:16.502088Z"
  },
  "deployedIndexId": "DEPLOYED_INDEX_ID"
}
}

Terraform

如需了解如何应用或移除 Terraform 配置，请参阅基本 Terraform 命令。如需了解详情，请参阅 Terraform 提供程序参考文档。

provider "google" {
  region = "us-central1"
}

resource "google_vertex_ai_index_endpoint_deployed_index" "default" {
  depends_on        = [google_vertex_ai_index_endpoint.default]
  index_endpoint    = google_vertex_ai_index_endpoint.default.id
  index             = google_vertex_ai_index.default.id
  deployed_index_id = "deployed_index_for_mutate_vpc"
  # This example assumes the deployed index endpoint's resources configuration
  # differs from the values specified below. Terraform will mutate the deployed
  # index endpoint's resource configuration to match.
  automatic_resources {
    min_replica_count = 3
    max_replica_count = 5
  }
}

resource "google_vertex_ai_index_endpoint" "default" {
  display_name = "sample-endpoint"
  description  = "A sample index endpoint within a VPC network"
  region       = "us-central1"
  network      = "projects/${data.google_project.project.number}/global/networks/${google_compute_network.default.name}"
  depends_on = [
    google_service_networking_connection.default
  ]
}

resource "google_service_networking_connection" "default" {
  network                 = google_compute_network.default.id
  service                 = "servicenetworking.googleapis.com"
  reserved_peering_ranges = [google_compute_global_address.default.name]
  # Workaround to allow `terraform destroy`, see https://github.com/hashicorp/terraform-provider-google/issues/18729
  deletion_policy = "ABANDON"
}

resource "google_compute_global_address" "default" {
  name          = "sample-address"
  purpose       = "VPC_PEERING"
  address_type  = "INTERNAL"
  prefix_length = 16
  network       = google_compute_network.default.id
}

resource "google_compute_network" "default" {
  name = "sample-network"
}

data "google_project" "project" {}

# Cloud Storage bucket name must be unique
resource "random_id" "default" {
  byte_length = 8
}

# Create a Cloud Storage bucket
resource "google_storage_bucket" "bucket" {
  name                        = "vertex-ai-index-bucket-${random_id.default.hex}"
  location                    = "us-central1"
  uniform_bucket_level_access = true
}

# Create index content
resource "google_storage_bucket_object" "data" {
  name    = "contents/data.json"
  bucket  = google_storage_bucket.bucket.name
  content = <<EOF
{"id": "42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline"]}]}
{"id": "43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine"]}]}
EOF
}

resource "google_vertex_ai_index" "default" {
  region       = "us-central1"
  display_name = "sample-index-batch-update"
  description  = "A sample index for batch update"
  labels = {
    foo = "bar"
  }

  metadata {
    contents_delta_uri = "gs://${google_storage_bucket.bucket.name}/contents"
    config {
      dimensions                  = 2
      approximate_neighbors_count = 150
      distance_measure_type       = "DOT_PRODUCT_DISTANCE"
      algorithm_config {
        tree_ah_config {
          leaf_node_embedding_count    = 500
          leaf_nodes_to_search_percent = 7
        }
      }
    }
  }
  index_update_method = "BATCH_UPDATE"

  timeouts {
    create = "2h"
    update = "1h"
  }
}

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。

def vector_search_mutate_deployed_index(
    project: str,
    location: str,
    index_endpoint_name: str,
    deployed_index_id: str,
    min_replica_count: int,
    max_replica_count: int,
) -> None:
    """Mutate the deployment resources of an already deployed index.

    Args:
        project (str): Required. Project ID
        location (str): Required. The region name
        index_endpoint_name (str): Required. Index endpoint to run the query
          against.
        deployed_index_id (str): Required. The ID of the DeployedIndex to run
          the queries against.
        min_replica_count (int): Required. The minimum number of replicas to
          deploy.
        max_replica_count (int): Required. The maximum number of replicas to
          deploy.
    """
    # Initialize the Vertex AI client
    aiplatform.init(project=project, location=location)

    # Create the index endpoint instance from an existing endpoint
    index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
        index_endpoint_name=index_endpoint_name
    )

    # Mutate the deployed index
    index_endpoint.mutate_deployed_index(
        deployed_index_id=deployed_index_id,
        min_replica_count=min_replica_count,
        max_replica_count=max_replica_count,
    )

影响性能的部署设置

使用 Vector Search 时，以下部署设置可能会影响延迟时间、可用性和费用。本指南适用于大多数情况。但是，请始终对您的配置进行实验，以确保它们适用于您的应用场景。

设置性能影响

机器类型

设置	性能影响
机器类型	硬件选择会与所选的分片大小直接相互作用。根据您在创建索引时指定的分片选项，每种机器类型都会在性能和费用之间折衷。如需确定可用的硬件和价格，请参阅价格页面。一般来说，性能从低到高排列如下： E2 标准 E2 highmem N1 标准 N2D 标准
副本数下限	`minReplicaCount` 会预留最小容量，以确保可用性并控制延迟时间，保证在流量从低水平快速增长时，系统不会出现冷启动问题。如果您的工作负载的模式是流量下降到较低水平，然后快速增长，请考虑将 `minReplicaCount` 设置为足以处理初始突发流量的数字。
副本数上限	`maxReplicaCount` 主要可让您控制使用费。您可以选择防止费用超过特定阈值，但代价是延迟时间增加和可用性降低。

硬件选择会与所选的分片大小直接相互作用。根据您在创建索引时指定的分片选项，每种机器类型都会在性能和费用之间折衷。

如需确定可用的硬件和价格，请参阅价格页面。一般来说，性能从低到高排列如下：

E2 标准
E2 highmem
N1 标准
N2D 标准

副本数下限

minReplicaCount 会预留最小容量，以确保可用性并控制延迟时间，保证在流量从低水平快速增长时，系统不会出现冷启动问题。

如果您的工作负载的模式是流量下降到较低水平，然后快速增长，请考虑将 minReplicaCount 设置为足以处理初始突发流量的数字。

副本数上限 maxReplicaCount 主要可让您控制使用费。您可以选择防止费用超过特定阈值，但代价是延迟时间增加和可用性降低。

列出 `IndexEndpoints`

如需列出 IndexEndpoint 资源并查看任何关联的 DeployedIndex 实例的信息，请运行以下代码：

gcloud

以下示例使用 gcloud ai index-endpoints list 命令。

在使用下面的命令数据之前，请先进行以下替换：

LOCATION：您在其中使用 Vertex AI 的区域。
PROJECT_ID：您的 Google Cloud 项目 ID。

执行以下命令：

Linux、macOS 或 Cloud Shell

gcloud ai index-endpoints list \
    --region=LOCATION \
    --project=PROJECT_ID

Windows (PowerShell)

gcloud ai index-endpoints list `
    --region=LOCATION `
    --project=PROJECT_ID

Windows (cmd.exe)

gcloud ai index-endpoints list ^
    --region=LOCATION ^
    --project=PROJECT_ID

REST

在使用任何请求数据之前，请先进行以下替换：

LOCATION：您在其中使用 Vertex AI 的区域。
PROJECT_ID：您的 Google Cloud 项目 ID。
PROJECT_NUMBER：自动生成的项目编号。

HTTP 方法和网址：

GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints

如需发送您的请求，请展开以下选项之一：

curl（Linux、macOS 或 Cloud Shell）

执行以下命令：

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints"

PowerShell (Windows)

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints" | Select-Object -Expand Content

您应该收到类似以下内容的 JSON 响应：

{
 "indexEndpoints": [
   {
     "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID",
     "displayName": "INDEX_ENDPOINT_DISPLAY_NAME",
     "deployedIndexes": [
       {
         "id": "DEPLOYED_INDEX_ID",
         "index": "projects/PROJECT_NUMBER/locations/LOCATION/indexes/INDEX_ID",
         "displayName": "DEPLOYED_INDEX_DISPLAY_NAME",
         "createTime": "2021-06-04T02:23:40.178286Z",
         "privateEndpoints": {
           "matchGrpcAddress": "GRPC_ADDRESS"
         },
         "indexSyncTime": "2022-01-13T04:22:00.151916Z",
         "automaticResources": {
           "minReplicaCount": 2,
           "maxReplicaCount": 10
         }
       }
     ],
     "etag": "AMEw9yP367UitPkLo-khZ1OQvqIK8Q0vLAzZVF7QjdZ5O3l7Zow-mzBo2l6xmiuuMljV",
     "createTime": "2021-03-17T04:47:28.460373Z",
     "updateTime": "2021-06-04T02:23:40.930513Z",
     "network": "VPC_NETWORK_NAME"
   }
 ]
}

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。

def vector_search_list_index_endpoint(
    project: str, location: str
) -> List[aiplatform.MatchingEngineIndexEndpoint]:
    """List vector search index endpoints.

    Args:
        project (str): Required. Project ID
        location (str): Required. The region name

    Returns:
        List of aiplatform.MatchingEngineIndexEndpoint
    """
    # Initialize the Vertex AI client
    aiplatform.init(project=project, location=location)

    # List Index Endpoints
    return aiplatform.MatchingEngineIndexEndpoint.list()

控制台

按照以下说明查看索引端点列表。

在 Google Cloud 控制台的 Vertex AI 部分中，前往部署和使用部分。选择 Vector Search。
前往 Vector Search
选择页面顶部的索引端点标签页。
此时会显示所有现有的索引端点。

如需了解详情，请参阅 IndexEndpoint 的参考文档。

取消部署索引

如需取消部署索引，请运行以下代码：

gcloud

以下示例使用 gcloud ai index-endpoints undeploy-index 命令。

在使用下面的命令数据之前，请先进行以下替换：

INDEX_ENDPOINT_ID：索引端点的 ID。
DEPLOYED_INDEX_ID：用户指定的字符串，用于唯一标识已部署的索引。必须以字母开头，并且只包含字母、数字或下划线。如需了解格式准则，请参阅 DeployedIndex.id。
LOCATION：您在其中使用 Vertex AI 的区域。
PROJECT_ID：您的 Google Cloud 项目 ID。

执行以下命令：

Linux、macOS 或 Cloud Shell

gcloud ai index-endpoints undeploy-index INDEX_ENDPOINT_ID \
    --deployed-index-id=DEPLOYED_INDEX_ID \
    --region=LOCATION \
    --project=PROJECT_ID

Windows (PowerShell)

gcloud ai index-endpoints undeploy-index INDEX_ENDPOINT_ID `
    --deployed-index-id=DEPLOYED_INDEX_ID `
    --region=LOCATION `
    --project=PROJECT_ID

Windows (cmd.exe)

gcloud ai index-endpoints undeploy-index INDEX_ENDPOINT_ID ^
    --deployed-index-id=DEPLOYED_INDEX_ID ^
    --region=LOCATION ^
    --project=PROJECT_ID

REST

在使用任何请求数据之前，请先进行以下替换：

INDEX_ENDPOINT_ID：索引端点的 ID。
DEPLOYED_INDEX_ID：用户指定的字符串，用于唯一标识已部署的索引。必须以字母开头，并且只包含字母、数字或下划线。如需了解格式准则，请参阅 DeployedIndex.id。
LOCATION：您在其中使用 Vertex AI 的区域。
PROJECT_ID：您的 Google Cloud 项目 ID。
PROJECT_NUMBER：自动生成的项目编号。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:undeployIndex

请求 JSON 正文：

{
 "deployed_index_id": "DEPLOYED_INDEX_ID"
}

如需发送您的请求，请展开以下选项之一：

curl（Linux、macOS 或 Cloud Shell）

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:undeployIndex"

PowerShell (Windows)

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID:undeployIndex" | Select-Object -Expand Content

您应该收到类似以下内容的 JSON 响应：

{
 "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID",
 "metadata": {
   "@type": "type.googleapis.com/google.cloud.aiplatform.v1.UndeployIndexOperationMetadata",
   "genericMetadata": {
     "createTime": "2022-01-13T04:09:56.641107Z",
     "updateTime": "2022-01-13T04:09:56.641107Z"
   }
 }
}

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。

def vector_search_undeploy_index(
    project: str,
    location: str,
    index_endpoint_name: str,
    deployed_index_id: str,
) -> None:
    """Mutate the deployment resources of an already deployed index.

    Args:
        project (str): Required. Project ID
        location (str): Required. The region name
        index_endpoint_name (str): Required. Index endpoint to run the query
          against.
        deployed_index_id (str): Required. The ID of the DeployedIndex to run
          the queries against.
    """
    # Initialize the Vertex AI client
    aiplatform.init(project=project, location=location)

    # Create the index endpoint instance from an existing endpoint
    index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
        index_endpoint_name=index_endpoint_name
    )

    # Undeploy the index
    index_endpoint.undeploy_index(
        deployed_index_id=deployed_index_id,
    )

控制台

按照以下说明取消部署索引。

在 Google Cloud 控制台的 Vertex AI 部分中，前往部署和使用部分。选择 Vector Search。
前往 Vector Search
系统会显示活跃索引列表。
选择您要取消部署的索引。系统会打开索引详情页面。
在已部署的索引部分下，确定您要取消部署的索引端点。
点击与该索引端点位于同一行的选项菜单，然后选择取消部署。
系统会打开一个确认屏幕。点击取消部署。注意：取消部署过程最多可能需要 30 分钟。

删除 `IndexEndpoint`

在删除 IndexEndpoint 之前，您必须undeploy所有已部署到该端点的索引。

gcloud

以下示例使用 gcloud ai index-endpoints delete 命令。

在使用下面的命令数据之前，请先进行以下替换：

INDEX_ENDPOINT_ID：索引端点的 ID。
LOCATION：您在其中使用 Vertex AI 的区域。
PROJECT_ID：您的 Google Cloud 项目 ID。

执行以下命令：

Linux、macOS 或 Cloud Shell

gcloud ai index-endpoints delete INDEX_ENDPOINT_ID \
    --region=LOCATION \
    --project=PROJECT_ID

Windows (PowerShell)

gcloud ai index-endpoints delete INDEX_ENDPOINT_ID `
    --region=LOCATION `
    --project=PROJECT_ID

Windows (cmd.exe)

gcloud ai index-endpoints delete INDEX_ENDPOINT_ID ^
    --region=LOCATION ^
    --project=PROJECT_ID

REST

在使用任何请求数据之前，请先进行以下替换：

INDEX_ENDPOINT_ID：索引端点的 ID。
LOCATION：您在其中使用 Vertex AI 的区域。
PROJECT_ID：您的 Google Cloud 项目 ID。
PROJECT_NUMBER：自动生成的项目编号。

HTTP 方法和网址：

DELETE https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID

如需发送您的请求，请展开以下选项之一：

curl（Linux、macOS 或 Cloud Shell）

执行以下命令：

curl -X DELETE \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID"

PowerShell (Windows)

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method DELETE `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID" | Select-Object -Expand Content

您应该收到类似以下内容的 JSON 响应：

{
 "name": "projects/PROJECT_NUMBER/locations/LOCATION/indexEndpoints/INDEX_ENDPOINT_ID/operations/OPERATION_ID",
 "metadata": {
   "@type": "type.googleapis.com/google.cloud.aiplatform.v1.DeleteOperationMetadata",
   "genericMetadata": {
     "createTime": "2022-01-13T04:36:19.142203Z",
     "updateTime": "2022-01-13T04:36:19.142203Z"
   }
 },
 "done": true,
 "response": {
   "@type": "type.googleapis.com/google.protobuf.Empty"
 }
}

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。

def vector_search_delete_index_endpoint(
    project: str, location: str, index_endpoint_name: str, force: bool = False
) -> None:
    """Delete a vector search index endpoint.

    Args:
        project (str): Required. Project ID
        location (str): Required. The region name
        index_endpoint_name (str): Required. Index endpoint to run the query
          against.
        force (bool): Required. If true, undeploy any deployed indexes on this
          endpoint before deletion.
    """
    # Initialize the Vertex AI client
    aiplatform.init(project=project, location=location)

    # Create the index endpoint instance from an existing endpoint
    index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
        index_endpoint_name=index_endpoint_name
    )

    # Delete the index endpoint
    index_endpoint.delete(force=force)

控制台

按照以下说明删除索引端点。

在 Google Cloud 控制台的 Vertex AI 部分中，前往部署和使用部分。选择 Vector Search。
前往 Vector Search
选择页面顶部的索引端点标签页。
此时会显示所有现有的索引端点。
点击与要删除的索引端点位于同一行的选项菜单，然后选择删除。
系统会打开一个确认屏幕。点击删除。索引端点现已删除。

在 VPC 网络中部署和管理索引端点

在 VPC 网络中创建 IndexEndpoint

gcloud

Linux、macOS 或 Cloud Shell

Windows (PowerShell)

Windows (cmd.exe)

REST

curl（Linux、macOS 或 Cloud Shell）

PowerShell (Windows)

Terraform

Python

控制台

部署索引

gcloud

Linux、macOS 或 Cloud Shell

Windows (PowerShell)

Windows (cmd.exe)

REST

curl（Linux、macOS 或 Cloud Shell）

PowerShell (Windows)

Terraform

Python

控制台

启用自动扩缩功能

gcloud

Linux、macOS 或 Cloud Shell

Windows (PowerShell)

Windows (cmd.exe)

REST

curl（Linux、macOS 或 Cloud Shell）

PowerShell (Windows)

Python

控制台

更改 DeployedIndex

gcloud

Linux、macOS 或 Cloud Shell

Windows (PowerShell)

Windows (cmd.exe)

REST

curl（Linux、macOS 或 Cloud Shell）

PowerShell (Windows)

Terraform

Python

影响性能的部署设置

列出 IndexEndpoints

gcloud

Linux、macOS 或 Cloud Shell

Windows (PowerShell)

Windows (cmd.exe)

REST

curl（Linux、macOS 或 Cloud Shell）

PowerShell (Windows)

Python

控制台

取消部署索引

gcloud

Linux、macOS 或 Cloud Shell

Windows (PowerShell)

Windows (cmd.exe)

REST

curl（Linux、macOS 或 Cloud Shell）

PowerShell (Windows)

Python

控制台

删除 IndexEndpoint

gcloud

Linux、macOS 或 Cloud Shell

Windows (PowerShell)

Windows (cmd.exe)

REST

curl（Linux、macOS 或 Cloud Shell）

PowerShell (Windows)

Python

控制台

在 VPC 网络中创建 `IndexEndpoint`

更改 `DeployedIndex`

列出 `IndexEndpoints`

删除 `IndexEndpoint`