查询公共索引以获取最近邻

创建和部署索引后，您可以运行查询以获取最近邻。

以下是匹配查询的一些示例，这些查询使用 k 最近邻算法 (k-NN) 查找最近邻。

公共端点查询示例

Python

def vector_search_find_neighbors(
    project: str,
    location: str,
    index_endpoint_name: str,
    deployed_index_id: str,
    queries: List[List[float]],
    num_neighbors: int,
) -> None:
    """Query the vector search index.

    Args:
        project (str): Required. Project ID
        location (str): Required. The region name
        index_endpoint_name (str): Required. Index endpoint to run the query
        against.
        deployed_index_id (str): Required. The ID of the DeployedIndex to run
        the queries against.
        queries (List[List[float]]): Required. A list of queries. Each query is
        a list of floats, representing a single embedding.
        num_neighbors (int): Required. The number of neighbors to return.
    """
    # Initialize the Vertex AI client
    aiplatform.init(project=project, location=location)

    # Create the index endpoint instance from an existing endpoint.
    my_index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
        index_endpoint_name=index_endpoint_name
    )

    # Query the index endpoint for the nearest neighbors.
    resp = my_index_endpoint.find_neighbors(
        deployed_index_id=deployed_index_id,
        queries=queries,
        num_neighbors=num_neighbors,
    )
    print(resp)

    # Query hybrid datapoints, sparse-only datapoints, and dense-only datapoints.
    hybrid_queries = [
        aiplatform.matching_engine.matching_engine_index_endpoint.HybridQuery(
            dense_embedding=[1, 2, 3],
            sparse_embedding_dimensions=[10, 20, 30],
            sparse_embedding_values=[1.0, 1.0, 1.0],
            rrf_ranking_alpha=0.5,
        ),
        aiplatform.matching_engine.matching_engine_index_endpoint.HybridQuery(
            dense_embedding=[1, 2, 3],
            sparse_embedding_dimensions=[10, 20, 30],
            sparse_embedding_values=[0.1, 0.2, 0.3],
        ),
        aiplatform.matching_engine.matching_engine_index_endpoint.HybridQuery(
            sparse_embedding_dimensions=[10, 20, 30],
            sparse_embedding_values=[0.1, 0.2, 0.3],
        ),
        aiplatform.matching_engine.matching_engine_index_endpoint.HybridQuery(
            dense_embedding=[1, 2, 3]
        ),
    ]

    hybrid_resp = my_index_endpoint.find_neighbors(
            deployed_index_id=deployed_index_id,
            queries=hybrid_queries,
            num_neighbors=num_neighbors,)
    print(hybrid_resp)

Curl

您可以在部署中找到下面列出的 publicEndpointDomainName，其格式为 <number>.<region>-<number>.vdb.vertexai.goog。


  $ curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer `gcloud auth print-access-token`" https://1957880287.us-central1-181224308459.vdb.vertexai.goog/v1/projects/181224308459/locations/us-central1/indexEndpoints/3370566089086861312:findNeighbors -d '{deployed_index_id: "test_index_public1", queries: [{datapoint: {datapoint_id: "0", feature_vector: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}, neighbor_count: 5}]}'

此 curl 示例展示了如何从 http(s) 客户端进行调用，虽然公共端点支持 RESTful 和 grpc_cli 的双协议。


  $ curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer `gcloud auth print-access-token`" https://1957880287.us-central1-181224308459.vdb.vertexai.goog/v1/projects/${PROJECT_ID}/locations/us-central1/indexEndpoints/${INDEX_ENDPOINT_ID}:readIndexDatapoints -d '{deployed_index_id:"test_index_public1", ids: ["606431", "896688"]}'

此 curl 示例演示了如何使用词元和数值限制进行查询。


  $ curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer `gcloud auth print-access-token`"  https://${PUBLIC_ENDPOINT_DOMAIN}/v1/projects/${PROJECT_ID}/locations/${LOCATION}/indexEndpoints/${INDEX_ENDPOINT_ID}:findNeighbors -d '{deployed_index_id:"${DEPLOYED_INDEX_ID}", queries: [{datapoint: {datapoint_id:"x", feature_vector: [1, 1], "sparse_embedding": {"values": [111.0,111.1,111.2], "dimensions": [10,20,30]}, numeric_restricts: [{namespace: "int-ns", value_int: -2, op: "GREATER"}, {namespace: "int-ns", value_int: 4, op: "LESS_EQUAL"}, {namespace: "int-ns", value_int: 0, op: "NOT_EQUAL"}], restricts: [{namespace: "color", allow_list: ["red"]}]}}]}'

控制台

按照以下说明从控制台查询部署到公共端点的索引。

在 Google Cloud 控制台的 Vertex AI 部分中，前往部署和使用部分。选择向量搜索。
前往 Vector Search
选择要查询的索引。此时会打开索引信息页面。
向下滚动到已部署的索引部分，然后选择要查询的已部署索引。此时会打开已部署的索引信息页面。
在查询索引部分中，选择是按密集嵌入值、稀疏嵌入值、混合嵌入值（密集嵌入和稀疏嵌入）还是特定数据点进行查询。
输入您选择的查询类型的查询参数。例如，如果您要按密集嵌入进行查询，请输入要按其进行查询的嵌入向量。
使用提供的 curl 命令或使用 Cloud Shell 运行来执行查询。
如果使用 Cloud Shell，请选择在 Cloud Shell 中运行。
在 Cloud Shell 中运行。
结果会返回最近邻。

如需查看端到端示例，了解如何创建索引、如何将其部署到公共端点以及如何查询，请参阅官方笔记本：使用 Vector Search 和 Vertex AI Embeddings for Text 处理 StackOverflow 问题。

影响性能的查询时间设置

使用向量搜索时，以下查询时间参数可能会影响延迟时间、可用性和费用。本指南适用于大多数情况。但是，请始终对您的配置进行实验，以确保它们适用于您的应用场景。

如需了解参数定义，请参阅索引配置参数。

参数简介性能影响

参数	简介	性能影响
`approximateNeighborsCount`	指示算法要从每个分片中检索的近似结果数。 `approximateNeighborsCount` 的值应始终大于 `setNeighborsCount` 的值。如果 `setNeighborsCount` 的值较小，建议对 `approximateNeighborsCount` 使用该值的 10 倍。对于较大的 `setNeighborsCount` 值，可以使用较小的乘数。	增加 `approximateNeighborsCount` 的值会对性能产生以下影响：召回率：增加延迟时间：可能会延长可用性：无影响费用：可能会增加，因为搜索期间会处理更多数据减小 `approximateNeighborsCount` 的值会对性能产生以下影响：召回率：降低延迟时间：可能缩短可用性：无影响费用：由于搜索期间处理的数据较少，因此费用可能会降低
`setNeighborCount`	指定您希望查询返回的结果数。	在大多数应用场景中，值小于或等于 300 仍能保持高性能。对于较大的值，针对您的特定应用场景进行测试。
`fractionLeafNodesToSearch`	控制在搜索最近邻时要访问的叶节点的百分比。这与 `leafNodeEmbeddingCount` 相关，因为每个叶节点的嵌入越多，每个叶检查的数据就越多。	增加 `fractionLeafNodesToSearch` 的值会对性能产生以下影响：召回率：增加延迟时间：延长可用性：无影响费用：可能会增加，因为延迟时间较长会占用较多的机器资源减小 `fractionLeafNodesToSearch` 的值会对性能产生以下影响：召回率：降低延迟时间：缩短可用性：无影响费用：可能会降低，因为延迟时间较短会占用较少的机器资源

approximateNeighborsCount

指示算法要从每个分片中检索的近似结果数。

approximateNeighborsCount 的值应始终大于 setNeighborsCount 的值。如果 setNeighborsCount 的值较小，建议对 approximateNeighborsCount 使用该值的 10 倍。对于较大的 setNeighborsCount 值，可以使用较小的乘数。

增加 approximateNeighborsCount 的值会对性能产生以下影响：

召回率：增加
延迟时间：可能会延长
可用性：无影响
费用：可能会增加，因为搜索期间会处理更多数据

减小 approximateNeighborsCount 的值会对性能产生以下影响：

召回率：降低
延迟时间：可能缩短
可用性：无影响
费用：由于搜索期间处理的数据较少，因此费用可能会降低

setNeighborCount

指定您希望查询返回的结果数。

在大多数应用场景中，值小于或等于 300 仍能保持高性能。对于较大的值，针对您的特定应用场景进行测试。

fractionLeafNodesToSearch

控制在搜索最近邻时要访问的叶节点的百分比。这与 leafNodeEmbeddingCount 相关，因为每个叶节点的嵌入越多，每个叶检查的数据就越多。

增加 fractionLeafNodesToSearch 的值会对性能产生以下影响：

召回率：增加
延迟时间：延长
可用性：无影响
费用：可能会增加，因为延迟时间较长会占用较多的机器资源

减小 fractionLeafNodesToSearch 的值会对性能产生以下影响：

召回率：降低
延迟时间：缩短
可用性：无影响
费用：可能会降低，因为延迟时间较短会占用较少的机器资源

后续步骤

了解如何更新和重建索引
了解如何过滤向量匹配
了解如何监控索引