Filtrar búsquedas por relevancia a nivel de documento
Organízate con las colecciones
Guarda y clasifica el contenido según tus preferencias.
Cuando buscas en tu aplicación Vertex AI Search, puedes aplicar un umbral de relevancia para que solo se devuelvan como resultados los documentos que cumplan este umbral. En esta página se explica cómo especificar un umbral de relevancia para reducir el número de documentos devueltos en las consultas.
Acerca del filtrado por relevancia a nivel de documento
A cada documento devuelto por una consulta de búsqueda se le asigna un nivel de relevancia, que indica la relevancia del documento devuelto en relación con la consulta. Cuando haces una consulta a través de una llamada a la API, puedes definir un umbral de relevancia. Si se define un umbral de relevancia alto, se puede reducir el número de documentos que devuelve una consulta.
Por ejemplo, si detecta que la búsqueda devuelve demasiados documentos que no son lo suficientemente relevantes para sus usuarios, defina el umbral de relevancia en alto para acotar los resultados a los que sean más relevantes. Si el ajuste alto es demasiado restrictivo, prueba con el medio.
Tipos de datos y aplicaciones admitidos para el filtro de relevancia a nivel de documento
El filtro de relevancia a nivel de documento se puede aplicar a almacenes de datos con los siguientes tipos de datos:
Datos de sitios web con indexación avanzada de sitios web
Datos sin estructurar personalizados
Datos estructurados personalizados
El filtro de relevancia a nivel de documento no funciona con almacenes de datos que tengan indexación básica de sitios web, datos multimedia o datos sanitarios.
Además, el filtro de relevancia a nivel de documento no se puede usar con aplicaciones de búsqueda combinada. Las aplicaciones de búsqueda combinada son aplicaciones conectadas a varios almacenes de datos.
Otros tipos de filtros
El filtro de relevancia a nivel de documento no es la única forma de filtrar los datos devueltos por las consultas. También puedes usar expresiones de filtro para filtrar los resultados en función de los metadatos (en la indexación avanzada de sitios web y en los almacenes de datos no estructurados con metadatos) y de los valores de los campos (en los almacenes de datos estructurados).
Si usas tanto una expresión de filtro como el filtro de relevancia a nivel de documento, la expresión de filtro se aplica primero a los resultados y, después, se aplica el filtro de relevancia a nivel de documento.
En este caso, el umbral de relevancia se ha definido como alto, por lo que solo se devuelven los resultados más relevantes. En este ejemplo, solo se ha determinado que un documento es muy pertinente.
Prueba varias consultas con umbrales diferentes para determinar la mejor configuración de umbral para tus datos y tu aplicación.
[[["Es fácil de entender","easyToUnderstand","thumb-up"],["Me ofreció una solución al problema","solvedMyProblem","thumb-up"],["Otro","otherUp","thumb-up"]],[["Es difícil de entender","hardToUnderstand","thumb-down"],["La información o el código de muestra no son correctos","incorrectInformationOrSampleCode","thumb-down"],["Me faltan las muestras o la información que necesito","missingTheInformationSamplesINeed","thumb-down"],["Problema de traducción","translationIssue","thumb-down"],["Otro","otherDown","thumb-down"]],["Última actualización: 2025-08-21 (UTC)."],[[["\u003cp\u003eVertex AI Search allows filtering search results by document-level relevance, reducing the number of returned documents based on their relevance to the query.\u003c/p\u003e\n"],["\u003cp\u003eYou can set the relevance threshold to \u003ccode\u003eHIGH\u003c/code\u003e, \u003ccode\u003eMEDIUM\u003c/code\u003e, \u003ccode\u003eLOW\u003c/code\u003e, or \u003ccode\u003eLOWEST\u003c/code\u003e when making an API call to narrow down the search results to only the most relevant ones, using the \u003ccode\u003erelevanceThreshold\u003c/code\u003e field.\u003c/p\u003e\n"],["\u003cp\u003eThis document-level relevance filter is applicable to data stores with website data with advanced indexing, generic unstructured data, and generic structured data, but it is not supported for blended search apps, or data stores with basic website indexing, media data, or healthcare data.\u003c/p\u003e\n"],["\u003cp\u003eDocument-level relevance filtering can be used alongside filter expressions based on metadata or field values, with filter expressions being applied first.\u003c/p\u003e\n"],["\u003cp\u003eTo use this filtering method, search over an app using the \u003ccode\u003eengines.servingConfigs.search\u003c/code\u003e method, and input your app ID and query alongside the relevance threshold.\u003c/p\u003e\n"]]],[],null,["# Filter searches by document-level relevance\n\n| **Note:** This feature is a Preview offering, subject to the \"Pre-GA Offerings Terms\" of the [GCP Service Specific Terms](https://cloud.google.com/terms/service-terms). Pre-GA products and features may have limited support, and changes to pre-GA products and features may not be compatible with other pre-GA versions. For more information, see the [launch stage descriptions](https://cloud.google.com/products#product-launch-stages). Further, by using this feature, you agree to the [Generative AI Preview terms and conditions](https://cloud.google.com/trustedtester/aitos) (\"Preview Terms\"). For this feature, you can process personal data as outlined in the [Cloud Data Processing Addendum](https://cloud.google.com/terms/data-processing-terms), subject to applicable restrictions and obligations in the Agreement (as defined in the Preview Terms).\n|\n| \u003cbr /\u003e\n|\nWhen searching in your Vertex AI Search app, you can apply a\nrelevance threshold so that only the documents that meet this threshold\nare returned as results. This page explains how to specify a\nrelevance threshold in order to reduce the number of documents returned in\nqueries.\n\nAbout filtering by document-level relevance\n-------------------------------------------\n\nEach document returned by a search query is given a relevance level, which\nindicates the relevance of the returned document to the query. When you make a\nquery through an API call, you can set a relevance threshold. Setting a high\nrelevance threshold can reduce the number of documents returned by a query.\n\nFor example, if you find that search is returning too many documents of\ninsufficient relevance to your users, set the relevance threshold to high to\nnarrow the results to only those few that are most relevant. If the high setting\nis too restrictive, try the medium setting.\n| **Note:** This document-level relevance filtering feature is different from and less precise than the [document-relevance score](/generative-ai-app-builder/docs/preview-search-results#relevance-scores) that can be returned for search results.\n\nData types and apps supported for document-level relevance filter\n-----------------------------------------------------------------\n\nThe document-level relevance filter can be applied to data stores with following kinds of data:\n\n- Website data with advanced website indexing\n- Custom unstructured data\n- Custom structured data\n\nThe document-level relevance filter doesn't work for data stores with basic website indexing,\nmedia data, or healthcare data.\n\nFurthermore, the document-level relevance filter can't be used with blended search apps. Blended\nsearch apps are apps that are connected to multiple data stores.\n\nOther kinds of filters\n----------------------\n\nThe document-level relevance filter is not the only way you can filter data returned by queries. You\ncan also use filter expressions to filter results based on metadata (in\nadvanced website indexing and unstructured data with metadata data stores) and field\nvalues (in structured data stores).\n\nFor information, see:\n\n- [Filter expressions with advanced website indexing](/generative-ai-app-builder/docs/filter-website-search#filter-expressions-advanced-indexing)\n\n- [Filter custom search for structured or unstructured data](/generative-ai-app-builder/docs/filter-search-metadata)\n\nIf you use both a filter expression and the document-level relevance filter, the filter expression\nis applied first to the results and then the document-level relevance filter is applied.\n\nBefore you begin\n----------------\n\nMake sure you have created an app and data store and have ingested data\ninto your data store. For more information, see [Create a search\napp](/generative-ai-app-builder/docs/create-engine-es). See also [Data types and apps supported for\ndocument-level relevance filter](#supported).\n\nSearch and filter results by document-level relevance\n-----------------------------------------------------\n\nTo filter by relevance, follow these steps:\n| **Note:** You can search over an app using the [`engines.servingConfigs.search`](/generative-ai-app-builder/docs/reference/rest/v1/projects.locations.collections.engines.servingConfigs/search) method and you can search over a data store using the [`dataStores.servingConfigs.search`](/generative-ai-app-builder/docs/reference/rest/v1/projects.locations.collections.dataStores.servingConfigs/search) method. For the following procedure, Google recommends searching using the `engines.servingConfigs.search` method.\n\n1. Find your app ID. If you already have your app ID, skip to the next step.\n\n 1. In the Google Cloud console, go to the **AI Applications** page.\n\n [Go to Apps](https://console.cloud.google.com/gen-app-builder/engines)\n 2. On the **Apps** page, find the name of your app and get the app's ID from\n the **ID** column.\n\n2. To filter search by document-level relevance, use the `relevanceThreshold`\n field with the [`engines.servingConfigs.search`](/generative-ai-app-builder/docs/reference/rest/v1alpha/projects.locations.collections.engines.servingConfigs/search) method.\n\n **Key Term:** In Vertex AI Search, the term *app* can be used interchangeably with the term *engine* in the context of APIs. \n\n curl -X POST -H \"Authorization: Bearer $(gcloud auth application-default print-access-token)\" \\\n -H \"Content-Type: application/json\" \\\n \"https://discoveryengine.googleapis.com/v1alpha/projects/\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e/locations/global/collections/default_collection/engines/\u003cvar translate=\"no\"\u003eAPP_ID\u003c/var\u003e/servingConfigs/default_search:search\" \\\n -d '{\n \"servingConfig\": \"projects/\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e/locations/global/collections/default_collection/engines/\u003cvar translate=\"no\"\u003eAPP_ID\u003c/var\u003e/servingConfigs/default_search\",\n \"query\": \"\u003cvar translate=\"no\"\u003eQUERY\u003c/var\u003e\",\n \"relevanceThreshold\": \"\u003cvar translate=\"no\"\u003eRELEVANCE_THRESHOLD\u003c/var\u003e\"\n }'\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: the ID of your Google Cloud project.\n - \u003cvar translate=\"no\"\u003eAPP_ID\u003c/var\u003e: the ID of the Vertex AI Search app that you want to query.\n - \u003cvar translate=\"no\"\u003eQUERY\u003c/var\u003e: the query text to search.\n - \u003cvar translate=\"no\"\u003eRELEVANCE_THRESHOLD\u003c/var\u003e: one of the following: `HIGH`, `MEDIUM`, `LOW`, `LOWEST`.\n\n #### Example command and result\n\n ```\n curl -X POST -H \"Authorization: Bearer $(gcloud auth print-access-token)\"\n -H \"Content-Type: application/json\" \\\n \"https://discoveryengine.googleapis.com/v1alpha/projects/my-project-123/locations/global/collections/default_collection/engines/my-search-app/servingConfigs/default_search:search\" \\\n -d '{\n \"servingConfig\": \"projects/my-project-123/locations/global/collections/default_collection/engines/my-search-app/servingConfigs/default_search\",\n \"query\": \"What is the check grounding API\",\n \"relevanceThreshold\": \"HIGH\"\n }'\n\n {\n \"results\": [\n {\n \"id\": \"a082e70352c073a4443502477255bd2a\",\n \"document\": {\n \"name\": \"projects/123456/locations/global/collections/default_collection/dataStores/my-data-store/branches/0/documents/a082e70352c073a4443502477255bd2a\",\n \"id\": \"a082e70352c073a4443502477255bd2a\",\n \"derivedStructData\": {\n \"displayLink\": \"cloud.google.com\",\n \"link\": \"https://cloud.google.com/generative-ai-app-builder/docs/check-grounding\",\n \"htmlTitle\": \"Check grounding\",\n \"title\": \"Check grounding\"\n }\n }\n }\n ],\n \"totalSize\": 1,\n \"attributionToken\": \"f_B-CgwIidzwswYQyue15gESJDY2N2M1NmJkLTAwMDAtMjk3Ni1iMGI4LTg4M2QyNGZmNTZhOCIHR0VORVJJQypAjr6dFavEii3b7Ygt3o-aIoCymiLC8J4Vo4CXIra3jC3Usp0V24-aIt7tiC3n7YgtrsSKLeTtiC2DspoixsvzFw\",\n \"guidedSearchResult\": {},\n \"summary\": {}\n }\n ```\n\n Here, the relevance threshold is set to high, so only the most\n relevant results are returned. In this example, only one document was determined\n to be highly relevant.\n3. Test multiple queries with different thresholds to determine the best\n threshold settings for your data and application."]]