获取搜索摘要

本页介绍了如何使用此 API 获取包含搜索结果的搜索摘要。还介绍了搜索摘要提供的选项。 仅适用于非结构化数据和网站数据。

如需了解如何针对医疗数据查询获取生成式 AI 回答,请参阅使用生成式 AI 回答的自然语言查询进行搜索

准备工作

根据您应用的类型,满足以下要求:

获取搜索摘要

搜索摘要是对搜索响应中返回的排在前面的一个或多个搜索结果的简短摘要。摘要本身取自响应中返回的摘要型回答。因此,为了获取摘要,您还必须从搜索结果中获取提取式回答。如需了解详情,请参阅获取摘要回答(预览版)

下图显示了将 summaryResultCount 设置为 5 时,查询数据存储区中的 PDF 的摘要。摘要内容可能会因应用配置而异。

查询是“定义运营费用”引号。“搜索摘要”部分会显示从前几条结果中提取的摘要。
图 1:包含搜索摘要的微件示例。

搜索摘要可以包含 Markdown 格式的文本。因此,请考虑在应用中使用 Markdown 解析器来渲染 Markdown 文本。

如需获取搜索摘要,请按以下步骤操作:

  1. 提交包含 contentSearchSpec.summarySpec 且指定 summaryResultCountmaxExtractiveAnswerCount 值的搜索请求。如需详细了解如何提交搜索请求,请参阅获取搜索结果

    在以下示例中,summarySpec 表示您需要搜索摘要,并且摘要应根据前三条搜索结果生成。

    "contentSearchSpec":
     {
       "summarySpec":
       {
         "summaryResultCount": 3
       },
       "extractiveContentSpec": { "maxExtractiveAnswerCount" : 1}
     }
    
    • summaryResultCount:用于生成搜索摘要的热门搜索结果的数量。如果返回的结果数量少于 summaryResultCount,系统会根据所有结果生成摘要。

    • maxExtractiveAnswerCount:为每个搜索结果返回的提取式回答的数量。默认值为 0,最大值为 1。

  2. 从搜索响应中获取摘要。每个响应中都会返回一个 summary 属性。

    以下是搜索响应末尾返回的摘要示例:

    "summary":
    {
      "summaryText": "BigQuery is Google Cloud's fully managed and completely
      serverless enterprise data warehouse. BigQuery supports all data types,
      works across clouds, and has built-in machine learning and business
      intelligence, all within a unified platform."
    }
    

从语义块生成摘要

您可以开启 use_semantic_chunks,以便根据最相关的文档段生成摘要。与使用提取式回答的默认行为相比,使用语义块生成摘要可提高召回率和检索率。

为摘要开启语义分块后,响应会返回摘要以及摘要使用的每个分块的内容。

如需使用语义块生成摘要,请按以下步骤操作:

  1. 提交包含 contentSearchSpec.summarySpec 并指定 "use_semantic_chunks": true 的搜索请求。如需详细了解如何提交搜索请求,请参阅获取搜索结果

    以下 summarySpec 示例表明,您希望搜索摘要使用语义块、包含多少个结果以及是否包含引文。

    "contentSearchSpec":
     {
       "summarySpec":
       {
         "useSemanticChunks": SEMANTIC_CHUNK_BOOLEAN,
         "summaryResultCount": SUMMARY_RESULT_COUNT,
         "includeCitations": CITATIONS_BOOLEAN,
       }
     }
    
    • SEMANTIC_CHUNK_BOOLEAN:一个布尔值,用于指定是否使用语义块生成搜索摘要。如果设置为 true,则使用语义块。
    • SUMMARY_RESULT_COUNT:用于生成搜索摘要的前几条结果的数量。最大值为 10
    • CITATIONS_BOOLEAN:一个布尔值,用于指定是否返回引文。如果您在创建数据存储区时启用了分块模式,则引文是指分块。否则,引文应引用来源文档。如需详细了解分块模式,请参阅解析和分块文档
  2. 从搜索响应中获取摘要。

    下面是一个搜索响应的示例,其中包含从数据块生成的摘要,并包含引文。响应的 references 部分包含生成摘要的区块的内容。

    响应

    {
      "results": [
        {
          "id": "123xyz",
          "document": {
            "name": "projects/exampleproject/locations/global/collections/default_collection/dataStores/exampledatastore/branches/0/documents/123xyz",
            "id": "123xyz",
            "derivedStructData": {
              "link": "gs://examplebucket/alphabet-investor-pdfs/2004_google_annual_report.pdf"
            }
          }
        }
      ],
      "totalSize": 8375,
      "attributionToken": "abcdefg",
      "nextPageToken": "hijklmnop",
      "guidedSearchResult": {},
      "summary": {
        "summaryText": "Google's search technology uses a combination of techniques to determine the importance of a web page independent of a particular search query and to determine the relevance of that page to a particular search query. [1]",
        "summaryWithMetadata": {
          "summary": "Google's search technology uses a combination of techniques to determine the importance of a web page independent of a particular search query and to determine the relevance of that page to a particular search query.",
          "citationMetadata": {
            "citations": [
              {
                "endIndex": "216",
                "sources": [
                  {}
                ]
              }
            ]
          },
          "references": [
            {
              "document": "projects/exampleproject/locations/global/collections/default_collection/dataStores/exampledatastore/branches/0/documents/123xyz",
              "chunkContents": [
                {
                  "content": "Groups contains more than 1 billion messages from Usenet Internet discussion groups dating back to 1981.The\ndiscussions in these groups cover a broad range of discourse and provide a comprehensive look at evolving\nviewpoints, debate and advice on many subjects.The new Google Groups adds in the ability to create your own\ngroups for you and your friends and an improved user interface.Google Mobile.Google Mobile offers people the ability to search and view both the "mobile web,"\nconsisting of pages created specifically for wireless devices, and the entire Google index of more than 8 billion\nweb pages.Google Mobile works on devices that support WAP, WAP 2.0, i-mode or j-sky mobile Internet\nprotocols.In addition, users can access a variety of information using Google SMS by typing a query to the\nGoogle shortcode.Google Mobile is available through many wireless and mobile phone services worldwide.",
                  "pageIdentifier": "17"
                },
                {
                  "content": "Google Labs is our playground for our engineers and for adventurous Google users.On Google\nLabs, we post product prototypes and solicit feedback on how the technology could be used or improved.Current Google Labs examples include:Google Personalized Search—provides customized search results based on an individual user's interests.Froogle Wireless—gives people the ability to search for product information from their mobile phones\nand other wireless devices.Google Maps—enables users to see maps, get directions, and find local businesses and services quickly\nand easily.Google Maps has several unique features, including draggable maps, integrated local search\nfrom Google Local, and keyboard shortcuts.Google Scholar—enables users to search specifically for scholarly literature, including peer-reviewed\npapers, theses, books, preprints, abstracts and technical reports from all broad areas of research.Google\nScholar can be used to find articles from a wide variety of academic publishers, professional societies,\npreprint repositories and universities, as well as scholarly articles available across the web.Google Suggest—guesses what you're typing and offers suggestions in real time.This is similar to\nGoogle's "Did you mean?"feature that offers alternative spellings for your query after you search, except\nthat it works in real time.",
                  "pageIdentifier": "17"
                },
                {
                  "content": "Groups contains more than 1 billion messages from Usenet Internet discussion groups dating back to 1981.The\ndiscussions in these groups cover a broad range of discourse and provide a comprehensive look at evolving\nviewpoints, debate and advice on many subjects.The new Google Groups adds in the ability to create your own\ngroups for you and your friends and an improved user interface.Google Mobile.Google Mobile offers people the ability to search and view both the "mobile web,"\nconsisting of pages created specifically for wireless devices, and the entire Google index of more than 8 billion\nweb pages.Google Mobile works on devices that support WAP, WAP 2.0, i-mode or j-sky mobile Internet\nprotocols.In addition, users can access a variety of information using Google SMS by typing a query to the\nGoogle shortcode.Google Mobile is available through many wireless and mobile phone services worldwide.\n\nGoogle Local.Google Local enables users to find relevant local businesses near a city, postal code, or specific\naddress.This service combines Yellow Page listings with information found on web pages, and plots their\nlocations on interactive maps.Google Print.Google Print brings information online that had previously not been available to web\nsearchers.Under this program, we enable a number of publishers to host their content and show their\npublications at the top of our search results.",
                  "pageIdentifier": "17"
                },
                {
                  "content": "Votes cast by important web pages with high PageRank weigh more heavily and are\nmore influential in deciding the PageRank of pages on the web.Text-Matching Techniques.Our technology employs text-matching techniques that compare search queries\nwith the content of web pages to help determine relevance.Our text-based scoring techniques do far more than\ncount the number of times a search term appears on a web page.For example, our technology determines the\nproximity of individual search terms to each other on a given web page, and prioritizes results that have the\nsearch terms near each other.Many other aspects of a page's content are factored into the equation, as is the\ncontent of pages that link to the page in question.By combining query independent measures such as PageRank\nwith our text-matching techniques, we are able to deliver search results that are relevant to what people are\ntrying to find.\n\nAdvertising Technology\nOur advertising program serves millions of relevant, targeted ads each day based on search terms people\n\nenter or content they view on the web.The key elements of our advertising technology include:\n\nGoogle AdWords Auction System.We use the Google AdWords auction system to enable advertisers to\nautomatically deliver relevant, targeted advertising.",
                  "pageIdentifier": "21"
                },
                {
                  "content": "Votes cast by important web pages with high PageRank weigh more heavily and are\nmore influential in deciding the PageRank of pages on the web.Text-Matching Techniques.Our technology employs text-matching techniques that compare search queries\nwith the content of web pages to help determine relevance.Our text-based scoring techniques do far more than\ncount the number of times a search term appears on a web page.For example, our technology determines the\nproximity of individual search terms to each other on a given web page, and prioritizes results that have the\nsearch terms near each other.Many other aspects of a page's content are factored into the equation, as is the\ncontent of pages that link to the page in question.By combining query independent measures such as PageRank\nwith our text-matching techniques, we are able to deliver search results that are relevant to what people are\ntrying to find.\n\nAdvertising Technology\nOur advertising program serves millions of relevant, targeted ads each day based on search terms people\n\nenter or content they view on the web.The key elements of our advertising technology include:",
                  "pageIdentifier": "21"
                },
                {
                  "content": "Google Maps—enables users to see maps, get directions, and find local businesses and services quickly\nand easily.Google Maps has several unique features, including draggable maps, integrated local search\nfrom Google Local, and keyboard shortcuts.Google Scholar—enables users to search specifically for scholarly literature, including peer-reviewed\npapers, theses, books, preprints, abstracts and technical reports from all broad areas of research.Google\nScholar can be used to find articles from a wide variety of academic publishers, professional societies,\npreprint repositories and universities, as well as scholarly articles available across the web.Google Suggest—guesses what you're typing and offers suggestions in real time.This is similar to\nGoogle's "Did you mean?"feature that offers alternative spellings for your query after you search, except\nthat it works in real time.Google Video—includes thousands of programs that play on our TVs every day.Google Video enables\nyou to search a growing archive of televised content—everything from sports to dinosaur\ndocumentaries to news shows.\n\n6",
                  "pageIdentifier": "17"
                },
                {
                  "content": "Every search query we process involves the automated\nexecution of an auction, resulting in our advertising system often processing hundreds of millions of auctions per\nday.To determine whether an ad is relevant to a particular query, this system weighs an advertiser's willingness\nto pay for prominence in the ad listings (the CPC) and interest from users in the ad as measured by the click\nthrough rate and other factors.If an ad does not attract user clicks, it moves to a less prominent position on the\npage, even if the advertiser offers to pay a high amount.This prevents advertisers with irrelevant ads from\n"squatting" in top positions to gain exposure.Conversely, more relevant, well-targeted ads that are clicked on\nfrequently move up in ranking, with no need for advertisers to increase their bids.Because we are paid only\nwhen users click on ads, the AdWords ranking system aligns our interests equally with those of our advertisers\nand our users.The more relevant and useful the ad, the better for our users, for our advertisers and for us.\n\nThe AdWords auction system also incorporates our AdWords discounter, which automatically lowers the\namount advertisers actually pay to the minimum needed to maintain their ad position.",
                  "pageIdentifier": "21"
                },
                {
                  "content": "Web Search Technology\nOur web search technology uses a combination of techniques to determine the importance of a web page\nindependent of a particular search query and to determine the relevance of that page to a particular search\nquery.We do not explain how we do ranking in great detail because some people try to manipulate our search\nresults for their own gain, rather than in an attempt to provide high-quality information to users.\n\nRanking Technology.One element of our technology for ranking web pages is called PageRank.While we\ndeveloped much of our ranking technology after Google was formed, PageRank was developed at Stanford\nUniversity with the involvement of our founders, and was therefore published as research.Most of our current\nranking technology is protected as trade-secret.PageRank is a query-independent technique for determining the\nimportance of web pages by looking at the link structure of the web.PageRank treats a link from web page A to\nweb page B as a "vote" by page A in favor of page B.The PageRank of a page is the sum of the PageRank of the\npages that link to it.The PageRank of a web page also depends on the importance (or PageRank) of the other\nweb pages casting the votes.",
                  "pageIdentifier": "21"
                },
                {
                  "content": "The Company recognizes as revenue the fees charged advertisers each time a user clicks on one of the text\nbased ads that are displayed next to the search results on Google web sites.Effective January 1, 2004, the\nCompany offered a single pricing structure to all of its advertisers based on the AdWords cost per click model.\n\nGoogle AdSense is the program through which the Company distributes its advertisers' text-based ads for\ndisplay on the web sites of the Google Network members.In accordance with Emerging Issues Task Force\n("EITF") Issue No. 99 19, Reporting Revenue Gross as a Principal Versus Net as an Agent, the Company recognizes\nas revenues the fees it receives from its advertisers.This revenue is reported gross primarily because the\nCompany is the primary obligor to its advertisers.\n\nThe Company generates fees from search services through a variety of contractual arrangements, which\ninclude per-query search fees and search service hosting fees.Revenues from set up and support fees and search\nservice hosting fees are recognized on a straight-line basis over the term of the contract, which is the expected\nperiod during which these services will be provided.The Company's policy is to recognize revenues from per\nquery search fees in the period queries are made and results are delivered.\n\nThe Company provides search services pursuant to certain AdSense agreements.",
                  "pageIdentifier": "85"
                },
                {
                  "content": "On Google Print pages, we provide links to book sellers that may\noffer the full versions of these publications for sale, and we show content-targeted ads that are served through\nthe Google AdSense program.Google Desktop Search.Google Desktop Search enables our users to perform a full text search on the\ncontents of their own computer, including email, files, instant messenger chats and web browser history.Users\ncan use this service to view web pages they have visited even when they are not online.Google Alerts.Google Alerts are email updates of the latest relevant Google results (web, news, etc.) based\non the user's choice of query or topic.Typical uses include monitoring a developing news story, keeping current\non a competitor or industry, getting the latest on a celebrity or event, or keeping tabs on a favorite sports team.Google Labs.Google Labs is our playground for our engineers and for adventurous Google users.On Google\nLabs, we post product prototypes and solicit feedback on how the technology could be used or improved.Current Google Labs examples include:Google Personalized Search—provides customized search results based on an individual user's interests.Froogle Wireless—gives people the ability to search for product information from their mobile phones\nand other wireless devices.",
                  "pageIdentifier": "17"
                }
              ]
            }
          ]
        }
      }
    }

获取引用

如果指定了引文,则引文是指在搜索摘要中内嵌的数字。这些数字表示摘要中特定句子的来源搜索结果。

如需获取引用,请按以下步骤操作:

  1. 提交包含 contentSearchSpec.summarySpec 并指定 "includeCitations": true 的搜索请求。如需详细了解如何提交搜索请求,请参阅获取搜索结果

    在以下示例中,summarySpec 表示您希望生成搜索摘要,摘要应根据前三条搜索结果生成,并且摘要中应包含引文。

    "contentSearchSpec":
     {
       "summarySpec":
       {
         "summaryResultCount": 3,
         "includeCitations": true
       },
       "extractiveContentSpec": { "maxExtractiveAnswerCount" : 1}
     }
    
    • summaryResultCount:用于生成搜索摘要的热门搜索结果的数量。如果返回的结果数量少于 summaryResultCount,系统会根据所有结果生成摘要。最大值为 5
    • includeCitations:一个布尔值,用于指定是否返回引文。
    • maxExtractiveAnswerCount:为每个搜索结果返回的提取式回答的数量。默认值为 0,最大值为 1。
  2. 从搜索响应中获取摘要(包括引用)。每个响应中都会返回一个 summary 属性。

    以下是搜索响应末尾返回的摘要示例,其中包含引文和引文元数据:

    "summary": {
     "summaryText": "BigQuery is Google Cloud's fully managed and completely
      serverless enterprise data warehouse [1]. BigQuery supports all data types,
      works across clouds, and has built-in machine learning and business
      intelligence, all within a unified platform [2, 3].",
     "summaryWithMetadata": {
       "summary": "BigQuery is Google Cloud's fully managed and completely
       serverless enterprise data warehouse. BigQuery supports all data types,
       works across clouds, and has built-in machine learning and business
       intelligence, all within a unified platform.",
       "citationMetadata": {
         "citations": [
           {
             "startIndex": "0",
             "endIndex": "101",
             "sources": [
               {
                 "uri": "gs://example-dataset/html/6344007140738632642.html",
                 "title": "About BigQuery",
                 "id": "b6344007140738632642",
                 "referenceIndex": "0"
               },
               {
                 "uri": "gs://example-dataset/html/1365490014946172719.html",
                 "title": "Google Cloud article",
                 "id": "b1365490014946172719",
                 "referenceIndex": "1"
               },
               {
                 "uri": "gs://example-dataset/html/2687910668117268120.html",
                 "title": "BigQuery document",
                 "id": "a2687910668117268120",
                 "referenceIndex": "2"
               }
             ]
           },
           {
             "startIndex": "103",
             "endIndex": "230",
             "sources": [
               {
                 "referenceIndex": "0"
                },
               {
                 "referenceIndex": "1"
               },
               {
                 "referenceIndex": "2",
               }
             ]
           }
         ]
       },
       "references": [
       {
         "title": "Sports in the United States",
         "docName": "projects/123/locations/global/collections/default_collection/dataStores/ds-123/branches/0/documents/b6344007140738632642",
         "uri": "https://example.com/bigqueryA"
       },
       {
         "title": "Sports in the United States",
         "docName": "projects/123/locations/global/collections/default_collection/dataStores/ds-123/branches/0/documents/b1365490014946172719",
         "uri": "https://example.com/bigqueryB"
       },
       {
         "title": "Sports in the United States",
         "docName": "projects/123/locations/global/collections/default_collection/dataStores/ds-123/branches/0/documents/a268791066811726812",
         "uri": "https://example.com/bigqueryC"
       }
     ]
    }
    }
    
    • summaryText:搜索摘要,其中包含引文编号。引文编号是指返回的搜索结果,编号从 1 开始。例如,[1] 表示该句子归因于第一个搜索结果。[2, 3] 表示该句子同时归因于第二个和第三个搜索结果。
    • citations:对于摘要中包含引文的每个句子,列出该引文的元数据。
    • startIndex:表示句子的开头,以 Unicode 字节为单位。
    • endIndex:表示句子的结尾,以 Unicode 字节为单位。
    • sources:列出句子引文中包含的每个来源的 referenceIndexreferenceIndex 是分配给来源的编号。响应中并不总是明确返回第一个来源的 referenceIndex。由于 referenceIndex 从 0 开始编号,因此第一个来源的 referenceIndex 始终为 0。
    • references:列出摘要中引用的每个参考文献的元数据。元数据包括 titledocNameuri

忽略对抗性查询

恶意查询包含负面评论,或者旨在生成不安全、违反政策的输出。您可以指定不应针对对抗性查询返回搜索摘要。当系统忽略对抗性查询时,summaryText 属性会包含表示未返回任何搜索摘要的样本文字。系统会针对对抗性查询返回搜索文档,但不会返回搜索摘要。

如需指定不应针对对抗性询问返回搜索摘要,请按以下步骤操作:

  1. 提交包含 contentSearchSpec.summarySpec 并指定 "ignoreAdversarialQuery": true 的搜索请求。如需详细了解如何提交搜索请求,请参阅获取搜索结果

    在以下示例中,summarySpec 表示您希望显示搜索摘要,该摘要应根据前三个搜索结果生成,但不应针对对抗性查询返回摘要。

    "contentSearchSpec":
     {
       "summarySpec":
       {
         "summaryResultCount": 3,
         "ignoreAdversarialQuery": true
       },
       "extractiveContentSpec": { "maxExtractiveAnswerCount" : 1}
     }
    
    • summaryResultCount:用于生成搜索摘要的热门搜索结果的数量。如果返回的结果数量少于 summaryResultCount,系统会根据所有结果生成摘要。最大值为 5
    • ignoreAdversarialQuery:一个布尔值,用于指定不应针对对抗性查询返回搜索摘要。
    • maxExtractiveAnswerCount:为每个搜索结果返回的提取式回答的数量。默认值为 0,最大值为 1。
  2. 查看针对对抗性搜索请求返回的 summary 属性。

    示例如下:

    "summary":
    {
      "summaryText": "We do not have a summary for your query. Here are some
      search results.",
      "summarySkippedReasons": [
       "ADVERSARIAL_QUERY_IGNORED"
     ]
    }
    
    • summaryText:表示未返回任何搜索摘要的样本文本。
    • summarySkippedReasons:包含跳过摘要原因的值的枚举。

忽略非摘要查询

非摘要查询会返回不适合总结的结果。例如,“为什么天空是蓝色的?”和“谁是世界上最好的足球运动员?”是寻求摘要的查询,但“SFO 机场”和“2026 年世界杯”不是。这类查询很可能是导航查询。您可以指定不应针对非摘要查询返回搜索摘要。系统会针对非摘要查询返回搜索文档,但不会返回搜索摘要。

如需指定不应针对非摘要查询返回搜索摘要,请按以下步骤操作:

  1. 提交包含 contentSearchSpec.summarySpec 并指定 "ignoreNonSummarySeekingQuery": true 的搜索请求。如需详细了解如何提交搜索请求,请参阅获取搜索结果

    在以下示例中,summarySpec 表示您希望显示搜索摘要,摘要应根据前三个搜索结果生成,但对于不查询摘要的查询,则不应返回摘要。

    "contentSearchSpec":
     {
       "summarySpec":
       {
         "summaryResultCount": 3,
         "ignoreNonSummarySeekingQuery": true
       },
       "extractiveContentSpec": { "maxExtractiveAnswerCount" : 1}
     }
    
    • summaryResultCount:用于生成搜索摘要的热门搜索结果的数量。如果返回的结果数量少于 summaryResultCount,系统会根据所有结果生成摘要。最大值为 5
    • ignoreNonSummarySeekingQuery:一个布尔值,用于指定不应针对非摘要查询返回搜索摘要。
    • maxExtractiveAnswerCount:为每个搜索结果返回的提取式回答的数量。默认值为 0,最大值为 1。
  2. 查看针对非摘要搜索请求返回的 summary 属性。

    示例如下:

    "summary":
    {
      "summaryText": "We do not have a summary for your query. Here are some
      search results.",
      "summarySkippedReasons": [
        "NON_SUMMARY_SEEKING_QUERY_IGNORED"
     ]
    }
    
    • summaryText:表示未返回任何搜索摘要的样本文本。
    • summarySkippedReasons:包含跳过摘要原因的值的枚举。

获取自定义摘要

您可以通过提供自然语言说明来获取自定义摘要。 您可以使用控制台为搜索 widget 启用此功能,也可以在 modelPromptSpec.preamble 字段中输入说明,为 API 启用此功能。

您可以请求自定义设置,例如长度、详细程度、输出样式(例如“简单”)、输出语言、答案重点和格式(例如表格、项目符号和 XML)。

例如,对于查询 What are the benefits of bigquery?,您可以输入自定义说明,如以下示例所示:

  • 说明示例:please show the answer format in an ordered list

    回答示例:

    The benefits of BigQuery are:
    
    1. Increased business productivity
    2. IT staff productivity
    3. Reduced infrastructure costs
    4. Loading and exporting data are free
    
  • 说明示例:give a very detailed answer

    回答示例:

    BigQuery is a serverless, highly scalable, and cost-effective cloud data warehouse that enables businesses to analyze all their data very quickly. It is a fully managed service that takes care of provisioning, scaling, and managing the underlying infrastructure, so you can focus on analyzing your data. BigQuery is also integrated with other Google Cloud services, such as Google Kubernetes Engine, Cloud Data Fusion, and Cloud Dataproc, making it easy to build and deploy data pipelines.
    
    Here are some of the benefits of using BigQuery:
    
    * **Fast and scalable:** BigQuery can process petabytes of data very quickly, and it can scale to handle even the most demanding workloads. * **Cost-effective:** BigQuery is a very cost-effective way to store and analyze data. You only pay for the data that you use, and there are no upfront costs or commitments. * **Secure:** BigQuery is a secure platform that meets the needs of even the most security-conscious organizations. * **Easy to use:** BigQuery is easy to use, even for non-technical users. It has a simple and intuitive user interface, and it supports a variety of data sources. * **Integrated with other Google Cloud services:** BigQuery is integrated with other Google Cloud services, making it easy to build and deploy data pipelines.
    
    If you are looking for a fast, scalable, and cost-effective way to analyze your data, then BigQuery is a great option.
    

自定义摘要的最佳实践

如果您打算使用此功能,请执行以下操作:

  • 一次只能申请一项自定义。请勿组合自定义设置,例如请求以法语显示 HTML 表格。
  • Google 建议您限制最终用户可以请求的定制内容,例如,提供包含一组预定义定制的选择器。

自定义摘要

您可以使用控制台仅为搜索微件获取自定义摘要,也可以使用 API 为任何搜索请求获取自定义摘要。

如需获取自定义摘要,请按以下步骤操作:

控制台

  1. 在 Google Cloud 控制台中,前往 Agent Builder 页面。

    Agent Builder

  2. 点击要修改的应用的名称。

  3. 依次前往配置 > 界面

  4. 确保搜索微件的搜索类型设置为搜索并显示答案搜索并显示后续问题。如果选择了搜索,则无法使用此功能。

  5. 开启启用摘要自定义功能

  6. 如需输入摘要说明,请执行以下操作之一:

    • 输入自由形式的说明:在前言字段中输入您自己的自然语言说明。
    • 使用模板说明:点击替换为模板,然后选择某个预定义的模板说明。选择预定义模板后,该模板会显示在前言字段中。
  7. 预览窗格中进行搜索,测试为应用生成自定义摘要的效果。

  8. 如需重置为上次保存的一组说明,请点击重置序言

  9. 如需将设置保存到微件,请点击保存并发布

REST

  1. 提交包含 contentSearchSpec.summarySpec 且在 modelPromptSpec.preamble 中指定自定义说明的搜索请求。如需详细了解如何提交搜索请求,请参阅获取搜索结果

    在以下示例中,summarySpec 表示您需要搜索摘要,摘要应根据前三条搜索结果生成,并且应进行自定义,以便向 10 岁的儿童解释。

    "contentSearchSpec":
      {
        "summarySpec":
        {
          "summaryResultCount": 3,
          "modelPromptSpec":
          {
            "preamble": "explain like you would to a ten year old"
          }
        }
      }
    
    • summaryResultCount:用于生成搜索摘要的热门搜索结果的数量。如果返回的结果数量少于 summaryResultCount,系统会根据所有结果生成摘要。最大值为 5
    • preamble:自定义说明。
  2. 从搜索响应中获取自定义摘要。

    下面是返回的自定义摘要示例:

    "summary":
    {
      "summaryText": "BigQuery is a serverless data warehouse that helps you
      analyze all your data very quickly. It's very easy to use and you don't
      need to worry about managing servers or infrastructure. BigQuery is also
      very scalable, so you can analyze large datasets without any problems."
    }
    
    • summaryText:自定义的搜索摘要。

指定总结模型

您可以指定要用于生成摘要的模型。

您可以按名称指定 stablepreview 或特定模型版本。 如需了解可用的模型版本,请参阅回答生成模型版本和生命周期

如需更改模型版本,请执行以下操作:

  1. 提交包含 ContentSearchSpec.SummarySpec.ModelSpec 的搜索请求,以指定模型版本。

    "contentSearchSpec": {
      "summarySpec": {
        "modelSpec": {
          "version": "MODEL_VERSION"
         }
       }
     }
    
    • MODEL_VERSION:指定要使用哪个模型生成摘要。支持的值包括:

      • stable:字符串。如果未指定值,则为默认规范。 stable 指向经过微调以生成回答的 GA 模型版本。随着新的 GA 模型版本发布和旧版模型版本停用,stable 指向的模型会发生变化。如需了解 stable 指向的最新版本,请参阅回答生成模型版本和生命周期
      • preview:字符串。preview 指向用于问答的最新 Gemini 模型。如需详细了解 Gemini,请参阅模型概览
      • 如需指定特定的模型版本,请输入版本名称,例如 gemini-1.5-flash-002/answer_gen/v1。如需了解受支持的版本,请参阅回答生成模型版本和生命周期

例如,以下搜索请求将 preview 指定为模型版本:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://discoveryengine.googleapis.com/v1/projects/exampleproject/locations/global/collections/default_collection/dataStores/exampledatastore/servingConfigs/default_search:search" \
-d '{
  "query": "what is bigquery",
  "contentSearchSpec": {
    "summarySpec": {
      "modelSpec": {
        "version": "preview"
      }
    }
  }
}'

搜索摘要的限制

使用搜索摘要时,您可能会遇到以下限制:

  • 由于 LLM 用于生成搜索摘要和引用,因此 LLM 的限制也适用于 Vertex AI Search 摘要。

    如需了解这些 LLM 限制的一般信息,请参阅 Vertex AI 文档中的 PaLM API 限制

  • 如果搜索查询需要进行复杂的逻辑或分析推理或对世界进行理解,则搜索摘要可能会包含不正确的信息(幻觉)或非结构化数据或网站数据中不存在的信息。

  • 搜索摘要中的某些陈述可能不包含引文:

    • 如果系统确定某个陈述不需要依据,则不会添加引用。“我发现了以下内容”或“您可以采用多种方法”等句子缺少引用。

    • 缺少引文也可能表示未找到有效的参考文献。没有引文的事实可能不可靠。

  • 在极少数情况下,引文可能会错误地归因于某个陈述。

  • LLM 可能会错误地解析复杂文档。在这种情况下,摘要可能不完整或不正确。

  • 由于自定义说明是用自然语言编写的,因此我们无法保证所有请求都能遵循说明。