获取搜索摘要

本页介绍了如何使用 API 在搜索结果中获取搜索摘要。本文还介绍了搜索摘要提供的选项。 仅适用于非结构化数据和网站数据。

如需了解如何针对医疗保健数据查询获取生成式 AI 回答,请参阅使用自然语言查询并获取生成式 AI 回答

准备工作

根据您拥有的应用类型,完成以下要求:

获取搜索摘要

搜索摘要是指搜索响应中返回的排在前面的一个或多个搜索结果的简短摘要。摘要本身是从响应中返回的提取式答案中提取的。因此,如需获取摘要,您还必须在搜索结果中获取提取式答案。如需了解详情,请参阅获取提取式答案(预览版)

下图显示了当数据存储区中的 PDF 通过将 summaryResultCount 设置为 5 进行查询时,系统显示的摘要。摘要内容可能会因应用配置而异。

查询是“定义运营费用”。搜索摘要部分会显示从热门结果中提取的摘要。
图 1:包含搜索摘要的示例 widget。

搜索摘要可以包含 Markdown 格式的文本和 Markdown 解析器通常可以理解的简单 HTML 标记。因此,请考虑在应用中使用 Markdown 解析器来渲染 Markdown 文本。

如需获取搜索摘要,请按以下步骤操作:

  1. 提交包含 contentSearchSpec.summarySpec 的搜索请求,并为 summaryResultCountmaxExtractiveAnswerCount 指定值。 如需详细了解如何提交搜索请求,请参阅获取搜索结果

    在以下示例中,summarySpec 表示您需要搜索摘要,并且该摘要应根据前三条搜索结果生成。

    "contentSearchSpec":
     {
       "summarySpec":
       {
         "summaryResultCount": 3
       },
       "extractiveContentSpec": { "maxExtractiveAnswerCount" : 1}
     }
    
    • summaryResultCount:用于生成搜索摘要的热门结果数量。如果返回的结果数量少于 summaryResultCount,则系统会根据所有结果生成摘要。

    • maxExtractiveAnswerCount:要为每个搜索结果返回的提取式回答的数量。默认值为 0,最大值为 1。

  2. 从搜索响应中获取摘要。每个响应中都会返回一个 summary 属性。

    以下是搜索响应末尾返回的摘要示例:

    "summary":
    {
      "summaryText": "BigQuery is Google Cloud's fully managed and completely
      serverless enterprise data warehouse. BigQuery supports all data types,
      works across clouds, and has built-in machine learning and business
      intelligence, all within a unified platform."
    }
    

根据语义块生成摘要

您可以开启 use_semantic_chunks,以便根据最相关的文档块生成摘要。与使用提取式答案的默认行为相比,使用语义块生成摘要可提高召回率和检索率。

如果为摘要开启了语义分块,则响应会返回摘要以及摘要所使用的每个块的内容。

如需使用语义块生成摘要,请按以下步骤操作:

  1. 提交包含 contentSearchSpec.summarySpec 并指定 "use_semantic_chunks": true 的搜索请求。如需详细了解如何提交搜索请求,请参阅获取搜索结果

    以下 summarySpec 示例表明,您希望搜索摘要使用语义块,并指定了要包含的结果数量以及是否包含引用。

    "contentSearchSpec":
     {
       "summarySpec":
       {
         "useSemanticChunks": SEMANTIC_CHUNK_BOOLEAN,
         "summaryResultCount": SUMMARY_RESULT_COUNT,
         "includeCitations": CITATIONS_BOOLEAN,
       }
     }
    
    • SEMANTIC_CHUNK_BOOLEAN:一个布尔值,用于指定是否使用语义块来生成搜索摘要。如果设置为 true,则使用语义块。
    • SUMMARY_RESULT_COUNT:用于生成搜索摘要的热门结果数量。最大值为 10
    • CITATIONS_BOOLEAN:一个布尔值,用于指定是否返回引用。如果您在创建数据存储区时开启了分块模式,那么引用是指分块。否则,引用是指源文档。如需详细了解分块模式,请参阅解析文档并将其分块
  2. 从搜索响应中获取摘要。

    以下是一个搜索响应示例,其中包含根据块生成的摘要和引用。响应的 references 部分包含生成摘要所依据的块的内容。

    响应

    {
      "results": [
        {
          "id": "123xyz",
          "document": {
            "name": "projects/exampleproject/locations/global/collections/default_collection/dataStores/exampledatastore/branches/0/documents/123xyz",
            "id": "123xyz",
            "derivedStructData": {
              "link": "gs://examplebucket/alphabet-investor-pdfs/2004_google_annual_report.pdf"
            }
          }
        }
      ],
      "totalSize": 8375,
      "attributionToken": "abcdefg",
      "nextPageToken": "hijklmnop",
      "guidedSearchResult": {},
      "summary": {
        "summaryText": "Google's search technology uses a combination of techniques to determine the importance of a web page independent of a particular search query and to determine the relevance of that page to a particular search query. [1]",
        "summaryWithMetadata": {
          "summary": "Google's search technology uses a combination of techniques to determine the importance of a web page independent of a particular search query and to determine the relevance of that page to a particular search query.",
          "citationMetadata": {
            "citations": [
              {
                "endIndex": "216",
                "sources": [
                  {}
                ]
              }
            ]
          },
          "references": [
            {
              "document": "projects/exampleproject/locations/global/collections/default_collection/dataStores/exampledatastore/branches/0/documents/123xyz",
              "chunkContents": [
                {
                  "content": "Groups contains more than 1 billion messages from Usenet Internet discussion groups dating back to 1981.The\ndiscussions in these groups cover a broad range of discourse and provide a comprehensive look at evolving\nviewpoints, debate and advice on many subjects.The new Google Groups adds in the ability to create your own\ngroups for you and your friends and an improved user interface.Google Mobile.Google Mobile offers people the ability to search and view both the "mobile web,"\nconsisting of pages created specifically for wireless devices, and the entire Google index of more than 8 billion\nweb pages.Google Mobile works on devices that support WAP, WAP 2.0, i-mode or j-sky mobile Internet\nprotocols.In addition, users can access a variety of information using Google SMS by typing a query to the\nGoogle shortcode.Google Mobile is available through many wireless and mobile phone services worldwide.",
                  "pageIdentifier": "17"
                },
                {
                  "content": "Google Labs is our playground for our engineers and for adventurous Google users.On Google\nLabs, we post product prototypes and solicit feedback on how the technology could be used or improved.Current Google Labs examples include:Google Personalized Search—provides customized search results based on an individual user's interests.Froogle Wireless—gives people the ability to search for product information from their mobile phones\nand other wireless devices.Google Maps—enables users to see maps, get directions, and find local businesses and services quickly\nand easily.Google Maps has several unique features, including draggable maps, integrated local search\nfrom Google Local, and keyboard shortcuts.Google Scholar—enables users to search specifically for scholarly literature, including peer-reviewed\npapers, theses, books, preprints, abstracts and technical reports from all broad areas of research.Google\nScholar can be used to find articles from a wide variety of academic publishers, professional societies,\npreprint repositories and universities, as well as scholarly articles available across the web.Google Suggest—guesses what you're typing and offers suggestions in real time.This is similar to\nGoogle's "Did you mean?"feature that offers alternative spellings for your query after you search, except\nthat it works in real time.",
                  "pageIdentifier": "17"
                },
                {
                  "content": "Groups contains more than 1 billion messages from Usenet Internet discussion groups dating back to 1981.The\ndiscussions in these groups cover a broad range of discourse and provide a comprehensive look at evolving\nviewpoints, debate and advice on many subjects.The new Google Groups adds in the ability to create your own\ngroups for you and your friends and an improved user interface.Google Mobile.Google Mobile offers people the ability to search and view both the "mobile web,"\nconsisting of pages created specifically for wireless devices, and the entire Google index of more than 8 billion\nweb pages.Google Mobile works on devices that support WAP, WAP 2.0, i-mode or j-sky mobile Internet\nprotocols.In addition, users can access a variety of information using Google SMS by typing a query to the\nGoogle shortcode.Google Mobile is available through many wireless and mobile phone services worldwide.\n\nGoogle Local.Google Local enables users to find relevant local businesses near a city, postal code, or specific\naddress.This service combines Yellow Page listings with information found on web pages, and plots their\nlocations on interactive maps.Google Print.Google Print brings information online that had previously not been available to web\nsearchers.Under this program, we enable a number of publishers to host their content and show their\npublications at the top of our search results.",
                  "pageIdentifier": "17"
                },
                {
                  "content": "Votes cast by important web pages with high PageRank weigh more heavily and are\nmore influential in deciding the PageRank of pages on the web.Text-Matching Techniques.Our technology employs text-matching techniques that compare search queries\nwith the content of web pages to help determine relevance.Our text-based scoring techniques do far more than\ncount the number of times a search term appears on a web page.For example, our technology determines the\nproximity of individual search terms to each other on a given web page, and prioritizes results that have the\nsearch terms near each other.Many other aspects of a page's content are factored into the equation, as is the\ncontent of pages that link to the page in question.By combining query independent measures such as PageRank\nwith our text-matching techniques, we are able to deliver search results that are relevant to what people are\ntrying to find.\n\nAdvertising Technology\nOur advertising program serves millions of relevant, targeted ads each day based on search terms people\n\nenter or content they view on the web.The key elements of our advertising technology include:\n\nGoogle AdWords Auction System.We use the Google AdWords auction system to enable advertisers to\nautomatically deliver relevant, targeted advertising.",
                  "pageIdentifier": "21"
                },
                {
                  "content": "Votes cast by important web pages with high PageRank weigh more heavily and are\nmore influential in deciding the PageRank of pages on the web.Text-Matching Techniques.Our technology employs text-matching techniques that compare search queries\nwith the content of web pages to help determine relevance.Our text-based scoring techniques do far more than\ncount the number of times a search term appears on a web page.For example, our technology determines the\nproximity of individual search terms to each other on a given web page, and prioritizes results that have the\nsearch terms near each other.Many other aspects of a page's content are factored into the equation, as is the\ncontent of pages that link to the page in question.By combining query independent measures such as PageRank\nwith our text-matching techniques, we are able to deliver search results that are relevant to what people are\ntrying to find.\n\nAdvertising Technology\nOur advertising program serves millions of relevant, targeted ads each day based on search terms people\n\nenter or content they view on the web.The key elements of our advertising technology include:",
                  "pageIdentifier": "21"
                },
                {
                  "content": "Google Maps—enables users to see maps, get directions, and find local businesses and services quickly\nand easily.Google Maps has several unique features, including draggable maps, integrated local search\nfrom Google Local, and keyboard shortcuts.Google Scholar—enables users to search specifically for scholarly literature, including peer-reviewed\npapers, theses, books, preprints, abstracts and technical reports from all broad areas of research.Google\nScholar can be used to find articles from a wide variety of academic publishers, professional societies,\npreprint repositories and universities, as well as scholarly articles available across the web.Google Suggest—guesses what you're typing and offers suggestions in real time.This is similar to\nGoogle's "Did you mean?"feature that offers alternative spellings for your query after you search, except\nthat it works in real time.Google Video—includes thousands of programs that play on our TVs every day.Google Video enables\nyou to search a growing archive of televised content—everything from sports to dinosaur\ndocumentaries to news shows.\n\n6",
                  "pageIdentifier": "17"
                },
                {
                  "content": "Every search query we process involves the automated\nexecution of an auction, resulting in our advertising system often processing hundreds of millions of auctions per\nday.To determine whether an ad is relevant to a particular query, this system weighs an advertiser's willingness\nto pay for prominence in the ad listings (the CPC) and interest from users in the ad as measured by the click\nthrough rate and other factors.If an ad does not attract user clicks, it moves to a less prominent position on the\npage, even if the advertiser offers to pay a high amount.This prevents advertisers with irrelevant ads from\n"squatting" in top positions to gain exposure.Conversely, more relevant, well-targeted ads that are clicked on\nfrequently move up in ranking, with no need for advertisers to increase their bids.Because we are paid only\nwhen users click on ads, the AdWords ranking system aligns our interests equally with those of our advertisers\nand our users.The more relevant and useful the ad, the better for our users, for our advertisers and for us.\n\nThe AdWords auction system also incorporates our AdWords discounter, which automatically lowers the\namount advertisers actually pay to the minimum needed to maintain their ad position.",
                  "pageIdentifier": "21"
                },
                {
                  "content": "Web Search Technology\nOur web search technology uses a combination of techniques to determine the importance of a web page\nindependent of a particular search query and to determine the relevance of that page to a particular search\nquery.We do not explain how we do ranking in great detail because some people try to manipulate our search\nresults for their own gain, rather than in an attempt to provide high-quality information to users.\n\nRanking Technology.One element of our technology for ranking web pages is called PageRank.While we\ndeveloped much of our ranking technology after Google was formed, PageRank was developed at Stanford\nUniversity with the involvement of our founders, and was therefore published as research.Most of our current\nranking technology is protected as trade-secret.PageRank is a query-independent technique for determining the\nimportance of web pages by looking at the link structure of the web.PageRank treats a link from web page A to\nweb page B as a "vote" by page A in favor of page B.The PageRank of a page is the sum of the PageRank of the\npages that link to it.The PageRank of a web page also depends on the importance (or PageRank) of the other\nweb pages casting the votes.",
                  "pageIdentifier": "21"
                },
                {
                  "content": "The Company recognizes as revenue the fees charged advertisers each time a user clicks on one of the text\nbased ads that are displayed next to the search results on Google web sites.Effective January 1, 2004, the\nCompany offered a single pricing structure to all of its advertisers based on the AdWords cost per click model.\n\nGoogle AdSense is the program through which the Company distributes its advertisers' text-based ads for\ndisplay on the web sites of the Google Network members.In accordance with Emerging Issues Task Force\n("EITF") Issue No. 99 19, Reporting Revenue Gross as a Principal Versus Net as an Agent, the Company recognizes\nas revenues the fees it receives from its advertisers.This revenue is reported gross primarily because the\nCompany is the primary obligor to its advertisers.\n\nThe Company generates fees from search services through a variety of contractual arrangements, which\ninclude per-query search fees and search service hosting fees.Revenues from set up and support fees and search\nservice hosting fees are recognized on a straight-line basis over the term of the contract, which is the expected\nperiod during which these services will be provided.The Company's policy is to recognize revenues from per\nquery search fees in the period queries are made and results are delivered.\n\nThe Company provides search services pursuant to certain AdSense agreements.",
                  "pageIdentifier": "85"
                },
                {
                  "content": "On Google Print pages, we provide links to book sellers that may\noffer the full versions of these publications for sale, and we show content-targeted ads that are served through\nthe Google AdSense program.Google Desktop Search.Google Desktop Search enables our users to perform a full text search on the\ncontents of their own computer, including email, files, instant messenger chats and web browser history.Users\ncan use this service to view web pages they have visited even when they are not online.Google Alerts.Google Alerts are email updates of the latest relevant Google results (web, news, etc.) based\non the user's choice of query or topic.Typical uses include monitoring a developing news story, keeping current\non a competitor or industry, getting the latest on a celebrity or event, or keeping tabs on a favorite sports team.Google Labs.Google Labs is our playground for our engineers and for adventurous Google users.On Google\nLabs, we post product prototypes and solicit feedback on how the technology could be used or improved.Current Google Labs examples include:Google Personalized Search—provides customized search results based on an individual user's interests.Froogle Wireless—gives people the ability to search for product information from their mobile phones\nand other wireless devices.",
                  "pageIdentifier": "17"
                }
              ]
            }
          ]
        }
      }
    }

获取引用

引用(如果指定)是指放置在搜索摘要中的内嵌数字。这些数字表示摘要中的特定句子来自哪些搜索结果。

如需获取引用,请按以下步骤操作:

  1. 提交包含 contentSearchSpec.summarySpec 并指定 "includeCitations": true 的搜索请求。如需详细了解如何提交搜索请求,请参阅获取搜索结果

    在以下示例中,summarySpec 表示您需要搜索摘要,该摘要应根据前三条搜索结果生成,并且应在摘要中包含引用。

    "contentSearchSpec":
     {
       "summarySpec":
       {
         "summaryResultCount": 3,
         "includeCitations": true
       },
       "extractiveContentSpec": { "maxExtractiveAnswerCount" : 1}
     }
    
    • summaryResultCount:用于生成搜索摘要的热门结果数量。如果返回的结果数量少于 summaryResultCount,则系统会根据所有结果生成摘要。 最大值为 5
    • includeCitations:一个布尔值,用于指定是否返回引用。
    • maxExtractiveAnswerCount:要为每个搜索结果返回的提取式回答的数量。默认值为 0,最大值为 1。
  2. 从搜索响应中获取带有引用的摘要。每个响应中都会返回一个 summary 属性。

    以下是搜索响应末尾返回的摘要示例,其中包含引用和引用元数据:

    "summary": {
     "summaryText": "BigQuery is Google Cloud's fully managed and completely
      serverless enterprise data warehouse [1]. BigQuery supports all data types,
      works across clouds, and has built-in machine learning and business
      intelligence, all within a unified platform [2, 3].",
     "summaryWithMetadata": {
       "summary": "BigQuery is Google Cloud's fully managed and completely
       serverless enterprise data warehouse. BigQuery supports all data types,
       works across clouds, and has built-in machine learning and business
       intelligence, all within a unified platform.",
       "citationMetadata": {
         "citations": [
           {
             "startIndex": "0",
             "endIndex": "101",
             "sources": [
               {
                 "uri": "gs://example-dataset/html/6344007140738632642.html",
                 "title": "About BigQuery",
                 "id": "b6344007140738632642",
                 "referenceIndex": "0"
               },
               {
                 "uri": "gs://example-dataset/html/1365490014946172719.html",
                 "title": "Google Cloud article",
                 "id": "b1365490014946172719",
                 "referenceIndex": "1"
               },
               {
                 "uri": "gs://example-dataset/html/2687910668117268120.html",
                 "title": "BigQuery document",
                 "id": "a2687910668117268120",
                 "referenceIndex": "2"
               }
             ]
           },
           {
             "startIndex": "103",
             "endIndex": "230",
             "sources": [
               {
                 "referenceIndex": "0"
                },
               {
                 "referenceIndex": "1"
               },
               {
                 "referenceIndex": "2",
               }
             ]
           }
         ]
       },
       "references": [
       {
         "title": "Sports in the United States",
         "docName": "projects/123/locations/global/collections/default_collection/dataStores/ds-123/branches/0/documents/b6344007140738632642",
         "uri": "https://example.com/bigqueryA"
       },
       {
         "title": "Sports in the United States",
         "docName": "projects/123/locations/global/collections/default_collection/dataStores/ds-123/branches/0/documents/b1365490014946172719",
         "uri": "https://example.com/bigqueryB"
       },
       {
         "title": "Sports in the United States",
         "docName": "projects/123/locations/global/collections/default_collection/dataStores/ds-123/branches/0/documents/a268791066811726812",
         "uri": "https://example.com/bigqueryC"
       }
     ]
    }
    }
    
    • summaryText:包含引用编号的搜索摘要。引用编号是指返回的搜索结果,从 1 开始编制索引。例如,[1] 表示相应句子归因于第一个搜索结果。[2, 3] 表示相应句子归因于第二个和第三个搜索结果。
    • citations:对于摘要中包含引用的每个句子,列出相应引用的元数据。
    • startIndex:表示句子的开头,以 Unicode 字节为单位。
    • endIndex:表示句子的结尾(以 Unicode 字节为单位)。
    • sources:列出句子引用中包含的每个来源的 referenceIndexreferenceIndex 是分配给来源的索引编号。第一个来源的 referenceIndex 并不总是明确返回在响应中。由于 referenceIndex 是从 0 开始编制索引的,因此第一个来源的 referenceIndex 始终为 0。
    • references:列出摘要中引用的每个参考资料的元数据。元数据包括 titledocNameuri

忽略对抗性查询

对抗性查询包括负面评论,或旨在生成不安全、违反政策的输出。您可以指定不针对对抗性查询返回任何搜索摘要。当对抗性查询被忽略时,summaryText 属性包含样板文字,表明未返回任何搜索摘要。即使不返回搜索摘要,也会针对对抗性查询返回搜索文档。

如需指定不应针对对抗性查询返回任何搜索摘要,请按以下步骤操作:

  1. 提交包含 contentSearchSpec.summarySpec 并指定 "ignoreAdversarialQuery": true 的搜索请求。如需详细了解如何提交搜索请求,请参阅获取搜索结果

    在以下示例中,summarySpec 表示您需要搜索摘要,该摘要应根据前三条搜索结果生成,但对于对抗性查询,不应返回摘要。

    "contentSearchSpec":
     {
       "summarySpec":
       {
         "summaryResultCount": 3,
         "ignoreAdversarialQuery": true
       },
       "extractiveContentSpec": { "maxExtractiveAnswerCount" : 1}
     }
    
    • summaryResultCount:用于生成搜索摘要的热门结果数量。如果返回的结果数量少于 summaryResultCount,则系统会根据所有结果生成摘要。 最大值为 5
    • ignoreAdversarialQuery:一个布尔值,用于指定不应针对对抗性查询返回任何搜索摘要。
    • maxExtractiveAnswerCount:要为每个搜索结果返回的提取式回答的数量。默认值为 0,最大值为 1。
  2. 请参阅针对对抗性搜索请求返回的 summary 属性。

    示例如下:

    "summary":
    {
      "summaryText": "We do not have a summary for your query. Here are some
      search results.",
      "summarySkippedReasons": [
       "ADVERSARIAL_QUERY_IGNORED"
     ]
    }
    
    • summaryText:表示未返回任何搜索摘要的样板文本。
    • summarySkippedReasons:一个枚举,其中包含摘要跳过原因的值。

忽略非摘要寻求查询

非总结性查询会返回不适合总结的结果。例如,“为什么天空是蓝色的”和“谁是世界上最好的足球运动员?”是寻求总结的查询,但“旧金山国际机场”和“2026 年世界杯”不是。它们很可能是导航查询。您可以指定不应为非摘要查询返回任何搜索摘要。即使不返回搜索摘要,也会针对非总结寻求查询返回搜索文档。

如需指定不应针对非总结查询返回任何搜索摘要,请按以下步骤操作:

  1. 提交包含 contentSearchSpec.summarySpec 并指定 "ignoreNonSummarySeekingQuery": true 的搜索请求。如需详细了解如何提交搜索请求,请参阅获取搜索结果

    在以下示例中,summarySpec 表示您需要搜索摘要,摘要应根据前三条搜索结果生成,但对于非摘要寻求查询,不应返回摘要。

    "contentSearchSpec":
     {
       "summarySpec":
       {
         "summaryResultCount": 3,
         "ignoreNonSummarySeekingQuery": true
       },
       "extractiveContentSpec": { "maxExtractiveAnswerCount" : 1}
     }
    
    • summaryResultCount:用于生成搜索摘要的热门结果数量。如果返回的结果数量少于 summaryResultCount,则系统会根据所有结果生成摘要。 最大值为 5
    • ignoreNonSummarySeekingQuery:一个布尔值,用于指定对于非总结寻求查询,不应返回任何搜索总结。
    • maxExtractiveAnswerCount:要为每个搜索结果返回的提取式回答的数量。默认值为 0,最大值为 1。
  2. 请参阅针对非总结性搜索请求返回的 summary 属性。

    示例如下:

    "summary":
    {
      "summaryText": "We do not have a summary for your query. Here are some
      search results.",
      "summarySkippedReasons": [
        "NON_SUMMARY_SEEKING_QUERY_IGNORED"
     ]
    }
    
    • summaryText:表示未返回任何搜索摘要的样板文本。
    • summarySkippedReasons:一个枚举,其中包含摘要跳过原因的值。

获取自定义摘要

您可以通过提供自然语言指令来获取自定义摘要。 您可以使用控制台来使用此功能,也可以在 modelPromptSpec.preamble 字段中输入指令来使用此功能。

您可以请求自定义,例如输出的长度、详细程度、样式(例如“简单”)、语言、答案重点和格式(例如表格、项目符号和 XML)。

例如,对于查询 What are the benefits of bigquery?,您可以输入如下示例所示的自定义说明:

  • 指令示例:please show the answer format in an ordered list

    回答示例:

    The benefits of BigQuery are:
    
    1. Increased business productivity
    2. IT staff productivity
    3. Reduced infrastructure costs
    4. Loading and exporting data are free
    
  • 指令示例:give a very detailed answer

    回答示例:

    BigQuery is a serverless, highly scalable, and cost-effective cloud data warehouse that enables businesses to analyze all their data very quickly. It is a fully managed service that takes care of provisioning, scaling, and managing the underlying infrastructure, so you can focus on analyzing your data. BigQuery is also integrated with other Google Cloud services, such as Google Kubernetes Engine, Cloud Data Fusion, and Cloud Dataproc, making it easy to build and deploy data pipelines.
    
    Here are some of the benefits of using BigQuery:
    
    * **Fast and scalable:** BigQuery can process petabytes of data very quickly, and it can scale to handle even the most demanding workloads. * **Cost-effective:** BigQuery is a very cost-effective way to store and analyze data. You only pay for the data that you use, and there are no upfront costs or commitments. * **Secure:** BigQuery is a secure platform that meets the needs of even the most security-conscious organizations. * **Easy to use:** BigQuery is easy to use, even for non-technical users. It has a simple and intuitive user interface, and it supports a variety of data sources. * **Integrated with other Google Cloud services:** BigQuery is integrated with other Google Cloud services, making it easy to build and deploy data pipelines.
    
    If you are looking for a fast, scalable, and cost-effective way to analyze your data, then BigQuery is a great option.
    

自定义摘要的最佳实践

如果您计划使用此功能,请执行以下操作:

  • 一次只能请求一项自定义设置。请勿组合使用自定义设置,例如请求以法语显示 HTML 表格。
  • Google 建议您限制最终用户可以请求的自定义项,例如,通过提供包含一组预定义自定义项的选择器来实现。

自定义摘要

您可以使用控制台获取仅针对搜索 widget 的自定义摘要,也可以使用 API 获取针对任何搜索请求的自定义摘要。

如需获取自定义摘要,请按以下步骤操作:

控制台

  1. 在 Google Cloud 控制台中,前往 AI Applications 页面。

    AI Applications

  2. 点击要修改的应用的名称。

  3. 依次前往配置 > 界面

  4. 确保搜索 widget 的搜索类型设置为搜索并提供回答搜索并进行后续对话。如果选择了搜索,则无法使用此功能。

  5. 开启启用摘要自定义功能

  6. 如需输入总结指令,请执行以下任一操作:

    • 输入自由格式的指令:在Preamble字段中输入您自己的自然语言指令。
    • 使用模板指令:点击替换为模板,然后选择一个预定义的模板指令。选择预定义模板后,该模板会显示在Preamble字段中。
  7. 预览窗格中进行搜索,测试应用自定义摘要生成功能。

  8. 如需重置为上次保存的指令集,请点击重置序言

  9. 如需将设置保存到 widget,请点击保存并发布

REST

  1. 提交包含 contentSearchSpec.summarySpec 并在 modelPromptSpec.preamble 中指定自定义指令的搜索请求。如需详细了解如何提交搜索请求,请参阅获取搜索结果

    在以下示例中,summarySpec 表示您需要搜索摘要,摘要应根据前三条搜索结果生成,并且应进行自定义,以便向 10 岁儿童进行解释。

    "contentSearchSpec":
      {
        "summarySpec":
        {
          "summaryResultCount": 3,
          "modelPromptSpec":
          {
            "preamble": "explain like you would to a ten year old"
          }
        }
      }
    
    • summaryResultCount:用于生成搜索摘要的热门结果数量。如果返回的结果数量少于 summaryResultCount,则系统会根据所有结果生成摘要。 最大值为 5
    • preamble:用于自定义的指令。
  2. 从搜索响应中获取自定义摘要。

    以下是返回的自定义摘要示例:

    "summary":
    {
      "summaryText": "BigQuery is a serverless data warehouse that helps you
      analyze all your data very quickly. It's very easy to use and you don't
      need to worry about managing servers or infrastructure. BigQuery is also
      very scalable, so you can analyze large datasets without any problems."
    }
    
    • summaryText:自定义的搜索摘要。

指定总结模型

您可以指定要用于生成摘要的模型。

您可以指定 stablepreview,也可以按名称指定特定模型版本。 如需了解可用的模型版本,请参阅回答生成模型版本和生命周期

如需更改模型版本,请执行以下操作:

  1. 提交包含 ContentSearchSpec.SummarySpec.ModelSpec 的搜索请求,以指定模型版本。

    "contentSearchSpec": {
      "summarySpec": {
        "modelSpec": {
          "version": "MODEL_VERSION"
         }
       }
     }
    
    • MODEL_VERSION:指定用于生成摘要的模型。支持的值包括:

      • stable:字符串。未指定值时的默认规范。 stable 指向经过微调以生成答案的 GA 模型版本。随着新的正式版模型发布和旧模型版本停用,stable 所指向的模型会发生变化。如需了解 stable 指向的最新版本,请参阅回答生成模型版本和生命周期
      • preview:字符串。preview 指向最新的 Gemini 问答模型。如需详细了解 Gemini,请参阅模型概览
      • 如需指定特定模型版本,请输入版本名称,例如 gemini-1.5-flash-002/answer_gen/v1。如需了解受支持的版本,请参阅回答生成模型版本和生命周期

例如,以下搜索请求将 preview 指定为模型版本:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://discoveryengine.googleapis.com/v1/projects/exampleproject/locations/global/collections/default_collection/dataStores/exampledatastore/servingConfigs/default_search:search" \
-d '{
  "query": "what is bigquery",
  "contentSearchSpec": {
    "summarySpec": {
      "modelSpec": {
        "version": "preview"
      }
    }
  }
}'

搜索摘要的限制

使用搜索摘要时,您可能会遇到以下限制:

  • 由于 LLM 用于生成搜索摘要和引用,因此 LLM 的限制也适用于 Vertex AI Search 摘要。

    如需了解有关这些 LLM 限制的一般信息,请参阅 Vertex AI 文档中的 PaLM API 限制

  • 如果搜索查询需要复杂的逻辑或分析推理,或者需要了解世界,则生成的搜索摘要可能包含不正确的信息(幻觉)或非结构化数据或网站数据中不存在的信息。

  • 搜索摘要中的某些陈述可能不包含引用:

    • 如果系统确定某个陈述不需要依据,则不会添加引用。“以下是我找到的内容”或“您可以按照多种方法操作”等句子缺少引用。

    • 缺少引用也可能表示未找到有效参考资料。 没有引用来源的事实可能不可靠。

  • 在极少数情况下,引用可能被错误地归因于某个陈述。

  • LLM 可能会错误地解析复杂文档。在这种情况下,摘要可能不完整或不正确。

  • 由于自定义说明采用自然语言,因此无法保证所有请求都能遵循说明。