“OR”析取操作在可计费操作和计算时间中均成本高昂。假设您要搜索 'cuisine:Japanese OR cuisine:Korean'。您可以选择按照更宽泛的菜系类别为文档编索引。在这种情况下,查询可以简化为 'cuisine:Asian'。
消除查询中的同义反复
假设您要查找多伦多的所有餐厅。假定您的文档只有一个名为“city”的单个字段,那么当您使用查询 'city:toronto AND NOT city:montreal' 时,将获取与 'city:toronto' 一样的结果,因为当 city 设置为 "toronto" 时,其无法设置为 "montreal"。第二个查询运行得更快,因为它只涉及一个字词。第一个查询执行三个步骤:首先,它找到“city”设置为“toronto”的文档列表,然后找到“city”未设置为“montreal”的所有城市的列表,最后计算这两个列表的重叠部分。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-04。"],[[["\u003cp\u003eBatch adding or deleting up to 200 documents at a time in the search index is more efficient than handling them individually.\u003c/p\u003e\n"],["\u003cp\u003eUtilizing the document rank to pre-sort documents can enhance search performance, such as sorting by price in a real estate application, using \u003ccode\u003erank = price\u003c/code\u003e or \u003ccode\u003erank = MAXINT-price\u003c/code\u003e for ascending and descending orders.\u003c/p\u003e\n"],["\u003cp\u003eAtom fields should be used for boolean data instead of number fields, which is less efficient, and assign constant values like True/False or yes/no to the atom field.\u003c/p\u003e\n"],["\u003cp\u003eOptimizing queries by turning negatives into positives, disjunctions into conjunctions, and eliminating tautologies can significantly reduce evaluation costs and computation time.\u003c/p\u003e\n"],["\u003cp\u003eNarrowing the range of documents before sorting and using narrow categories instead of extensive sorting can greatly improve search performance, especially when dealing with large datasets.\u003c/p\u003e\n"]]],[],null,["# Search Best Practices\n\n| This API is supported for first-generation runtimes and can be used when [upgrading to corresponding second-generation runtimes](/appengine/docs/standard/\n| python3\n|\n| /services/access). If you are updating to the App Engine Python 3 runtime, refer to the [migration guide](/appengine/migration-center/standard/migrate-to-second-gen/python-differences) to learn about your migration options for legacy bundled services.\n\nThis document describes the best practices for the Search API. We use single\nquotes ('') throughout to delimit query strings. This way a query that contains\nmulti-word phrases surrounded by double quotes can be delimited without confusion:\n`'field:\"some text\" some-value'`.\n\n### Batch Index.put() and Index.delete() calls\n\nYou can pass up to 200 documents at a time when adding or deleting them from an index. This is much more efficient than handling them one at a time.\n\n### Use document rank to pre-sort documents\n\nBy default, search returns its results by descending rank. Also by default, the Search API sets the rank of each document to seconds since Jan 1st 2011. This results in the freshest documents being returned first. However, if you don't need documents to be sorted by the time they were added, you can use rank for other purposes. Suppose you have a real estate application. What customers want most is sorting by price. For an efficient default sort, you could set the rank to the house price.\n\nIf you need multiple sort orders such as price low-to-high and price high-to-low, you can create a separate index for each order. One index would have rank = price and the other rank = MAXINT-price (since rank must be positive).\n\nUsing rank as the sort key will improve search performance. To specify other sort keys, you must use sort options, which limits the number of search results to 10,000 documents. In this case, the sort order determined by rank will determine which documents will be included in the sort. Read about [sort options](/appengine/docs/legacy/standard/python/search/options#SortOptions) to learn more.\n\n### Use atom fields for boolean data\n\nStoring boolean data in number fields is very inefficient. Use atom fields instead, and assign your favorite constants (True/False, yes/no, 0/1).\n\n### Turn negatives into positives\n\nSuppose you have a special term to identify restaurants whose cuisine is undefined. If you want to exclude those restaurants you could use `'NOT cuisine:undefined'` as your query. This is, however, more expensive to evaluate (in both billable operations and computation time) than having the opposite, finding restaurants whose cuisine is known. Rather than having one field, cuisine, you can use two, `cuisine`, and `cuisine_known`, with the latter being an atom field. For restaurants for which cuisine is defined, you set the first field to the actual cuisine and the second field to `\"yes\"`. For restaurants for which you do not know the cuisine, you set cuisine to `\"\"` (an empty string) and `cuisine_known` to `\"no\"`. Now to find restaurants for which cuisine is known you issue a query `'cuisine_known:yes'`, which is much faster than the negation.\n\n### Turn disjunctions into conjunctions\n\nThe \"OR\" disjunction is an expensive operation in both billable operations and computation time. Suppose you want to search for `'cuisine:Japanese OR cuisine:Korean'`. An alternative is to index documents with more general categories of cuisine. In this case, the query may be simplified to `'cuisine:Asian'`.\n\n### Eliminate tautologies from your queries\n\nSuppose you want to find all restaurants in Toronto. Assuming that your documents have only a single field named \"city\", if you use the query `'city:toronto AND NOT city:montreal'` you get the same results as `'city:toronto'`, because if city is set to `\"toronto\"` it cannot be set to `\"montreal\"`. The second query runs much faster since it involves only one term. The first query performs three steps: first, it finds a list of documents where city is set to \"toronto\", then it finds a list of all cities for where city is not set to \"montreal\", and finally it computes the intersection of the two lists.\n\n### Narrow the range before sorting\n\nSuppose your application stores information about restaurants around the world, and you would like to show the restaurants closest to the current user. One way of doing this is to sort matching documents by the distance from the user's location. But if you have 1,000,000 restaurants, running a query like `'cuisine:japanese'` with the sort expression distance(geopoint(x, y), restaurant_loc) will take a long time. It's a good idea to add filters to a query so that you start with a more salient set of selected documents to sort. One solution is to create geographical categories, such as country, state and city - you could infer city and state from the user's location. Then your query becomes `'cuisine:japanese AND city:\u003cuser-city\u003e'`. Chances are very good that you'll no longer need to sort 1,000,000 documents.\n\n### Use narrow categories to avoid or minimize sorting\n\nIf you use rank to sort restaurants by price, you could create a `price_range` field that contains price categories: `price_0_10`, `price_11_20`, `price_21_30`, `price_31_40`, `price_41_lots`. You could then find all restaurants that cost between $21 and $40 with no sorting at all using the query `'price_range:price_21_30 OR price_range:price_31_40'`. In many cases the appropriate categories are not as clear-cut, but with this technique you can reject a large number of documents before winnowing down the search with expensive queries such as `'... AND price\u003e25 AND price\u003c35'`.\n\n### Do not score matches unless you need to\n\nScoring is used to indicate how well a given document matched a query. However, unless you intend to sort by score, do not request scoring. It will only slow down your search."]]