Google Developer Relations
The previous class covered the basics of defining, submitting, and processing a search query. The Search API supports more complex queries, including specification of the point in the index at which query should start, how the results should be sorted and formatted, and what information about the docs should be returned from the query. It also supports Geosearch (location-based queries).
In this lesson, we'll look in more detail at some of these features. You'll learn the following concepts:
- Define and process the results of complex queries
- Control which document fields are returned from a query
- Use offsets and limits to control where a query starts and how many results are returned
- Construct and use location-based queries (Geosearch)
See the Search API documentation for more detail on the features described in this lesson, as well as some additional capabilities that we won't cover here.
Learn how to perform complex Search API search queries
The precursor to this class, Getting Started with the Python Search API
You should also:
- Python 2.7 and the Google App Engine SDK for Python
- Familiarity with Python and the basics of App Engine applications
search_query = search.Query( query_string=query.strip(), options=search.QueryOptions(...) )
Consider one of the
QueryOptions configurations used in the example product
search_query = search.Query( query_string=query.strip(), options=search.QueryOptions( limit=doc_limit, offset=offsetval, sort_options=sortopts, snippeted_fields=[docs.Product.DESCRIPTION], returned_expressions=[search.FieldExpression(name='adjusted_price', expression='max(price, 14.99)')], returned_fields = [docs.Product.PID, docs.Product.DESCRIPTION, docs.Product.CATEGORY, docs.Product.AVG_RATING, docs.Product.PRICE, docs.Product.PRODUCT_NAME] ))
This specifies an offset (where to start the query) and a limit (the maximum number of results to return), some sort options (discussed in the next lesson), a list of snippeted fields, a list of returned expressions (computed fields), and a list of returned fields. Let's look at what each of these options does.
Query offsets, limits, and cursors
To control the number of results a query returns, use the
limit parameter. The example product search application uses
limit to return
a maximum of three results per page.
The example above also shows the use of the
offset parameter. The offset
specifies the number of matched documents to skip before beginning to return
search.QueryOptions( limit=doc_limit, offset=offsetval, ...)
One common use for the
limit parameters is to paginate the query
results. To implement pagination, you need to know the total number of matches
the query found and how many have been returned so far. You can get that
information from the returned
number_found = search_results.number_found returned_count = len(search_results.results)
The Search API also supports the use of query cursors. Cursors are another way to indicate the point from which to begin a query, allowing you to continue a search from the end of the previous result set. Using a cursor is generally more efficient than using offsets. However, the Search API doesn't currently support a "reverse cursor" as does the Datastore API, making it more difficult to to implement backward paging. For this reason, the example application uses offsets rather than cursors to paginate its query results. You can find an example using cursors here.
Snippeted fields allow you to return an abbreviated portion of a field instead
of its full content. The returned snippet will include the fragment of the field
on which the match occurred, with the matched search terms highlighted in bold.
In the product search application (with default data), a search on the query
stories returns three matches, in the documents'
description fields. Because
we requested that
description be snippeted, the snippet expressions in the
results have the word "stories" highlighted.
You specify the snippeting that should occur by providing an iterable of field
names to snippet. The
QueryOptions constructor above requests snippeting of
search.QueryOptions( snippeted_fields=[docs.Product.DESCRIPTION], ...)
Then, when processing your query results, you access the generated snippets via
a returned document's
for doc in search_results: ... for expr in doc.expressions: # iterate over the computed fields if expr.name == docs.Product.DESCRIPTION: description_snippet = expr.value break # ... do something with the document ...
expressions property holds a list of computed fields that are the
results of expressions requested in the query. The code above grabs the snippet
generated for the
DESCRIPTION field, where
doc is a scored
document. Scored documents
are returned from a search. In addition to document content, they include the
document score, as well as computed fields (discussed below) and other
Returned expressions and expression functions
returned_expression query option allows you to define computed fields,
based on your document fields, that will be returned as part of a scored
document in the search results.
Suppose you want to compute and display a price for each product that includes
an 8% sales tax. You create a field
expression with the name
adjusted_price, whose value is the string
price * 1.08:
search.QueryOptions( returned_expressions=[search.FieldExpression(name='adjusted_price', expression='price * 1.08')], ...)
This expression tells the search API to return, as the value of
adjusted_price, the value of the
price field multiplied by 1.08. The Search
API provides a variety of built-in expression
that you can use in such expressions. For example, you can define expressions
After including a
returned_expression list in your
QueryOptions object, you
can access that computed field in the documents returned from the search query,
again via the
for doc in search_results: ... for expr in doc.expressions: # iterate over the computed fields if expr.name == docs.Product.DESCRIPTION: # get the description snippet description_snippet = expr.value elif expr.name == 'adjusted_price': # get the adjusted price price = expr.value # ... do something with the document ...
QueryOptions constructor also accepts a
returned_fields parameter, which
you can use to make your queries more efficient by requesting only the specific
document fields you intend to use. For example, the
QueryOptions object shown
earlier requests all the "core" product fields except for the date last update,
which we've decided not to show in our result summary. It also doesn't request
any of the category-specific fields, such as
book documents or
search.QueryOptions( returned_fields = [docs.Product.PID, docs.Product.DESCRIPTION, docs.Product.CATEGORY, docs.Product.AVG_RATING, docs.Product.PRICE, docs.Product.PRODUCT_NAME] ...)
returned_fields argument should be an iterable over the names of fields to
return in search results. The documents returned in the search results will
include only the specified fields, even though the indexed documents can include
Location-based queries (Geosearch)
The Search API's support for Geosearch allows you to make location-based queries. These allow you, for example, to find nearby stores or restaurants, or nearby activity stream updates.
To execute a location-based query, you need three pieces of information:
- A location, in latitude and longitude coordinates, from which to measure distances.
- The radius within which to search (such as 45 kilometers).
- The set of points to which to measure distances.
The first two of these items are often supplied by the user. The last comes from the indexed documents themselves: in our example product search application, it consists of the locations of our stores, taken from the store location documents we built in the previous Getting Started class.
To search for store locations near the user, the example application obtains the user's location via the browser, and the user inputs the distance within which to search. The distance is converted to meters, the unit of distance used by the Search API. Suppose the user's location is (-33.857, 151.215), and they specify a search radius of 45 kilometers. The application would construct a query string like
"distance(store_location, geopoint(-33.857, 151.215)) < 45000"
and pass it to the
from google.appengine.api import search ... # a query string like this comes from the client query = "distance(store_location, geopoint(-33.857, 151.215)) < 45000" try: index = search.Index(config.STORE_INDEX_NAME) search_results = index.search(query) for doc in search_results: # process doc ... except search.Error: # ...
Summary and review
In this lesson, we've learned how to specify a search query using a
QueryOptions object, and we've looked at some useful
returned_fields. We've also described how to construct a Geosearch query.
sort_options, has enough features to
merit its own lesson, so we'll discuss it next. See the
for additional options not covered in this lesson.
To check your understanding, try playing with some of the
properties described here. For instance, change the
DOC_LIMIT in the
config.py file to a larger value. This is the value passed as the
Try playing with the
should have been defined in
_buildQuery() like this:
search.FieldExpression(name='adjusted_price', expression='price * 1.08')
Look for the lines in
handlers.py, in class
ProductSearchHandler, that say
# uncomment to use 'adjusted price', which should be # defined in returned_expressions in _buildQuery() below, as the # displayed price.
Uncomment the lines below them:
# elif expr.name == 'adjusted_price': # price = expr.value
When you redeploy the application, you should see the
in the search results instead of the actual price. That is, the price displayed
will include the sales tax. The View product details link in the search
results will still show you the actual price. (The
adjusted_price field will
be populated only for a deployed application).
In the next lesson, you'll learn how to sort the results of a query search in the order you want them.