When a query call completes normally, it returns the result as a
SearchResults object. The results object tells you how many matching documents were found in the index, and how many matched documents were returned. It also includes a list of matching
ScoredDocuments. The list usually contains a portion of all the matching documents found, since search returns a limited number of documents each time it's called. By using an offset or a cursor you can retrieve all the matching documents, a subset at a time.
def query_results(index, query_string): result = index.search(query_string) total_matches = result.number_found list_of_docs = result.results number_of_docs_returned = len(list_of_docs) return total_matches, list_of_docs, number_of_docs_returned
Depending on the value of the
limit query option, the number of matching documents returned in the result may be less than the number found. Remember that the number found will be an estimate if the number found accuracy is less than the number found. No matter how you configure the search options, a
search() call will find no more than 10,000 matching documents.
If more documents were found than returned, and you want to retrieve all of them, you need to repeat the search using either an offset or a cursor, as explained below.
The search results will include a list of
ScoredDocuments that match the query. You can iterate over the list to process each document in turn:
for scored_document in results: print(scored_document)
By default, a scored document contains all the fields of the original document that was indexed. If your query options specified
returned_fields, only those fields appear in the fields property of the document. If you created any computed fields by specifying
snippeted_fields they will appear
separately in the expressions property of the document.
If your search finds more documents than you can return at once, use an offset to index into the list of matching documents. For example, the default query limit is 20 documents. After you've executed a search the first time (with offset 0) and retrieved the first 20 documents, retrieve the next 20 documents by setting the offset to 20 and running the same search again. Keep repeating the search, incrementing the offset each time by the number of documents returned:
def query_offset(index, query_string): offset = 0 while True: # Build the query using the current offset. options = search.QueryOptions(offset=offset) query = search.Query(query_string=query_string, options=options) # Get the results results = index.search(query) number_retrieved = len(results.results) if number_retrieved == 0: break # Add the number of documents found to the offset, so that the next # iteration will grab the next page of documents. offset += number_retrieved # Process the matched documents for document in results: print(document)
Offsets can be inefficient when iterating over a very large result set.
You can also use cursors to retrieve a subrange of results. Cursors are useful when you intend to present your search results in consecutive pages and you want to be sure you do not skip any documents in the case where an index could be modified between queries. Cursors are also more efficient when iterating across a very large result set.
In order to use cursors, you must create an initial cursor and include it in the query options. There are two kinds of cursors, per-query and per-result. A per-query cursor causes a separate cursor to be associated with the results object returned by the search call. A per-result cursor causes a cursor to be associated with every scored document in the results.
Using a per-query cursor
By default, a newly constructed cursor is a per-query cursor. This cursor holds the position of the last document returned in the search's results. It is updated with each search. To enumerate all matching documents in an index, execute the same search until the result returns a null cursor:
def query_cursor(index, query_string): cursor = search.Cursor() while cursor: # Build the query using the cursor. options = search.QueryOptions(cursor=cursor) query = search.Query(query_string=query_string, options=options) # Get the results and the next cursor results = index.search(query) cursor = results.cursor for document in results: print(document)
Using a per-result cursor
To create per-result cursors, you must set the cursor per_result property to true when you create the initial cursor. When the search returns, every document will have a cursor associated with it. You can use that cursor to specify a new search with results that begin with a specific document. Note that when you pass a per-result cursor to search, there will be no per-query cursor associated with the result itself; result.getCursor() will return null so you can't use this to test whether you've retrieved all the matches.
def query_per_document_cursor(index, query_string): cursor = search.Cursor(per_result=True) # Build the query using the cursor. options = search.QueryOptions(cursor=cursor) query = search.Query(query_string=query_string, options=options) # Get the results. results = index.search(query) document_cursor = None for document in results: # discover some document of interest and grab its cursor, for this # sample we'll just use the first document. document_cursor = document.cursor break # Start the next search from the document of interest. if document_cursor is None: return options = search.QueryOptions(cursor=document_cursor) query = search.Query(query_string=query_string, options=options) results = index.search(query) for document in results: print(document)
Saving and restoring cursors
A cursor can be serialized as a web-safe string, saved, and then restored for later use:
def saving_and_restoring_cursor(cursor): # Convert the cursor to a web-safe string. cursor_string = cursor.web_safe_string # Restore the cursor from a web-safe string. cursor = search.Cursor(web_safe_string=cursor_string)