If you have a media search app, you can use metadata to filter your search queries. This page explains how use metadata fields to restrict your search to a specific set of documents.
Before you begin
Make sure you have created a media app and data store and ingested data. For more information, see Create a media data store and Create a media app.
Example documents
Review these example media documents. You can refer back to them as you read through this page.
{"id":"172851","schemaId":"default_schema","jsonData":"{\"title\":\"Avatar: Creating the World of Pandora (2010)\",\"categories\":[\"Documentary\"],\"uri\":\"http://mytestdomain.movie/content/172851\",\"available_time\":\"2023-01-01T00:00:00Z\",\"media_type\":\"movie\"}"}
{"id":"243308","schemaId":"default_schema","jsonData":"{\"title\":\"Capturing Avatar (2010)\",\"categories\":[\"Documentary\"],\"uri\":\"http://mytestdomain.movie/content/243308\",\"available_time\":\"2023-01-01T00:00:00Z\",\"media_type\":\"movie\"}"}
{"id":"280218","schemaId":"default_schema","jsonData":"{\"title\":\"Avatar: The Way of Water (2022)\",\"categories\":[\"Action\",\"Adventure\",\"Sci-Fi\"],\"uri\":\"http://mytestdomain.movie/content/280218\",\"available_time\":\"2023-01-01T00:00:00Z\",\"media_type\":\"movie\"}"}
{"id":"72998","schemaId":"default_schema","jsonData":"{\"title\":\"Avatar (2009)\",\"categories\":[\"Action\",\"Adventure\",\"Sci-Fi\",\"IMAX\"],\"uri\":\"http://mytestdomain.movie/content/72998\",\"available_time\":\"2023-01-01T00:00:00Z\",\"media_type\":\"movie\"}"}
Filter expression syntax
Make sure you understand the filter expression syntax that you'll use to define your search filter. The filter expression syntax can be summarized by the following Extended Backus–Naur form:
# A single expression or multiple expressions that are joined by "AND" or "OR". filter = expression, { " AND " | "OR", expression }; # Expressions can be prefixed with "-" or "NOT" to express a negation. expression = [ "-" | "NOT " ], # A parenthetical expression. | "(", expression, ")" # A simple expression applying to a text field. # Function "ANY" returns true if the field contains any of the literals. ( text_field, ":", "ANY", "(", literal, { ",", literal }, ")" # A simple expression applying to a numerical field. Function "IN" returns true # if a field value is within the range. By default, lower_bound is inclusive and # upper_bound is exclusive. | numerical_field, ":", "IN", "(", lower_bound, ",", upper_bound, ")" # A simple expression that applies to a numerical field and compares with a double value. | numerical_field, comparison, double ); # Datetime field | datetime_field, comparison, literal_iso_8601_datetime_format); # A lower_bound is either a double or "*", which represents negative infinity. # Explicitly specify inclusive bound with the character 'i' or exclusive bound # with the character 'e'. lower_bound = ( double, [ "e" | "i" ] ) | "*"; # An upper_bound is either a double or "*", which represents infinity. # Explicitly specify inclusive bound with the character 'i' or exclusive bound # with the character 'e'. upper_bound = ( double, [ "e" | "i" ] ) | "*"; # Supported comparison operators. comparison = "<=" | "<" | ">=" | ">" | "="; # A literal is any double quoted string. You must escape backslash (\) and # quote (") characters. literal = double quoted string; text_field = text field - for example, category; numerical_field = numerical field - for example, score; datetime_field = field of datetime data type - for example available_time; literal_iso_8601_datetime_format = either a double quoted string representing ISO 8601 datetime or a numerical field representing microseconds from unix epoch.
Filter media search
To filter media search using metadata, follow these steps:
Find your data store ID. If you already have your data store ID, skip to the next step.
In the Google Cloud console, go to the Agent Builder page and in the navigation menu, click Data Stores.
Click the name of your data store.
On the Data page for your data store, get the data store ID.
Determine the document field or fields that you want to filter on. For example, for the documents in Before you begin, you could use the
categories
field as a filter.You can only use indexable fields in filter expressions. To determine if a field is indexable, do the following:
In the Google Cloud console, go to the Agent Builder page and in the navigation menu, click Data Stores.
Click the name of your data store.
In the Name column, click the data store.
Click the Schema tab to view the schema for your data store. If Indexable for the field is:
Selected
, then that field is ready to be filtered on for search; skip step 3.Not selected
, then follow step 3 to enable the field for indexing.Not available
, then the field can't be indexed.
To make a field, such as the
categories
field, filterable, do the following:In the Google Cloud console, go to the Agent Builder page, and in the navigation menu, click Apps.
Click your media search app.
In the navigation menu, click Data.
Click the Schema tab. This tab shows current field settings.
Click Edit.
If it's not already selected, select the Indexable checkbox in the categories row, and then click Save.
Wait six hours to allow time for your schema edit to propagate. After six hours, you can proceed to the following step.
Get search results.
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ "https://discoveryengine.googleapis.com/v1beta/projects/PROJECT_ID/locations/global/collections/default_collection/dataStores/DATA_STORE_ID/servingConfigs/default_search:search" \ -d '{ "query": "QUERY", "filter": "FILTER" }'
- PROJECT_ID: The ID of your project.
- DATA_STORE_ID: The ID your data store.
- QUERY: The query text to search.
- FILTER: A text field for filtering your search using a filter expression.
For example, suppose you want to search through the movies in the Before you begin section, and you want search results only for movies that: (1) Contain the word "avatar", and (2) are in the "Documentary" category. You would do that by including the following statements with your call:
"query": "avatar", "filter": "categories: ANY(\"Documentary\")"
For more information, see the
search
method.Click for an example response.
If you perform a search like the one in the preceding procedure, you can expect to get a response similar to the following. Notice that the response includes only the Avatar documentaries.
{ "results": [ { "id": "243308", "document": { "name": "projects/431678329718/locations/global/collections/default_collection/dataStores/rdds3_1698205785399/branches/0/documents/243308", "id": "243308", "structData": { "categories": [ "Documentary" ], "title": "Capturing Avatar (2010)", "uri": "http://mytestdomain.movie/content/243308", "media_type": "movie" } } }, { "id": "172851", "document": { "name": "projects/431678329718/locations/global/collections/default_collection/dataStores/rdds3_1698205785399/branches/0/documents/172851", "id": "172851", "structData": { "categories": [ "Documentary" ], "uri": "http://mytestdomain.movie/content/172851", "media_type": "movie", "title": "Avatar: Creating the World of Pandora (2010)" } } } ], "totalSize": 2, "attributionToken": "XfBcCgwIvIzJqwYQ2_qNxwMSJDY1NzEzNmY1LTAwMDAtMmFhMy05YWU3LTE0MjIzYmIwOGVkMiIFTUVESUEqII6-nRXFy_MXnIaOIsLwnhXUsp0VpovvF6OAlyKiho4i", "guidedSearchResult": {}, "summary": {} }