If advanced website indexing is enabled in your data store, you can add metadata to the schema to enrich your indexing.
Example use case
Suppose you have a large number of web pages that are relevant to various
departments in your organization. You can use meta
tags to label the pages
that are relevant for each department. You can then use the indexed tags as
filters in your queries. This lets you to restrict search results to web pages
containing a label that matches any of the specified departments.
This process can be summarized as follows:
- Add the following
meta
tags to a subset of your webpages:- Relevant to engineering and IT departments:
<meta name="department" content="eng, infotech">
- Relevant to finance and HR departments:
<meta name="department" content="finance, human resources">
- Relevant to engineering and IT departments:
- Recrawl the updated pages.
- Add
department
to your data store schema as an indexable array as described in the Add metadata to the store schema section.
After updating your schema, your data store is automatically reindexed.
After the reindexing is complete, you can use the department
filter in a
filter expression to reorder or filter search
results. For example, when users from the finance department issue queries,
the search results can be made more relevant for them with the department
filter set to finance
.
Before you begin
Before you update the data store's schema, do the following:
- Turn on advanced website indexing for the data store. For more information, see Turn on advanced website indexing.
- Understand that after you add
meta
tags in your web pages, you must recrawl the pages. This might take several hours. - Understand that after you add metadata and update the data store schema, the website in your data store is reindexed automatically. Reindexing is a long-running operation that might take several hours.
- Ensure that you don't use any excluded or unsupported meta tags.
Add metadata to the data store schema
To add metadata to the data store schema:
Add
meta
tags to all the pages in your website that you that you want to enrich with metadata indexing.Each
meta
tag must have itsname
attribute set to the field you want to index and itscontent
attribute to a string comprising one or more comma-separated values.Vertex AI Search supports all
meta
tags with names that match the pattern[a-zA-Z0-9][a-zA-Z0-9-_]*
. Ensure that you don't use any excluded or unsupported meta tags.Recrawl the updated web pages.
View the schema definition for your data store over REST API.
Update the data store schema over REST API by adding the
META_TAG_NAME
field that has itstype
set toarray
. For more information, see About providing your own schema as a JSON object. The following is an example of a schema update for a website:{ "type": "object", "properties": { "META_TAG_NAME": { "type": "array", "items": { "type": "string", "searchable": true, "retrievable": true, "indexable": true } } }, "$schema": "https://json-schema.org/draft/2020-12/schema" }
Replace
META_TAG_NAME
with the exactname
attribute's value.After you update the website schema, the website is reindexed automatically. This is a long-running operation that can take multiple hours.
What's next
Use the indexed metadata for the following:
- Serving controls, such as boost, bury, and filter
- Surfacing as facets in search results
- Filter search results
- Boost search results