This page provides best practices for creating your catalog information and populating your catalog data.
Overview
The catalog data you import into Vertex AI Search for commerce has a direct effect on the quality of the resulting model, and therefore on the quality of search and recommendation results. In general, the more accurate and specific catalog information you can provide, the higher quality your model.
Your catalog should be kept up to date. You can upload catalog changes as often as needed; ideally, every day for catalogs with a high rate of change. You can upload (patch) existing product items; only the changed fields will be updated. There is no charge for uploading catalog information. For more information, see Keeping your catalog up to date.
Catalog branches
If you use search, you can use catalog branches to test new data that you've uploaded offline before making it live on your site.
You can use up to three branches, identified as
0
, 1
, and 2
. Your live site points to default_branch
for its catalog
data. Specify which branch is your live default_branch
(the default is set to
branch 0
) using either setDefaultBranch
or the Data tab in
Search for commerce console. Your site then uses the catalog data provided
by the branch that default_branch
points to.
As an example, say default_branch
is set to branch ID 0
, so your
site is using the catalog data that you've uploaded to that branch. You can
upload new catalog data to branch 1
and preview it. After you've confirmed
that the catalog has been uploaded correctly, you can switch to branch 1
as
the live default_branch
.
The catalog cache can take up to 30 minutes to update after branch switching.
If you use recommendations, we recommend using only the default branch due to the update delay during branch switching. If the data difference between branches is large, update delay can negatively impact prediction results.
Products
The catalog is a collection of product objects.
Required product information
The following fields are required; you must provide values for them when you create product items in your catalog. They should also correspond with the values used in your internal product database, and should accurately reflect the product represented, because they are included in training your models.
In some cases, other fields are also required. Refer to the complete list of all
product fields on the Product
reference page.
All product information you provide can be used to improve the quality of recommendations and search results. Be sure to provide as many fields as possible.
Field | Notes |
---|---|
name
|
The full, unique resource name of the product. Required for all
Product methods except for import . During
import, the name is automatically generated and does not need to be
manually provided.
|
id
|
The product ID used by your product database. The ID field must be
unique across your entire catalog. The same value is used when you
record a user event, and is also returned by the predict
and search methods.
|
title
|
Product title from your product database. A UTF-8 encoded string. Limited to 1250 characters. |
categories
|
Product categories. Every product must be assigned to at least one category.
If a product belongs to more than one category, repeat the field for each category.
The value must be a non-empty UTF-8 encoded string with a length limit of
5,000 characters. Always specify the full category path, for example:
["Sports & Fitness > Athletic Clothing > Shoes"] .
|
Product attributes
Providing values for predefined system attributes in Product
such as brand, color, and size is highly encouraged. You can also include custom
attributes that you define with Product.attributes
.
Retrievable fields
If you are using search, attributes are included with a product
in the search response if you mark them as retrievable in
Product.retrievableFields
. They can then be used for
other search features such as filtering and facets.
Exact-searchable option
You can give a certain value for specific string queries to the exact-searchable option, a catalog attribute field. If a query matches any product in your catalog where the value for that field is the string given in the query, it returns exactly that product in the search query. This option works well for serial numbers, where customers expect a targeted search experience.
A useful field for product attributes with an exact value (like ModelId
or ManufacturerId
), the ExactSearchableOption
field is usually for custom attributes. Attributes like product_id are primary index fields — exact-searchable by default. The item_id
field is always on for exact match and can't be disabled.
- To avoid returning unrelated items in searches, never set the searchable option to a generic value such as
battery
. - To avoid under-serving search queries, don't set special fields like
tag
— which could have as one of its string values "iphone" — to exact-searchable. Doing so could cause those queries to limit results for all iphones in the product catalog.
For more information, see About product attributes.
Product levels
Product SKU designations determine the hierarchy in your catalog.
Product designation types
There are three product designation types:
Primary items are returned in recommendation or search results. Primaries can be individual (SKU-level) items and groups of similar items (SKU groups).
Variant items are versions of a SKU-group primary product. Variants can only be individual (SKU-level) items. For example, if the primary product is "V-neck shirt", variants could be "Brown V-neck shirt, size XL" and "White V-neck shirt, size S". Primaries and variants are sometimes described as parent and child items.
Collection items are bundles of primary products or variant products, such as a jewelry set with a necklace, earrings, and a ring. Hierarchical structures similar to products and variants, collections group related primary products. Customers can't buy them directly, they're not widely used, and they're only available in search.
Product examples
As an example, according to these product designation types, grocery items are better cataloged as primary products, each consisting of a single SKU product, such as"bananas, fresh"
.
On the other hand, tee-shirts would be better structured hierarchically, as primaries with their corresponding set of variants. Each variant represents an individual SKU (for each size) and each primary item represents a group of SKUs, where each SKU is a different size for one overarching tee-shirt style. This organization by SKU structure allows the search results and recommendation panels to show a range of tee-shirt styles. It allows the shopper to drill down on a particular primary (style) to select the variant (size) to purchase.
Collections group related products that a customer might buy. To accurately represent them in the reranking model, Vertex AI Search for commerce has a logic that credits them with purchases. For example: A shopper clicks on products in a bedsheets set, then adds to cart or purchases a primary product in that collection. The collection is credited with that purchase, and the model accurately represents the popularity and value of collections.
There is also a variant-only catalog type, which is now deprecated. This catalog
type can only be used with recommendations. For the variant-only
catalog, the ingestionProductType
is set to variant
during import. A primary
is inferred for each variant, based on a primary product ID specified for each
variant.
Set up your product catalog
When planning your product catalog, you need to decide whether it contains products designated as only primaries, primaries and variants, or a mixture of the two arrangements. Think of it in terms of your products' SKU structure. Your products can be primary items, which may or may not have variants.
Based on how your product SKUs are designated, consider your options for setting up your product catalog:
- You want your SKU to be shown as an individual search result or recommendation: SKU=primary
- Your SKU should be part of a group of similar SKUs: SKU=variant, group of SKUs=primary
A mixture of both combinations: SKU=primary, SKU=variant, group of SKUs=primary
If your product detail page shows an option/size/color selector, these options are typically uploaded as variants into your product catalog. Consider whether or not you want different types of the same product with different attributes such as size and color to appear as a single search result or separate ones. For example, for a book you want to decide if you want a hard cover SKU and a soft cover SKU of the same book to appear as separate search results (SKU = primary), or as one (SKU = variant, group of SKUs = primary).
When setting up your product catalog, keep in mind that recommendation and search results only return primary items.
Minimal primary products
If you determine that your catalog should have both primaries and variants, that is, SKU groups and SKUs, but you only have SKUs now, you need to create primaries for the SKU groups. These primaries are sometimes called "virtual primaries" or "fake primaries".
These primaries only need to contain minimal information: id
, title
, and
categories
.
If type
is not specified, the product type defaults to primary. If you are
importing, you don't need to specify name
. For more information, see
the preceding section, Required product information.
Type is immutable
You cannot change the type of a product, for example, from variant to primary or from primary to variant.
If you do need to change the type of a product, delete the product and re-create a product with a different type. Before you can delete a primary product, the associated variants must be deleted.
Catalog import
If you have your catalog in Merchant Center, then we recommend that you import your catalog by linking your Merchant Center account.
If your catalog is not in Merchant Center but is in Cloud Storage or BigQuery or some other storage, then do a bulk data import.
If you plan to import catalog data from Merchant Center in the future, review your data as described for Merchant Center imports to ensure you are making correct choices about your catalog. This is important because changing the configuration of an existing catalog requires deleting the catalog and uploading it again (see Change product-level configuration).
For detailed information about how to upload a catalog, see Import catalog information.
Product inventory
Product inventory encompasses:
Price, both the current and original prices
Availability, such as in stock, out-of-stock, back ordered, and pre-ordered
Quantity available
Fulfillment information such as pickup-in-store, ship-to-store, and next-day-delivery
There are two levels of inventory: product-level and local.
Product-level inventory
For retailers who only sell online, inventory is specified at the product level. Price, availability, and other inventory data is set for each product in the catalog.
For more information about product-level inventory, including how to maintain inventory data, see Update inventory for Vertex AI Search for commerce.
Local inventory
Retailers who have brick-and-mortar stores and an online store need to keep inventory information on a per-store basis. They use local inventory to do this.
There are two product fields that can be used to store local inventory. Both fields are lists of locations (place IDs) with associated inventory information:
Product.fulfillmentInfo. Pickup and shipping methods at each store location
Product.localInventories. Price information, product attributes, and pickup and shipping methods at each store location
You can use either or both fields for your store-level information.
For more information about local inventories, see Update local inventory for Vertex AI Search for commerce.
Catalog data quality metrics
The Data quality page in the Search for commerce console assesses if you need to update catalog data to improve the quality of search results and unlock search performance tiers.
The following table describes the quality metrics that Vertex AI Search for commerce uses to help you evaluate your product data. For details about how to view data quality metrics and search performance tiers in the Search for commerce console, see Unlock search performance tiers.
Catalog quality metric | Quality rule | Notes |
---|---|---|
URI is present and accessible | Product has a valid Product.uri . The URI needs to be accessible and match your domain. |
Search uses web signals crawled using this URI to improve search quality. |
Meets time conformance | Product.availableTime is before current time, and Product.expireTime is after current time. |
Only products that meet the time conformance are available for search. |
Searchable attribute is present | Product has at least one attribute set to searchable. |
Custom attributes that are marked searchable can be searched by text queries. |
Description is present | Product has non-empty Product.description . |
A comprehensive description helps to improve search quality. |
Title consists of at least two words | Product.title consists of at least two words. |
A comprehensive title helps to improve search quality. |
Has variant with image | The variant product has at least one Product.image . You may ignore this metric if all your products are at primary level. |
This metric is for informational purposes and does not impact search quality. |
Has variant with price info | The variant product has Product.priceInfo set. You may ignore this metric if all your products are at primary level. |
This metric is for informational purposes and does not impact search quality. |
Product schema for Vertex AI Search for commerce
When importing a catalog from BigQuery, use the following Vertex AI Search for commerce product schema to create a BigQuery table with the correct format and load it with your catalog data. Then, import the catalog.