About catalogs and products

This page provides best practices for creating your catalog information and populating your catalog data.

Overview

The catalog data you import into Vertex AI Search for retail has a direct effect on the quality of the resulting model, and therefore on the quality of search and recommendation results. In general, the more accurate and specific catalog information you can provide, the higher quality your model.

Your catalog should be kept up to date. You can upload catalog changes as often as needed; ideally, every day for catalogs with a high rate of change. You can upload (patch) existing product items; only the changed fields will be updated. There is no charge for uploading catalog information. For more information, see Keeping your catalog up to date.

Catalog branches

If you use search, you can use catalog branches to test new data that you've uploaded offline before making it live on your site.

You can use up to three branches, identified as 0, 1, and 2. Your live site points to default_branch for its catalog data. Specify which branch is currently your live default_branch (it is set to branch 0 by default) using either setDefaultBranch or the Data tab in Search for Retail console. Your site then uses the catalog data provided by the branch that default_branch points to.

As an example, say default_branch is currently set to branch ID 0, so your site is using the catalog data that you've uploaded to that branch. You can upload new catalog data to branch 1 and preview it. After you've confirmed that the catalog has been uploaded correctly, you can switch to branch 1 as the live default_branch.

The catalog cache can take up to 30 minutes to update after branch switching.

If you use recommendations, we recommend using only the default branch due to the update delay during branch switching. If the data difference between branches is large, update delay can negatively impact prediction results.

Products

The catalog is a collection of product objects.

Required product information

The following fields are required; you must provide values for them when you create product items in your catalog. They should also correspond with the values used in your internal product database, and should accurately reflect the product represented, because they are included in training your models.

In some cases, other fields are also required. Refer to the complete list of all product fields on the Product reference page.

All product information you provide can be used to improve the quality of recommendations and search results. Be sure to provide as many fields as possible.

Field Notes
name The full, unique resource name of the product. Required for all Product methods except for import. During import, the name is automatically generated and does not need to be manually provided.
id The product ID used by your product database. The ID field must be unique across your entire catalog. The same value is used when you record a user event, and is also returned by the predict and search methods.
title Product title from your product database. A UTF-8 encoded string. Limited to 1250 characters.
categories Product categories. Every product must be assigned to at least one category. If a product belongs to more than one category, repeat the field for each category. The value must be a non-empty UTF-8 encoded string with a length limit of 5,000 characters. Always specify the full category path, for example: ["Sports & Fitness > Athletic Clothing > Shoes"].

Product attributes

Providing values for predefined system attributes in Product such as brand, color, and size is highly encouraged. You can also include custom attributes that you define with Product.attributes.

If you are using search, attributes are included with a product in the search response if you mark them as retrievable in Product.retrievableFields. They can then be used for other search features such as filtering and facets.

For more information, see About product attributes.

Product levels

Product levels determine the hierarchy in your catalog. Typically, you need to choose between a single-level catalog or two-level catalog.

For example, you can have a single-level catalog where each product item has a SKU. Alternatively, you might choose a two-level catalog that contains both SKU groups and individual SKUs.

Product-level types

There are three product-level types:

  • Primary items are returned in recommendation or search results. Primaries can be individual (SKU-level) items and groups of similar items (SKU groups).

  • Variant items are versions of a SKU-group primary product. Variants can only be individual (SKU-level) items. For example, if the primary product is "V-neck shirt", variants could be "Brown V-neck shirt, size XL" and "White V-neck shirt, size S". Primaries and variants are sometimes described as parent and child items.

  • Collection items are collections of products. Collections are bundles of primary products or variant products. For example, a collection might be a jewelry set with a necklace, earrings, and ring. Collections are only available in search and are not widely used.

About catalog hierarchy

When planning your catalog hierarchy, you need to decide if your catalog should contain only primaries or primaries and variants. The key point to remember is that prediction and search results only return primary items.

For example, a primary-only catalog might work well for selling books, where a recommendations panel returns a selection of books, each with its own SKU. However, a primary-only catalog for tee-shirts, would likely show the same tee-shirt in each available size in the recommendation panel.

The tee-shirt catalog would be better to have both primaries and variants, with the SKUs as variants (one variant for each size) and the primaries representing the groups of SKUs for the sizes of each style of tee-shirt. This two-level catalog allows the recommendation panel to show a range of similar tee-shirts styles. The shopper can drill down on a particular primary (style) to select the variant (size) to purchase.

There is also a variant-only catalog type, which is now deprecated. This catalog type can only be used with recommendations. For the variant-only catalog, the ingestionProductType is set to variant during import. A primary is inferred for each variant, based on a primary product ID specified for each variant.

Minimal primary products

If you determine that your catalog should have both primaries and variants, that is, SKU groups and SKUs, but you only have SKUs now, you need to create primaries for the SKU groups. These primaries are sometimes called "virtual primaries" or "fake primaries".

These primaries only need to contain minimal information: id, title, and categories.

If type is not specified, the product type defaults to primary. If you are importing, you do not need to specify name. For more information, see the preceding section, Required product information.

Type is immutable

You cannot change the type of a product, for example, from variant to primary or from primary to variant.

If you do need to change the type of a product, delete the product and re-create a product with a different type. Before you can delete a primary product, the associated variants must be deleted.

Catalog import

If you currently have your catalog in Merchant Center, then we recommend that you import your catalog by linking your Merchant Center account.

If your catalog is not in Merchant Center but is in Cloud Storage or BigQuery or some other storage, then do a bulk data import.

If you plan to import catalog data from Merchant Center in the future, review your data as described for Merchant Center imports to ensure you are making correct choices about your catalog. This is important because changing the configuration of an existing catalog requires deleting the catalog and uploading it again (see Change product-level configuration).

For detailed information about how to upload a catalog, see Import catalog information.

Product inventory

Product inventory encompasses:

  • Price, both the current and original prices

  • Availability, such as in stock, out-of-stock, back ordered, and pre-ordered

  • Quantity available

  • Fulfillment information such as pickup-in-store, ship-to-store, and next-day-delivery

There are two levels of inventory: product-level and local.

Product-level inventory

For retailers who only sell online, inventory is specified at the product level. Price, availability, and other inventory data is set for each product in the catalog.

For more information about product-level inventory, including how to maintain inventory data, see Update inventory for Vertex AI Search for retail.

Local inventory

Retailers who have brick-and-mortar stores and an online store need to keep inventory information on a per-store basis. They use local inventory to do this.

There are two product fields that can be used to store local inventory. Both fields are lists of locations (place IDs) with associated inventory information:

You can use either or both fields for your store-level information.

For more information about local inventories, see Update local inventory for Vertex AI Search for retail.

Catalog data quality metrics

The Data quality page in the Search for Retail console assesses if you need to update catalog data to improve the quality of search results and unlock search performance tiers.

The following table describes the quality metrics that Vertex AI Search for retail uses to help you evaluate your product data. For details about how to view data quality metrics and search performance tiers in the Search for Retail console, see Unlock search performance tiers.

Catalog quality metric Quality rule Notes
URI is present and accessible Product has a valid Product.uri. The URI needs to be accessible and match your domain. Search uses web signals crawled via this URI to improve search quality.
Meets time conformance Product.availableTime is before current time, and Product.expireTime is after current time. Only products that meet the time conformance are available for search.
Searchable attribute is present Product has at least one attribute set to searchable. Custom attributes that are marked searchable can be searched by text queries.
Description is present Product has non-empty Product.description. A comprehensive description helps to improve search quality.
Title consists of at least two words Product.title consists of at least two words. A comprehensive title helps to improve search quality.
Has variant with image The variant product has at least one Product.image. You may ignore this metric if all your products are at primary level. This metric is for informational purposes and does not impact search quality.
Has variant with price info The variant product has Product.priceInfo set. You may ignore this metric if all your products are at primary level. This metric is for informational purposes and does not impact search quality.

Product schema for Vertex AI Search for retail

When importing a catalog from BigQuery, use the Vertex AI Search for retail product schema below to create a BigQuery table with the correct format and load it with your catalog data. Then, import the catalog.