About catalogs and products

This page provides best practices for creating your catalog information and populating your catalog data.

Overview

The catalog data you import into Vertex AI Search for commerce has a direct effect on the quality of the resulting model, and therefore on the quality of search and recommendation results. In general, the more accurate and specific catalog information you can provide, the higher quality your model.

Your catalog should be kept up to date. You can upload catalog changes as often as needed; ideally, every day for catalogs with a high rate of change. You can upload (patch) existing product items; only the changed fields will be updated. There is no charge for uploading catalog information. For more information, see Keeping your catalog up to date.

Catalog branches

If you use search, you can use catalog branches to test new data that you've uploaded offline before making it live on your site.

You can use up to three branches, identified as 0, 1, and 2. Your live site points to default_branch for its catalog data. Specify which branch is your live default_branch (the default is set to branch 0) using either setDefaultBranch or the Data tab in Search for commerce console. Your site then uses the catalog data provided by the branch that default_branch points to.

As an example, say default_branch is set to branch ID 0, so your site is using the catalog data that you've uploaded to that branch. You can upload new catalog data to branch 1 and preview it. After you've confirmed that the catalog has been uploaded correctly, you can switch to branch 1 as the live default_branch.

The catalog cache can take up to 30 minutes to update after branch switching.

If you use recommendations, we recommend using only the default branch due to the update delay during branch switching. If the data difference between branches is large, update delay can negatively impact prediction results.

Products

The catalog is a collection of product objects.

Required product information

The following fields are required; you must provide values for them when you create product items in your catalog. They should also correspond with the values used in your internal product database, and should accurately reflect the product represented, because they are included in training your models.

In some cases, other fields are also required. Refer to the complete list of all product fields on the Product reference page.

All product information you provide can be used to improve the quality of recommendations and search results. Be sure to provide as many fields as possible.

Field Notes
name The full, unique resource name of the product. Required for all Product methods except for import. During import, the name is automatically generated and does not need to be manually provided.
id The product ID used by your product database. The ID field must be unique across your entire catalog. The same value is used when you record a user event, and is also returned by the predict and search methods.
title Product title from your product database. A UTF-8 encoded string. Limited to 1250 characters.
categories Product categories. Every product must be assigned to at least one category. If a product belongs to more than one category, repeat the field for each category. The value must be a non-empty UTF-8 encoded string with a length limit of 5,000 characters. Always specify the full category path, for example: ["Sports & Fitness > Athletic Clothing > Shoes"].

Product attributes

Providing values for predefined system attributes in Product such as brand, color, and size is highly encouraged. You can also include custom attributes that you define with Product.attributes.

Retrievable fields

If you are using search, attributes are included with a product in the search response if you mark them as retrievable in Product.retrievableFields. They can then be used for other search features such as filtering and facets.

Exact-searchable option

You can give a certain value for specific string queries to the exact-searchable option, a catalog attribute field. If a query matches any product in your catalog where the value for that field is the string given in the query, it returns exactly that product in the search query. This option works well for serial numbers, where customers expect a targeted search experience.

A useful field for product attributes with an exact value (like ModelId or ManufacturerId), the ExactSearchableOption field is usually for custom attributes. Attributes like product_id are primary index fields — exact-searchable by default. The item_id field is always on for exact match and can't be disabled.

  • To avoid returning unrelated items in searches, never set the searchable option to a generic value such as battery.
  • To avoid under-serving search queries, don't set special fields like tag — which could have as one of its string values "iphone" — to exact-searchable. Doing so could cause those queries to limit results for all iphones in the product catalog.

For more information, see About product attributes.

Product levels

Product SKU designations determine the hierarchy in your catalog.

Product designation types

There are three product designation types:

  1. Primary items are returned in recommendation or search results. Primaries can be individual (SKU-level) items and groups of similar items (SKU groups).

  2. Variant items are versions of a SKU-group primary product. Variants can only be individual (SKU-level) items. For example, if the primary product is "V-neck shirt", variants could be "Brown V-neck shirt, size XL" and "White V-neck shirt, size S". Primaries and variants are sometimes described as parent and child items.

  3. Collection items are bundles of primary products or variant products, such as a jewelry set with a necklace, earrings, and a ring. Hierarchical structures similar to products and variants, collections group related primary products. Customers can't buy them directly, they're not widely used, and they're only available in search.

Product examples

As an example, according to these product designation types, grocery items are better cataloged as primary products, each consisting of a single SKU product, such as"bananas, fresh".

On the other hand, tee-shirts would be better structured hierarchically, as primaries with their corresponding set of variants. Each variant represents an individual SKU (for each size) and each primary item represents a group of SKUs, where each SKU is a different size for one overarching tee-shirt style. This organization by SKU structure allows the search results and recommendation panels to show a range of tee-shirt styles. It allows the shopper to drill down on a particular primary (style) to select the variant (size) to purchase.

Collections group related products that a customer might buy. To accurately represent them in the reranking model, Vertex AI Search for commerce has a logic that credits them with purchases. For example: A shopper clicks on products in a bedsheets set, then adds to cart or purchases a primary product in that collection. The collection is credited with that purchase, and the model accurately represents the popularity and value of collections.

There is also a variant-only catalog type, which is now deprecated. This catalog type can only be used with recommendations. For the variant-only catalog, the ingestionProductType is set to variant during import. A primary is inferred for each variant, based on a primary product ID specified for each variant.

Set up your product catalog

When planning your product catalog, you need to decide whether it contains products designated as only primaries, primaries and variants, or a mixture of the two arrangements. Think of it in terms of your products' SKU structure. Your products can be primary items, which may or may not have variants.

Based on how your product SKUs are designated, consider your options for setting up your product catalog:

  • You want your SKU to be shown as an individual search result or recommendation: SKU=primary
  • Your SKU should be part of a group of similar SKUs: SKU=variant, group of SKUs=primary
  • A mixture of both combinations: SKU=primary, SKU=variant, group of SKUs=primary

    If your product detail page shows an option/size/color selector, these options are typically uploaded as variants into your product catalog. Consider whether or not you want different types of the same product with different attributes such as size and color to appear as a single search result or separate ones. For example, for a book you want to decide if you want a hard cover SKU and a soft cover SKU of the same book to appear as separate search results (SKU = primary), or as one (SKU = variant, group of SKUs = primary).

When setting up your product catalog, keep in mind that recommendation and search results only return primary items.

Minimal primary products

If you determine that your catalog should have both primaries and variants, that is, SKU groups and SKUs, but you only have SKUs now, you need to create primaries for the SKU groups. These primaries are sometimes called "virtual primaries" or "fake primaries".

These primaries only need to contain minimal information: id, title, and categories.

If type is not specified, the product type defaults to primary. If you are importing, you don't need to specify name. For more information, see the preceding section, Required product information.

Type is immutable

You cannot change the type of a product, for example, from variant to primary or from primary to variant.

If you do need to change the type of a product, delete the product and re-create a product with a different type. Before you can delete a primary product, the associated variants must be deleted.

Catalog import

If you have your catalog in Merchant Center, then we recommend that you import your catalog by linking your Merchant Center account.

If your catalog is not in Merchant Center but is in Cloud Storage or BigQuery or some other storage, then do a bulk data import.

If you plan to import catalog data from Merchant Center in the future, review your data as described for Merchant Center imports to ensure you are making correct choices about your catalog. This is important because changing the configuration of an existing catalog requires deleting the catalog and uploading it again (see Change product-level configuration).

For detailed information about how to upload a catalog, see Import catalog information.

Product inventory

Product inventory encompasses:

  • Price, both the current and original prices

  • Availability, such as in stock, out-of-stock, back ordered, and pre-ordered

  • Quantity available

  • Fulfillment information such as pickup-in-store, ship-to-store, and next-day-delivery

There are two levels of inventory: product-level and local.

Product-level inventory

For retailers who only sell online, inventory is specified at the product level. Price, availability, and other inventory data is set for each product in the catalog.

For more information about product-level inventory, including how to maintain inventory data, see Update inventory for Vertex AI Search for commerce.

Local inventory

Retailers who have brick-and-mortar stores and an online store need to keep inventory information on a per-store basis. They use local inventory to do this.

There are two product fields that can be used to store local inventory. Both fields are lists of locations (place IDs) with associated inventory information:

You can use either or both fields for your store-level information.

For more information about local inventories, see Update local inventory for Vertex AI Search for commerce.

Catalog data quality metrics

The Data quality page in the Search for commerce console assesses if you need to update catalog data to improve the quality of search results and unlock search performance tiers.

The following table describes the quality metrics that Vertex AI Search for commerce uses to help you evaluate your product data. For details about how to view data quality metrics and search performance tiers in the Search for commerce console, see Unlock search performance tiers.

Catalog quality metric Quality rule Notes
URI is present and accessible Product has a valid Product.uri. The URI needs to be accessible and match your domain. Search uses web signals crawled using this URI to improve search quality.
Meets time conformance Product.availableTime is before current time, and Product.expireTime is after current time. Only products that meet the time conformance are available for search.
Searchable attribute is present Product has at least one attribute set to searchable. Custom attributes that are marked searchable can be searched by text queries.
Description is present Product has non-empty Product.description. A comprehensive description helps to improve search quality.
Title consists of at least two words Product.title consists of at least two words. A comprehensive title helps to improve search quality.
Has variant with image The variant product has at least one Product.image. You may ignore this metric if all your products are at primary level. This metric is for informational purposes and does not impact search quality.
Has variant with price info The variant product has Product.priceInfo set. You may ignore this metric if all your products are at primary level. This metric is for informational purposes and does not impact search quality.

Product schema for Vertex AI Search for commerce

When importing a catalog from BigQuery, use the following Vertex AI Search for commerce product schema to create a BigQuery table with the correct format and load it with your catalog data. Then, import the catalog.