Catalogs and catalog information

Stay organized with collections Save and categorize content based on your preferences.

This page provides best practices for creating your catalog information and populating your catalog data.

Overview

The catalog data you import into Retail has a direct effect on the quality of the resulting model, and therefore on the quality of the results the Retail API provides. In general, the more accurate and specific catalog information you can provide, the higher quality your model.

Your catalog should be kept up to date. You can upload catalog changes as often as needed; ideally, every day for catalogs with a high rate of change. You can upload (patch) existing product items; only the changed fields will be updated. There is no charge for uploading catalog information. For more information, see Keeping your catalog up to date.

Catalog branches

If you use Retail Search, you can use catalog branches to test new data that you've uploaded offline before making it live on your site.

With the Retail API, you can use up to three branches, identified as 0, 1, and 2. Your live site points to default_branch for its catalog data. Specify which branch is currently your live default_branch (it is set to branch 0 by default) using either setDefaultBranch or the Data tab in Google Cloud console. Your site then uses the catalog data provided by the branch that default_branch points to.

As an example, say default_branch is currently set to branch ID 0, so your site is using the catalog data that you've uploaded to that branch. You can upload new catalog data to branch 1 and preview it. After you've confirmed that the catalog has been uploaded correctly, you can switch to branch 1 as the live default_branch.

The catalog cache can take up to 30 minutes to update after branch switching.

If you use Recommendations AI, we recommend using only the default branch due to the update delay during branch switching. If the data difference between branches is large, update delay can negatively impact prediction results.

Products

The catalog is a collection of product objects.

Required product information

The following fields are required; you must provide values for them when you create product items in your catalog. They should also correspond with the values used in your internal product database, and should accurately reflect the product represented, because they are included in training your models.

In some cases, other fields are also required. Refer to the complete list of all product fields on the Product reference page.

All product information you provide can be used to improve the quality of recommendations and search results. Be sure to provide as many fields as possible.

Field Notes
name The full, unique resource name of the product. Required for all Product methods except for import. During import, the name is automatically generated and does not need to be manually provided.
id The product ID used by your product database. The ID field must be unique across your entire catalog. The same value is used when you record a user event, and is also returned by the predict and search methods.
title Product title from your product database. A UTF-8 encoded string. Limited to 1250 characters.

Product attributes

Providing values for predefined system attributes in Product such as brand, color, and size is highly encouraged. You can also include custom attributes that you define with Product.attributes.

If you are using Retail Search, attributes are included with a product in the search response if you mark them as retrievable in Product.retrievableFields. They can then be used for other Retail Search features such as filtering and facets.

Product levels

Product levels determine the hierarchy in your catalog. Typically, you need to choose between a single-level catalog or two-level catalog.

For example, you can have a single-level catalog where each product item has a SKU. Alternatively, you might choose a two-level catalog that contains both SKU groups and individual SKUs.

Product-level types

There are three product-level types:

  • Primary items are what the Retail API returns in prediction or search results. Primaries can be individual (SKU-level) items and groups of similar items (SKU groups).

  • Variant items are versions of a SKU-group primary product. Variants can only be individual (SKU-level) items. For example, if the primary product is "V-neck shirt", variants could be "Brown V-neck shirt, size XL" and "White V-neck shirt, size S". Primaries and variants are sometimes described as parent and child items.

  • Collection items are collections of products. Collections are bundles of primary products or variant products. For example, a collection might be a jewelry set with a necklace, earrings, and ring. Collections are only available in Retail Search and are not widely used.

About catalog hierarchy

When planning your catalog hierarchy, you need to decide if your catalog should contain only primaries or primaries and variants. The key point to remember is that prediction and search results only return primary items.

For example, a primary-only catalog might work well for selling books, where a recommendations panel returns a selection of books, each with its own SKU. However, a primary-only catalog for tee-shirts, would likely show the same tee-shirt in each available size in the recommendation panel.

The tee-shirt catalog would be better to have both primaries and variants, with the SKUs as variants (one variant for each size) and the primaries representing the groups of SKUs for the sizes of each style of tee-shirt. This two-level catalog allows the recommendation panel to show a range of similar tee-shirts styles. The shopper can drill down on a particular primary (style) to select the variant (size) to purchase.

There is also a variant-only catalog type, which is now deprecated. This catalog type can only be used with Recommendations AI. For the variant-only catalog, the ingestionProductType is set to variant during import. A primary is inferred for each variant, based on a primary product ID specified for each variant.

Minimal primary products

If you determine that your catalog should have both primaries and variants, that is, SKU groups and SKUs, but you only have SKUs now, you need to create primaries for the SKU groups. These primaries are sometimes called "virtual primaries" or "fake primaries".

These primaries only need to contain minimal information: id, title, and categories.

If type is not specified, the product type defaults to primary. If you are importing, you do not need to specify name. For more information, see the preceding section, Required product information.

Type is immutable

You cannot change the type of a product, for example, from variant to primary or from primary to variant.

If you do need to change the type of a product, delete the product and re-create a product with a different type. Before you can delete a primary product, the associated variants must be deleted.

Catalog import

If you currently have your catalog in Merchant Center, then we recommend that you import your catalog by linking your Merchant Center account.

If your catalog is not in Merchant Center but is in Cloud Storage or BigQuery or some other storage, then do a bulk data import.

If you plan to import catalog data from Merchant Center in the future, review your data as described for Merchant Center imports to ensure you are making correct choices about your catalog. This is important because changing the configuration of an existing catalog requires deleting the catalog and uploading it again (see Change product-level configuration).

For detailed information about how to upload a catalog, see Import catalog information.

Catalog data quality metrics

To help you monitor the search quality of your catalog data, Retail evaluates your product data against a set of quality rules. You can view the percentage of products that meet each quality rule on the Retail Data page.

The following table describes the quality metrics that Retail use to help you evaluate your product data:

Catalog quality metric Quality rule Notes
URI is present and accessible Product has a valid Product.uri. The URI needs to be accessible and match your domain. Cloud Retail Search uses web signals crawled via this URI to improve search quality.
Meets time conformance Product.availableTime is before current time, and Product.expireTime is after current time. Only products that meet the time conformance are available for search.
Searchable attribute is present Product has at least one attribute set to searchable. Custom attributes that are marked searchable can be searched by text queries.
Description is present Product has non-empty Product.description. A comprehensive description helps to improve search quality.
Title consists of at least two words Product.title consists of at least two words. A comprehensive title helps to improve search quality.
Has variant with image The variant product has at least one Product.image. You may ignore this metric if all your products are at primary level. This metric is for informational purposes and does not impact search quality.
Has variant with price info The variant product has Product.priceInfo set. You may ignore this metric if all your products are at primary level. This metric is for informational purposes and does not impact search quality.

Retail schema

When importing a catalog from BigQuery, use the Retail schema below to create a BigQuery table with the correct format and load it with your catalog data. Then, import the catalog.