Catalogs and catalog information

This page provides best practices for creating your catalog information and populating your catalog data.

Overview

The catalog data you import into Recommendations AI has a direct effect on the quality of the resulting model, and therefore on the quality of the predictions Recommendations AI provides. In general, the more accurate and specific catalog information you can provide, the higher quality your model.

Your catalog should be kept up to date. You can upload catalog changes as often as needed; ideally, every day for catalogs with a high rate of change. You can upload (patch) existing catalog items; only the changed fields will be updated. There is no charge for uploading catalog information. For more information, see Keeping your catalog up to date.

Catalog items

The catalog is a collection of catalog item objects. Use the information below to decide what information you will provide when you upload your catalog information.

To see a complete list of all catalog item fields, see the catalog item reference page.

Required catalog item information

The following catalog item fields are required; you must provide values for them when you create catalog items. They should also correspond with the values used in your internal product database, and should accurately reflect the product represented, because they are included in training your models.

Field Notes
id The product id used by your product database. The id field must be unique across your entire catalog. The same value is used when you record a user event, and is also returned by the predict method.
categoryHierarchies[]

A list of category hierarchies the catalog item belongs to. Categories should be taken from your company's product classification, and should accurately describe the product.

You must provide at least one category; multiple categories are also accepted.

Subcategories are grouped together in square brackets. For example, suppose a footwear product belongs to both "Shoes & Accessories" -> "Shoes" and "Athletic Clothing" -> "Shoes"]. Its categoryHierarchies field would be:

categoryHierarchies: [
       { categories: ["Shoes & Accessories", "Shoes"] },
       { categories: ["Athletic Clothing", "Shoes"] }
]
title Catalog item title from your product database. A UTF-8 encoded string. Limited to 1250 characters.
productMetadata.currencyCode Required only if price is set. The three-character alphabetic ISO-4217 code for the currency used in the price and cost fields of this catalog item.

Optional catalog information

These fields are not required. However, all catalog information you provide can be used to improve the quality of the model, prediction results, or prediction metrics. Be sure to provide as many fields as possible.

Field Notes Used for
description Catalog item description from your product database. A UTF-8 encoded string. Limited to 1250 characters. Highly encouraged.
tags

The tags field enables you to filter your prediction results. Tag values must consist only of alphanumeric characters, underscores, and dashes.

Updates to tag values can take up to 24 hours before they can be used to filter prediction results.

Recommendation filtering
productMetadata An object with information about your catalog item. While productMetadata fields are optional, canonicalProductUri, price, and currencyCode are highly encouraged.
productMetadata.canonicalProductUri URL of the detail page for this catalog item. Highly encouraged.
productMetadata.price If specified, the price field must contain one of two values: ExactPrice or PriceRange. This information is used to compute model metrics; if PriceRange is specified, the low end of the range is used.
productMetadata.costs

When you provide costs for a catalog item, it increases the accuracy of the prediction metrics, because any costs you provide are subtracted from the price. You can provide any number of costs; each cost is represented as a string and a value. The strings are for your reference only; they can be whatever value makes sense for your data. For example:

{"manufacturing cost", 45.5 "shipping cost", 12.4}
Recommendation metrics
productMetadata.stockState

You can use this field to provide stock information about this catalog item. The possible values are:

  • IN_STOCK
  • OUT_OF_STOCK
  • PREORDER
  • BACKORDER

The default value is IN_STOCK.

Model quality
productMetadata.availableQuantity

How many of this item are available. A value of zero does not change the stockState field to OUT_OF_STOCK, nor does it prevent Recommendations AI from recommending this item.

Not currently used, could be used for model quality.
productMetadata.images Images for this catalog item. Model quality
itemAttributes A place to provide information about your catalog item that is not included in other fields, and that you believe could help improve model quality. Highly encouraged. Model quality
itemGroupId ID for master item for this variant. Must be enabled by Recommendations AI. SKU grouping

Using catalog levels

When you import your catalog for the first time, you must specify whether you are providing master items or variants with your user events, and whether you want master items or variants returned with your predictions.

During catalog import, set your product levels by using the Catalog.patch method. For example:

curl -X PATCH \
    -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
     --data '{
       "catalogItemLevelConfig": {
         "eventItemLevel": "event-data-level",
         "predictItemLevel": "prediction-level"
       }
     }' \
    "https://recommendationengine.googleapis.com/v1beta1/projects/PROJECT_ID/locations/global/catalogs/default_catalog"

Master items represent a group of similar items (in other words, a SKU group). An example of a master item might be "V-neck shirt", with variants such as "Brown v-neck shirt, size XL" and "White v-neck shirt, size S". Masters and variants are sometimes described as "parent" and "child" items.

The possible combinations are:

  • Variant/Variant: Capture variants with user events and return variants with predictions.

  • Master/Master: Capture master items with user events and return master items with predictions.

  • Variant/Master: Capture variants with user events and return master items with predictions.

To determine the best catalog level choices for your implementation, you will need to review your catalog data and your website logic. What item IDs are available to you when you capture user event data? What item IDs would be most effective to be returned with predictions? How do those IDs compare and relate to each other?

Use the appropriate steps, depending on whether you are importing from Merchant Center or not:

Determining your catalog levels for importing from Merchant Center

  1. Review your catalog data and your website logic to answer the following questions:

    • Do I have both masters and variants in my catalog?
    • If so, what level of item will I have available when I capture user event data?
    • What type of items do I need returned with my predictions?
  2. If your catalog has only one level of items (none of your catalog items have a value for item_group_ID), use Variant/Variant.

  3. If you can capture only master items at the time you record user events, then you must use the Master/Master combination.

  4. If you can capture only variant items when you record user events, then you can choose either Variant/Variant (if you want predictions to also return variants) or Variant/Master (if you want predictions to return masters).

Determining your catalog levels for importing using JSON

Either all of your catalog items must have a value for itemGroupId or none of them can. You cannot import catalog data with itemGroupId set for some items but not for others.

  1. If your catalog has only one level of items, use either Variant/Variant (recommended) or Master/Master.

    If you are not importing from Merchant Center, these two choices are equivalent, and this choice can be made based on whether your data includes values for itemGroupId or not (the Master/Master choice requires a value for itemGroupId for every item, and it has to be equal to itemId).

    If you plan to import catalog data from Merchant Center in the future, review your data as described for Merchant Center imports to ensure you are making the correct choice; catalog levels can't be changed after you have imported data.

  2. Otherwise , use Variant/Master, and make sure itemGroupId has a meaningful value for all of your catalog items.

Recommendations AI schema

When importing a catalog from BigQuery, use the Recommendations AI schema below to create a BigQuery table with the correct format and load it with your catalog data. Then, import the catalog.