Import catalog information

This page describes how to import your catalog information and keep it up to date.

The import procedures on this page apply to both recommendations and search. After you import data, both services are able to use that data, so you don't need to import the same data twice if you use both services.

Import catalog data from BigQuery

This tutorial shows you how to use a BigQuery table to import large amounts of catalog data with no limits.


To follow step-by-step guidance for this task directly in the Cloud Shell Editor, click Guide me:

Guide me


Import catalog data from Cloud Storage

This tutorial shows you how to to import a large number of items to a catalog.


To follow step-by-step guidance for this task directly in the Cloud Shell Editor, click Guide me:

Guide me


Import catalog data inline

This tutorial shows how to to import products into a catalog inline.


To follow step-by-step guidance for this task directly in the Cloud Shell Editor, click Guide me:

Guide me


Before you begin

Before you can import your catalog information, you must have completed the instructions in Before you begin, specifically setting up your project, creating a service account, and adding the service account to your local environment.

You must have the Retail Admin IAM role to perform the import.

Catalog import best practices

High quality data is needed to generate high-quality results. If your data is missing fields or has placeholder values instead of actual values, the quality of your predictions and search results suffers.

When you import catalog data, ensure that you implement the following best practices:

  • Make sure to think carefully when determining which products or groups of products are primary and which are variants. Before you upload any data, see Product levels.

    Change product level configuration after you have imported any data requires a significant effort.

    Primary items are returned as search results or recommendations. Variant items are not.

    For example, if the primary SKU group is "V-neck shirt", then the recommendation model returns one V-neck shirt item, and, perhaps, a crew-neck shirt and a scoop-neck shirt. However, if variants are not used and each SKU is a primary, then every color/size combination of V-neck shirt is returned as a distinct item on the recommendation panel: "Brown V-neck shirt, size XL", "Brown V-neck shirt, size L", through to "White V-neck shirt, size M", "White V-neck shirt, size S".

  • Observe the product item import limits.

    For bulk import from Cloud Storage, the size of each file must be 2 GB or smaller. You can include up to 100 files at a time in a single bulk import request.

    For inline import, import no more than 5,000 product items at a time.

  • Make sure that all required catalog information is included and correct.

    Do not use placeholder values.

  • Include as much optional catalog information as possible.

  • Make sure your events all use a single currency, especially if you plan to use Google Cloud console to get revenue metrics. The Vertex AI Search for retail API does not support using multiple currencies per catalog.

  • Keep your catalog up to date.

    Ideally, you should update your catalog daily. Scheduling periodic catalog imports prevents model quality from going down over time. You can schedule automatic, recurring imports when you import your catalog using the Search for Retail console. Alternatively, you can use Google Cloud Scheduler to automate imports.

  • Do not record user events for product items that have not been imported yet.

  • After importing catalog information, review the error reporting and logging information for your project.

    A few errors are expected, but if you have a large number of errors, you should review them and fix any process issues that led to the errors.

About importing catalog data

You can import your product data from Merchant Center, Cloud Storage, BigQuery, or specify the data inline in the request. Each of these procedures are one-time imports with the exception of linking Merchant Center. Schedule regular catalog imports (ideally, daily) to ensure that your catalog is current. See Keep your catalog up to date.

You can also import individual product items. For more information, see Upload a product.

Catalog import considerations

This section describes the methods that can be used for batch importing of your catalog data, when you might use each method, and some of their limitations.

Merchant Center Syncing Description Imports catalog data through Merchant Center by linking the account with Vertex AI Search for retail. After linking, updates to catalog data in Merchant Center are synced in real time to Vertex AI Search for retail.
When to use If you have an existing integration with Merchant Center.
Limitations Limited schema support. For example, product collections are not supported by Merchant Center. Merchant Center becomes the source of truth for data until it is unlinked, so any custom attributes needed must be added to Merchant Center data.

Limited control. You cannot specify certain fields or sets of items to import from Merchant Center; all items and fields existing in Merchant Center are imported.
BigQuery Description Import data from a previously loaded BigQuery table that uses the Vertex AI Search for retail schema or the Merchant Center schema. Can be performed using the Google Cloud console or curl.
When to use If you have product catalogs with many attributes. BigQuery import uses the Vertex AI Search for retail schema, which has more product attributes than other import options, including key/value custom attributes.

If you have large volumes of data. BigQuery import does not have a data limit.

If you already use BigQuery.
Limitations Requires the extra step of creating a BigQuery table that maps to the Vertex AI Search for retail schema.
Cloud Storage Description Import data in a JSON format from files loaded in a Cloud Storage bucket. Each file must be 2 GB or smaller and up to 100 files at a time can be imported. The import can be done using the Google Cloud console or curl. Uses the Product JSON data format, which allows custom attributes.
When to use If you need to load a large amount of data in a single step.
Limitations Not ideal for catalogs with frequent inventory and pricing updates because changes are not reflected immediately.
Inline import Description Import using a call to the Product.import method. Uses the ProductInlineSource object, which has fewer product catalog attributes than the Vertex AI Search for retail schema, but supports custom attributes.
When to use If you have flat, non-relational catalog data or a high frequency of quantity or price updates.
Limitations No more than 100 catalog items can be imported at a time. However, many load steps can be performed; there is no item limit.

Purge catalog branches

If you are importing new catalog data to an existing branch, it is important that the catalog branch is empty. This ensures the integrity of data imported to the branch. When the branch is empty, you can import new catalog data and then link the branch to a merchant account.

If you are serving live predict or search traffic and plan to purge your default branch, consider first specifying another branch as the default before purging. Because the default branch will serve empty results after being purged, purging a live default branch can cause an outage.

To purge data from a catalog branch, complete the following steps:

  1. Go to the Data> page in the Search for Retail console.

    Go to the Data page

  2. Select a catalog branch from the Branch name field.

  3. From the three-dot menu beside the Branch name field, choose Purge branch.

    A message is displayed warning you that you are about to delete all data in the branch as well as any attributes created for the branch.

  4. Enter the branch and click Confirm to purge the catalog data from the branch.

    A long-running operation is started to purge data from the catalog branch. When the purge operation is complete, the status of the purge is displayed in the Product catalog list in the Activity status window.

Sync Merchant Center to Vertex AI Search for retail

For continuous synchronization between Merchant Center and Vertex AI Search for retail, you can link your Merchant Center account to Vertex AI Search for retail. After linking, the catalog information in your Merchant Center account is immediately imported to Vertex AI Search for retail.

While Vertex AI Search for retail is linked to the Merchant Center account, changes to your product data in the Merchant Center account are automatically updated within minutes in Vertex AI Search for retail. If you want to prevent Merchant Center changes from being synced to Vertex AI Search for retail, you can unlink your Merchant Center account.

Unlinking your Merchant Center account does not delete any products in Vertex AI Search for retail. To delete imported products, see Delete product information.

To sync your Merchant Center account, complete the following steps.

Console

  1. Go to the Data> page in the Search for Retail console.

    Go to the Data page
  2. Click Import to open the Import Data panel.
  3. Choose Product catalog.
  4. Select Merchant Center Sync as your data source.
  5. Select your Merchant Center account. Check User Access if you don't see your account.
  6. Optional: Select Merchant Center feeds filter to import only offers from selected feeds.

    If not specified, offers from all feeds are imported (including future feeds).
  7. Optional: To import only offers targeted to certain countries or languages, expand Show Advanced Options and select Merchant Center countries of sale and languages to filter for.
  8. Select the branch you will upload your catalog to.
  9. Click Import.

curl

  1. Check that the service account in your local environment has access to both the Merchant Center account and Vertex AI Search for retail. To check which accounts have access to your Merchant Center account, see User access for Merchant Center.

  2. Use the MerchantCenterAccountLink.create method to establish the link.

    curl -X POST \
    -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
     --data '{
      "merchantCenterAccountId": MERCHANT_CENTER_ID,
      "branchId": "BRANCH_ID",
      "feedFilters": [
        {"primaryFeedId": PRIMARY_FEED_ID_1}
        {"primaryFeedId": PRIMARY_FEED_ID_2}
      ],
      "languageCode": "LANGUAGE_CODE",
      "feedLabel": "FEED_LABEL",
     }' \
     "https://retail.googleapis.com/v2alpha/projects/PROJECT_ID/locations/global/catalogs/default_catalog/merchantCenterAccountLinks"
    
    • MERCHANT_CENTER_ID: The ID of the Merchant Center account.
    • BRANCH_ID: The ID of the branch to establish the link with. Accepts values '0', '1', or '2'.
    • LANGUAGE_CODE: (OPTIONAL) The two-letter language code of the products you want to import. As seen in Merchant Center under Language column of the product. If not set, all languages are imported.
    • FEED_LABEL: (OPTIONAL) The feed label of the products you want to import. You can see the feed label in Merchant Center in the product's Feed Label column product. If not set, all feed labels are imported.
    • FEED_FILTERS: (OPTIONAL) List of primary feeds from which products will be imported. Not selecting feeds means that all Merchant Center account feeds are shared. The IDs can be found in Content API datafeeds resource or by visiting Merchant Center, selecting a feed and getting the feed ID from the dataSourceId parameter in the site URL. For example, mc/products/sources/detail?a=MERCHANT_CENTER_ID&dataSourceId=PRIMARY_FEED_ID.

To view your linked Merchant Center, go to the Search for Retail console Data page and click the Merchant Center button on the top right of the page. This opens the Linked Merchant Center Accounts panel. You can also add additional Merchant Center accounts from this panel.

See View aggregated information about your catalog for instructions on how to view the products that have been imported.

List your Merchant Center account links.

Console

  1. Go to the Data> page in the Search for Retail console.

    Go to the Data page

  2. Click the Merchant Center button on the top right of the page to open a list of your linked Merchant Center accounts.

curl

Use the MerchantCenterAccountLink.list method to list the links resource.

curl -X GET \
 -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
 -H "Content-Type: application/json; charset=utf-8" \
 "https://retail.googleapis.com/v2alpha/projects/PROJECT_NUMBER/locations/global/catalogs/default_catalog/merchantCenterAccountLinks"

Unlinking your Merchant Center account stops that account from syncing catalog data to Vertex AI Search for retail. This procedure does not delete any products in Vertex AI Search for retail that have already been uploaded.

Console

  1. Go to the Data> page in the Search for Retail console.

    Go to the Data page

  2. Click the Merchant Center button on the top right of the page to open a list of your linked Merchant Center accounts.

  3. Click Unlink next to the Merchant Center account you're unlinking, and confirm your choice in the dialog that appears.

curl

Use the MerchantCenterAccountLink.delete method to remove the MerchantCenterAccountLink resource.

curl -X DELETE \
 -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
 -H "Content-Type: application/json; charset=utf-8" \
 "https://retail.googleapis.com/v2alpha/projects/PROJECT_NUMBER/locations/global/catalogs/default_catalog/merchantCenterAccountLinks/BRANCH_ID_MERCHANT_CENTER_ID"

Limitations on linking to Merchant Center

  • A Merchant Center account can be linked to any number of catalog branches, but a single catalog branch can only be linked to one Merchant Center account.

  • A Merchant Center account cannot be a multi-client account (MCA). However, you can link individual sub-accounts.

  • The first import after linking your Merchant Center account may take hours to finish. The amount of time depends on the number of offers in the Merchant Center account.

  • Any product modifications using API methods are disabled for branches linked to a Merchant Center account. Any changes to the product catalog data in those branches have to be made using Merchant Center. Those changes are then automatically synced to Vertex AI Search for retail.

  • The collection product type isn't supported for branches that use Merchant Center linking.

  • Your Merchant Center account can only be linked to empty catalog branches to ensure data correctness. In order to delete products from a catalog branch, see Delete product information.

Import catalog data from Merchant Center

Merchant Center is a tool you can use to make your store and product data available for Shopping ads and other Google services.

You can bulk import catalog data from Merchant Center as a one-time procedure from BigQuery using the Merchant Center schema (recommendations only).

Bulk import from Merchant Center

You can import catalog data from Merchant Center using the Search for Retail console or the products.import method. Bulk importing is a one-time procedure, and is only supported for recommendations.

To import your catalog from Merchant Center, complete the following steps:

  1. Using the instructions in Merchant Center transfers, set up a transfer from Merchant Center into BigQuery.

    You'll use the Google Merchant Center products table schema. Configure your transfer to repeat daily, but configure your dataset expiration time at 2 days.

  2. If your BigQuery dataset is in another project, configure the required permissions so that Vertex AI Search for retail can access the BigQuery dataset. Learn more.

  3. Import your catalog data from BigQuery into Vertex AI Search for retail.

    Console

    1. Go to the Data> page in the Search for Retail console.

      Go to the Data page

    2. Click Import to open the Import panel.

    3. Choose Product catalog.

    4. Select BigQuery as your data source.

    5. Select the branch you will upload your catalog to.

    6. Select Merchant Center as the data schema.

    7. Enter the BigQuery table where your data is located.

    8. Optional: Enter the location of a Cloud Storage bucket in your project as a temporary location for your data.

      If not specified, a default location is used. If specified, the BigQuery and Cloud Storage bucket have to be in the same region.

    9. Choose whether to schedule a recurring upload of your catalog data.

    10. If this is the first time you are importing your catalog, or you are re-importing the catalog after purging it,select the product levels. Learn more about product levels.

      Change product level configuration after you have imported any data requires a significant effort.

    11. Click Import.

    curl

    1. If this is the first time you are uploading your catalog, or you are re-importing the catalog after purging it, set your product levels by using the Catalog.patch method. This operation requires the Retail Admin role. Learn more about product levels.

      curl -X PATCH \
      -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
      -H "Content-Type: application/json; charset=utf-8" \
      --data '{
      "productLevelConfig": {
        "ingestionProductType": "PRODUCT_TYPE",
        "merchantCenterProductIdField": "PRODUCT_ID_FIELD"
      }
      }' \
      "https://retail.googleapis.com/v2/projects/PROJECT_ID/locations/global/catalogs/default_catalog"
    2. Import your catalog using the Products.import method.

      • DATASET_ID: The ID of the BigQuery dataset.
      • TABLE_ID: The ID of the BigQuery table holding your data.
      • STAGING_DIRECTORY: Optional. A Cloud Storage directory that is used as an interim location for your data before it is imported into BigQuery. Leave this field empty to automatically create a temporary directory (recommended).
      • ERROR_DIRECTORY: Optional. A Cloud Storage directory for error information about the import. Leave this field empty to automatically create a temporary directory (recommended).
      • dataSchema: For the dataSchema property, use value product_merchant_center. See the Merchant Center products table schema.

      We recommend you don't specify staging or error directories, that way, a Cloud Storage bucket with new staging and error directories can be automatically created. These directories are created in the same region as the BigQuery dataset, and are unique to each import (which prevents multiple import jobs from staging data to the same directory, and potentially re-importing the same data). After three days, the bucket and directories are automatically deleted to reduce storage costs.

      An automatically created bucket name includes the project ID, bucket region, and data schema name, separated by underscores (for example, 4321_us_catalog_retail). The automatically created directories are called staging or errors, appended by a number (for example, staging2345 or errors5678).

      If you specify directories, the Cloud Storage bucket must be in the same region as the BigQuery dataset, or the import will fail. Provide the staging and error directories in the format gs://<bucket>/<folder>/; they should be different.

      curl -X POST \
           -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
           -H "Content-Type: application/json; charset=utf-8" \
           --data '{
             "inputConfig":{
                "bigQuerySource": {
                  "datasetId":"DATASET_ID",
                  "tableId":"TABLE_ID",
                  "dataSchema":"product_merchant_center"
                }
              }
          }' \
         "https://retail.googleapis.com/v2/projects/PROJECT_NUMBER/locations/global/catalogs/default_catalog/branches/0/products:import"
    

Import catalog data from BigQuery

To import catalog data in the correct format from BigQuery, use the Vertex AI Search for retail schema to create a BigQuery table with the correct format and load the empty table with your catalog data. Then, upload your data to Vertex AI Search for retail.

For more help with BigQuery tables, see Introduction to tables. For help with BigQuery queries, see Overview of querying BigQuery data.


To follow step-by-step guidance for this task directly in the Cloud Shell Editor, click Guide me:

Guide me


To import your catalog:

  1. If your BigQuery dataset is in another project, configure the required permissions so that Vertex AI Search for retail can access the BigQuery dataset. Learn more.

  2. Import your catalog data to Vertex AI Search for retail.

    Console

    1. Go to the Data> page in the Search for Retail console.

      Go to the Data page
    2. Click Import to open the Import Data panel.
    3. Choose Product catalog.
    4. Select BigQuery as your data source.
    5. Select the branch you will upload your catalog to.
    6. Choose Retail Product Catalogs Schema. This is the Product schema for Vertex AI Search for retail.
    7. Enter the BigQuery table where your data is located.
    8. Optional: Under Show advanced options, enter the location of a Cloud Storage bucket in your project as a temporary location for your data.

      If not specified, a default location is used. If specified, the BigQuery and Cloud Storage bucket have to be in the same region.
    9. If you do not have search enabled and you are using the Merchant Center schema, select the product level.

      You must select the product level if this is the first time you are importing your catalog or you are re-importing the catalog after purging it. Learn more about product levels. Changing product levels after you have imported any data requires a significant effort.

      Important: You can't turn on the search for projects with a product catalog that has been ingested as variants.
    10. Click Import.

    curl

    1. If this is the first time you are uploading your catalog, or you are re-importing the catalog after purging it, set your product levels by using the Catalog.patch method. This operation requires the Retail Admin role.

      curl -X PATCH \
      -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
      -H "Content-Type: application/json; charset=utf-8" \
       --data '{
         "productLevelConfig": {
           "ingestionProductType": "PRODUCT_TYPE",
           "merchantCenterProductIdField": "PRODUCT_ID_FIELD"
         }
       }' \
      "https://retail.googleapis.com/v2/projects/PROJECT_ID/locations/global/catalogs/default_catalog"
      
    2. Create a data file for the input parameters for the import.

      Use the BigQuerySource object to point to your BigQuery dataset.

      • DATASET_ID: The ID of the BigQuery dataset.
      • TABLE_ID: The ID of the BigQuery table holding your data.
      • PROJECT_ID: The project ID that the BigQuery source is in. If not specified, the project ID is inherited from the parent request.
      • STAGING_DIRECTORY: Optional. A Cloud Storage directory that is used as an interim location for your data before it is imported into BigQuery. Leave this field empty to automatically create a temporary directory (recommended).
      • ERROR_DIRECTORY: Optional. A Cloud Storage directory for error information about the import. Leave this field empty to automatically create a temporary directory (recommended).
      • dataSchema: For the dataSchema property, use value product (default). You'll use the Vertex AI Search for retail schema.

      We recommend you don't specify staging or error directories, that way, a Cloud Storage bucket with new staging and error directories can be automatically created. These directories are created in the same region as the BigQuery dataset, and are unique to each import (which prevents multiple import jobs from staging data to the same directory, and potentially re-importing the same data). After three days, the bucket and directories are automatically deleted to reduce storage costs.

      An automatically created bucket name includes the project ID, bucket region, and data schema name, separated by underscores (for example, 4321_us_catalog_retail). The automatically created directories are called staging or errors, appended by a number (for example, staging2345 or errors5678).

      If you specify directories, the Cloud Storage bucket must be in the same region as the BigQuery dataset, or the import will fail. Provide the staging and error directories in the format gs://<bucket>/<folder>/; they should be different.

      {
         "inputConfig":{
           "bigQuerySource": {
             "projectId":"PROJECT_ID",
             "datasetId":"DATASET_ID",
             "tableId":"TABLE_ID",
             "dataSchema":"product"}
            }
      }
      
    3. Import your catalog information by making a POST request to the Products:import REST method, providing the name of the data file (here, shown as input.json).

      curl -X POST \
      -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
      -H "Content-Type: application/json; charset=utf-8" -d @./input.json \
      "https://retail.googleapis.com/v2/projects/PROJECT_NUMBER/locations/global/catalogs/default_catalog/branches/0/products:import"
      

      You can check the status programmatically using the API. You should receive a response object that looks something like this:

      {
      "name": "projects/PROJECT_ID/locations/global/catalogs/default_catalog/operations/import-products-123456",
      "done": false
      }
      

      The name field is the ID of the operation object. To request the status of this object, replace the name field with the value returned by the import method, until the done field returns as true:

      curl -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
      "https://retail.googleapis.com/v2/projects/PROJECT_ID/locations/global/catalogs/default_catalog/operations/import-products-123456"
      

      When the operation completes, the returned object has a done value of true, and includes a Status object similar to the following example:

      { "name": "projects/PROJECT_ID/locations/global/catalogs/default_catalog/operations/import-products-123456",
      "metadata": {
        "@type": "type.googleapis.com/google.cloud.retail.v2.ImportMetadata",
        "createTime": "2020-01-01T03:33:33.000001Z",
        "updateTime": "2020-01-01T03:34:33.000001Z",
        "successCount": "2",
        "failureCount": "1"
      },
      "done": true,
      "response": {
      "@type": "type.googleapis.com/google.cloud.retail.v2.ImportProductsResponse",
      },
      "errorsConfig": {
        "gcsPrefix": "gs://error-bucket/error-directory"
      }
      }
      

      You can inspect the files in the error directory in Cloud Storage to see if errors occurred during the import.

Set up access to your BigQuery dataset

To set up access when your BigQuery dataset is in a different project than your Vertex AI Search for retail service, complete the following steps.

  1. Open the IAM page in the Google Cloud console.

    Open the IAM page

  2. Select your Vertex AI Search for retail project.

  3. Find the service account with the name Retail Service Account.

    If you have not previously initiated an import operation, this service account might not be listed. If you do not see this service account, return to the import task and initiate the import. When it fails due to permission errors, return here and complete this task.

  4. Copy the identifier for the service account, which looks like an email address (for example, service-525@gcp-sa-retail.iam.gserviceaccount.com).

  5. Switch to your BigQuery project (on the same IAM & Admin page) and click  Grant Access.

  6. For New principals, enter the identifier for the Vertex AI Search for retail service account and select the BigQuery > BigQuery User role.

  7. Click Add another role and select BigQuery > BigQuery Data Editor.

    If you do not want to provide the Data Editor role to the entire project, you can add this role directly to the dataset. Learn more.

  8. Click Save.

Import catalog data from Cloud Storage

To import catalog data in JSON format, you create one or more JSON files that contain the catalog data you want to import, and upload it to Cloud Storage. From there, you can import it to Vertex AI Search for retail.

For an example of the JSON product item format, see Product item JSON data format.

For help with uploading files to Cloud Storage, see Upload objects.

  1. Make sure the Vertex AI Search for retail service account has permission to read and write to the bucket.

    The Vertex AI Search for retail service account is listed on the IAM page in the Google Cloud console with the name Retail Service Account. Use the service account's identifier, which looks like an email address (for example, service-525@gcp-sa-retail.iam.gserviceaccount.com), when adding the account to your bucket permissions.

  2. Import your catalog data.

    Console

    1. Go to the Data> page in the Search for Retail console.

      Go to the Data page
    2. Click Import to open the Import Data panel.
    3. Choose Product catalog as your data source.
    4. Select the branch you will upload your catalog to.
    5. Choose Retail Product Catalogs Schema as the schema.
    6. Enter the Cloud Storage location of your data.
    7. If you do not have search enabled, select the product levels.

      You must select the product levels if this is the first time you are importing your catalog or you are re-importing the catalog after purging it. Learn more about product levels. Changing product levels after you have imported any data requires a significant effort.

      Important: You can't turn on the search for projects with a product catalog that has been ingested as variants.
    8. Click Import.

    curl

    1. If this is the first time you are uploading your catalog, or you are re-importing the catalog after purging it, set your product levels by using the Catalog.patch method. Learn more about product levels.

      curl -X PATCH \
      -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
      -H "Content-Type: application/json; charset=utf-8" \
       --data '{
         "productLevelConfig": {
           "ingestionProductType": "PRODUCT_TYPE",
           "merchantCenterProductIdField": "PRODUCT_ID_FIELD"
         }
       }' \
      "https://retail.googleapis.com/v2/projects/PROJECT_ID/locations/global/catalogs/default_catalog"
      
    2. Create a data file for the input parameters for the import. Use the GcsSource object to point to your Cloud Storage bucket.

      You can provide multiple files, or just one; this example uses two files.

      • INPUT_FILE: A file or files in Cloud Storage containing your catalog data.
      • ERROR_DIRECTORY: A Cloud Storage directory for error information about the import.

      The input file fields must be in the format gs://<bucket>/<path-to-file>/. The error directory must be in the format gs://<bucket>/<folder>/. If the error directory does not exist, it gets created. The bucket must already exist.

      {
      "inputConfig":{
       "gcsSource": {
         "inputUris": ["INPUT_FILE_1", "INPUT_FILE_2"]
        }
      },
      "errorsConfig":{"gcsPrefix":"ERROR_DIRECTORY"}
      }
      
    3. Import your catalog information by making a POST request to the Products:import REST method, providing the name of the data file (here, shown as input.json).

      curl -X POST \
      -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
      -H "Content-Type: application/json; charset=utf-8" -d @./input.json \
      "https://retail.googleapis.com/v2/projects/PROJECT_NUMBER/locations/global/catalogs/default_catalog/branches/0/products:import"
      

      The easiest way to check the status of your import operation is to use the Google Cloud console. For more information, see See status for a specific integration operation.

      You can also check the status programmatically using the API. You should receive a response object that looks something like this:

      {
      "name": "projects/PROJECT_ID/locations/global/catalogs/default_catalog/operations/import-products-123456",
      "done": false
      }
      

      The name field is the ID of the operation object. You request the status of this object, replacing the name field with the value returned by the import method, until the done field returns as true:

      curl -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
      "https://retail.googleapis.com/v2/projects/PROJECT_ID/locations/global/catalogs/default_catalog/operations/[OPERATION_NAME]"
      

      When the operation completes, the returned object has a done value of true, and includes a Status object similar to the following example:

      { "name": "projects/PROJECT_ID/locations/global/catalogs/default_catalog/operations/import-products-123456",
      "metadata": {
        "@type": "type.googleapis.com/google.cloud.retail.v2.ImportMetadata",
        "createTime": "2020-01-01T03:33:33.000001Z",
        "updateTime": "2020-01-01T03:34:33.000001Z",
        "successCount": "2",
        "failureCount": "1"
      },
      "done": true,
      "response": {
      "@type": "type.googleapis.com/google.cloud.retail.v2.ImportProductsResponse"
      },
      "errorsConfig": {
        "gcsPrefix": "gs://error-bucket/error-directory"
      }
      }
      

      You can inspect the files in the error directory in Cloud Storage to see what kind of errors occurred during the import.

Import catalog data inline

curl

You import your catalog information inline by making a POST request to the Products:import REST method, using the productInlineSource object to specify your catalog data.

Provide an entire product on a single line. Each product should be on its own line.

For an example of the JSON product item format, see Product item JSON data format.

  1. Create the JSON file for your product and call it ./data.json:

    {
    "inputConfig": {
    "productInlineSource": {
      "products": [
        { PRODUCT_1 }
        { PRODUCT_2 }
      ]
    }
    }
    }
    
  2. Call the POST method:

    curl -X POST \
     -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     --data @./data.json \
    "https://retail.googleapis.com/v2/projects/PROJECT_NUMBER/locations/global/catalogs/default_catalog/branches/0/products:import"
    

Java

public static String importProductsFromInlineSource(
    List<Product> productsToImport)
    throws IOException, InterruptedException, ExecutionException {
  ProductServiceClient productClient = getProductServiceClient();

  ProductInlineSource inlineSource = ProductInlineSource.newBuilder()
      .addAllProducts(productsToImport)
      .build();

  ProductInputConfig inputConfig = ProductInputConfig.newBuilder()
      .setProductInlineSource(inlineSource)
      .build();

  ImportProductsRequest importRequest = ImportProductsRequest.newBuilder()
      .setParent(IMPORT_PARENT)
      .setRequestId(REQUEST_ID)
      .setReconciliationMode(ReconciliationMode.INCREMENTAL)
      .setInputConfig(inputConfig)
      .build();

  String operationName = productClient
      .importProductsAsync(importRequest).getName();

  productClient.shutdownNow();
  productClient.awaitTermination(2, TimeUnit.SECONDS);

  return operationName;
}

Product item JSON data format

The Product entries in your JSON file should look like the following examples.

Provide an entire product on a single line. Each product should be on its own line.

Minimum required fields:

  {
    "id": "1234",
    "categories": "Apparel & Accessories > Shoes",
    "title": "ABC sneakers"
  }
  {
    "id": "5839",
    "categories": "casual attire > t-shirts",
    "title": "Crew t-shirt"
  }

Complete object:

  {
    "name": "projects/PROJECT_NUMBER/locations/global/catalogs/default_catalog/branches/0/products/1234",
    "id": "1234",
    "categories": "Apparel & Accessories > Shoes",
    "title": "ABC sneakers",
    "description": "Sneakers for the rest of us",
    "attributes": { "vendor": {"text": ["vendor123", "vendor456"]} },
    "language_code": "en",
    "tags": [ "black-friday" ],
    "priceInfo": {
      "currencyCode": "USD", "price":100, "originalPrice":200, "cost": 50
    },
    "availableTime": "2020-01-01T03:33:33.000001Z",
    "availableQuantity": "1",
    "uri":"http://example.com",
    "images": [
      {"uri": "http://example.com/img1", "height": 320, "width": 320 }
    ]
  }
  {
    "name": "projects/PROJECT_NUMBER/locations/global/catalogs/default_catalog/branches/0/products/4567",
    "id": "4567",
    "categories": "casual attire > t-shirts",
    "title": "Crew t-shirt",
    "description": "A casual shirt for a casual day",
    "attributes": { "vendor": {"text": ["vendor789", "vendor321"]} },
    "language_code": "en",
    "tags": [ "black-friday" ],
    "priceInfo": {
      "currencyCode": "USD", "price":50, "originalPrice":60, "cost": 40
    },
    "availableTime": "2020-02-01T04:44:44.000001Z",
    "availableQuantity": "2",
    "uri":"http://example.com",
    "images": [
      {"uri": "http://example.com/img2", "height": 320, "width": 320 }
    ]
  }

Historical catalog data

Vertex AI Search for retail supports importing and managing historical catalog data. Historical catalog data can be helpful when you use historical user events for model training. Past product information can be used to enrich historical user event data and improve model accuracy.

Historical products are stored as expired products. They are not returned in search responses, but are visible to the Update, List, and Delete API calls.

Import historical catalog data

When a product's expireTime field is set to a past timestamp, this product is considered as a historical product. Set the product availability to OUT_OF_STOCK to avoid impacting recommendations.

We recommend using the following methods for importing historical catalog data:

Call the Product.Create method

Use the Product.Create method to create a Product entry with the expireTime field set to a past timestamp.

Inline import expired products

The steps are identical to inline import, except that the products should have the expireTime fields set to a past timestamp.

Provide an entire product on a single line. Each product should be on its own line.

An example of the ./data.json used in the inline import request:

{
"inputConfig": {
  "productInlineSource": {
      "products": [
          {
            "id": "historical_product_001",
            "categories": "Apparel & Accessories > Shoes",
            "title": "ABC sneakers",
            "expire_time": {
              "second": "2021-10-02T15:01:23Z"  // a past timestamp
            }
          },
          {
            "id": "historical product 002",
            "categories": "casual attire > t-shirts",
            "title": "Crew t-shirt",
            "expire_time": {
              "second": "2021-10-02T15:01:24Z"  // a past timestamp
            }
          }
      ]
    }
  }
}

Import expired products from BigQuery or Cloud Storage

Use the same procedures documented for importing catalog data from BigQuery or importing catalog data from Cloud Storage. However, make sure to set the expireTime field to a past timestamp.

Keep your catalog up to date

For best results, your catalog must contain current information. We recommend that you import your catalog on a daily basis to ensure that your catalog is current. You can use Google Cloud Scheduler to schedule imports, or choose an automatic scheduling option when you import data using the Google Cloud console.

You can update only new or changed product items, or you can import the entire catalog. If you import products that are already in your catalog, they are not added again. Any item that has changed is updated.

To update a single item, see Update product information.

Batch update

You can use the import method to batch update your catalog. You do this the same way you do the initial import; follow the steps in Import catalog data.

Monitor import health

To monitor catalog ingestion and health:

  1. View aggregated information about your catalog and preview uploaded products on the Catalog tab of the Search for Retail Data page.

    Go to the Data page

  2. Assess if you need to update catalog data to improve the quality of search results and unlock search performance tiers on the Data quality page.

    For more about how to check search data quality and view search performance tiers, see Unlock search performance tiers. For a summary of available catalog metrics on this page, see Catalog quality metrics.

    Go to the Data Quality page

  3. To create alerts that let you know if something goes wrong with your data uploads, follow the procedures in Set up Cloud Monitoring alerts.

    Keeping your catalog up to date is important for getting high-quality results. Use alerts to monitor the import error rates and take action if needed.

What's next