This is the documentation for Recommendations AI, Retail Search, and the new Retail console. To use Retail Search in the restricted GA phase, contact Cloud sales.

If you are only using Recommendations AI, remain on the Recommendations console and refer to the Recommendations AI documentation.

Importing auto-completion data

Auto-completion is a feature for predicting the rest of a word a user is typing, which can improve the user search experience. It can provide typeahead suggestion based on your provided dataset or based on user events you provided.

This page describes how to import your auto-completion dataset to Retail. These instructions are for uploading your own auto-completion data only. Keep it up to date if you plan to use your auto-completion dataset all the time. For getting auto-completion, refer to CompletionService.CompleteQuery. Auto-completion data is used only for Retail Search. This data is not used by Recommendations AI.

Before you begin

Before you can import your auto-completion information, you must have completed the instructions in Before you begin, specifically setting up your project, creating a service account, and adding the service account to your local environment.

You must have the Retail Editor IAM role to be able perform the import.

Auto-completion import best practices

When you import auto-completion data, ensure that you implement the following best practices:

  • Read the schema files listed in the following sections and API documentation.

    Do not use dummy or placeholder values.

  • Include as many fields as possible.

  • Keep your own auto-completion dataset up to date if you plan to use own uploaded dataset.

Import auto-completion data

Importing auto-completion data from BigQuery

To import auto-completion data in the correct format from BigQuery, use the Retail auto-completion schema to create a BigQuery table with the correct format and load the table with your auto-completion data. Then, upload your data to Retail.

For more help with BigQuery tables, see Introduction to tables. For help with BigQuery queries, see Overview of querying BigQuery data.

Populating data to BigQuery

Use the Retail auto-completion schema to upload your auto-completion data to BigQuery.

BigQuery can use the schema to validate whether JSON-formatted data has correct field names and types (such as STRING, INTEGER, and RECORD), but cannot perform validations such as determining:

  • If a string field mapped into recognizable enum value.
  • If a string field is using the correct format.
  • If an integer or float field has value in a valid range.
  • If a missing field is a required field.

To ensure the quality of your data and the end user search experience, make sure you refer to the schema and reference documentation for details about values and format.

Importing auto-completion data to Retail

curl

  1. If your BigQuery dataset is in another project, configure the required permissions so that Retail can access the BigQuery dataset. Learn more.

  2. Create a data file for the input parameters for the import. Your input parameter values depend on whether you are importing from Cloud Storage or BigQuery.

    Use the BigQuerySource object to point to your BigQuery dataset.

    • dataset-id: The ID of the BigQuery dataset.
    • table-id: The ID of the BigQuery table holding your data.
    • data-schema: For the dataSchema property, use value suggestions (default), allowlist, denylist. You'll use the Retail auto-completion schema.
    {
      "inputConfig":{
        "bigQuerySource": {
          "datasetId":"dataset-id",
          "tableId":"table-id",
          "dataSchema":"data-schema"
        }
      }
    }
    
  3. Import your auto-completion information to Retail by making a POST request to the CompletionData:import REST method, providing the name of the data file (here, shown as input.json).

    curl -X POST \
    -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" -d @./input.json
    "https://retail.googleapis.com/v2alpha/projects/[PROJECT_NUMBER]/locations/global/catalogs/default_catalog/completionData:import"
    

    You can check the status programmatically using the API. You should receive a response object that looks something like this:

    {
      "name": "projects/[PROJECT_ID]/locations/global/catalogs/default_catalog/operations/123456",
      "done": false
    }
    

    The name field is the ID of the operation object. To request the status of this object, replace the name field with the value returned by the import method, until the done field returns as true:

    curl -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
    "https://retail.googleapis.com/v2alpha/projects/[PROJECT_ID]/locations/global/catalogs/default_catalog/operations/123456"
    

    When the operation completes, the returned object has a done value of true, and includes a Status object similar to the following example:

    {
      "name": "projects/[PROJECT_ID]/locations/global/catalogs/default_catalog/operations/123456",
      "metadata": {
        "@type": "type.googleapis.com/google.cloud.retail.v2alpha.ImportMetadata",
        "createTime": "2020-01-01T03:33:33.000001Z",
        "updateTime": "2020-01-01T03:34:33.000001Z",
        "successCount": "2",
        "failureCount": "1"
      },
      "done": true
      "response": {
        "@type": "type.googleapis.com/google.cloud.retail.v2alpha.ImportCompletionDataResponse",
      }
    }
    

Setting up access to your BigQuery dataset

To set up access when your BigQuery dataset is in a different project than your Retail service, complete the following steps.

  1. Open the IAM page in the Cloud Console.

    Open the IAM page

  2. Select your Retail project.

  3. Find the service account with the name Retail Service Account.

    If you have not previously initiated an import operation with Retail, this service account might not be listed. If you do not see this service account, return to the import task and initiate the import. When it fails due to permission errors, return here and complete this task.

  4. Copy the identifier for the service account, which looks like an email address (for example, cloud-retail-customer-data-access@system.gserviceaccount.com).

  5. On IAM & Admin and click Add.

  6. Enter the identifier for the Retail service account and select the BigQuery > BigQuery Data Viewer role.

    If you do not want to provide the Data Viewer role to the entire project, you can add this role directly to the dataset. Learn more.

  7. Click Save.

Auto-completion data format

Your JSON file should look like the following examples. The line breaks are for readability; you should provide an entire suggestion on a single line. Each suggestion should be on its own line.

Suggestion minimum required fields:

{
  "suggestion": "ABC",
  "globalScore": "0.5"
}

Or:

{
  "suggestion": "ABC",
  "frequency": "100"
}

Auto-completion data import duration

It usually takes about 10 hours for one import from BigQuery. The long-running operation pushes the dataset for production serving. Once the dataset is ready for serving, the done field in the operation object will be marked as true. Before the operation is done, it is not guaranteed to get the latest suggestion from import dataset.

Keeping your auto-completion dataset up to date

If you plan to use own uploaded dataset, it's a best practice to keep the uploaded dataset up to date on a regular basis.

Batch updating

You can use the import method to batch update your auto-completion. You do this the same way you do the initial import; follow the steps in Importing auto-completion data. This will replace the whole imported dataset.

Monitoring import health

Keeping your own dataset up to date is important for getting high-quality suggestion results when you use it. You should monitor the import error rates and take action if needed.

Retail auto-completion schema

When importing auto-completion dataset from BigQuery, use the Retail schema below to create BigQuery tables with the correct format and load them with your auto-completion data.

Schema for suggestions

This dataset is used for suggesting phrases with typeahead.

Schema for denylist

This dataset is used as a denylist to block phrases from being suggested.

Schema for allowlist

This dataset is used for skipping post processes, e.g. spell correction, for all the phrases in this allowlist.