Monitor and troubleshoot

This page describes how to get information about errors that have occurred in catalog and user event imports and in other API operations in Vertex AI Search for retail.

For help with setting up alerts, see Set up Cloud Monitoring alerts.

Introduction

Providing accurate catalog information and user events to the API is important for getting the highest quality results. Monitoring and understanding the source of errors helps you find and fix any errors in your site.

See aggregated integration errors

To see the aggregated errors generated by your data upload processes and prediction or search requests, use the Monitoring page.

This page displays all errors for the Vertex AI Search for retail API. You can view errors related to the product catalog, user events, recommendations predictions, search results, and models. The system also logs errors from imports, such as a malformed line in your Cloud Storage file. The system logs up to 100 errors per import file. You can define the time period for which errors are displayed and filter based on the error type.

You can click an individual error to see the logs for that error in Cloud Logging.

You can open individual error logs by expanding that log. Error logs provide more detail about the request, including the request and response payloads and error details. This information can help you determine where the erroneous method call is located in your site.

For invalid JSON errors, you can get more information about the issue by expanding the status field.

See status for a specific integration operation

You can see the status of a specific integration operation in the Activity status window:

  1. Go to the Data> page in the Search for Retail console.

    Go to the Data page

  2. Click Activity status.

    The Activity status window shows the status of long-running operations on your product catalog, user events, and controls.

    You can inspect errors for specific integration operations in this window.

  3. Click View logs in the Detail column of any operation with an error to inspect its log files in Cloud Logging.

View logs in Cloud Logging

To open your log files directly in Cloud Logging, use the following procedure. You must have the Logs Viewer (roles/logging.viewer) role to view logs.

  1. Go to Logs Explorer in the Google Cloud console. Go to Logs Explorer

  2. Select your Vertex AI Search for retail project from the project selector.

  3. Click the Resource drop-down menu and select Consumed API > Cloud Retail.

For more information about the Logs Explorer, see View logs by using the Logs Explorer.

For example, this link opens logs for all Vertex AI Search for retail errors in the past hour:

Open Vertex AI Search for retail logs

To configure which API logs are written, see Configure Logging.

Configure Logging

You can configure which service logs are written to Logging. Logging configuration provides a way to set the severity levels at which to write logs, turn logging on or off, and override default logging settings for specific services.

Each API request an end user makes can generate one logging entry. An entry contains information such as the API method, when it was invoked, the response code, and the request and response bodies. A project's logging configuration specifies which types of logs generated by the API get written to Logging, with the option to granularly specify logging configurations for specific API services.

To update logging configurations, you need the Vertex AI Search for retail editor role.

You can use the console or the LoggingConfig API to configure Logging.

Console

To update logging configurations in the console, follow these steps:

  1. Go to the Monitoring page in the Search for Retail console.

    Go to the Monitoring page

  2. Click Logging configuration.

  3. To set a global logging configuration, select a logging level. If you select LOG_ALL, also enter a sampling rate for successful logs.

  4. To set a service-level configuration, select a service to update and select its logging level. This setting overrides the global logging configuration.

curl

To update logging configurations using the API, use the LoggingConfig resource. See the LoggingConfig API reference.

  1. To view the current logging configuration, use loggingConfig.Get.

    curl -X GET \
        -H "Authorization: Bearer $(gcloud auth print-access-token)" \
        -H "Content-Type: application/json" \
        "https://retail.googleapis.com/v2alpha/projects/PROJECT_ID/loggingConfig"
    
    • PROJECT_ID: The ID of your project.
  2. To update the logging configuration, use the loggingConfig.Patch method. For more information, see the LoggingConfig API reference.

    This example uses loggingConfig.Patch to set the global logging configuration to LOG_WARNINGS_AND_ABOVE. It also sets two service-level configurations: CatalogService is set to LOG_WARNINGS_AND_ABOVE and ControlService is set to LOG_ALL.

    curl -X PATCH \
      -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
      -H "Content-Type: application/json; charset=utf-8" \
      "https://retail.googleapis.com/v2alpha/projects/PROJECT_ID/loggingConfig" \
      --data '{
        "name": "projects/PROJECT_ID/loggingConfig",
        "default_log_generation_rule": {"logging_level": "LOG_ERRORS_AND_ABOVE"},
        "service_log_generation_rules": [
          {
            "service_name": "CatalogService",
            "log_generation_rule": {
              "logging_level": "LOG_WARNINGS_AND_ABOVE"
              }
          },
          {
            "service_name": "ControlService",
            "log_generation_rule": {
                "logging_level": "LOG_ALL", "info_log_sample_rate": "0.1"
                }
            }
          ]
        }'
    

Logging levels

Only logs of some severity levels are written to Logging. The logging level settings determine which logs generated by an API method get written to Logging.

When no service-level logging config is set for an API method, the global logging level setting is used.

The default logging level setting is LOG_WARNINGS_AND_ABOVE.

The logging_level field accepts the following values:

  • LOGGING_DISABLED: No log written.
  • LOG_ERRORS_AND_ABOVE: Logs errors only.
  • LOG_WARNINGS_AND_ABOVE: Logs errors and warnings only.
  • LOG_ALL: Logs everything, including successful logs such as INFO logs.

Sampling rate for successful logs

If you set the logging level setting to LOG_ALL but don't want to log every successful log, you can specify a sampling rate. For example, you might decide to periodically monitor logs for successful status confirmation, or want see a percentage of successful logs. Specifying a sampling rate can help you do this without writing a high volume of INFO log entries to Logging, which can incur higher Logging costs.

To specify a sampling rate, set info_log_sample_rate to a valid float value greater than 0 and less than or equal to 1. The sampling rate determines the likelihood of an INFO log being written to Logging. The default value is 1 (all INFO logs are written).

Service-level configurations

You can set logging configurations for specific services. This overwrites the global logging setting for that service. For example, you might have the global logging level set to LOG_WARNINGS_AND_ABOVE, but set the UserEventService service logging level to LOG_ALL so you can check for successful user event integrations.

Use the ServiceLoggingLevel object to set granular logging levels.

The service_name field accepts the following values:

  • CompletionService
  • ControlService
  • MerchantCenterStreaming
  • ModelService
  • PredictionService
  • ProductService
  • ServingConfigService
  • UserEventService

Error types

This section provides definitions for error types that can appear in your logs:

  • MISSING_FIELD: A required field value is not set; for example, a catalog item is missing its title.
  • INVALID_TIMESTAMP: The timestamp is invalid, such as being too far in the future, or formatted incorrectly.
  • FIELD_VALUE_TOO_SMALL: The value in the field is lower than the required minimum; for example, a negative price.
  • INCORRECT_JSON_FORMAT: The JSON in the request is incorrectly formatted, such as a missing { bracket.
  • INVALID_LANGUAGE_CODE: The language code is incorrectly formatted.
  • FIELD_VALUE_EXCEEDED: The value in the field is higher than the allowed maximum.
  • INVALID_RESOURCE_ID: The resource ID is invalid; for example, a non-existent catalog_id in the resource name.
  • FIELD_SIZE_EXCEEDED: The number of entries in the field exceeds the maximum limit.
  • UNEXPECTED_FIELD: A field that was expected to be empty has a value; for example, a transaction for a detail page view event.
  • INVALID_FORMAT: The field is not formatted correctly, such as a malformed string
  • RESOURCE_ALREADY_EXISTS: You tried to create a resource that already exists, such as a previously created catalog item.
  • INVALID_API_KEY: The API key does not match the project in your request.
  • INSUFFICIENT_PERMISSIONS: You do not have permission to execute the request; this error is usually related to the lack of a required IAM permission.
  • UNJOINED_WITH_CATALOG: The request includes a catalog item ID that does not exist in the catalog. Make sure your catalog is up to date.
  • BATCH_ERROR: The request has multiple errors; for example, an inline import with 10 items that fail validation for different reasons.
  • INACTIVE_RECOMMENDATION_MODEL: You queried a model that is not active for serving.
  • ABUSIVE_ENTITY: The visitor ID or user ID associated with the request has sent an abnormal number of events in a short period of time.
  • FILTER_TOO_STRICT: The prediction request filter blocked all prediction results. Generic (not personalized) popular items are returned, unless the call specified strictFiltering as false, in which case no items are returned. Some common reasons why this issue occurs:

    • You are specifying a filter tag that doesn't exist in your catalog. It can take up to a day for a filter tag update to take effect.
    • Your filter is too narrow.

View data load metrics

To monitor your catalog and user event data ingestion in the Google Cloud console, follow these steps:

  1. View error metrics for your catalog and user event data ingestion on the Monitoring page.

    Go to the Monitoring page

  2. After your data upload system is running successfully, use the Catalog and Event tabs on the Data page to see aggregated information about your catalog, preview your uploaded products, and view visualizations of your user event integration metrics.

    Go to the Data page

  3. To create alerts that let you know if something goes wrong with your data uploads, follow the procedures in Set up Cloud Monitoring alerts.

Catalog data summary

Use the Catalog tab on the Data page to view high-level data statistics for each catalog branch. This page displays how many products you have imported, how many are in stock, and when you last imported products for each product catalog branch.

You can also see a preview of the catalog items you have uploaded, and filter based on product fields.

You can import data to different branches as a way to stage and preview recommendations or search results. For example, to prepare for a holiday season, you might upload new catalog data to a non- default branch and make sure Vertex AI Search for retail results are generated correctly before making it live on your website.

User event recording statistics

For each type of user event, you can see in the Event tab how many you have recorded, how many of those were not able to be associated with a product (unjoined events), and how the numbers differed from previous periods. You can select a preset time period or enter custom time range.

The metric graph displays user events ingested over time, which you can filter by user event type.

Data quality metrics

On the Data quality page, you can see metrics that show the percentages of products and user events that fulfill recommended standards of data quality for search. Use this page to assess what data you need to import or update in order to improve the quality of search results and unlock search performance tiers.

For more information about search performance tiers and checking the quality of your data, see Unlock search performance tiers.

For a list of all catalog data quality metrics, see Catalog data quality metrics.

For all user event requirements and recommendations for recommendations and search, see User event requirements and best practices.

Unjoined events

When a user event or API request refers to a product that has not been uploaded to Vertex AI Search for retail, it is an unjoined event. Unjoined user events are still logged, and unjoined requests are handled, but neither can be used to further enhance the model for future predictions. For this reason, you should make sure that your unlogged event percentage is very low for both user events and prediction requests.

You can see your unjoined user event percentage in the Event tab on the Data page.

API errors

You can see a graph of API errors over time, displayed by method name, by clicking View API metrics on the button bar of the Monitoring page.

Monitor API method activity

For visualizations of traffic, errors, and latency by API method, go to the Monitoring page. You can select a preset time period or enter custom time range.

To see more details about each graph:

  • Underneath a graph, click a method name to isolate it in the graph.
  • Hover your cursor over a graph to see a callout with each method and its values at that point in time.
  • Click and drag over any section of the graph to zoom in on that period of time.

What's next