Troubleshooting BigQuery subscriptions

This page provides some common troubleshooting tips for BigQuery subscriptions.

Check the state of a BigQuery subscription

To check the state of a subscription, perform the following steps:

  1. In the Google Cloud console, go to the Pub/Sub subscription page.

    Go to Subscriptions

  2. Check the icon for State for your BigQuery subscription.

    If the icon is a green check mark, the subscription is healthy.

    If the icon is a red exclamation mark, the subscription is in an error state.

  3. Click the BigQuery subscription.

    The subscription details page opens.

  4. Check Subscription state for the error message.

  5. Depending on the error message, go to the relevant section in this page for troubleshooting the issue.

After the issue is resolved, the subscription eventually returns to a healthy state.

Unable to create or update subscription

These are some of the common issues you could face if you're having trouble creating or updating a BigQuery subscription.

Table not found error

If the table that you specify in the create or update subscription workflow does not exist, then the workflow returns a table not found error. In the Google Cloud console, the message is similar to the following:

The BigQuery table or dataset specified cannot be found.

To resolve the issue, create the table and make sure you can check its state before using it with a BigQuery subscription.

Schema mismatch error

If the schemas of the table and the topic are not compatible, then the create or update subscription workflow returns a schema mismatch error. In the Google Cloud console, the message is similar to the following:

Incompatible schema type for field project_ids: expected INT64, got STRING

The specified error message is for schema mismatch for a field called project_ids. Depending on the type of schema mismatch that you have, you might see a different variation of the error message.

To resolve the issue, check if the schema mappings are compatible.

Service account error

If you have not configured the Pub/Sub service account with the right permissions, then the create or update subscription workflow returns an error. In the Google Cloud console, the message is similar to the following:

Service account service-1234234234@gcp-sa-pubsub.iam.gserviceaccount.com
is missing permissions required to write to the BigQuery table:
bigquery.tables.get, bigquery.tables.updateData.

To resolve the issue, check if the service account has the correct permissions.

Subscription state shows a red exclamation

If you edit the table after creating a subscription, it can affect how Pub/Sub writes messages to the table. If a change results in an issue, then the state field of the subscription is set to an error state.

In the subscription details page, check the state of the field Subscription state. The Subscription state field provides a more specific error, which may be one of the following:

  • table not found: The table is deleted. Create a table and check the state of the table. See Get table information.

  • table permission denied: The Pub/Sub service account no longer has permission to write to the table. Check if the service account has the correct permissions.

  • table schema mismatch: The table schema is no longer compatible with the BigQuery subscription settings. Check if the schema mappings are compatible.

While a Pub/Sub subscription is in the error state, messages are not written to the BigQuery table and remain in the subscription backlog. Note that messages are not delivered to an attached dead-letter topic, if configured. Unacknowledged messages are retained for the period set in message_retention_duration(7d, by default).

A backlog is building up

If you see a backlog of messages building up in the subscription or messages going to a subscription's dead letter topic, review the following possible causes.

INVALID_ARGUMENT error message

This error happens when the message supplied is in a format that Pub/Sub considers valid but the BigQuery destination table schema does not. This means one or more fields in the message have values that are not permissible by the BigQuery table schema. Review the schema compatibility to verify that the data types and formats are correct. Some of the most common errors include:

  • An empty string ("") is not a valid JSON. When sending data to a nullable JSON BigQuery table column, provide either an empty JSON object ({}), null, or an empty JSON string ("\"\"") to represent missing values. Sending an empty string results in an error.

  • If a message field value exceeds the BigQuery field's maximum length, the message fails due to size limitations.

To troubleshoot INVALID_ARGUMENT errors, add a dead letter topic to the subscription of interest. The dead letter topic captures messages that couldn't be written to BigQuery, along with an attribute called CloudPubSubDeadLetterSourceDeliveryErrorMessage that explains the failure reason.

These delivery failures can also be seen in Metrics Explorer. Select the metric pubsub.googleapis.com/subscription/push_request_count and filter by response_code=invalid_argument.

RESOURCE_EXHAUSTED error message

If messages are being written to BigQuery slowly, you might need to increase your project's Pub/Sub push quota or BigQuery storage write throughput quota. To check if you're encountering quota limitations, examine the push requests metric (subscription/push_request_count) for any resource_exhausted errors.

Another way to diagnose quota issues is to check the project's quota. Navigate to IAM & Admin > Quotas within the project containing your Pub/Sub resource or BigQuery instance. Search for the relevant quota, either pubsub.googleapis.com/regionalpushsubscriber or bigquerystorage.googleapis.com/write/append_bytes. If either quota needs an increase, you can request a higher quota.

Hourly partitioned table showing __UNPARTITIONED__ in partition ID column

When a BigQuery destination table is partitioned by hour, rows initially land in a special partition labeled __UNPARTITIONED__ within the INFORMATION_SCHEMA.PARTITIONS view. This is expected behavior for tables using ingestion-time partitioning.

BigQuery employs a streaming buffer to optimize the writing process. Data might reside in the __UNPARTITIONED__ partition until enough volume accumulates or at least an hour has passed. After these conditions are met, BigQuery re-partitions the data into its appropriate hourly partition.

You can monitor data within the __UNPARTITIONED__ partition using the INFORMATION_SCHEMA.PARTITIONS view.

What's next

  • If you still have issues with your BigQuery subscription, see Getting support.