This document describes how to create a BigQuery subscription. You can use the Google Cloud console, the Google Cloud CLI, the client library, or the Pub/Sub API to create a BigQuery subscription.
Before you begin
Before reading this document, ensure that you're familiar with the following:
How subscriptions work.
The workflow for BigQuery subscriptions.
How to configure a dead letter topic to handle message failures.
In addition to your familiarity with Pub/Sub and BigQuery, ensure that you meet the following prerequisites before you create a BigQuery subscription:
A BigQuery table exists. Alternatively, you can create one when you create the BigQuery subscription as described in the later sections of this document.
Compatibility between the schema of the Pub/Sub topic and the BigQuery table. If you add a non-compatible BigQuery table, you get a compatibility-related error message. For more information, see Schema compatibility.
Required roles and permissions
The following is a list of guidelines regarding roles and permissions:
To create a subscription, you must configure access control at the project level.
You also need resource-level permissions if your subscriptions and topics are in different projects, as discussed later in this section.
To create a BigQuery subscription, the Pub/Sub service account must have permission to write to the specific BigQuery table. For more information about how to grant these permissions, see the next section of this document.
You can configure a BigQuery subscription in a project to write to a BigQuery table in a different project.
To get the permissions that you need to create BigQuery subscriptions,
ask your administrator to grant you the
Pub/Sub Editor (roles/pubsub.editor
) IAM role on the project.
For more information about granting roles, see Manage access to projects, folders, and organizations.
This predefined role contains the permissions required to create BigQuery subscriptions. To see the exact permissions that are required, expand the Required permissions section:
Required permissions
The following permissions are required to create BigQuery subscriptions:
-
Pull from a subscription:
pubsub.subscriptions.consume
-
Create a subscription:
pubsub.subscriptions.create
-
Delete a subscription:
pubsub.subscriptions.delete
-
Get a subscription:
pubsub.subscriptions.get
-
List a subscription:
pubsub.subscriptions.list
-
Update a subscription:
pubsub.subscriptions.update
-
Attach a subscription to a topic:
pubsub.topics.attachSubscription
-
Get the IAM policy for a subscription:
pubsub.subscriptions.getIamPolicy
-
Configure the IAM policy for a subscription:
pubsub.subscriptions.setIamPolicy
You might also be able to get these permissions with custom roles or other predefined roles.
If you need to create BigQuery
subscriptions in one project that are associated with a topic in another
project, ask your topic administrator to also grant you the Pub/Sub Editor
(roles/pubsub.editor)
IAM role on the topic.
Assign BigQuery roles to the Pub/Sub service account
Some Google Cloud services have Google Cloud-managed service accounts that lets the
services access your resources. These service accounts are
known as service agents. Pub/Sub creates and maintains a
service account for each project in the format
service-project-number@gcp-sa-pubsub.iam.gserviceaccount.com
.
To create a BigQuery subscription, the Pub/Sub service account must have permission to write to the specific BigQuery table and to read the table metadata.
Grant the BigQuery Data Editor (roles/bigquery.dataEditor
)
role to the Pub/Sub service account.
In the Google Cloud console, go to the IAM page.
Click Grant access.
In the Add Principals section, enter the name of your Pub/Sub service account. The format of the service account is
service-project-number@gcp-sa-pubsub.iam.gserviceaccount.com
. For example, for a project withproject-number=112233445566
, the service account is of the formatservice-112233445566@gcp-sa-pubsub.iam.gserviceaccount.com
.In the Assign Roles section, click Add another role.
In the Select a role drop-down, enter
BigQuery
, and select the BigQuery Data Editor role.Click Save.
For more information about BigQuery IAM, see BigQuery roles and permissions.
BigQuery subscription properties
When you configure a BigQuery subscription, you can specify the following properties.
Common properties
Learn about the common subscription properties that you can set across all subscriptions.
Use topic schema
This option lets Pub/Sub use the schema of the Pub/Sub topic to which the subscription is attached. In addition, Pub/Sub writes the fields in messages to the corresponding columns in the BigQuery table.
When you use this option, remember to check the following additional requirements:
The fields in the topic schema and the BigQuery schema must have the same names and their types must be compatible with each other.
Any optional field in the topic schema must also be optional in the BigQuery schema.
Required fields in the topic schema don't need to be required in the BigQuery schema.
If there are BigQuery fields that are not present in the topic schema, these BigQuery fields must be in mode
NULLABLE
.If the topic schema has additional fields that are not present in the BigQuery schema and these fields can be dropped, select the option Drop unknown fields.
You can select only one of the subscription properties, Use topic schema or Use table schema.
If you don't select the Use topic schema or Use table schema option,
ensure that the BigQuery table has a column called data
of
type BYTES
, STRING
, or JSON
. Pub/Sub writes the message to
this BigQuery column.
You might not see changes to the Pub/Sub topics schema or BigQuery table schema take effect immediately with messages written to the BigQuery table. For example, if the Drop unknown fields option is enabled and a field is present in the Pub/Sub schema, but not the BigQuery schema, messages written to the BigQuery table might still not contain the field after adding it to the BigQuery schema. Eventually, the schemas synchronize and subsequent messages include the field.
When you use the Use topic schema option for your BigQuery subscription, you can also take advantage of BigQuery change data capture (CDC). CDC updates your BigQuery tables by processing and applying changes to existing rows.
To learn more about this feature, see Stream table updates with change data capture.
To learn how to use this feature with BigQuery subscriptions, see BigQuery change data capture.
Use table schema
This option lets Pub/Sub use the schema of the BigQuery table to write the fields of a JSON message to the corresponding columns. When you use this option, remember to check the following additional requirements:
Published messages must be in JSON format.
The following JSON conversions are supported:
JSON Type BigQuery Data Type string
NUMERIC
,BIGNUMERIC
,DATE
,TIME
,DATETIME
, orTIMESTAMP
number
NUMERIC
,BIGNUMERIC
,DATE
,TIME
,DATETIME
, orTIMESTAMP
- When using
number
toDATE
,DATETIME
,TIME
, orTIMESTAMP
conversions, the number must adhere to the supported representations. - When using
number
toNUMERIC
orBIGNUMERIC
conversion, the precision and range of values is limited to those accepted by the IEEE 754 standard for floating-point arithmetic. If you require high precision or a wider range of values, usestring
toNUMERIC
orBIGNUMERIC
conversions instead. - When using
string
toNUMERIC
orBIGNUMERIC
conversions, Pub/Sub assumes the string is a human readable number (e.g."123.124"
). If processing the string as a human readable number fails, Pub/Sub treats the string as bytes encoded with the BigDecimalByteStringEncoder.
- When using
If the subscription's topic has a schema associated with it, then the message encoding property must be set to
JSON
.If there are BigQuery fields that are not present in the messages, these BigQuery fields must be in mode
NULLABLE
.If the messages have additional fields that are not present in the BigQuery schema and these fields can be dropped, select the option Drop unknown fields.
You can select only one of the subscription properties, Use topic schema or Use table schema.
If you don't select the Use topic schema or Use table schema option,
ensure that the BigQuery table has a column called data
of
type BYTES
, STRING
, or JSON
. Pub/Sub writes the message to
this BigQuery column.
You might not see changes to the BigQuery table schema take effect immediately with messages written to the BigQuery table. For example, if the Drop unknown fields option is enabled and a field is present in the messages, but not in the BigQuery schema, messages written to the BigQuery table might still not contain the field after adding it to the BigQuery schema. Eventually, the schema synchronizes and subsequent messages include the field.
When you use the Use table schema option for your BigQuery subscription, you can also take advantage of BigQuery change data capture (CDC). CDC updates your BigQuery tables by processing and applying changes to existing rows.
To learn more about this feature, see Stream table updates with change data capture.
To learn how to use this feature with BigQuery subscriptions, see BigQuery change data capture.
Drop unknown fields
This option is used with the Use topic schema or Use table schema option. This option lets Pub/Sub drop any field that is present in the topic schema or message but not in the BigQuery schema. Without Drop unknown fields set, messages with extra fields are not written to BigQuery and remain in the subscription backlog. The subscription ends up in an error state.
Write metadata
This option lets Pub/Sub write the metadata of each message to additional columns in the BigQuery table. Else, the metadata is not written to the BigQuery table.
If you select the Write metadata option, ensure that the BigQuery table has the fields described in the following table.
If you don't select the Write metadata option, then the destination BigQuery table only requires the data
field unless
use_topic_schema
is true. If you select both the Write metadata and
Use topic schema options, then the schema of the topic must
not contain any fields with names that match those of the metadata parameters.
This limitation includes camelcase versions of these snake case parameters.
Parameters | |
---|---|
subscription_name |
STRING Name of a subscription. |
message_id |
STRING ID of a message |
publish_time |
TIMESTAMP The time of publishing a message. |
data |
BYTES, STRING, or JSON The message body. The |
attributes |
STRING or JSON A JSON object containing all message attributes. It also contains additional fields that are part of the Pub/Sub message including the ordering key, if present. |
Create a BigQuery subscription
The following samples demonstrate how to create a subscription with BigQuery delivery.
Console
- In the Google Cloud console, go to the Subscriptions page.
- Click Create subscription.
- For the Subscription ID field, enter a name.
For information on how to name a subscription, see Guidelines to name a topic or a subscription.
- Choose or create a topic from the drop-down menu. The subscription receives messages from the topic.
- Select Delivery type as Write to BigQuery.
- Select the project for the BigQuery table.
- Select an existing dataset or create a new one.
For information on how to create a dataset, see Creating datasets.
- Select an existing table or create a new one.
For information on how to create a table, see Creating tables.
- We strongly recommend that you enable Dead
lettering to handle message failures.
For more information, see Dead letter topic.
- Click Create.
You can also create a subscription from the Topics page. This shortcut is useful for associating topics with subscriptions.
- In the Google Cloud console, go to the Topics page.
- Click more_vert next to the topic for which you want to create a subscription.
- From the context menu, select Create subscription.
- Select Delivery type as Write to BigQuery.
- Select the project for the BigQuery table.
- Select an existing dataset or create a new one.
For information on how to create a dataset, see Creating datasets.
- Select an existing table or create a new one.
For information on how to create a dataset, see Creating tables.
- We strongly recommend that you enable Dead
lettering to handle message failures.
For more information, see Dead letter topic.
- Click Create.
gcloud
-
In the Google Cloud console, activate Cloud Shell.
At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
-
To create a Pub/Sub subscription, use the
gcloud pubsub subscriptions create
command:gcloud pubsub subscriptions create SUBSCRIPTION_ID \ --topic=TOPIC_ID \ --bigquery-table=PROJECT_ID:DATASET_ID.TABLE_ID
Replace the following:
- SUBSCRIPTION_ID: Specifies the ID of the subscription.
- TOPIC_ID: Specifies the ID of the topic. The topic requires a schema.
- PROJECT_ID: Specifies the ID of the project.
- DATASET_ID: Specifies the ID of an existing dataset. To create a dataset, see Create datasets.
- TABLE_ID: Specifies the ID of an existing table. The table requires a data field if your topic doesn't have a schema. To create a table, see Create an empty table with a schema definition.
C++
Before trying this sample, follow the C++ setup instructions in the Pub/Sub quickstart using client libraries. For more information, see the Pub/Sub C++ API reference documentation.
To authenticate to Pub/Sub, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
C#
Before trying this sample, follow the C# setup instructions in the Pub/Sub quickstart using client libraries. For more information, see the Pub/Sub C# API reference documentation.
To authenticate to Pub/Sub, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Go
Before trying this sample, follow the Go setup instructions in the Pub/Sub quickstart using client libraries. For more information, see the Pub/Sub Go API reference documentation.
To authenticate to Pub/Sub, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Java
Before trying this sample, follow the Java setup instructions in the Pub/Sub quickstart using client libraries. For more information, see the Pub/Sub Java API reference documentation.
To authenticate to Pub/Sub, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Node.js
Node.js
PHP
Before trying this sample, follow the PHP setup instructions in the Pub/Sub quickstart using client libraries. For more information, see the Pub/Sub PHP API reference documentation.
To authenticate to Pub/Sub, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Python
Before trying this sample, follow the Python setup instructions in the Pub/Sub quickstart using client libraries. For more information, see the Pub/Sub Python API reference documentation.
To authenticate to Pub/Sub, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Ruby
Before trying this sample, follow the Ruby setup instructions in the Pub/Sub quickstart using client libraries. For more information, see the Pub/Sub Ruby API reference documentation.
To authenticate to Pub/Sub, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Monitor a BigQuery subscription
Cloud Monitoring provides a number of metrics to monitor subscriptions.
For a list of all the available metrics related to Pub/Sub and their descriptions, see the Monitoring documentation for Pub/Sub.
You can also monitor subscriptions from within Pub/Sub.
What's next
- Create or modify a subscription with
gcloud
commands. - Create or modify a subscription with REST APIs.
- Troubleshoot a BigQuery subscription.