The Pub/Sub subscriber data APIs, such as pull, provide limited access to message data. Normally, acknowledged messages are inaccessible to subscribers of a given subscription. In addition, subscriber clients must process every message in a subscription even if only a subset is needed.
The Seek feature extends subscriber functionality by allowing you to alter the acknowledgement state of messages in bulk. For example, you can replay previously acknowledged messages or purge messages in bulk. In addition, you can copy the state of one subscription to another by using seek in combination with a Snapshot.
These features are described below. However, you can look at the quickstart for a working example.
Configuring message retention
To seek to a time in the past and replay previously-acknowledged messages, you must first configure message retention on the topic or configure the subscription to retain acknowledged messages.
If topic message retention is not configured, an unacknowledged message is
deleted from the subscription when its age exceeds the subscription's
message_retention_duration
property. On the other hand, if topic message
retention is configured, the unacknowledged message is deleted from the
subscription only when its age exceeds the maximum of the topic's and the
subscription's message_retention_duration
s.
Topic message retention
By default, a Pub/Sub topic discards messages as soon as they are
acknowledged by all subscriptions attached to the topic.
Configuring a topic with message retention gives you more flexibility, allowing
any subscription attached to the topic to seek back in time and replay
previously acknowledged messages up to the topic's
message_retention_duration
.
Topic message retention also allows a subscription to replay messages that are
published before you created a subscription.
A topic can retain published messages for a maximum of 31 days (configurable by
the topic's message_retention_duration
property) even after they
have been acknowledged by all attached subscriptions. In cases where the topic's
message_retention_duration
is greater than the subscription's
message_retention_duration
, Pub/Sub discards a message only when
its age exceeds the topic's message_retention_duration
.
If topic message retention is enabled, storage costs for the messages retained by the topic are be billed to the topic's project.
Console
To create a topic with message retention enabled, follow these steps:
In the Google Cloud console, go to the Pub/Sub topics page.
Click Create topic.
In the Topic ID field, enter an ID for your topic.
Check the box for Set message retention duration. Leave the other options in their default settings.
Use the drop-down menus below Message retention duration to select the number of days, hours, and minutes to retain messages.
Click Create topic to save the topic.
To update a topic's message retention settings:
Select your topic from the Pub/Sub topics page.
Click Edit at the top of the topic details page.
Adjust the retention time or enable/disable message retention by checking or unchecking the box labeled Enable message retention.
Click Update to save changes to the topic.
gcloud
To create a topic with a message retention duration of 7 days, use the
following gcloud pubsub topics create
command:
gcloud pubsub topics create TOPIC_ID --message-retention-duration=7d
You can update this setting by using gcloud pubsub topics update
. This also lets you enable message retention for an existing topic:
gcloud pubsub topics update TOPIC_ID --message-retention-duration=1d
You can also disable message retention for a topic with the update
command:
gcloud pubsub topics update TOPIC_ID --clear-message-retention-duration
Subscription message retention
By default, subscriptions discard messages as soon as they are acknowledged.
Unacknowledged messages are retained for a default of 7 days (configurable by
the subscription's
message_retention_duration
property).
Configuring a subscription to retain acknowledged messages (via the
retain_acked_messages
property) lets you replay previously-acked messages
retained by the subscription. You can configure messages to be retained for a
maximum of 7 days in a subscription. This configuration applies to both
acknowledged and unacknowledged messages. However, messages can be retained in a
subscription for more than 7 days if the message retention duration configured
on its topic is greater than 7 days.
If a subscription is configured to retain acknowledged messages, storage costs for the acknowledged messages retained by the subscription will be billed to the subscription's project.
Console
To create a subscription with retention of acked messages enabled, follow these steps:
In the Google Cloud console, go to the Pub/Sub subscriptions page.
Click Create subscription.
In the Subscription ID field, enter an ID for your subscription.
Use the drop-down menus below Message retention duration to select the number of days, hours, and minutes to retain messages.
Check the box for Retain acknowledged messages. Leave the other options in their default settings.
Click Create subscription to save the subscription.
To update a subscription's message retention settings:
Select your subscription from the Pub/Sub subscriptions page.
Click Edit at the top of the subscription details page.
Adjust the message retention duration, or enable/disable retention of acked messages by checking or unchecking the box labeled Retain acknowledged messages.
Click Update to save changes to the subscription.
gcloud
To create a subscription with retention of acked messages enabled, use the
following
gcloud pubsub subscriptions create
command:
gcloud pubsub subscriptions create SUBSCRIPTION_ID --retain-acked-messages=true --message-retention-duration=5d
You can update this setting by using
gcloud pubsub subscriptions update
. This also lets you enable retention of acked messages for an existing
subscription:
gcloud pubsub subscriptions update SUBSCRIPTION_ID --message-retention-duration=1d
You can also disable retention of acked messages for a subscription with the update
command:
gcloud pubsub subscriptions update SUBSCRIPTION_ID --retain-acked-messages=false
Seeking to a timestamp
Seeking to a time marks every message received by Pub/Sub before
the time as acknowledged, and all messages received after the time as
unacknowledged. You can seek to a time in the future to purge messages. To
replay and reprocess previously acknowledged messages, seek to a time in the
past. The message publication time is generated by the Pub/Sub
servers (see
publishTime
in the API reference). This approach is imprecise due to:
Possible clock skew among Pub/Sub servers.
The fact that Pub/Sub has to work with the arrival time of the publish request rather than when an event occurred in the source system.
Seeking to a snapshot
The snapshot feature allows you to capture the message acknowledgment state of a subscription. Once a snapshot is created, it retains:
All messages that were unacknowledged in the source subscription at the time of the snapshot's creation.
Any messages published to the topic thereafter.
You can replay these unacknowledged messages by using a snapshot to seek to any of the topic's subscriptions.
Unlike with seeking to a time, you don't need to perform any special subscription configuration to seek to a snapshot. You just need to create the snapshot ahead of time. For example, you might create a snapshot when deploying new subscriber code, in case you need to recover from unexpected or erroneous acknowledgements.
Snapshots expire and are deleted in the following cases (whichever comes first):
- The snapshot reaches a lifespan of seven days.
- The oldest unacknowledged message in the snapshot exceeds the
message retention duration
.
For example, consider a snapshot of a subscription with a backlog where the oldest unacknowledged message is a day old. The snapshot expires after six days, rather than seven. This timeline is necessary for snapshots to offer strong at-least-once delivery guarantees.
Eventual consistency
Seek operations are strictly consistent in regard to message delivery guarantees. This means that any message that is to become unacknowledged based on the seek condition is guaranteed to be eventually delivered after the seek operation succeeds. However, delivered messages do not instantly become consistent with the seek operation. So a message that was published before the seek timestamp or that is acknowledged in a snapshot might be delivered after the seek operation. In a sense, message delivery operates as an eventually consistent system with respect to the seek operation: it might take as long as a minute for the operation to take full effect.
Seeking with filters
You can replay messages from subscriptions with filters. If you seek to a timestamp using a subscription with a filter, the Pub/Sub service only redelivers the messages that match the filter.
A snapshot of a subscription with a filter contains the following messages:
- All messages that are newer than the snapshot, including messages that don't match the filter.
- Unacknowledged messages that are older than the snapshot.
If you seek to a snapshot using a subscription with a filter, the Pub/Sub service only redelivers the messages in the snapshot that match the filter of the subscription making the seek request.
For more information about filters, see Filtering messages.
Seeking with dead-letter topics
If you seek messages in a subscription with a dead-letter topic,
Pub/Sub sets the delivery attempts to 0
. The messages that you
receive from these subscriptions have a field that tallies the number of
delivery attempts.
For more information about dead-letter topics, see Forwarding to dead-letter topics.
Seeking with retry policies
If you seek messages in a subscription with a retry policy, Pub/Sub resets the delay between the following:
- The acknowledgement deadline expiring or the subscriber sending a negative acknowledgment.
- Pub/Sub resending the message.
For more information about retry policies, see Using retry policies.
Seeking with exactly once delivery
If you seek messages in a subscription with exactly once delivery, Pub/Sub will resend the previously acknowleged messages that are eligible for delivery. Any acknowledgements, for a delivery made before the seek operation, will fail. Seek operations are eventually consistent.
For more information about retry policies, see Exactly once delivery.
Use cases
- Update subscriber code safely. A concern with deploying new subscriber code is that the new executable may erroneously acknowledge messages, leading to message loss. Incorporating snapshots into your deployment process gives you a way to recover from bugs in new subscriber code.
- Recover from unexpected subscriber problems. In cases where subscriber problems are not associated with a specific deployment event, you might not have a relevant snapshot. In this case, if you have enabled acknowledged message retention for a subscription, seeking to a time in the past gives you a way to recover from the error.
- Save processing time and cost. Perform a bulk acknowledgement on a large backlog of messages that are no longer relevant.
- Test subscriber code on known data. When testing subscriber code for performance and consistency, it is useful to use the same data in every run. Snapshots enable consistent data with strong semantics. In addition, snapshots can be applied to any subscription on a given topic, including a newly created one.
What's next
You can use Pub/Sub with Dataflow. However, we do not recommend direct access to Pub/Sub Seek from within a running Dataflow pipeline. For the recommended workflow, see Using Pub/Sub with Dataflow.