Replaying and purging messages

The Pub/Sub subscriber data APIs, such as pull, provide limited access to message data. Normally, acknowledged messages are inaccessible to subscribers of a given subscription. In addition, subscriber clients must process every message in a subscription even if only a subset is needed.

The Seek feature extends subscriber functionality by allowing you to alter the acknowledgement state of messages in bulk. For example, you can replay previously acknowledged messages or purge messages in bulk. In addition, you can copy the state of one subscription to another by using seek in combination with a Snapshot.

These features are described below. However, you can look at the quickstart for a working example.

Configuring message retention

To seek to a time in the past and replay previously-acknowledged messages, you must first configure message retention on the topic or configure the subscription to retain acknowledged messages.

Topic message retention

By default, a Pub/Sub topic discards messages as soon as they are acknowledged by all subscriptions attached to the topic. Configuring a topic with message retention gives you more flexibility, allowing any subscription attached to the topic to seek back in time and replay previously-acked messages. Topic message retention also allows a subscription to replay messages published before a subscription was created.

If topic message retention is enabled, storage costs for the messages retained by the topic are be billed to the topic's project.

Console

To create a topic with message retention enabled, follow these steps:

  1. In the Cloud Console, go to the Pub/Sub topics page.

    Go to the topics page

  2. Click Create topic.

  3. In the Topic ID field, enter an ID for your topic.

  4. Check the box for Set message retention duration. Leave the other options in their default settings.

  5. Use the drop-down menus below Message retention duration to select the number of days, hours, and minutes to retain messages.

  6. Click Create topic to save the topic.

To update a topic's message retention settings:

  1. Select your topic from the Pub/Sub topics page.

    Go to the topics page

  2. Click Edit at the top of the topic details page.

  3. Adjust the retention time or enable/disable message retention by checking or unchecking the box labeled Enable message retention.

  4. Click Update to save changes to the topic.

gcloud

To create a topic with a message retention duration of 7 days, use the following gcloud pubsub topics create command:

gcloud pubsub topics create TOPIC_ID --message-retention-duration=7d

You can update this setting by using gcloud pubsub topics update . This also lets you enable message retention for an existing topic:

gcloud pubsub topics update TOPIC_ID --message-retention-duration=1d

You can also disable message retention for a topic with the update command:

gcloud pubsub topics update TOPIC_ID --clear-message-retention-duration

Subscription message retention

By default, subscriptions discard messages as soon as they are acknowledged. Unacknowledged messages are retained for a default of 7 days (configurable by the subscription's message_retention_duration property). Configuring a subscription to retain acknowledged messages (via the retain_acked_messages property) lets you replay previously-acked messages sent to the subscription. Messages in a subscription can be retained for a maximum of 7 days, whether they are acknowledged or unacknowledged. In other words, the age of the oldest message in a subscription will not exceed 7 days.

If a subscription is configured to retain acknowledged messages, storage costs for the acknowledged messages retained by the subscription will be billed to the subscription's project.

Console

To create a subscription with retention of acked messages enabled, follow these steps:

  1. In the Cloud Console, go to the Pub/Sub subscriptions page.

    Go to the subscriptions page

  2. Click Create subscription.

  3. In the Subscription ID field, enter an ID for your subscription.

  4. Use the drop-down menus below Message retention duration to select the number of days, hours, and minutes to retain messages.

  5. Check the box for Retain acknowledged messages. Leave the other options in their default settings.

  6. Click Create subscription to save the subscription.

To update a subscription's message retention settings:

  1. Select your subscription from the Pub/Sub subscriptions page.

    Go to the subscriptions page

  2. Click Edit at the top of the subscription details page.

  3. Adjust the message retention duration, or enable/disable retention of acked messages by checking or unchecking the box labeled Retain acknowledged messages.

  4. Click Update to save changes to the subscription.

gcloud

To create a subscription with retention of acked messages enabled, use the following gcloud pubsub subscriptions create command:

gcloud pubsub subscriptions create SUBSCRIPTION_ID
    --retain_acked_messages=true
    --message-retention-duration=5d

You can update this setting by using gcloud pubsub subscriptions update . This also lets you enable retention of acked messsages for an existing subscription:

gcloud pubsub subscriptions update SUBSCRIPTION_ID --message-retention-duration=1d

You can also disable retention of acked messages for a subscription with the update command:

gcloud pubsub subscriptions update SUBSCRIPTION_ID --retain_acked_messages=false

Seeking to a timestamp

Seeking to a time marks every message received by Pub/Sub before the time as acknowledged, and all messages received after the time as unacknowledged. You can seek to a time in the future to purge messages. To replay and reprocess previously acknowledged messages, seek to a time in the past. The message publication time is generated by the Pub/Sub servers (see publishTime in the API reference). This approach is imprecise due to:

  • Possible clock skew among Pub/Sub servers.

  • The fact that Pub/Sub has to work with the arrival time of the publish request rather than when an event occurred in the source system.

Seeking to a snapshot

The snapshot feature allows you to capture the message acknowledgment state of a subscription. Once a snapshot is created, it retains:

  • All messages that were unacknowledged in the source subscription at the time of the snapshot's creation.

  • Any messages published to the topic thereafter.

You can replay these unacknowledged messages by using a snapshot to seek to any of the topic's subscriptions.

Unlike with seeking to a time, you don't need to perform any special subscription configuration to seek to a snapshot. You just need to create the snapshot ahead of time. For example, you might create a snapshot when deploying new subscriber code, in case you need to recover from unexpected or erroneous acknowledgements.

Snapshots expire and are deleted in the following cases (whichever comes first):

  • The snapshot reaches a lifespan of seven days.
  • The oldest unacknowledged message in the snapshot exceeds the message retention duration.

For example, consider a snapshot of a subscription with a backlog where the oldest unacknowledged message is a day old. The snapshot expires after six days, rather than seven. This timeline is necessary for snapshots to offer strong at-least-once delivery guarantees.

Eventual consistency

Seek operations are strictly consistent in regard to message delivery guarantees. This means that any message that is to become unacknowledged based on the seek condition is guaranteed to be eventually delivered at least once after the seek operation succeeds. However, delivered messages do not instantly become consistent with the seek operation. So a message that was published before the seek timestamp or that is acknowledged in a snapshot might be delivered after the seek operation. In a sense, message delivery operates as an eventually consistent system with respect to the seek operation: it might take as long as a minute for the operation to take full effect.

Seeking with filters

You can replay messages from subscriptions with filters. If you seek to a timestamp using a subscription with a filter, the Pub/Sub service only redelivers the messages that match the filter.

A snapshot of a subscription with a filter contains the following messages:

  • All messages that are newer than the snapshot, including messages that don't match the filter.
  • Unacknowledged messages that are older than the snapshot.

If you seek to a snapshot using a subscription with a filter, the Pub/Sub service only redelivers the messages in the snapshot that match the filter of the subscription making the seek request.

For more information about filters, see Filtering messages.

Seeking with dead-letter topics

If you seek messages in a subscription with a dead-letter topic, Pub/Sub sets the delivery attempts to 0. The messages that you receive from these subscriptions have a field that tallies the number of delivery attempts.

For more information about dead-letter topics, see Forwarding to dead-letter topics.

Seeking with retry policies

If you seek messages in a subscription with a retry policy, Pub/Sub resets the delay between the following:

  1. The acknowledgement deadline expiring or the subscriber sending a negative acknowledgment.
  2. Pub/Sub resending the message.

For more information about retry policies, see Using retry policies.

Use cases

  • Update subscriber code safely. A concern with deploying new subscriber code is that the new executable may erroneously acknowledge messages, leading to message loss. Incorporating snapshots into your deployment process gives you a way to recover from bugs in new subscriber code.
  • Recover from unexpected subscriber problems. In cases where subscriber problems are not associated with a specific deployment event, you might not have a relevant snapshot. In this case, if you have enabled acknowledged message retention for a subscription, seeking to a time in the past gives you a way to recover from the error.
  • Save processing time and cost. Perform a bulk acknowledgement on a large backlog of messages that are no longer relevant.
  • Test subscriber code on known data. When testing subscriber code for performance and consistency, it is useful to use the same data in every run. Snapshots enable consistent data with strong semantics. In addition, snapshots can be applied to any subscription on a given topic, including a newly created one.

What's next

You can use Pub/Sub with Dataflow. However, we do not recommend direct access to Pub/Sub Seek from within a running Dataflow pipeline. For the recommended workflow, see Using Pub/Sub with Dataflow.