Jump to Content
Data Analytics

Manage Capacity with Pub/Sub Lite Reservations. It’s easier and cheaper.

September 27, 2021
Kir Titievsky

Product Manager, Google Cloud Pub/Sub

Try Google Cloud

Start building on Google Cloud with $300 in free credits and 20+ always free products.

Free trial

If you need inexpensive managed messaging for streaming analytics, Pub/Sub Lite was made for you.  Lite can be as much as 10 times cheaper than Pub/Sub.  But, until now, the low price came with a lot more work. You had to manage the read and write throughput capacity of each partition of every topic.  Have 10 single-partition topics? Make sure you watch 10 write and another 10 read capacity utilization metrics or you might run out of capacity.

Hello, Reservations

We did not like this either. So we launched Pub/Sub Lite Reservations to manage throughput capacity for many topics with a single number.  A reservation is a regional pool of throughput capacity.  The capacity can be used interchangeably for read or write operations by any topic within the same project and region as the reservation. You can think of this as provisioning a cluster of machines and letting it handle the traffic.  Except instead of a cluster, there is just a single number. 

Less work is great, of course. It is even better when it saves you money.  Without reservations, you must provision each topic partition for the highest peaks in throughput.  Depending on how variable your traffic is this can mean that half or more of the provisioned capacity is unused most of the time. Unused, but not free. 

Reservations allow you to handle the same traffic spikes with less spare capacity.  Usually, the peaks in throughput are not perfectly correlated among topics. If so, the peaks in the aggregate throughput of a reservation are smaller, relative to the time average, than the peaks in individual topics.  This makes for less variable throughput and reduces the need for spare capacity. 

As a cost-saving bonus, reservations do away with the explicit minimum capacity per partition. There is still a limit on the number of partitions per reservation. With this limit, you pay for at least 1 MiB/s per topic partition.  This is not quite “scale to zero” of Pub/Sub, but beats the 4 MiB/s required for any partition without reservations.

An Illustration

Suppose you have three topics with traffic patterns that combine a diurnal rhythm with random surges.  The minimum capacity needed to accommodate this traffic is illustrated below.

https://storage.googleapis.com/gweb-cloudblog-publish/images/pub_sub_l18OqkJ.max-1000x1000.jpg

“Before” you provision for the spikes in each topic independently. “After,” you aggregate the traffic, dampening most peaks.  In practice, you will provision more than shown here to anticipate the peaks you haven’t seen.  You will also likely have more topics. Both considerations increase the difference in favor of reservations. 

Are Reservations Always Best?

For all the benefits of shared capacity, it has the “noisy neighbor” problem.  A traffic peak on some topics can leave others without capacity.  This is a concern if your application critically depends on consistent latency and high availability.  In this case, you can isolate noisy topics in a separate reservation. In addition, you can limit the noise by setting throughput caps on individual topics. 

All in all, if you need to stream tens of megabytes per second at a low cost, Lite is now an even better option. 

Reservations are generally available to all Pub/Sub Lite users.  You can use reservations with existing topics. Start saving money and time by creating a Pub/Sub Lite reservation in the Cloud Console and let us know how it goes at pubsub-lite-helpline@google.com.

Posted in