Introducing logical replication and decoding for Cloud SQL for PostgreSQL
Bala Narasimhan
Group Product Manager, Google Cloud
Last week, we announced the Preview of Datastream, our serverless and easy-to-use change data capture and replication service. Datastream currently supports access to streaming, low-latency data from Oracle and MySQL databases into Google Cloud services such as Cloud Spanner, Cloud SQL, and BigQuery.
With this in mind, we understand that our customers are using different tools and technologies and we want to meet them where they are. Logical replication and decoding for example, is an inherent part of the PostgreSQL ecosystem and it is a commonly used functionality. Therefore, today we are excited to announce the public preview of logical replication and decoding for Cloud SQL for PostgreSQL. By releasing those capabilities and enabling change data capture (CDC) from Cloud SQL for PostgreSQL, we strengthen our commitment to building an open database platform that meets critical application requirements and integrates seamlessly with the PostgreSQL ecosystem.
Let’s take, for example, a retailer’s ecommerce system in which each order is saved in a database. Placing the order in the database is just one part of the order processing. How does the inventory get updated? By leveraging CDC, downstream systems can be notified of such changes and act accordingly—in this case, update the inventory in the warehouse.
Another common use case is data analytics pipelines. Businesses want to perform analytics on the freshest data possible. For example, low stock on some products might need to kick off certain logistical processes, such as restocking or alerting. You can leverage logical decoding and replication to get the freshest data from the operational systems to your data pipelines and from there to your analytics platform with low latency.
What is logical replication and decoding?
Logical replication enables the mirroring of database changes between two Postgres instances in a storage-agnostic fashion. Logical replication provides flexibility both in terms of what data can be replicated between instances and what versions those instances can be running.
Logical decoding enables the capture of all changes to tables within a database in different formats, such as JSON or plaintext, to name a few. Once captured, they can be consumed by a streaming protocol or a SQL interface.
What problems can I solve with logical replication and decoding?
Here’s what you can solve easily with logical replication and decoding:
Selective replication of sets of tables between instances so that only relevant data sets need be shared
Selective replication of table rows between instances mainly to reduce size of data
Selective replication of table columns from the source to remove non-essential or sensitive data
Data gather/merge from multiple sources to form a data lake
Stream fresh data from operational database to the data warehouse for near real time analyses
Upgrades instances between major versions with near zero downtime
How can I participate in the public preview?
To get started, check out documentation for this feature and release notes. To use this feature in public preview, spin up a new instance of Postgres (any version is fine) and follow the instructions in the documentation.