Configure an AlloyDB for PostgreSQL database

The following sections cover how to configure an AlloyDB for PostgreSQL database.

Set up connectivity between Datastream and AlloyDB

To enable Datastream to connect to the AlloyDB instance, you need to set up a TCP proxy in the consumer project. The TCP proxy image is publicly available to all customers.

  1. Launch the proxy container:

    1. Activate Cloud Shell in the consumer project.
    2. Run the following script to create the TCP proxy:
    gcloud compute instances create-with-container \
      --zone=REGION_ID VM_NAME \
      --container-image gcr.io/dms-images/tcp-proxy \
      --tags=dms-tcp-proxy \
      --container-env=SOURCE_CONFIG=ALLOYDB_IP:ALLOYDB_PORT \
      --can-ip-forward \
      --network=SOURCE_AND_DEST_VPC \
      --machine-type=VM_TIER
    

    Replace the following:

    • REGION_ID: The region in which you want to create the TCP proxy.
    • VM_NAME: The name of the virtual machine.
    • ALLOYDB_IP: The IP address of the AlloyDB for PostgreSQL instance.
    • ALLOYDB_PORT: The port number of the AlloyDB for PostgreSQL instance.
    • SOURCE_AND_DEST_VPC: The VPC network to which the source and destination are connected.
    • VM_TIER: The type of your virtual machine.

    An example command with updated parameters:

    gcloud compute instances create-with-container \
      --zone=us-central1-c ds-tcp-proxy \
      --container-image gcr.io/dms-images/tcp-proxy \
      --tags=ds-tcp-proxy \
      --container-env=SOURCE_CONFIG=10.16.0.5:5432 \
      --can-ip-forward \
      --network=default \
      --machine-type=e2-micro
    
  2. Run the script to print the IP address of the TCP Proxy with the following inputs:

    gcloud compute instances describe VM_NAME \
      --format="yaml(networkInterfaces[].networkIP)" \
      --zone=REGION_ID
    

    Replace the following:

    • VM_NAME: The name of your virtual machine.
    • REGION_ID: The region in which you created the TCP proxy.
  3. (Optional) Create a firewall rule to limit ingress traffic to the TCP proxy:

    gcloud compute firewall-rules create FIREWALL_RULE_NAME \
      --direction=INGRESS \
      --priority=1000 \
      --target-tags=dms-tcp-proxy \
      --network=SOURCE_VPC \
      --action=ALLOW \
      --rules=tcp:ALLOYDB_PORT \
      --source-ranges=IP_RANGE
    

    Replace the following:

    • FIREWALL_RULE_NAME: The name of your firewall rule.
    • SOURCE_VPC: The VPC network to which your source is connected.
    • ALLOYDB_PORT: The port number of the AlloyDB for PostgreSQL instance.
    • IP_RANGE: The IP address range to which you want to limit ingress traffic.

    An example command to allow ingress traffic from all IP addresses:

    gcloud compute firewall-rules create ds-proxy1 \
      --direction=INGRESS \
      --priority=1000 \
      --target-tags=ds-tcp-proxy \
      --network=default \
      --action=ALLOW \
      --rules=tcp:5432
    

Configure AlloyDB for replication

  1. Enable logical decoding for the AlloyDB primary instance. In Google Cloud, set the value for the alloydb.logical_decoding flag to ON. For information about updating the database flags, see Configure an instance's database flags.

  2. Connect to the AlloyDB instance using the TCP proxy IP address by running the following command from any VM on the same VPC:

      psql -h PROXY_IP \
        -U DB_USER \
        -d DB_NAME
    

    Replace the following:

    • PROXY_IP: The IP address of the TCP proxy.
    • DB_USER: The username for the AlloyDB database.
    • DB_NAME: The AlloyDB database name.
  3. Run the following command to grant replication privileges to your database user:

    ALTER USER DB_USER WITH REPLICATION;
    
  4. Create a publication. We recommend that you create a publication only for the tables that you want to replicate. This allows Datastream to read-only the relevant data, and lowers the load on the database and Datastream:

    CREATE PUBLICATION PUBLICATION_NAME
    FOR TABLE SCHEMA1.TABLE1, SCHEMA2.TABLE2;

    Replace the following:

    • PUBLICATION_NAME: The name of your publication. You'll need to provide this name when you create a stream in the Datastream stream creation wizard.
    • SCHEMA: The name of the schema that contains the table.
    • TABLE: The name of the table that you want to replicate.

    You can create a publication for all tables in a schema. This approach lets you replicate changes for tables in the specified list of schemas, including tables that you create in the future:

    CREATE PUBLICATION PUBLICATION_NAME
    FOR TABLES IN SCHEMA1, SCHEMA2;

    You can also create a publication for all tables in your database. Note that this approach increases the load on both the source database and Datastream:

    CREATE PUBLICATION PUBLICATION_NAME FOR ALL TABLES;
    
  5. Create a replication slot by executing the following command.

    SELECT PG_CREATE_LOGICAL_REPLICATION_SLOT('REPLICATION_SLOT_NAME', 'pgoutput');
    

    Replace the following:

    • REPLICATION_SLOT_NAME: The name of your replication slot. You'll need to provide this name when you create a stream in the Datastream stream creation wizard.

Create a Datastream user

  1. To create a Datastream user, enter the following PostgreSQL command:

    CREATE USER USER_NAME WITH REPLICATION LOGIN PASSWORD 'USER_PASSWORD';
    ALTER USER USER_NAME createdb;
    

    Replace the following:

    • USER_NAME: The name of the Datastream user that you want to create.
    • USER_PASSWORD: The login password for the Datastream user that you want to create.
  2. Grant the following privileges to the user you created:

    GRANT SELECT ON ALL TABLES IN SCHEMA SCHEMA_NAME TO USER_NAME;
    GRANT USAGE ON SCHEMA SCHEMA_NAME TO USER_NAME;
    ALTER DEFAULT PRIVILEGES IN SCHEMA SCHEMA_NAME GRANT SELECT ON TABLES TO USER_NAME;
    

    Replace the following:

    • SCHEMA_NAME: The name of the schema to which you want to grant the privileges.
    • USER_NAME: The user to whom you want to grant the privileges.