Read from a PostgreSQL database

This page describes how you read data from a PostgreSQL database in a Cloud Data Fusion instance.

Before you begin

  • Create a Cloud Data Fusion instance.
  • Enable your PostgreSQL database to accept connections from Cloud Data Fusion. To do this securely, we recommend that you use a private Cloud Data Fusion instance.

Open your instance in Cloud Data Fusion

  1. In the Google Cloud console, go to the Cloud Data Fusion Instances page.

    Go to Instances

    This page lets you manage your instances.

  2. Click View instance to open your instance in the Cloud Data Fusion UI.

Store your PostgreSQL password as a secure key

Enter your PostgreSQL password as a secure key to encrypt in your Cloud Data Fusion instance. For more information about keys, see Cloud KMS.

  1. In the Cloud Data Fusion UI, click System admin > Configuration.

  2. Click Make HTTP Calls.

    Configuration.

  3. In the dropdown menu, choose PUT.

  4. In the path field, enter namespaces/default/securekeys/pg_password.

  5. In the Body field, enter {"data":"POSTGRESQL_PASSWORD"}. Replace POSTGRESQL_PASSWORD with your PostgreSQL password.

  6. Click Send.

    Password.

The Response field notifies you of any errors.

Connect to Cloud SQL for PostgreSQL

  1. In the Cloud Data Fusion UI, click the menu and navigate to the Wrangler page.

  2. Click Add connection.

  3. Choose Database as the source type to connect.

    Add connection.

  4. Under Google Cloud SQL for PostgreSQL, click Upload.

    Choose source.

  5. Upload a JAR file that contains your PostgreSQL driver. Your JAR file must follow the format NAME-VERSION.jar. If your JAR file doesn't follow this format, rename it before you upload.

  6. Click Next.

  7. Enter the driver's name, class name, and version in the fields.

  8. Click Finish.

  9. In the Add connection window that opens, click Google Cloud SQL for PostgreSQL. Your JAR name should appear under Google Cloud SQL for PostgreSQL.

    JAR uploaded.

  10. Fill in the required connection fields. In the Password field, select the secure key you stored previously. This ensures that your password is retrieved using Cloud KMS.

    Choose password.

  11. In the Connection string field, enter your connection string as:

    jdbc:postgresql://google/DATABASE_NAME?cloudSqlInstance=INSTANCE_CONNECTION_NAME&socketFactory=com.google.cloud.sql.postgres.SocketFactory&useSSL=false
    

    Replace the following:

    • DATABASE_NAME: the Cloud SQL database name as listed in the Databases tab of the instance details page.
    • INSTANCE_CONNECTION_NAME: the Cloud SQL instance connection name as displayed in the Overview tab of the instance details page.
    Instance connection name.

    For example:

    jdbc:postgresql://google/postgres?cloudSqlInstance=dis-demo:us-central1:pgsql-1&socketFactory=com.google.cloud.sql.postgres.SocketFactory&useSSL=false
    

  12. Enable the Cloud SQL Admin API.

  13. Grant the following IAM roles to the service-PROJECT_NUMBER@gcp-sa-datafusion.iam.gserviceaccount.com service account:

    For more information about granting roles, see Manage access.

  14. Click Test connection to ensure that the connection can be established with the database.

  15. Click Add connection.

After your PostgreSQL database is connected, you can apply transformations to your data (in Wrangler), create a pipeline, and write your output to a sink (in Studio).