Network connectivity options

Overview

To use Datastream to create a stream from the source database to the destination, you must establish connectivity to the source database.

Datastream supports the IP allowlist, forward SSH tunnel, and VPC peering network connectivity methods.

Use the information in the following table to help you decide which method works best for you for your specific workload.

Networking method Description Advantages Disadvantages
IP allowlist

Works by configuring the source database server to allow incoming connections from Datastream's external IP addresses. To find out the IP addresses for your regions, see IP allowlists and regions.

  • Easy to configure
  • The source database is exposed to a public IP address.
  • The connection isn't encrypted by default. SSL must be enabled on the source database to encrypt the connection.
  • Configuring the firewall may require assistance from the IT department.
Forward SSH tunnel

Establish an encrypted connection over public networks between Datastream and the source, through a forward-SSH tunnel.

Learn more about SSH tunnels.

  • Secure
  • Limited bandwidth
  • You must set up and maintain the bastion host.
VPC peering Works by creating a private connectivity configuration. Datastream uses this configuration to communicate with the data source over a private network. This communication happens through a Virtual Private Cloud (VPC) peering connection.
  • Secure, private channel
  • Easy to configure
  • Requires a private network connection (VPN, Interconnect, etc.) between the database and Google Cloud.

Configure connectivity using IP allowlists

For Datastream to transfer data from a source database to a destination, Datastream first needs to connect to this database.

One way to configure this connectivity is through IP allowlists. Public IP connectivity is most appropriate when the source database is external to Google Cloud and has an externally accessible IPv4 address and TCP port.

If your source database is external to Google Cloud, then add Datastream's public IP addresses as an inbound firewall rule on the source network. In generic terms (your specific network settings may differ), do the following:

  1. Open your source database machine's network firewall rules.

  2. Create an inbound rule.

  3. Set the IP address of the source database to Datastream's IP addresses.

  4. Set the protocol to TCP.

  5. Set the port associated with the TCP protocol. The default values are:

    • 1521 for an Oracle database
    • 3306for a MySQL database
    • 5432for a PostgreSQL database
    • 1433 for a SQL Server database
  6. Save the firewall rule, and then exit.

Use an SSH tunnel

The following steps describe how to set up connectivity to a source database using a forward SSH tunnel.

Step 1: Choose a host on which to terminate the tunnel

The first step to set up SSH tunnel access for your database is to choose the host that will be used to terminate the tunnel. The tunnel can be terminated on either the database host itself, or on a separate host (the tunnel server).

Use the database server

Terminating the tunnel on the database has the advantage of simplicity. There's one fewer host involved, so there are no additional machines and their associated costs. The disadvantage is that your database server might be on a protected network that doesn't have direct access from the internet.

Use a tunnel server

Terminating the tunnel on a separate server has the advantage of keeping your database server inaccessible from the internet. If the tunnel server is compromised, then it's one step removed from the database server. We recommend that you remove all non-essential software and users from the tunnel server and closely monitor it with tools, such as an intrusion detection system (IDS).

The tunnel server can be any Unix or Linux host that:

  1. Can be accessed from the internet using SSH.
  2. Can access the database.

Step 2: Create an IP allowlist

The second step to set up SSH tunnel access for your database is to allow network traffic to reach the tunnel server or the database host using SSH, which is generally on TCP port 22.

Allow network traffic from each of the IP addresses for the region where Datastream resources are created.

Step 3: Use the SSH tunnel

Provide the tunnel details in the connection profile configuration. For more information, see Create a connection profile.

To authenticate the SSH tunnel session, Datastream requires either the password for the tunnel account, or a unique private key. To use a unique private key, you can use OpenSSH or OpenSSL command-line tools to generate keys.

Datastream stores the private key securely as part of the Datastream connection profile configuration. You must add the public key manually to the bastion host's ~/.ssh/authorized_keys file.

Generate private and public keys

You can generate SSH keys using the following method:

  • ssh-keygen: An OpenSSH command-line tool to generate SSH key pairs.

    Useful flags:

    • -t: Specifies the type of key to create, for example:

      ssh-keygen -t rsa

      ssh-keygen -t ed25519

    • -b: Specifies the key length in the key to create, for example:

      ssh-keygen -t rsa -b 2048

    • -y: Reads a private OpenSSH format file and prints an OpenSSH public key to standard output.

    • -f: Specifies the filename of the key file, for example:

      ssh-keygen -y [-f KEY_FILENAME]

    For more information about supported flags, see OpenBSD documentation.

You can generate a private PEM key using the following method:

  • openssl genpkey: An OpenSSL command-line tool to generate a PEM private key.

    Useful flags:

    • algorithm: Specifies the public key algorithm to use, for example:

      openssl genpkey -algorithm RSA

    • -out: Specifies the filename to which to output the key, for example:

      openssl genpkey -algorithm RSA -out PRIVATE_KEY_FILENAME.pem

    For more information about supported flags, see OpenSSL documentation.

Use private connectivity

Private connectivity is a connection between your VPC network and Datastream's private network, enabling Datastream to communicate with internal resources by using internal IP addresses. Using private connectivity establishes a dedicated connection on the Datastream network, meaning no other customers can share it.

If your source database is external to Google Cloud, then private connectivity enables Datastream to communicate with your database over VPN or Interconnect.

After a private connectivity configuration is created, a single configuration can service all streams in a project within a single region.

At a high-level, establishing private connectivity requires:

  • An existing Virtual Private Cloud (VPC)
  • An available IP range with a CIDR block of /29

If your project is using a shared VPC, then you'll also need to enable the Datastream and Google Compute Engine APIs, as well as grant permissions to Datastream's service account on the host project.

Learn more about how to create a private connectivity configuration.

What's next