Using the Datastream UI

Quickstart

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.

  4. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.

  6. Enable the Datastream API.

    Enable the API

  7. Make sure you have the Datastream Admin role assigned to your user account.

    Go to the IAM page

Requirements

Datastream offers a variety of source options, destination options, and networking connectivity methods.

In this quickstart, we assume that you're using a standalone Oracle database and a destination Cloud Storage service. For the source database, you should be able to configure your network to add an inbound firewall rule. The source database can be on-premises or in a cloud provider. Because the destination is Cloud Storage, it should be in Google's cloud provider.

Because we can't know the specifics of your environment, we can't provide detailed steps when it comes to your networking configuration.

For this quickstart, you'll select IP allowlisting as the connectivity method. IP allowlisting is a security feature often used for limiting and controlling access to the data in your source database to trusted users. You can use IP allowlists to create lists of trusted IP addresses or IP ranges from which your users and other Cloud services such as Datastream can access this data. To use IP allowlists, you must open the source database or firewall to incoming connections from Datastream.

Create connection profiles

By creating connection profiles for a source database and a destination, you're creating records that contain information about the source and the destination.

In this quickstart, you'll select Oracle as the profile type for your source connection profile, and Cloud Storage as the profile type for your destination connection profile. Datastream uses the information in the connection profiles to migrate data from the source Oracle database into a destination bucket in Cloud Storage.

Create a source connection profile for Oracle database

  1. Go to the Connection profiles page for Datastream in the Google Cloud Console.

    Go to the Connection profiles page

  2. Click CREATE PROFILE.

  3. In the Create a connection profile page, click the Oracle profile type (because you want to create a source connection profile for Oracle database).

  4. Supply the following information in the Define connection settings section of the Create Oracle profile page:

    • Enter My Source Connection Profile as the Connection profile name for your source database.
    • Keep the auto-generated Connection profile ID.
    • Enter Connection details:
      • In the Hostname or IP field, enter a hostname or public IP address that Datastream can use to connect to the source Oracle database. You're providing a public IP address because IP allowlist will be used as the network connectivity method for this quickstart.
      • In the Port field, enter the port number that's reserved for the source database. For an Oracle database, the default port is typically 1521.
      • Enter a Username and Password to authenticate to your source database.
      • In the System identifier (SID) field, enter the SID or service name that identifies the database instance. For Oracle databases, this is typically ORCL.
  5. In the Define connection settings section, click CONTINUE. The Define connectivity method section of the Create Oracle profile page is active.

  6. Choose the networking method that you'd like to use to establish connectivity between the source database and the destination bucket in Cloud Storage. For this quickstart, use the Connectivity method dropdown list to select IP allowlisting as the networking method.

  7. Use the Region drop-down menu to specify the regions from which connections can be established.

  8. Configure your source database to allow incoming connections from the Datastream public IP addresses that appear.

  9. In the Define connectivity method section, click CONTINUE. The Test connection profile section of the Create Oracle profile page is active.

  10. From the Region drop-down menu, select the regions from which you want to test connectivity from Datastream to the source Oracle database.

  11. Click RUN TEST to verify that the source Oracle database and Datastream can communicate with each other.

  12. Verify that you see the "Test passed on region-name!" status for each region that you selected.

  13. If the test fails, you can address the problem in the appropriate part of the flow, and then return to re-test.

  14. Click CREATE.

Create a destination connection profile for Cloud Storage

  1. Go to the Connection profiles page for Datastream in the Google Cloud Console.

    Go to the Connection profiles page

  2. Click CREATE PROFILE.

  3. In the Create a connection profile page, click the Cloud Storage profile type (because you want to create a destination connection profile for Cloud Storage).

  4. Supply the following information in the Create Cloud Storage profile page:

    • Enter My Destination Connection Profile as the Connection profile name for your destination Cloud Storage service.
    • Keep the auto-generated Connection profile ID.
    • Click BROWSE.
    • In the Select bucket pane, select the destination bucket in Cloud Storage into which Datastream will transfer data from the source database, and then click SELECT.

      Your bucket appears in the Bucket name field of the Create Cloud Storage profile page.

    • Optionally, in the Connection profile path prefix field, you can provide a prefix for the path that will be appended to the bucket name when Datastream transfers data to the destination.

  5. Click CREATE.

After creating a source connection profile for Oracle database and a destination connection profile for Cloud Storage, you can use them to create a stream.

Create a stream

In this section, you learn how to create a stream. Datastream uses this stream to transfer data from a source Oracle database to a destination bucket in Cloud Storage.

Creating a stream includes:

  • Defining settings for the stream.
  • Selecting the connection profile that you created for your source database (the source connection profile). For this quickstart, this is My Source Connection Profile.
  • Configuring information about the source database for the stream by specifying the tables and schemas in the source database that Datastream:
    • Can transfer into the destination.
    • Is restricted from transferring into the destination.
  • Selecting the connection profile that you created for Cloud Storage (the destination connection profile). For this quickstart, this is My Destination Connection Profile.
  • Configuring information about the destination bucket for the stream. This information includes:
    • The folder of the destination bucket into which Datastream will transfer schemas, tables, and data from a source Oracle database.
    • The size (in MBytes) of files that contain data that's being transferred from the source database into a folder in the Cloud Storage destination bucket. By default, as data is being retrieved from the source database, it's written into 50-MB files. If any data exceeds this size, then the data will be segmented into multiple 50-MB files.
    • The number of seconds that will elapse before Datastream closes an existing file in a folder of the Cloud Storage destination bucket and opens another file to contain data being transferred from the source database. By default, the file rotation interval is set to 60 seconds.

Define settings for the stream

  1. Go to the Streams page for Datastream in the Google Cloud Console.

    Go to the Streams page

  2. Click CREATE STREAM.

  3. Supply the following information in the Define stream details panel of the Create stream page:

    • Enter My Stream as the Stream name.
    • Keep the auto-generated Stream ID.
    • From the Region menu, select the region where you created your source connection profile.
    • From the Source type menu, select the Oracle profile type.
    • From the Destination type menu, select the Cloud Storage profile type.
  4. Review the required prerequisites that are generated automatically to reflect how your environment must be prepared for a stream. These prerequisites can include how to configure the source database and how to connect Datastream to the destination bucket in Cloud Storage.

  5. Click CONTINUE. The Define Oracle connection profile panel of the Create stream page appears.

Specify information about the source connection profile

  1. From the Source connection profile menu, select your source connection profile for Oracle database.

  2. Click RUN TEST to verify that the source database and Datastream can communicate with each other.

    If the test fails, then the issue associated with the connection profile appears. Make the necessary changes to correct the issue, and then retest.

  3. Click CONTINUE. The Configure stream source panel of the Create stream page appears.

Configure information about the source database for the stream

  1. Use the Objects to include menu to specify the tables and schemas in your source database that Datastream can transfer into a folder in the destination bucket in Cloud Storage.

    For this quickstart, you want Datastream to transfer all tables and schemas. Therefore, select All tables from all schemas from the menu.

  2. Click CONTINUE. The Define Cloud Storage connection profile panel of the Create stream page appears.

Select a destination connection profile

  1. From the destination connection profile menu, select your destination connection profile for Cloud Storage.

  2. Click CONTINUE. The Configure stream destination panel of the Create stream page appears.

Configure information about the destination for the stream

  1. In the Stream path prefix field, enter the folder of the destination bucket into which Datastream will transfer schemas, tables, and data from a source Oracle database.

    For this quickstart, you want Datastream to transfer data from the source database into the /root/tutorial folder in the destination bucket of Cloud Storage. Therefore, enter /root/tutorial in the Stream path prefix field.

  2. In the Target file size field, leave the default setting of 50. This setting represents the size (in MBytes) of files that contain data that's being transferred from the source database into a folder in the Cloud Storage destination bucket.

  3. In the Writing data delay limitation field, leave the default setting of 60. This setting represents how many seconds will elapse before Datastream closes an existing file in a folder of the Cloud Storage destination bucket and opens another file to contain data being transferred from the source database.

  4. In the Output format field, select the format of files written to Cloud Storage. For this quickstart, Avro is the file format.

  5. Click CONTINUE. The Review stream details and create panel of the Create stream page appears.

Create the stream

  1. Verify details about the stream as well as the source and destination connection profiles that the stream will use to transfer data from a source Oracle database to a destination bucket in Cloud Storage.

  2. Click CREATE.

  3. In the Create stream? dialog box, click CREATE.

After creating a stream, you can start it.

Start the stream

In the previous section of the quickstart, you created a stream, but you didn't start it. You can do this now.

For this quickstart, you create and start a stream separately in case the stream creation process incurs an increased load on your source database. To put off that load, you create the stream without starting it, and then start the stream when the load can be incurred.

By starting the stream, Datastream can transfer data, schemas, and tables from the source database to the destination.

  1. Go to the Streams page for Datastream in the Google Cloud Console.

    Go to the Streams page

  2. Select the check box to the left of the stream that you want to start. For this quickstart, this is My Stream.

  3. Click START.

  4. In the dialog box, click START. The status of the stream changes from Created to Running.

After starting a stream, you can verify that Datastream transferred data from the source database to the destination.

Verify the stream

In this section, you confirm that Datastream:

  • Transfers the data from all tables of your source Oracle database into the /root/tutorial folder in the Cloud Storage destination bucket.
  • Translates the data into the Avro file format.
  1. Go to the Streams page for Datastream in the Google Cloud Console.

    Go to the Streams page

  2. Click the stream that you created. For this quickstart, this is My Stream.

  3. In the Stream details page, click the link that appears below the Destination write path field. The Bucket details page of Cloud Storage opens in a separate tab.

  4. Verify that you see folders that represent tables of your source Oracle database.

  5. Click one of the table folders and drill down until you see data that's associated with the table.

  6. Click a file that represents the data and click DOWNLOAD.

  7. Open this file in an Avro tool (for example, Avro Viewer) to ensure that the content is readable. This confirms that Datastream also translated the data into the Avro file format.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this page, follow these steps.

  1. Use the Google Cloud Console to delete your project, Cloud Storage destination bucket, stream, and connection profiles.

By cleaning up the resources that you created on Datastream, they won't take up quota and you won't be billed for them in the future. The following sections describe how to delete or turn off these resources.

Delete your project

The easiest way to eliminate billing is to delete the project that you created for this quickstart.

  1. In the Cloud Console, go to the Manage resources page.

    Go to the Manage resources page

  2. In the project list, select the project that you want to delete, and then click Delete.

  3. In the dialog box, type the project ID, and then click Shut down to delete the project.

Delete your Cloud Storage destination bucket

  1. Go to the Storage browser page in Cloud Storage.

    Go to the Storage browser page

  2. Select the check box to the left of your bucket, and then click DELETE.

  3. In the dialog box, enter the name of your bucket in the text field, and then click DELETE.

Delete the stream

  1. Go to the Streams page for Datastream in the Google Cloud Console.

    Go to the Streams page

  2. Click the stream that you want to delete. For this quickstart, this is My Stream.

  3. Click PAUSE.

  4. In the dialog box, click PAUSE.

  5. In the Stream status pane of the Stream details page, verify that the status of the stream is Paused.

  6. Click DELETE.

  7. In the dialog box, enter Delete in the text field, and then click DELETE.

Delete the connection profiles

  1. Go to the Connection profiles page for Datastream in the Google Cloud Console.

    Go to the Connection profiles page

  2. Select the check box for each connection profile that you want to delete. For this quickstart, select the check boxes for My Source Connection Profile and My Destination Connection Profile.

  3. Click DELETE.

  4. In the dialog box, click DELETE.

What's next

  • Learn more about Datastream.
  • Try out other Google Cloud features for yourself. Have a look at our quickstarts.