Upgrade a Replication job

This page describes how to upgrade a replication job after upgrading a Cloud Data Fusion instance. The process differs when you upgrade to version 6.6.0 and earlier.

Version 6.7.0+

To upgrade a replication job to version 6.7.0 or later, perform the replication job upgrade using the same steps as you would to upgrade batch pipelines.

Version 6.6.0 and earlier

To back up and upgrade the replication job to version 6.6.0 or earlier, follow these steps:

  1. Stop the replication job.

  2. Get the generation ID by making an HTTP GET() request in the Cloud Data Fusion web interface or REST API. The path is similar to the following:

    namespaces/NAMESPACE_ID/apps/REPLICATION_JOB_ID</workers/DeltaWorker
    

    Replace the following:

    • NAMESPACE_ID: the string name of the namespace for the replication job. If your pipeline belongs to a Basic edition instance, the ID is always default.

    • REPLICATION_JOB_ID: the string name of the replication job.

    Interface

    To make the HTTP GET() request in the Cloud Data Fusion interface, follow these steps:

    1. Go to your instance:
      1. In the Google Cloud console, go to the Cloud Data Fusion page.

      2. To open the instance in the Cloud Data Fusion web interface, click Instances, and then click View instance.

        Go to Instances

    2. Click System Admin > Configuration.
    3. Click Make HTTP calls.
    4. Select GET and enter the path described previously in step 2.
    5. Click Send and find generation ID in the call response.

    REST API

    To make the GET() request in the API, see the CDAP API reference.

  3. Back up the existing Cloud Storage directory.

    1. To edit the directory in the Google Cloud console, go to the Buckets page.

      Go to Buckets

    2. Click the bucket name to open the Bucket details page. The bucket name is the generation ID number.

    The Cloud Storage bucket path in the Google Cloud console has a format similar to the following:

    Buckets > OFFSET_BASE_PATH > NAMESPACE_ID > REPLICATION_JOB_ID

    You can find the bucket by the offsetBasePath value for the job. To get the value, make a GET() request in Cloud Data Fusion with the following path:

    namespaces/NAMESPACE_ID/apps/REPLICATION_JOB_ID
    
  4. Perform the upgrade using the same steps as you would to upgrade batch pipelines.

  5. The upgraded job has a new generation ID. Use the new ID as the Cloud Storage directory name.

  6. Start the replication job.

What's next