Quickstart: Export conversations to BigQuery

Cloud Data Fusion has plugins available to help you import and export CCAI Insights data. This guide walks you through the process of creating a pipeline in Cloud Data Fusion and successfully exporting your conversation data to BigQuery.

Prerequisites

  1. Enable the CCAI Insights Accelerator on a Cloud Data Fusion instance.
  2. Install the CCAI Insights plugins.
  3. Import conversation data into CCAI Insights.

Create a pipeline

  1. Go to the project selector page
  2. Navigate to the Cloud Data Fusion Instances page.
  3. Select the View instance button next to the instance you want to use. Click on the HUB tab in the new window that pops up to bring up the instance's HUB main page.
  4. Enable the RDD Partitoner plugin on your Cloud Data Fusion instance.
    1. Navigate to your instance's HUB page.
    2. Click on the General tab.
    3. Type "Repartitioner" into the search bar and select the RDD Repartitioner card, then click Deploy -> Finish.
    4. If the plugin fails to deploy, it is already enabled for this instance. You can navigate back to the main HUB page and proceed to the next step.
    5. If the plugin was not previously deployed, you will be presented with the option to either customize a pipeline or return to the homepage. Select the homepage option, navigate back to the main HUB page, and proceed to the next step.
  5. Click on the CCAI tab, navigate to the CCAI Export Quickstart card, and click Create. Click Finish to create the pipeline, then click the Customize Pipeline button.
  6. Enter a name and description for your pipeline by clicking on the pipeline's name in the upper left corner of the Cloud Data Fusion Studio screen:

  7. After naming your pipeline, Save and then Deploy it.

    Optionally, you can set other configurations, such as running the pipeline on a schedule, by clicking on the Schedule or Properties buttons.

    Configure the Export tile

Export conversation data

  1. When the new pipeline is ready you will be redirected to the pipeline view. Click Run. In the window that pops up, fill in your BigQuery dataset and table names in the Value fields to the right of their respective keys. Click Run.
  2. Your pipeline has successfully completed when Status is Succeeded. This process can take minutes to hours depending on how much data you are exporting.
  3. Navigate to the BigQuery dashboard in the Google Cloud console to view your exported data.