BigQuery Data Transfer API Connector Overview

The Workflows connector defines the built-in functions that can be used to access other Google Cloud products within a workflow.

This page provides an overview of the individual connector. There is no need to import or load connector libraries in a workflow—connectors work out of the box when used in a call step.

BigQuery Data Transfer API

Schedule queries or transfer external data from SaaS applications to Google BigQuery on a regular basis. To learn more, see the BigQuery Data Transfer API documentation.

BigQuery Data Transfer connector sample

YAML

# This workflow creates a new dataset and a new table inside that dataset, which are required
# for the BigQuery Data Transfer Job to run. It creates a new TransferJob configuration and starts
# a manual run of the transfer (30 seconds after the config is created).
# The transferRun is a blocking LRO.
# All resources get deleted once the transfer run completes.
#
# On success, it returns "SUCCESS".
#
# Features included in this test:
# - BigQuery Data Transfer connector
# - Waiting for long-running transfer run to complete
#
# This workflow expects following items to be provided through input argument for execution:
#   - projectID (string)
#     - The user project ID.
#   - datasetID (string)
#     - The dataset name, expected to have an unique value to avoid the
#       instance being referred by multiple tests.
#   - tableID (string)
#     - The table name, expected to have an unique value to avoid the
#       instance being referred by multiple tests.
#   - runConfigDisplayName (string)
#     - The transfer run configuration display name.
#
# Expected successful output: "SUCCESS"
main:
  params: [args]
  steps:
    - init:
        assign:
          - project_id: ${args.projectID}
          - destination_dataset: ${args.datasetID}
          - destination_table: ${args.tableID}
          - run_config_display_name: ${args.runConfigDisplayName}
          - run_config_data_source_id: "google_cloud_storage"
          - location: "us"
          - data_path_template: "gs://xxxxxx-bucket/xxxxx/xxxx"
    - create_dataset:
        call: googleapis.bigquery.v2.datasets.insert
        args:
          projectId: ${project_id}
          body:
            datasetReference:
              datasetId: ${destination_dataset}
              projectId: ${project_id}
    - create_table:
        call: googleapis.bigquery.v2.tables.insert
        args:
          datasetId: ${destination_dataset}
          projectId: ${project_id}
          body:
            tableReference:
              datasetId: ${destination_dataset}
              projectId: ${project_id}
              tableId: ${destination_table}
            schema:
              fields:
                - name: "column1"
                  type: "STRING"
                - name: "column2"
                  type: "STRING"
    - list_config:
        call: googleapis.bigquerydatatransfer.v1.projects.locations.transferConfigs.list
        args:
          parent: ${"projects/" + project_id + "/locations/us"}
    - create_run_config:
        call: googleapis.bigquerydatatransfer.v1.projects.locations.transferConfigs.create
        args:
          parent: ${"projects/" + project_id + "/locations/" + location}
          body:
            displayName: ${run_config_display_name}
            schedule: "every day 19:22"
            scheduleOptions:
              disableAutoScheduling: true
            destinationDatasetId: ${destination_dataset}
            dataSourceId: ${run_config_data_source_id}
            params:
              destination_table_name_template: ${destination_table}
              file_format: "CSV"
              data_path_template: ${data_path_template}
        result: config
    - get_time_in_30s:
        assign:
          - now_plus_30s: ${time.format(sys.now() + 30)}
    - start_run:
        call: googleapis.bigquerydatatransfer.v1.projects.locations.transferConfigs.startManualRuns
        args:
          parent: ${config.name}
          body:
            requestedRunTime: ${now_plus_30s}
        result: runsResp
    - remove_run_config:
        call: googleapis.bigquerydatatransfer.v1.projects.locations.transferConfigs.delete
        args:
          name: ${config.name}
    - delete_table:
        call: googleapis.bigquery.v2.tables.delete
        args:
          datasetId: ${destination_dataset}
          projectId: ${project_id}
          tableId: ${destination_table}
    - delete_dataset:
        call: googleapis.bigquery.v2.datasets.delete
        args:
          projectId: ${project_id}
          datasetId: ${destination_dataset}
    - the_end:
        return: "SUCCESS"

JSON

{
  "main": {
    "params": [
      "args"
    ],
    "steps": [
      {
        "init": {
          "assign": [
            {
              "project_id": "${args.projectID}"
            },
            {
              "destination_dataset": "${args.datasetID}"
            },
            {
              "destination_table": "${args.tableID}"
            },
            {
              "run_config_display_name": "${args.runConfigDisplayName}"
            },
            {
              "run_config_data_source_id": "google_cloud_storage"
            },
            {
              "location": "us"
            },
            {
              "data_path_template": "gs://xxxxxx-bucket/xxxxx/xxxx"
            }
          ]
        }
      },
      {
        "create_dataset": {
          "call": "googleapis.bigquery.v2.datasets.insert",
          "args": {
            "projectId": "${project_id}",
            "body": {
              "datasetReference": {
                "datasetId": "${destination_dataset}",
                "projectId": "${project_id}"
              }
            }
          }
        }
      },
      {
        "create_table": {
          "call": "googleapis.bigquery.v2.tables.insert",
          "args": {
            "datasetId": "${destination_dataset}",
            "projectId": "${project_id}",
            "body": {
              "tableReference": {
                "datasetId": "${destination_dataset}",
                "projectId": "${project_id}",
                "tableId": "${destination_table}"
              },
              "schema": {
                "fields": [
                  {
                    "name": "column1",
                    "type": "STRING"
                  },
                  {
                    "name": "column2",
                    "type": "STRING"
                  }
                ]
              }
            }
          }
        }
      },
      {
        "list_config": {
          "call": "googleapis.bigquerydatatransfer.v1.projects.locations.transferConfigs.list",
          "args": {
            "parent": "${\"projects/\" + project_id + \"/locations/us\"}"
          }
        }
      },
      {
        "create_run_config": {
          "call": "googleapis.bigquerydatatransfer.v1.projects.locations.transferConfigs.create",
          "args": {
            "parent": "${\"projects/\" + project_id + \"/locations/\" + location}",
            "body": {
              "displayName": "${run_config_display_name}",
              "schedule": "every day 19:22",
              "scheduleOptions": {
                "disableAutoScheduling": true
              },
              "destinationDatasetId": "${destination_dataset}",
              "dataSourceId": "${run_config_data_source_id}",
              "params": {
                "destination_table_name_template": "${destination_table}",
                "file_format": "CSV",
                "data_path_template": "${data_path_template}"
              }
            }
          },
          "result": "config"
        }
      },
      {
        "get_time_in_30s": {
          "assign": [
            {
              "now_plus_30s": "${time.format(sys.now() + 30)}"
            }
          ]
        }
      },
      {
        "start_run": {
          "call": "googleapis.bigquerydatatransfer.v1.projects.locations.transferConfigs.startManualRuns",
          "args": {
            "parent": "${config.name}",
            "body": {
              "requestedRunTime": "${now_plus_30s}"
            }
          },
          "result": "runsResp"
        }
      },
      {
        "remove_run_config": {
          "call": "googleapis.bigquerydatatransfer.v1.projects.locations.transferConfigs.delete",
          "args": {
            "name": "${config.name}"
          }
        }
      },
      {
        "delete_table": {
          "call": "googleapis.bigquery.v2.tables.delete",
          "args": {
            "datasetId": "${destination_dataset}",
            "projectId": "${project_id}",
            "tableId": "${destination_table}"
          }
        }
      },
      {
        "delete_dataset": {
          "call": "googleapis.bigquery.v2.datasets.delete",
          "args": {
            "projectId": "${project_id}",
            "datasetId": "${destination_dataset}"
          }
        }
      },
      {
        "the_end": {
          "return": "SUCCESS"
        }
      }
    ]
  }
}

Module: googleapis.bigquerydatatransfer.v1.projects.dataSources

Functions
checkValidCreds Returns true if valid credentials exist for the given data source and requesting user.
get Retrieves a supported data source and returns its settings.
list Lists supported data sources and returns their settings.

Module: googleapis.bigquerydatatransfer.v1.projects.locations

Functions
get Gets information about a location.
list Lists information about the supported locations for this service.

Module: googleapis.bigquerydatatransfer.v1.projects.locations.dataSources

Functions
checkValidCreds Returns true if valid credentials exist for the given data source and requesting user.
get Retrieves a supported data source and returns its settings.
list Lists supported data sources and returns their settings.

Module: googleapis.bigquerydatatransfer.v1.projects.locations.transferConfigs

Functions
create Creates a new data transfer configuration.
delete Deletes a data transfer configuration, including any associated transfer runs and logs.
get Returns information about a data transfer config.
list Returns information about all transfer configs owned by a project in the specified location.
patch Updates a data transfer configuration. All fields must be set, even if they are not updated.
startManualRuns Start manual transfer runs to be executed now with schedule_time equal to current time. The transfer runs can be created for a time range where the run_time is between start_time (inclusive) and end_time (exclusive), or for a specific run_time.

Module: googleapis.bigquerydatatransfer.v1.projects.locations.transferConfigs.runs

Functions
delete Deletes the specified transfer run.
get Returns information about the particular transfer run.
list Returns information about running and completed transfer runs.

Module: googleapis.bigquerydatatransfer.v1.projects.locations.transferConfigs.runs.transferLogs

Functions
list Returns log messages for the transfer run.

Module: googleapis.bigquerydatatransfer.v1.projects.transferConfigs

Functions
create Creates a new data transfer configuration.
delete Deletes a data transfer configuration, including any associated transfer runs and logs.
get Returns information about a data transfer config.
list Returns information about all transfer configs owned by a project in the specified location.
patch Updates a data transfer configuration. All fields must be set, even if they are not updated.
startManualRuns Start manual transfer runs to be executed now with schedule_time equal to current time. The transfer runs can be created for a time range where the run_time is between start_time (inclusive) and end_time (exclusive), or for a specific run_time.

Module: googleapis.bigquerydatatransfer.v1.projects.transferConfigs.runs

Functions
delete Deletes the specified transfer run.
get Returns information about the particular transfer run.
list Returns information about running and completed transfer runs.

Module: googleapis.bigquerydatatransfer.v1.projects.transferConfigs.runs.transferLogs

Functions
list Returns log messages for the transfer run.