Jump to Content
Data Analytics

Automate your data warehouse migration to BigQuery with new data migration tool

August 18, 2023
Abdullateef Abdulsalam

Data & Analytics Cloud Consultant

Preethi Igoor

Data & Analytics Cloud Consultant

As customers embark on their digital transformation journey, migrating on-premises data warehouses to the cloud accounts for a significant percentage of their overall transformation efforts. 

Today, to accelerate data warehouse migrations to BigQuery, we are announcing additional services in the form of a data migration tool that automates the migration of on-premises data warehouses to BigQuery. 

Currently in preview, the data migration tool supports key phases of a data warehouse migration, from schema migration and historical data migration to SQL translation and validation and is built  on Google Cloud. This tool works in close coordination with migration accelerators like Data Validation Tool and BigQuery migration services to automate the end-to-end data warehouse migration. Migrations from Teradata, Hive, Redshift, and Oracle are currently supported  with support for more data warehouses in the future.

This is a fully configurable open-source solution that can be deployed via Terraform and triggered with a simple JSON configuration file. You can scale compute up or down based on data volume or complexity. The key benefits include:

  • A proven approach to enterprise data warehouse migration with reduced risk of errors, bugs, and data inconsistencies

  • Automated orchestration of data migration, SQL translation and validation 

  • Reporting dashboard with results of data validation, job runs, and SQL translation in Looker

  • Reduced time to initial landing of data and SQL artifacts into Google Cloud

  • Lower total cost of migration

  • Limited need for an additional customer development team

  • A low learning curve

https://storage.googleapis.com/gweb-cloudblog-publish/images/image1_ubIsmEg.max-1000x1000.png

For example, here’s how the data migration tool facilitates a migration from Teradata to BigQuery:

  1. A user uploads the relevant input files (table list csv, SQL/DDL files) and a JSON configuration file to a Cloud Storage bucket, triggering the data migration tool.

  2. The data migration tool triggers Data Transfer Service to convert table schema, and transfer table data from the on-prem source data warehouse to BigQuery tables.

  3. After the data transfer is completed, the data migration tool triggers Data Transfer Service to compare source and target tables on aggregated column values, and outputs the validation result to BigQuery reporting tables.

  4. The data migration tool triggers the BigQuery SQL Translation Service to translate SQL queries (uploaded to a Cloud Storage bucket) from source syntax to BigQuery SQL, and output the translated SQL to a different bucket. The data migration tool writes translation statistics to BigQuery reporting tables.

  5. The tool then triggers a data validation tool to compare source and target SQL on their result set. 

  6. The data migration tool writes validation statistics to BigQuery reporting tables.

  7. User reviews the Migration Dashboard in Looker Studio to inspect the results of the data migration, SQL translation, and data validation.

And that’s it! In a few short steps, the data migration tool orchestrates what can otherwise be a time-consuming, error-prone process. This simple and proven migration approach can help you reduce the risk and accelerate the migration process, freeing up time and resources to focus on other aspects of your digital transformation journey. Better yet, there are no license fees to use the data migration tool on Google Cloud; the only costs are storage and compute for the services that you consume. For more on the data migration tool, check out the project on Github.

Posted in