Automate your data warehouse migration to BigQuery with new data migration tool
Abdullateef Abdulsalam
Data & Analytics Cloud Consultant
Preethi Igoor
Data & Analytics Cloud Consultant
As customers embark on their digital transformation journey, migrating on-premises data warehouses to the cloud accounts for a significant percentage of their overall transformation efforts.
Today, to accelerate data warehouse migrations to BigQuery, we are announcing additional services in the form of a data migration tool that automates the migration of on-premises data warehouses to BigQuery.
Currently in preview, the data migration tool supports key phases of a data warehouse migration, from schema migration and historical data migration to SQL translation and validation and is built on Google Cloud. This tool works in close coordination with migration accelerators like Data Validation Tool and BigQuery migration services to automate the end-to-end data warehouse migration. Migrations from Teradata, Hive, Redshift, and Oracle are currently supported with support for more data warehouses in the future.
This is a fully configurable open-source solution that can be deployed via Terraform and triggered with a simple JSON configuration file. You can scale compute up or down based on data volume or complexity. The key benefits include:
A proven approach to enterprise data warehouse migration with reduced risk of errors, bugs, and data inconsistencies
Automated orchestration of data migration, SQL translation and validation
Reporting dashboard with results of data validation, job runs, and SQL translation in Looker
Reduced time to initial landing of data and SQL artifacts into Google Cloud
Lower total cost of migration
Limited need for an additional customer development team
A low learning curve
For example, here’s how the data migration tool facilitates a migration from Teradata to BigQuery:
A user uploads the relevant input files (table list csv, SQL/DDL files) and a JSON configuration file to a Cloud Storage bucket, triggering the data migration tool.
The data migration tool triggers Data Transfer Service to convert table schema, and transfer table data from the on-prem source data warehouse to BigQuery tables.
After the data transfer is completed, the data migration tool triggers Data Transfer Service to compare source and target tables on aggregated column values, and outputs the validation result to BigQuery reporting tables.
The data migration tool triggers the BigQuery SQL Translation Service to translate SQL queries (uploaded to a Cloud Storage bucket) from source syntax to BigQuery SQL, and output the translated SQL to a different bucket. The data migration tool writes translation statistics to BigQuery reporting tables.
The tool then triggers a data validation tool to compare source and target SQL on their result set.
The data migration tool writes validation statistics to BigQuery reporting tables.
User reviews the Migration Dashboard in Looker Studio to inspect the results of the data migration, SQL translation, and data validation.
And that’s it! In a few short steps, the data migration tool orchestrates what can otherwise be a time-consuming, error-prone process. This simple and proven migration approach can help you reduce the risk and accelerate the migration process, freeing up time and resources to focus on other aspects of your digital transformation journey. Better yet, there are no license fees to use the data migration tool on Google Cloud; the only costs are storage and compute for the services that you consume. For more on the data migration tool, check out the project on Github.