This document describes how to use the batch SQL translator in BigQuery to translate Teradata SQL to BigQuery Standard SQL. This document is intended for users who are familiar with the Google Cloud Console.
The batch SQL translator is part of the BigQuery Migration Service. You can use it to translate a batch of other SQL dialects to BigQuery Standard SQL. You can also translate SQL scripting languages such as Basic Teradata Query (BTEQ).
The following SQL dialects are supported:
- Teradata SQL
- Basic Teradata Query (BTEQ)
You must have the following permissions on the project to enable the BigQuery Migration Service:
You need the following permissions on the project to access and use the BigQuery Migration Service:
Alternatively, you can use the following roles to get the same permissions:
bigquerymigration.viewer- Read only access.
bigquerymigration.editor- Read/write access.
To access the Cloud Storage buckets for input and output files:
storage.objects.geton the source Cloud Storage bucket.
storage.objects.liston the source Cloud Storage bucket.
storage.objects.createon the destination Cloud Storage bucket.
You can have all the above necessary Cloud Storage permissions from the following roles:
Before you begin
Before you can submit a translation job, you must enable the API and upload the source files to Cloud Storage.
Enable the Migration API
Enable the BigQuery Migration API as follows:
In the Cloud Console, go to the BigQuery Migration API page.
Upload input files to Cloud Storage
Before you start a translation job, you must upload source files with the queries and scripts you want to translate. For information about what a source file can contain, see the Source File Format section. For information on how to create buckets and upload files to Cloud Storage, see Creating storage buckets and Uploading Objects.
Translate from Teradata SQL
Follow these steps to start a translation, view its progress, and finally see the results. These steps assumes you have source files uploaded into a Cloud Storage bucket already.
In the Cloud Console, go to the BigQuery page.
In the navigation panel, go to SQL translation.
Click Start Translation.
Fill in the translation configuration dialog.
- Name can contain letters, numbers or underscores.
- Choose a location for the translation job. For a most efficient run, choose the same location as your source file bucket.
- Choose Teradata SQL for translation mode.
Choose a source path. You can type in the path or use the Browse option. You can also exclude certain files in your bucket by adding prefix filters.
Click Continue to move to the next step.
Choose a destination path. Similarly to the source path, you can type in the path or use the Browse option.
If you're doing simple translations, you can click Create to start the translation job. For information about using the optional settings, see the Optional settings section.
You can see the status of the job in the translation jobs list.
Once the translation is complete, click Show details to see the results of the translation job.
Then call the
method to start the translation workflow.
Additional settings for BTEQ translation
The following lists the extra fields that are required for BTEQ translation. The BTEQ options are set in the BTEQ settings tab, after setting the destination path.
This dataset ID is used as the dataset prefix for tables referenced
in the script.
For example an insert statement that inserts new data into
|Default path URI||Required||
If specified, this path is used as the default path in
|Path Replacement||Not Required||
This field can be added multiple times. When added, you specify a file
path for any files mentioned in the BTEQ script. When translating, the
file is replaced with the path provided.
For example, if you provide the following pairs of file and path:
The settings in this tab are for advanced translation settings.
|Schema path||This is a Cloud Storage path that contains data definition language (DDL) statements for table schemas. Providing table schemas can improve translation results.|
This field specifies the encoding for the input file. The default is `utf-8`. The following encodings are available:
|Output SQL identifier cases||
Output casing for Table/Column names. The default is to keep the original casing. The following are the options for this field:
|Special token map||
If the translation input files contain SQL templates using special
tokens as placeholders, use this section to provide a map from the
special tokens to their data type.
The following is an example special token pair:
select trim(<b>) from x;translates to the following BigQuery Standard SQL:
SELECT TRIM(CAST(<b> AS STRING)) FROM x;
Explore the output files
After running the translation, you can see the results and errors in the destination path that you specified. The batch SQL translator outputs the following files into the destination path:
- Translated file
- Error summary files
For each input file, a corresponding output file is generated in the destination path. The output file contains the translated queries and errors.
Error summary file
The error summary file is a CSV file that contains a table of all the errors
encountered during the translation. Each input file has a corresponding error
summary file. The summary files are named
The following is an example of an error summary file:
|"selct * from abc"||5,1||SqlParseException||Non-query expression encountered in illegal context||ERROR|
|"RENAM TBLE DB.table1 to DB.table2;"||1,1||SqlParseException||Non-query expression encountered in illegal context||ERROR|
Source File Format
The source files in your Cloud Storage bucket should be valid Teradata SQL or BTEQ script files.
Source files can contain the following:
- Valid SQL statements terminated by semicolon in a single line or across multiple lines.
- Each stored procedure (SP) must be in their own file separate from other SPs and other SQL statements.
- Valid supported BTEQ statements and valid multi-statement SQL requests if BTEQ mode is chosen.
- Comments annotated by
*if translating BTEQ.
- Special placeholder tokens which are not applicable for translation. You must specify these tokens in a config file. See Optional settings for more information.
Source files cannot contain the following: + Unsupported BTEQ statements, for more information see Unsupported BTEQ statements.
Supported BTEQ Statements
The following statements are supported in BTEQ translations.
Unsupported BTEQ Statements
The following statements are not supported.
No Schema Mapping Support
The SQL translator doesn't translate schema changes between the source data warehouse and BigQuery. Any references to column or table names that are changed in the schema will not be reflected in the input queries during translation. For example, if a column named "foo" is renamed to "bar" during migration, translation won't automatically update queries that reference "foo" to reference "bar". To update these references, perform a post processing step to replace those SQL identifiers in the translated output queries.
- For information about running interactive SQL translations, see Interactive SQL translator.
- For information about the using the BigQuery Data Transfer Service to transfer your data, see Migrating data from Teradata
- For information about a tool for data validation, see Data Validation Tool