Amazon Redshift SQL translation guide
This document details the similarities and differences in SQL syntax between Amazon Redshift and BigQuery to help you plan your migration. Use batch SQL translation to migrate your SQL scripts in bulk, or interactive SQL translation to translate ad hoc queries.
The intended audience for this guide is enterprise architects, database administrators, application developers, and IT security specialists. It assumes you are familiar with Amazon Redshift.
Data types
This section shows equivalents between data types in Amazon Redshift and in BigQuery.
Amazon Redshift | BigQuery | Notes | |
---|---|---|---|
Data type | Alias | Data type | |
SMALLINT |
INT2 |
INT64 |
Amazon Redshift's SMALLINT is 2 bytes, whereas BigQuery's
INT64 is 8 bytes. |
INTEGER |
INT64 |
Amazon Redshift's INTEGER is 4 bytes, whereas BigQuery's
INT64 is 8 bytes. |
|
BIGINT |
INT8 |
INT64 |
Both Amazon Redshift's BIGINT and BigQuery's
INT64 are 8 bytes. |
DECIMAL |
NUMERIC |
NUMERIC |
|
REAL |
FLOAT4 |
FLOAT64 |
Amazon Redshift's REAL is 4 bytes, whereas BigQuery's
FLOAT64 is 8 bytes. |
DOUBLE
PRECISION |
FLOAT64 |
||
BOOLEAN |
BOOL |
BOOL |
Amazon Redshift's BOOLEAN can use TRUE , t ,
true , y , yes , and
1 as valid literal values for true. BigQuery's
BOOL data type uses case-insensitive
TRUE . |
CHAR |
STRING |
||
VARCHAR |
STRING |
||
DATE |
DATE |
||
TIMESTAMP |
TIMESTAMP WITHOUT TIME ZONE |
DATETIME |
|
TIMESTAMPTZ |
TIMESTAMP |
Note: In BigQuery, time zones are used when parsing timestamps or formatting timestamps for display. A string-formatted timestamp might include a time zone, but when BigQuery parses the string, it stores the timestamp in the equivalent UTC time. When a time zone is not explicitly specified, the default time zone, UTC, is used. Time zone names or offset from UTC using (-|+)HH:MM are supported, but time zone abbreviations such as PDT are not supported. | |
GEOMETRY |
GEOGRAPHY |
Support for querying geospatial data. |
BigQuery also has the following data types that do not have a direct Amazon Redshift analog:
Implicit conversion types
When migrating to BigQuery, you need to convert most of your Amazon Redshift implicit conversions to BigQuery's explicit conversions except for the following data types, which BigQuery implicitly converts.
BigQuery performs implicit conversions for the following data types:
From BigQuery type | To BigQuery type |
---|---|
|
|
|
|
|
|
BigQuery also performs implicit conversions for the following literals:
From BigQuery type | To BigQuery type |
---|---|
STRING literal (e.g. "2008-12-25") |
|
STRING literal (e.g. "2008-12-25 15:30:00") |
|
STRING literal (e.g. "2008-12-25T07:30:00") |
|
STRING literal (e.g. "15:30:00") |
|
Explicit conversion types
You can convert Amazon Redshift data types that BigQuery doesn't implicitly convert
using BigQuery's
CAST(expression AS type)
function
or any of the
DATE
and
TIMESTAMP
conversion functions.
When migrating your queries, change any occurrences of the Amazon Redshift
CONVERT(type, expression)
function (or the :: syntax) to BigQuery's
CAST(expression AS type)
function,
as shown in the table in the Data type formatting functions section.
Query syntax
This section addresses differences in query syntax between Amazon Redshift and BigQuery.
SELECT
statement
Most Amazon Redshift
SELECT
statements are compatible with BigQuery. The following table contains a list of
minor differences.
Amazon Redshift | BigQuery |
---|---|
|
|
|
|
BigQuery also supports the following expressions in SELECT
statements, which do
not have a Amazon Redshift equivalent:
FROM
clause
A
FROM
clause in a query lists the table references that data is selected from. In
Amazon Redshift, possible table references include tables, views, and subqueries. All
of these table references are supported in BigQuery.
BigQuery tables can be referenced in the FROM
clause using the following:
[project_id].[dataset_id].[table_name]
[dataset_id].[table_name]
[table_name]
BigQuery also supports additional table references:
- Historical versions of the table definition and rows using
FOR SYSTEM_TIME AS OF
. - Field paths,
or any path that resolves to a field within a data type (such as a
STRUCT
). - Flattened arrays.
JOIN
types
Both Amazon Redshift and BigQuery support the following types of join:
[INNER] JOIN
LEFT [OUTER] JOIN
RIGHT [OUTER] JOIN
FULL [OUTER] JOIN
CROSS JOIN
and the equivalent implicit comma cross join.
The following table contains a list of minor differences.
Amazon Redshift | BigQuery |
---|---|
|
Note: In BigQuery, JOIN clauses require a
JOIN condition unless the clause is a CROSS
JOIN or one of the joined tables is a field within a data type
or an array. |
WITH
clause
A BigQuery
WITH
clause contains one or more named subqueries that execute when a subsequent
SELECT
statement references them. Amazon Redshift
WITH
clauses behave the same as BigQuery's with the exception that you can evaluate
the clause once and reuse its results.
Set operators
There are some minor differences between Amazon Redshift set operators and BigQuery set operators. However, all set operations that are feasible in Amazon Redshift are replicable in BigQuery.
Amazon Redshift | BigQuery |
---|---|
|
Note: Both BigQuery and Amazon Redshift support the |
|
|
|
|
|
|
|
Note: BigQuery requires parentheses to separate different set operations. If the same set operator is repeated, parentheses are not necessary. |
ORDER BY
clause
There are some minor differences between Amazon Redshift
ORDER BY
clauses and BigQuery
ORDER BY
clauses.
Amazon Redshift | BigQuery |
---|---|
In Amazon Redshift, NULL s are ranked last by default (ascending
order). |
In BigQuery, NULL s are ranked first by default
(ascending order). |
|
Note: BigQuery does not use the LIMIT ALL syntax, but
ORDER BY sorts all rows by default, resulting in the
same behavior as Amazon Redshift's LIMIT ALL clause. We highly
recommend including a LIMIT clause with every
ORDER BY clause. Ordering all result rows unnecessarily
degrades query execution performance. |
|
Note: In BigQuery, OFFSET must be used together with a
LIMIT count. Make sure to set the count
INT64 value to the minimum necessary ordered rows.
Ordering all result rows unnecessarily degrades query execution performance. |
Conditions
The following table shows Amazon Redshift conditions, or predicates, that are specific to Amazon Redshift and must be converted to their BigQuery equivalent.
Amazon Redshift | BigQuery |
---|---|
|
|
|
|
|
|
|
Note: BigQuery does not support custom escape characters. You must use two backslashes \\ as escape characters for BigQuery. |
|
Note: If NOT is specified, wrap the above
IF expression in a NOT expression as shown
below:
|
|
|
Functions
The following sections list Amazon Redshift functions and their BigQuery equivalents.
Aggregate functions
The following table shows mappings between common Amazon Redshift aggregate, aggregate analytic, and approximate aggregate functions with their BigQuery equivalents.
Amazon Redshift | BigQuery |
---|---|
APPROXIMATE
COUNT(DISTINCT expression) |
APPROX_COUNT_DISTINCT(expression) |
APPROXIMATE
PERCENTILE_DISC( |
APPROX_QUANTILES(expression,
100) |
AVG([DISTINCT] expression) |
AVG([DISTINCT] expression) |
COUNT(expression) |
COUNT(expression) |
LISTAGG( |
STRING_AGG( |
MAX(expression) |
MAX(expression) |
MEDIAN(median_expression) |
PERCENTILE_CONT( median_expression, 0.5
) OVER() |
MIN(expression) |
MIN(expression) |
PERCENTILE_CONT( |
PERCENTILE_CONT( Note: Does not cover aggregation use cases. |
STDDEV([DISTINCT] expression) |
STDDEV([DISTINCT] expression) |
STDDEV_SAMP([DISTINCT] expression) |
STDDEV_SAMP([DISTINCT] expression) |
STDDEV_POP([DISTINCT] expression) |
STDDEV_POP([DISTINCT] expression) |
SUM([DISTINCT] expression) |
SUM([DISTINCT] expression) |
VARIANCE([DISTINCT] expression) |
VARIANCE([DISTINCT] expression) |
VAR_SAMP([DISTINCT] expression) |
VAR_SAMP([DISTINCT] expression) |
VAR_POP([DISTINCT] expression) |
VAR_POP([DISTINCT] expression) |
BigQuery also offers the following aggregate, aggregate analytic, and approximate aggregate functions, which do not have a direct analogue in Amazon Redshift:
ANY_VALUE
APPROX_TOP_COUNT
APPROX_TOP_SUM
ARRAY_AGG
ARRAY_CONCAT_AGG
COUNTIF
CORR
COVAR_POP
COVAR_SAMP
Bitwise aggregate functions
The following table shows mappings between common Amazon Redshift bitwise aggregate functions with their BigQuery equivalents.
Amazon Redshift | BigQuery |
---|---|
BIT_AND(expression) |
BIT_AND(expression) |
BIT_OR(expression) |
BIT_OR(expression) |
BOOL_AND>(expression) |
LOGICAL_AND(expression) |
BOOL_OR(expression) |
LOGICAL_OR(expression) |
BigQuery also offers the following bit-wise aggregate function, which does not have a direct analogue in Amazon Redshift:
Window functions
The following table shows mappings between common Amazon Redshift window functions with their BigQuery equivalents. Windowing functions in BigQuery include analytic aggregate functions, aggregate functions, navigation functions, and numbering functions.
Amazon Redshift | BigQuery |
---|---|
AVG(expression) OVER |
AVG(expression) OVER |
COUNT(expression) OVER |
COUNT(expression) OVER |
CUME_DIST() OVER |
CUME_DIST() OVER |
DENSE_RANK() OVER |
DENSE_RANK() OVER |
FIRST_VALUE(expression)
OVER |
FIRST_VALUE(expression)
OVER |
LAST_VALUE(expression) OVER |
LAST_VALUE(expression) OVER |
LAG(value_expr [, offset])
OVER |
LAG(value_expr [, offset])
OVER |
LEAD(value_expr [, offset])
OVER |
LEAD(value_expr [, offset])
OVER |
LISTAGG( |
STRING_AGG( |
MAX(expression) OVER |
MAX(expression) OVER |
MEDIAN(median_expression)
OVER |
PERCENTILE_CONT( |
MIN(expression) OVER |
MIN(expression) OVER |
NTH_VALUE(expression,
offset) OVER (
[PARTITION BY window_partition] [ORDER BY window_ordering
frame_clause]
) |
NTH_VALUE(expression,
offset) OVER |
NTILE(expr) OVER |
NTILE(expr) OVER |
PERCENT_RANK() OVER |
PERCENT_RANK() OVER |
PERCENTILE_CONT(percentile)
|
PERCENTILE_CONT(expr,
percentile) OVER |
PERCENTILE_DISC(percentile)
WITHIN GROUP (ORDER BY expr) OVER |
PERCENTILE_DISC(expr,
percentile) OVER |
RANK() OVER |
RANK() OVER |
RATIO_TO_REPORT(ratio_expression)
OVER |
ratio_expression SUM(ratio_expression) OVER |
ROW_NUMBER() OVER |
ROW_NUMBER() OVER |
STDDEV(expression) OVER |
STDDEV(expression) OVER |
STDDEV_SAMP(expression)
OVER |
STDDEV_SAMP(expression)
OVER |
STDDEV_POP(expression) OVER |
STDDEV_POP(expression) OVER |
SUM(expression) OVER |
SUM(expression) OVER |
VAR_POP(expression) OVER |
VAR_POP(expression) OVER |
VAR_SAMP(expression) OVER |
VAR_SAMP(expression) OVER |
VARIANCE(expression) OVER |
VARIANCE(expression) OVER |
Conditional expressions
The following table shows mappings between common Amazon Redshift conditional expressions with their BigQuery equivalents.
Amazon Redshift | BigQuery |
---|---|
CASEexpression |
CASE expression |
COALESCE(expression1[,
...]) |
COALESCE(expression1[,
...]) |
DECODE( |
CASE expression |
GREATEST(value [,
...]) |
GREATEST(value [,
...]) |
LEAST(value [, ...]) |
LEAST(value [, ...]) |
NVL(expression1[, ...])
|
COALESCE(expression1[,
...]) |
NVL2( |
IF( |
NULLIF(expression1,
expression2) |
NULLIF(expression1,
expression2) |
BigQuery also offers the following conditional expressions, which do not have a direct analogue in Amazon Redshift:
Date and time functions
The following table shows mappings between common Amazon Redshift date and time functions with their BigQuery equivalents. BigQuery data and time functions include date functions, datetime functions, time functions, and timestamp functions.
Keep in mind that functions that seem identical in Amazon Redshift and BigQuery might return different data types.
Amazon Redshift | BigQuery |
---|---|
ADD_MONTHS( |
CAST( DATE_ADD( |
timestamptz_or_timestamp AT TIME
ZONE timezone |
PARSE_TIMESTAMP( Note: Time zones are used when parsing timestamps or formatting timestamps for display. A string-formatted timestamp might include a time zone, but when BigQuery parses the string, it stores the timestamp in the equivalent UTC time. When a time zone is not explicitly specified, the default time zone, UTC, is used. Time zone names or offset from UTC (-HH:MM) are supported, but time zone abbreviations (such as PDT) are not supported. |
CONVERT_TIMEZONE( |
PARSE_TIMESTAMP( Note: source_timezone is UTC in BigQuery. |
CURRENT_DATE Note: Returns start date for the current transaction in the current session time zone (UTC by default). |
CURRENT_DATE() Note: Returns start date for the current statement in the UTC time zone. |
DATE_CMP(date1, date2)
|
CASE |
DATE_CMP_TIMESTAMP(date1,
date2) |
CASE |
DATE_CMP_TIMESTAMPTZ(date,
timestamptz) |
CASE |
DATE_PART_YEAR(date) |
EXTRACT(YEAR FROM
date) |
DATEADD(date_part,
interval, date) |
CAST( |
DATEDIFF( |
DATE_DIFF( |
DATE_PART(date_part, date)
|
EXTRACT(date_part FROM
date) |
DATE_TRUNC('date_part',
timestamp) |
TIMESTAMP_TRUNC(timestamp,
date_part) |
EXTRACT(date_part FROM
timestamp) |
EXTRACT(date_part FROM
timestamp) |
GETDATE() |
PARSE_TIMESTAMP( |
INTERVAL_CMP( |
For intervals in Redshift, there are 360 days in a year.
In BigQuery, you can use the following user-defined
function (UDF) to parse a Redshift interval and
translate it to seconds. CREATE TEMP FUNCTION To compare interval literals, perform: IF( |
LAST_DAY(date) |
DATE_SUB( |
MONTHS_BETWEEN( |
DATE_DIFF( |
NEXT_DAY(date, day) |
DATE_ADD( |
SYSDATE Note: Returns start timestamp for the current transaction in the current session time zone (UTC by default). |
CURRENT_TIMESTAMP() Note: Returns start timestamp for the current statement in the UTC time zone. |
TIMEOFDAY() |
FORMAT_TIMESTAMP( |
TIMESTAMP_CMP( |
CASE |
TIMESTAMP_CMP_DATE( |
CASE |
TIMESTAMP_CMP_TIMESTAMPTZ(
Note: Redshift compares timestamps in the user session-defined time zone. Default user session time zone is UTC. |
CASE Note: BigQuery compares timestamps in the UTC time zone. |
TIMESTAMPTZ_CMP( Note: Redshift compares timestamps in the user session-defined time zone. Default user session time zone is UTC. |
CASE Note: BigQuery compares timestamps in the UTC time zone. |
TIMESTAMPTZ_CMP_DATE( Note: Redshift compares timestamps in the user session-defined time zone. Default user session time zone is UTC. |
CASE Note: BigQuery compares timestamps in the UTC time zone. |
TIMESTAMPTZ_CMP_TIMESTAMP(
Note: Redshift compares timestamps in the user session-defined time zone. Default user session time zone is UTC. |
CASE Note: BigQuery compares timestamps in the UTC time zone. |
TIMEZONE( |
PARSE_TIMESTAMP( Note: Time zones are used when parsing timestamps or formatting timestamps for display. A string-formatted timestamp might include a time zone, but when BigQuery parses the string, it stores the timestamp in the equivalent UTC time. When a time zone is not explicitly specified, the default time zone, UTC, is used. Time zone names or offset from UTC (-HH:MM) are supported but time zone abbreviations (such as PDT) are not supported. |
TO_TIMESTAMP(timestamp,
format) |
PARSE_TIMESTAMP( Note: BigQuery follows a different set of format elements. Time zones are used when parsing timestamps or formatting timestamps for display. A string-formatted timestamp might include a time zone, but when BigQuery parses the string, it stores the timestamp in the equivalent UTC time. When a time zone is not explicitly specified, the default time zone, UTC, is used. Time zone names or offset from UTC (-HH:MM) are supported in the format string but time zone abbreviations (such as PDT) are not supported. |
TRUNC(timestamp) |
CAST(timestamp AS
DATE) |
BigQuery also offers the following date and time functions, which do not have a direct analogue in Amazon Redshift:
EXTRACT
DATE
DATE_SUB
DATE_ADD
(returningDATE
data type)DATE_FROM_UNIX_DATE
FORMAT_DATE
PARSE_DATE
UNIX_DATE
DATETIME
DATETIME_ADD
DATETIME_SUB
DATETIME_DIFF
DATETIME_TRUNC
FORMAT_DATETIME
PARSE_DATETIME
CURRENT_TIME
TIME
TIME_ADD
TIME_SUB
TIME_DIFF
TIME_TRUNC
FORMAT_TIME
PARSE_TIME
TIMESTAMP_SECONDS
TIMESTAMP_MILLIS
TIMESTAMP_MICROS
UNIX_SECONDS
UNIX_MILLIS
UNIX_MICROS
Mathematical operators
The following table shows mappings between common Amazon Redshift mathematical operators with their BigQuery equivalents.
Amazon Redshift | BigQuery |
---|---|
|
|
|
|
|
|
Note: If the operator is performing integer division (in other words, if X and
Y are both
integers), an integer is returned. If the operator is performing
non-integer division, a non-integer is returned. |
If integer division: CAST(FLOOR(X / Y) AS INT64)
If not integer division:
Note: Division in BigQuery returns a non-integer. To prevent errors from a division operation (division by zero error), use SAFE_DIVIDE(X, Y) or
IEEE_DIVIDE(X, Y) . |
|
Note: To prevent errors from a division operation (division by zero error), use SAFE.MOD(X, Y) . SAFE.MOD(X, 0) results in 0. |
|
Note: Unlike Amazon Redshift, the ^ operator in
BigQuery performs Bitwise
xor. |
|
Note: To prevent errors from a square root operation (negative input), use SAFE.SQRT(X) . Negative input
with SAFE.SQRT(X) results in
NULL . |
|
Note: BigQuery's POWER(X, Y) returns an
error if X is a finite value less than 0
and Y is a noninteger. |
|
|
|
Note: This operator returns 0 or a byte sequence of b'\x00' if the second operand Y is greater than or equal to the bit length of the
first operand X (for example, 64 if X has the type INT64). This
operator throws an error if Y is negative. |
|
Note: Shifts the first operand X to the right. This operator does not
do sign bit extension with a signed type (it fills vacant bits on the
left with 0). This operator returns 0 or a byte sequence of b'\x00' if the second operand Y is greater than or equal to the bit
length of the first operand X (for example, 64 if X has the type
INT64). This operator throws an error if Y is negative. |
|
|
|
|
|
|
BigQuery also offers the following mathematical operator, which does not have a direct analog in Amazon Redshift:
X ^ Y
(Bitwise xor)
Math functions
Amazon Redshift | BigQuery |
---|---|
ABS(number) |
ABS(number) |
ACOS(number) |
ACOS(number) |
ASIN(number) |
ASIN(number) |
ATAN(number) |
ATAN(number) |
ATAN2(number1,
number2) |
ATAN2(number1,
number2) |
CBRT(number) |
POWER(number, 1/3) |
CEIL(number) |
CEIL(number) |
CEILING(number) |
CEILING(number) |
CHECKSUM(expression) |
FARM_FINGERPRINT(expression)
|
COS(number) |
COS(number) |
COT(number) |
1/TAN(number) |
DEGREES(number) |
number *180/ACOS(-1) |
DEXP(number) |
EXP(number) |
DLOG1(number) |
LN(number) |
DLOG10(number) |
LOG10(number) |
EXP(number) |
EXP(number) |
FLOOR(number) |
FLOOR(number) |
LNnumber) |
LN(number) |
LOG(number) |
LOG10(number) |
MOD(number1, number2) |
MOD(number1, number2) |
PI |
ACOS(-1) |
POWER(expression1,
expression2) |
POWER(expression1,
expression2) |
RADIANS(number) |
ACOS(-1)*(number/180) |
RANDOM() |
RAND() |
ROUND(number [,
integer]) |
ROUND(number [,
integer]) |
SIN(number) |
SIN(number) |
SIGN(number) |
SIGN(number) |
SQRT(number) |
SQRT(number) |
TAN(number) |
TAN(number) |
TO_HEX(number) |
FORMAT('%x', number) |
TRUNC(number [,
integer])+-+++ |
TRUNC(number [, integer])
|
String functions
Amazon Redshift | BigQuery |
---|---|
string1 || string2 |
CONCAT(string1,
string2) |
BPCHARCMP(string1,
string2) |
CASE |
BTRIM(string [,
matching_string]) |
TRIM(string [,
matching_string]) |
BTTEXT_PATTERN_CMP(string1,
string2) |
CASE |
CHAR_LENGTH(expression) |
CHAR_LENGTH(expression) |
CHARACTER_LENGTH(expression) |
CHARACTER_LENGTH(expression) |
CHARINDEX(substring,
string) |
STRPOS(string, substring)
|
CHR(number) |
CODE_POINTS_TO_STRING([number])
|
CONCAT(string1,
string2) |
CONCAT(string1,
string2) Note: BigQuery's CONCAT (...) supports concatenating any number of strings. |
CRC32 |
Custom user-defined function |
FUNC_SHA1(string) |
SHA1(string) |
INITCAP |
INITCAP |
LEFT(string, integer) |
SUBSTR(string, 0, integer)
|
RIGHT(string, integer)
|
SUBSTR(string,
-integer) |
LEN(expression) |
LENGTH(expression) |
LENGTH(expression) |
LENGTH(expression) |
LOWER(string) |
LOWER(string) |
LPAD(string1, length[,
string2]) |
LPAD(string1, length[,
string2]) |
RPAD(string1, length[,
string2]) |
RPAD(string1, length[,
string2]) |
LTRIM(string,
trim_chars) |
LTRIM(string,
trim_chars) |
MD5(string) |
MD5(string) |
OCTET_LENGTH(expression) |
BYTE_LENGTH(expression) |
POSITION(substring IN
string) |
STRPOS(string,
substring) |
QUOTE_IDENT(string) |
CONCAT('"',string,'"') |
QUOTE_LITERAL(string) |
CONCAT("'",string,"'")
|
REGEXP_COUNT(
source_string, pattern |
ARRAY_LENGTH( REGEXP_EXTRACT_ALL( If position is specified: ARRAY_LENGTH( REGEXP_EXTRACT_ALL( Note: BigQuery provides regular expression support using the re2 library; see that
documentation for its regular expression syntax. |
REGEXP_INSTR( |
IFNULL( STRPOS( If source_string is specified: REGEXP_REPLACE( If position is specified:IFNULL( STRPOS( If occurrence is specified:IFNULL( STRPOS( Note: BigQuery provides regular expression support using the re2
library; see that documentation for its regular expression syntax. |
REGEXP_REPLACE(
source_string, |
REGEXP_REPLACE( If source_string is specified:
REGEXP_REPLACE( If position is specified:CASE |
REGEXP_SUBSTR(
source_string, pattern |
REGEXP_EXTRACT( If position is specified:REGEXP_EXTRACT( If occurrence is specified:REGEXP_EXTRACT_ALL( Note: BigQuery provides regular expression support using the re2 library; see that
documentation for its regular expression syntax.
|
REPEAT(string,
integer) |
REPEAT(string,
integer) |
REPLACE(string, old_chars,
new_chars) |
REPLACE(string, old_chars,
new_chars) |
REPLICA(string,
integer) |
REPEAT(string,
integer) |
REVERSE(expression) |
REVERSE(expression) |
RTRIM(string,
trim_chars) |
RTRIM(string,
trim_chars) |
SPLIT_PART(string,
delimiter, part) |
SPLIT( |
STRPOS(string,
substring) |
STRPOS(string,
substring) |
STRTOL(string, base) |
|
SUBSTRING( |
SUBSTR( |
TEXTLEN(expression) |
LENGTH(expression) |
TRANSLATE( |
Can be implemented using UDFs: CREATE TEMP FUNCTION |
TRIM([BOTH] string) |
TRIM(string) |
TRIM([BOTH] characters FROM
string) |
TRIM(string, characters)
|
UPPER(string) |
UPPER(string) |
Data type formatting functions
Amazon Redshift | BigQuery |
---|---|
CAST(expression AS type) |
CAST(expression AS type) |
expression ::
type |
CAST(expression AS type) |
CONVERT(type, expression) |
CAST(expression AS type) |
TO_CHAR(
|
FORMAT_TIMESTAMP( Note: BigQuery and Amazon Redshift differ in how to specify a format string for timestamp_expression . |
TO_CHAR(
|
FORMAT( Note: BigQuery and Amazon Redshift differ in how to specify a format string for timestamp_expression .
|
TO_DATE(date_string, format) |
PARSE_DATE(date_string, format)
Note: BigQuery and Amazon Redshift differ in how to specify a format string for date_string . |
TO_NUMBER(string, format) |
CAST( Note: BigQuery and Amazon Redshift differ in how to specify a numeric format string. |
BigQuery also supports SAFE_CAST
(expression
AS typename)
, which returns NULL
if BigQuery is unable to perform a cast; for
example,
SAFE_CAST
("apple"
AS INT64)
returns NULL
.
DML syntax
This section addresses differences in data management language syntax between Amazon Redshift and BigQuery.
INSERT
statement
Amazon Redshift offers a configurable DEFAULT
keyword for columns. In
BigQuery, the DEFAULT
value for nullable columns is NULL
,
and DEFAULT
is not supported for
required columns. Most
Amazon Redshift INSERT
statements
are compatible with BigQuery. The following table shows exceptions.
Amazon Redshift | BigQuery |
---|---|
INSERT INTO table (column1 [, ...]) |
INSERT [INTO] table (column1 [, ...]) |
INSERT INTO table (column1, [,...]) VALUES ( |
INSERT [INTO] table (column1, [,...]) |
BigQuery also supports inserting values using a subquery (where one of the values is computed using a subquery), which is not supported in Amazon Redshift. For example:
INSERT INTO table (column1, column2)
VALUES ('value_1', (
SELECT column2
FROM table2
))
COPY
statement
Amazon Redshift's
COPY
command
loads data into a table from data files or from an Amazon DynamoDB table.
BigQuery does not use the SQL COPY
command to load data,
but you can use any of several non-SQL tools and options to
load data into BigQuery tables.
You can also use data pipeline sinks provided in
Apache Spark
or
Apache Beam
to write data into BigQuery.
UPDATE
statement
Most Amazon Redshift UPDATE
statements are compatible with BigQuery. The following
table shows exceptions.
Amazon Redshift | BigQuery |
---|---|
UPDATE table |
UPDATE table Note: All UPDATE statements in BigQuery require a WHERE keyword,
followed by a condition. |
UPDATE table |
UPDATE table Note: BigQuery's UPDATE command does not support DEFAULT values.
If the Amazon Redshift UPDATE statement does not include a WHERE clause, the
BigQuery UPDATE statement should be conditioned WHERE TRUE . |
DELETE
and TRUNCATE
statements
The DELETE
and TRUNCATE
statements are both ways to remove rows from a table
without affecting the table schema or indexes.
In Amazon Redshift, the TRUNCATE
statement is recommended over an
unqualified DELETE
statement because it is
faster and does not require VACUUM
and ANALYZE
operations afterward.
However, you can use DELETE
statements to achieve the same effect.
In BigQuery, the DELETE
statement must have a WHERE
clause. For more
information about DELETE
in BigQuery, see the
BigQuery
DELETE
examples
in the DML documentation.
Amazon Redshift | BigQuery |
---|---|
DELETE
[FROM] table_name TRUNCATE
[TABLE] table_name |
DELETE FROM table_name BigQuery DELETE statements require a
WHERE clause. |
DELETE FROM table_name |
DELETE FROM table_name DELETE FROM table_name In Amazon Redshift, USING allows additional tables to
be referenced in the
WHERE clause. This can be achieved in
BigQuery by using a subquery in
the WHERE clause. |
MERGE
statement
The MERGE
statement can combine INSERT
, UPDATE
,
and DELETE
operations into a
single upsert statement and perform the operations atomically. The MERGE
operation must match at most one source row for each target row.
Amazon Redshift does not support a single MERGE
command. However, a
merge operation
can be performed in Amazon Redshift by performing
INSERT
, UPDATE
, and DELETE
operations
in a transaction.
Merge operation by replacing existing rows
In Amazon Redshift, an overwrite of all of the columns in the target table can be
performed using a DELETE
statement and then an INSERT
statement. The DELETE
statement removes rows that should be updated, and then the INSERT
statement
inserts the updated rows. BigQuery tables are limited to
1,000 DML statements
per day, so you should consolidate INSERT
, UPDATE
, and DELETE
statements into a
single MERGE
statement as shown in the following table.
Amazon Redshift | BigQuery |
---|---|
See Performing a
merge operation by replacing
existing rows. CREATE TEMP TABLE temp_table; |
MERGE target Note: All columns must be listed if updating all columns. |
See Performing a
merge operation by specifying a
column list. CREATE TEMP TABLE temp_table; |
MERGE target |
DDL syntax
This section addresses differences in data definition language syntax between Amazon Redshift and BigQuery.
SELECT INTO
statement
In Amazon Redshift, the SELECT INTO
statement can be used to insert the results of a
query into a new table, combining table creation and insertion.
Amazon Redshift | BigQuery |
---|---|
SELECT expression, ... INTO table |
INSERT table |
WITH subquery_table AS ( SELECT ... |
INSERT table |
SELECT expression |
BigQuery offers several ways to emulate temporary tables. See the temporary tables section for more information. |
CREATE TABLE
statement
Most Amazon Redshift
CREATE TABLE
statements are compatible with BigQuery, except for the following syntax elements, which
are not used in BigQuery:
Amazon Redshift | BigQuery |
---|---|
CREATE TABLE table_name ( Note: UNIQUE and PRIMARY KEY constraints are
informational and are not
enforced by the Amazon Redshift system. |
CREATE TABLE table_name ( |
CREATE TABLE table_name Note: UNIQUE and PRIMARY KEY constraints are informational and are not
enforced by the Amazon Redshift system.
|
CREATE TABLE table_name Note: BigQuery does not use UNIQUE , PRIMARY KEY , or FOREIGN KEY table
constraints. To achieve similar optimization that these constraints provide during query execution,
partition and cluster your BigQuery tables. CLUSTER BY supports up to
4 columns. |
CREATE TABLE table_name |
Reference this example to learn how to use
the INFORMATION_SCHEMA tables to copy column names, data types, and
NOT NULL constraints to a new table. |
CREATE TABLE table_name Note: In Amazon Redshift, the BACKUP
NO setting is specified to save processing time and reduce
storage space. |
The BACKUP NO table option is not used or needed because BigQuery
automatically keeps up to 7 days of historical versions of all of
your tables with no effect on processing time or billed storage.
|
CREATE TABLE table_name |
BigQuery supports clustering, which allows storing keys in sorted order. |
CREATE TABLE table_name |
CREATE TABLE table_name |
CREATE TABLE IF NOT EXISTS table_name ... |
CREATE TABLE IF NOT EXISTS |
BigQuery also supports the DDL statement CREATE OR REPLACE TABLE
,
which overwrites a table if it already exists.
BigQuery's CREATE TABLE
statement also supports the following clauses, which do
not have an Amazon Redshift equivalent:
For more information about CREATE TABLE
in BigQuery, see the
BigQuery
CREATE TABLE
examples
in the DML documentation.
Temporary tables
Amazon Redshift supports temporary tables, which are only visible within the current session. There are several ways to emulate temporary tables in BigQuery:
- Dataset TTL: Create a dataset that has a short time to live (for example, one hour) so that any tables created in the dataset are effectively temporary because they won't persist longer than the dataset's time to live. You can prefix all of the table names in this dataset with temp to clearly denote that the tables are temporary.
Table TTL: Create a table that has a table-specific short time to live using DDL statements similar to the following:
CREATE TABLE temp.name (col1, col2, ...) OPTIONS (expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR));
CREATE VIEW
statement
The following table shows equivalents between Amazon Redshift and BigQuery for the
CREATE VIEW
statement.
Amazon Redshift | BigQuery |
---|---|
CREATE VIEW view_name AS SELECT ... code> |
CREATE VIEW view_name AS SELECT
... |
CREATE OR REPLACE VIEW view_name AS SELECT ... |
CREATE OR REPLACE VIEW |
CREATE VIEW view_name |
CREATE VIEW view_name AS SELECT
... |
Not supported. | CREATE VIEW IF NOT EXISTS c
view_name Creates a new view only if the view does not exist in the specified dataset. |
CREATE VIEW view_name In Amazon Redshift, a late binding view is required in order to reference an external table. |
In BigQuery, to create a view, all referenced objects must already
exist. BigQuery allows you to query external data sources. |
User-defined functions (UDFs)
A UDF lets you create functions for custom operations. These functions accept columns of input, perform actions, and return the result of those actions as a value.
Both Amazon Redshift and BigQuery support UDFs using SQL expressions. Additionally, in Amazon Redshift you can create a Python-based UDF, and in BigQuery you can create a JavaScript-based UDF.
Refer to the Google Cloud BigQuery utilities GitHub repository for a library of common BigQuery UDFs.
CREATE FUNCTION
syntax
The following table addresses differences in SQL UDF creation syntax between Amazon Redshift and BigQuery.
Amazon Redshift | BigQuery |
---|---|
CREATE [OR
REPLACE] FUNCTION |
CREATE [OR REPLACE] FUNCTION Note: In a BigQuery SQL UDF, a return data type is optional. BigQuery infers the result type of the function from the SQL function body when a query calls the function. |
CREATE [OR
REPLACE] FUNCTION |
CREATE [OR REPLACE] FUNCTION Note: Function volatility is not a configurable parameter in BigQuery. All BigQuery UDF volatility is equivalent to Amazon Redshift's IMMUTABLE volatility (that is, it does not do database lookups or
otherwise use information not directly present in its argument list).
|
CREATE [OR
REPLACE] FUNCTION Note: Amazon Redshift supports only a SQL SELECT clause as function
definition. Also, the SELECT clause cannot include any of the FROM,
INTO, WHERE, GROUP BY, ORDER BY, and LIMIT clauses. |
CREATE [OR REPLACE] FUNCTION Note: BigQuery supports any SQL expressions as function definition. However, referencing tables, views, or models is not supported. |
CREATE [OR
REPLACE] FUNCTION |
CREATE [OR REPLACE] FUNCTION
function_name ([sql_arg_name sql_arg_data_type[,..]]) RETURNS
data_type AS sql_function_definition Note: Language literal need not be specified in a GoogleSQL UDF. BigQuery interprets the SQL expression by default. Also, the Amazon Redshift dollar quoting ( $$ ) is not supported in BigQuery. |
CREATE [OR
REPLACE] FUNCTION function_name (integer, integer) RETURNS
integer IMMUTABLE AS $$ SELECT $1 + $2 $$ LANGUAGE sql |
CREATE [OR REPLACE] FUNCTION Note: BigQuery UDFs require all input arguments to be named. The Amazon Redshift argument variables ( $1 , $2 , …) are not supported in
BigQuery. |
CREATE [OR
REPLACE] FUNCTION Note: Amazon Redshift does not support ANY TYPE for SQL UDFs. However, it
supports using the ANYELEMENT
data type in Python-based UDFs. |
CREATE [OR REPLACE] FUNCTION Note: BigQuery supports using ANY TYPE as argument type. The function
accepts an input of any type for this argument. For more information,
see templated parameter in BigQuery.
|
BigQuery also supports the CREATE FUNCTION IF NOT EXISTS
statement, which
treats the query as successful and takes no action if a function with the same
name already exists.
BigQuery's CREATE FUNCTION
statement also supports creating
TEMPORARY
or TEMP
functions, which do not have an Amazon Redshift equivalent.
See calling UDFs for details on executing a BigQuery-persistent UDF.
DROP FUNCTION
syntax
The following table addresses differences in DROP FUNCTION
syntax between
Amazon Redshift and BigQuery.
Amazon Redshift | BigQuery |
---|---|
DROP
FUNCTION |
DROP FUNCTION Note: BigQuery does not require using the function's signature for deleting the function. Also, removing function dependencies is not supported in BigQuery. |
BigQuery also supports the
DROP FUNCTION IF EXISTS
statement,
which deletes the function only if the function exists in the specified
dataset.
BigQuery requires that you specify the project_name if the function is not located in the current project.
UDF components
This section highlights the similarities and differences in UDF components between Amazon Redshift andBigQuery.
Component | Amazon Redshift | BigQuery |
---|---|---|
Name | Amazon Redshift recommends using the prefix
_f for function names to avoid conflicts with existing
or future built-in SQL function names. |
In BigQuery, you can use any custom function name. |
Arguments | Arguments are optional. You can use name and data types for Python
UDF arguments and only data types for SQL UDF arguments. In SQL UDFs,
you must refer to arguments using $1 , $2 ,
and so on. Amazon Redshift also restricts
the number of arguments to 32. |
Arguments are optional, but if you specify arguments, they must use both name and data types for both JavaScript and SQL UDFs. The maximum number of arguments for a persistent UDF is 256. |
Data type | Amazon Redshift supports a different set of data types for SQL
and Python UDFs. For a Python UDF, the data type might also be ANYELEMENT .You must specify a RETURN data type for both SQL and
Python UDFs.See Data types in this document for equivalents between data types in Amazon Redshift and in BigQuery. |
BigQuery supports a different set of data types for SQL and JavaScript UDFs. For a SQL UDF, the data type might also be ANY TYPE . For
more information, see templated parameters in
BigQuery.The RETURN data type is optional for SQL UDFs.See SQL type encodings in JavaScript for information on how BigQuery data types map to JavaScript data types. |
Definition | For both SQL and Python UDFs, you must enclose the function
definition using dollar quoting, as in a pair of dollar signs
($$ ), to indicate the start and end of the function
statements.For SQL UDFs, Amazon Redshift supports only a SQL SELECT clause
as the function definition. Also, the SELECT clause
cannot include any of the FROM , INTO ,
WHERE , GROUP BY , ORDER BY , and LIMIT
clauses.For Python UDFs, you can write a Python program using the Python 2.7 Standard Library or import your custom modules by creating one using the CREATE
LIBRARY command. |
In BigQuery, you need to enclose the JavaScript code in quotes. See
Quoting rules for more
information. For SQL UDFs, you can use any SQL expressions as the function definition. However, BigQuery doesn't support referencing tables, views, or models. For JavaScript UDFs, you can include external code libraries directly using the OPTIONS section. You can also use the BigQuery UDF
test tool to test your functions. |
Language | You must use the LANGUAGE literal to specify the
language as either sql for SQL UDFs or plpythonu
for Python UDFs. |
You need not specify LANGUAGE for SQL UDFs but must
specify the language as js for JavaScript UDFs. |
State | Amazon Redshift does not support creating temporary UDFs. Amazon Redshift provides an option to define the volatility of a function using VOLATILE , STABLE ,
or IMMUTABLE literals. This is used for optimization by
the query optimizer. |
BigQuery supports both persistent and temporary UDFs. You can reuse
persistent UDFs across multiple queries, whereas you can only use
temporary UDFs in a single query. Function volatility is not a configurable parameter in BigQuery. All BigQuery UDF volatility is equivalent to Amazon Redshift's IMMUTABLE
volatility. |
Security and privileges | To create a UDF, you must have permission
for usage on language for SQL or plpythonu (Python). By default,
USAGE ON LANGUAGE SQL is granted to PUBLIC ,
but you must explicitly grant USAGE ON LANGUAGE PLPYTHONU
to specific users or groups.Also, you must be a superuser to replace a UDF. |
Granting explicit permissions for creating or deleting any type of
UDF is not necessary in BigQuery. Any user assigned a role of BigQuery Data Editor (having
bigquery.routines.* as one of the permissions)
can create or delete functions for the specified dataset.BigQuery also supports creating custom roles. This can be managed using Cloud IAM. |
Limits | See Python UDF limits. | See UDF limits. |
Metadata and transaction SQL statements
Amazon Redshift | BigQuery |
---|---|
SELECT * FROM STL_ANALYZE WHERE name |
Not used in BigQuery. You don't need to gather statistics in order to improve query performance. To get information about your data distribution, you can use approximate aggregate functions. |
ANALYZE
[[ table_name[(column_name |
Not used in BigQuery. |
LOCK
TABLE table_name; |
Not used in BigQuery. |
BEGIN
TRANSACTION; SELECT ... |
BigQuery uses snapshot isolation. For details, see Consistency guarantees. |
EXPLAIN
... |
Not used in BigQuery. Similar features are the query plan explanation in the BigQuery Google Cloud console, and in audit logging in Cloud Monitoring. |
SELECT * FROM SVV_TABLE_INFO WHERE |
SELECT * EXCEPT(is_typed) FROM For more information see Introduction to BigQuery INFORMATION_SCHEMA . |
VACUUM
[table_name] |
Not used in BigQuery. BigQuery clustered tables are automatically sorted. |
Multi-statement and multi-line SQL statements
Both Amazon Redshift and BigQuery support transactions (sessions) and therefore support statements separated by semicolons that are consistently executed together. For more information, see Multi-statement transactions.
Procedural SQL statements
CREATE PROCEDURE
statement
Amazon Redshift | BigQuery |
---|---|
CREATE or
REPLACE PROCEDURE |
CREATE PROCEDURE if a name is
required.Otherwise, use inline with BEGIN or in a single line with CREATE TEMP FUNCTION . |
CALL |
CALL |
Variable declaration and assignment
Amazon Redshift | BigQuery |
---|---|
DECLARE |
DECLARE Declares a variable of the specified type. |
SET |
SET Sets a variable to have the value of the provided expression, or sets multiple variables at the same time based on the result of multiple expressions. |
Error condition handlers
In Amazon Redshift, an error encountered during the execution of a stored procedure
ends the execution flow, ends the transaction, and rolls back the transaction.
These results occur because subtransactions are not supported. In an
Amazon Redshift-stored procedure, the only supported handler_statement
is RAISE
. In BigQuery, error
handling is a core feature of the main control flow, similar to what other
languages provide with TRY ... CATCH
blocks.
Amazon Redshift | BigQuery |
---|---|
BEGIN ...
EXCEPTION WHEN OTHERS THEN |
BEGIN ... EXCEPTION WHEN ERROR
THEN |
RAISE |
RAISE |
[ <<label>> ] [ DECLARE declarations ] |
BEGIN |
Cursor declarations and operations
Because BigQuery doesn't support cursors or sessions, the following statements aren't used in BigQuery:
DECLARE
cursor_name
CURSOR
[FOR] ...
PREPARE
plan_name [ (datatype [, ...] ) ] AS statement
OPEN
cursor_name FOR SELECT ...
FETCH
[ NEXT | ALL | {FORWARD [ count | ALL ] } ] FROM cursor_name
CLOSE
cursor_name;
If you're using the cursor to return a result set, you can achieve similar behavior using temporary tables in BigQuery.
Dynamic SQL statements
The scripting feature in BigQuery supports dynamic SQL statements like those shown in the following table.
Amazon Redshift | BigQuery |
---|---|
EXECUTE |
EXECUTE IMMEDIATE |
Flow-of-control statements
Amazon Redshift | BigQuery |
---|---|
IF..THEN..ELSIF..THEN..ELSE..END
IF |
IF condition |
name CURSOR
[ ( arguments ) ] FOR query |
Cursors or sessions are not used in BigQuery. |
[< |
LOOP |
WHILE
condition LOOP stmts END LOOP |
WHILE condition |
EXIT |
BREAK |
Consistency guarantees and transaction isolation
Both Amazon Redshift and BigQuery are atomic—that is, ACID-compliant on a per-mutation level across many rows.
Transactions
Amazon Redshift supports serializable isolation by default for transactions. Amazon Redshift lets you specify any of the four SQL standard transaction isolation levels but processes all isolation levels as serializable.
BigQuery also supports transactions. BigQuery helps ensure optimistic concurrency control (first to commit has priority) with snapshot isolation, in which a query reads the last committed data before the query starts. This approach guarantees the same level of consistency on a per-row, per-mutation basis and across rows within the same DML statement, yet avoids deadlocks. In the case of multiple DML updates against the same table, BigQuery switches to pessimistic concurrency control. Load jobs can run completely independently and append to tables.
Rollback
If Amazon Redshift encounters any error while running a stored procedure, it rolls
back all changes made in a transaction. Additionally, you can use the ROLLBACK
transaction control statement in a stored procedure to discard all changes.
In BigQuery, you can use the
ROLLBACK TRANSACTION
statement.
Database limits
Check the BigQuery public documentation for the latest quotas and limits. Many quotas for large-volume users can be raised by contacting the Cloud support team. The following table shows a comparison of the Amazon Redshift and BigQuery database limits.
Limit | Amazon Redshift | BigQuery |
---|---|---|
Tables in each database for large and xlarge cluster node types | 9,900 | Unrestricted |
Tables in each database for 8xlarge cluster node types | 20,000 | Unrestricted |
User-defined databases you can create for each cluster | 60 | Unrestricted |
Maximum row size | 4 MB | 100 MB |