IBM Netezza SQL translation guide

IBM Netezza data warehousing is designed to work with Netezza-specific SQL syntax. Netezza SQL is based on Postgres 7.2. SQL scripts written for Netezza can't be used in a BigQuery data warehouse without alterations, because the SQL dialects vary.

This document details the similarities and differences in SQL syntax between Netezza and BigQuery in the following areas:

  • Data types
  • SQL language elements
  • Query syntax
  • Data manipulation language (DML)
  • Data definition language (DDL)
  • Stored procedures
  • Functions

You can also use batch SQL translation to migrate your SQL scripts in bulk, or interactive SQL translation to translate ad-hoc queries. IBM Netezza SQL/NZPLSQL is supported by both tools in preview.

Data types

Netezza BigQuery Notes
INTEGER/INT/INT4 INT64
SMALLINT/INT2 INT64
BYTEINT/INT1 INT64
BIGINT/INT8 INT64
DECIMAL NUMERIC The DECIMAL data type in Netezza is an alias for the NUMERIC data type.
NUMERIC NUMERIC INT64
NUMERIC(p,s) NUMERIC The NUMERIC type in BigQuery does not enforce custom digit or scale bounds (constraints) like Netezza does. BigQuery has fixed 9 digits after the decimal, while Netezza allows a custom setup. In Netezza, precision p can range from 1 to 38, and scale s from 0 to the precision.
FLOAT(p) FLOAT64
REAL/FLOAT(6) FLOAT64
DOUBLE PRECISION/FLOAT(14) FLOAT64
CHAR/CHARACTER STRING The STRING type in BigQuery is variable-length and does not require manually setting a max character length as the Netezza CHARACTER and VARCHAR types require. The default value of n in CHAR(n) is 1. The maximum character string size is 64,000.
VARCHAR STRING The STRING type in BigQuery is variable-length and does not require manually setting a max character length as the Netezza CHARACTER and VARCHAR types require. The maximum character string size is 64,000.
NCHAR STRING The STRING type in BigQuery is stored as variable length UTF-8 encoded Unicode. The maximum length is 16,000 characters.
NVARCHAR STRING The STRING type in BigQuery is stored as variable-length UTF-8-encoded Unicode. The maximum length is 16,000 characters.
VARBINARY BYTES
ST_GEOMETRY GEOGRAPHY
BOOLEAN/BOOL BOOL The BOOL type in BigQuery can only accept TRUE/FALSE, unlike the BOOL type in Netezza, which can accept a variety of values like 0/1, yes/no, true/false, on/off.
DATE DATE
TIME TIME
TIMETZ/TIME WITH TIME ZONE TIME Netezza stores the TIME data type in UTC and allows you to pass an offset from UTC using the WITH TIME ZONE syntax. The TIME data type in BigQuery represents a time that's independent of any date or time zone.
TIMESTAMP DATETIME The Netezza TIMESTAMP type does not include a time zone, the same as the BigQuery DATETIME type.
ARRAY There is no array data type in Netezza. The array type is instead stored in a varchar field.

Timestamp and date type formatting

For more information about the date type formatting that Netezza SQL uses, see the Netezza Date Time template patterns documentation. For more information about the date time functions, see the Netezza date/time functions documentation.

When you convert date type formatting elements from Netezza to GoogleSQL, you must pay particular attention to time zone differences between TIMESTAMP and DATETIME, as summarized in the following table:

Netezza BigQuery
CURRENT_TIMESTAMP
CURRENT_TIME

TIME information in Netezza can have different time zone information, which is defined using the WITH TIME ZONE syntax.
If possible, use the CURRENT_TIMESTAMP function, which is formatted correctly. However, the output format does not always show the UTC time zone (internally, BigQuery does not have a time zone). The DATETIME object in the bq command-line tool and Google Cloud console is formatted using a T separator according to RFC 3339. However, in Python and Java JDBC, a space is used as a separator. Use the explicit FORMAT_DATETIME function to define the date format correctly. Otherwise, an explicit cast is made to a string, for example:
CAST(CURRENT_DATETIME() AS STRING)
This also returns a space separator.
CURRENT_DATE CURRENT_DATE
CURRENT_DATE-3 BigQuery does not support arithmetic data operations. Instead, use the DATE_ADD function.

SELECT statement

Generally, the Netezza SELECT statement is compatible with BigQuery. The following table contains a list of exceptions:

Netezza BigQuery
A SELECT statement without FROM clause Supports special case such as the following:

SELECT 1 UNION ALL SELECT 2;

SELECT
  (subquery) AS flag,
  CASE WHEN flag = 1 THEN ...

In BigQuery, columns cannot reference the output of other columns defined within the same query. You must duplicate the logic or move the logic into a nested query.

Option 1

SELECT
  (subquery) AS flag,
  CASE WHEN (subquery) = 1 THEN ...

Option 2

SELECT
  q.*,
  CASE WHEN flag = 1 THEN ...
FROM (
  SELECT
    (subquery) AS flag,
    ...
  ) AS q

Comparison operators

Netezza BigQuery Description
exp = exp2 exp = exp2 Equal
exp <= exp2 exp <= exp2 Less than or equal to
exp < exp2 exp < exp2 Less than
exp <> exp2
exp != exp2
exp <> exp2
exp != exp2
Not equal
exp >= exp2 exp >= exp2 Greater than or equal to
exp > exp2 exp > exp2 Greater than

Built-in SQL functions

Netezza BigQuery Description
CURRENT_DATE CURRENT_DATE Get the current date (year, month, and day).
CURRENT_TIME CURRENT_TIME Get the current time with fraction.
CURRENT_TIMESTAMP CURRENT_TIMESTAMP Get the current system date and time, to the nearest full second.
NOW CURRENT_TIMESTAMP Get the current system date and time, to the nearest full second.
COALESCE(exp, 0) COALESCE(exp, 0) Replace NULL with zero.
NVL(exp, 0) IFNULL(exp, 0) Replace NULL with zero.
EXTRACT(DOY FROM timestamp_expression) EXTRACT(DAYOFYEAR FROM timestamp_expression) Return the number of days from the beginning of the year.
ADD_MONTHS(date_expr, num_expr) DATE_ADD(date, INTERVAL k MONTH) Add months to a date.
DURATION_ADD(date, k) DATE_ADD(date, INTERVAL k DAY) Perform addition on dates.
DURATION_SUBTRACT(date, k) DATE_SUB(date, INTERVAL k DAY) Perform subtraction on dates.
str1 || str2 CONCAT(str1, str2) Concatenate strings.

Functions

This section compares Netezza and BigQuery functions.

Aggregate functions

Analytical functions

Date and time functions

String functions

Math functions

DML syntax

This section compares Netezza and BigQuery DML syntax.

INSERT statement

Netezza BigQuery

INSERT INTO table VALUES (...);

INSERT INTO table (...) VALUES (...);


Netezza offers a DEFAULT keyword and other constraints for columns. In BigQuery, omitting column names in the INSERT statement is valid only if all columns are given.

INSERT INTO table (...) VALUES (...);
INSERT INTO table (...) VALUES (...);

INSERT INTO table VALUES (), ();

BigQuery imposes DML quotas, which restrict the number of DML statements you can execute daily. To make the best use of your quota, consider the following approaches:

  • Combine multiple rows in a single INSERT statement, instead of one row per INSERT statement.
  • Combine multiple DML statements (including an INSERT statement) using a MERGE statement.
  • Use a CREATE TABLE ... AS SELECT statement to create and populate new tables.

DML scripts in BigQuery have slightly different consistency semantics than the equivalent statements in Netezza. Also note that BigQuery does not offer constraints apart from NOT NULL.

For an overview of snapshot isolation and session and transaction handling, see Consistency guarantees and transaction isolation.

UPDATE statement

In Netezza, the WHERE clause is optional, but in BigQuery it is necessary.

Netezza BigQuery

UPDATE tbl
SET
tbl.col1=val1;

Not supported without the WHERE clause. Use a WHERE true clause to update all rows.

UPDATE A
SET
  y = B.y,
  z = B.z + 1
FROM B
WHERE A.x = B.x
  AND A.y IS NULL;

UPDATE A
SET
  y = B.y,
  z = B.z + 1
FROM B
WHERE A.x = B.x
  AND A.y IS NULL;

UPDATE A alias
SET x = x + 1
WHERE f(x) IN (0, 1)

UPDATE A
SET x = x + 1
WHERE f(x) IN (0, 1);

UPDATE A
SET z = B.z
FROM B
WHERE A.x = B.x
  AND A.y = B.y

UPDATE A
SET z = B.z
FROM B
WHERE A.x = B.x
  AND A.y = B.y;

For examples, see UPDATE examples.

Because of DML quotas, we recommend that you use larger MERGE statements instead of multiple single UPDATE and INSERT statements. DML scripts in BigQuery have slightly different consistency semantics than equivalent statements in Netezza. For an overview of snapshot isolation and session and transaction handling, see Consistency guarantees and transaction isolation.

DELETE and TRUNCATE statements

The DELETE and TRUNCATE statements are both ways to remove rows from a table without affecting the table schema or indexes. The TRUNCATE statement has the same effect as the DELETE statement, but is much faster than the DELETE statement for large tables. The TRUNCATE statement is supported in Netezza but not supported in BigQuery. However, you can use DELETE statements in both Netezza and BigQuery.

In BigQuery, the DELETE statement must have a WHERE clause. In Netezza, the WHERE clause is optional. If the WHERE clause is not specified, all the rows in the Netezza table are deleted.

Netezza BigQuery Description

BEGIN;
LOCK TABLE A IN EXCLUSIVE MODE;
DELETE FROM A;
INSERT INTO A SELECT * FROM B;
COMMIT;

Replacing the contents of a table with query output is the equivalent of a transaction. You can do this with either a query or a copy (cp) operation.

bq query \
--replace \
--destination_table \
tableA \
'SELECT * \
FROM tableB \
WHERE ...'

bq cp \
-f tableA tableB

Replace the contents of a table with the results of a query.

DELETE FROM database.table

DELETE FROM table WHERE TRUE;

In Netezza, when a delete statement is run, the rows are not deleted physically but only marked for deletion. Running the GROOM TABLE or nzreclaim commands later removes the rows marked for deletion and reclaims the corresponding disk space.
GROOM TABLE Netezza uses the GROOM TABLE command to reclaim disk space by removing rows marked for deletion.

MERGE statement

A MERGE statement must match at most one source row for each target row. DML scripts in BigQuery have slightly different consistency semantics than the equivalent statements in Netezza. For an overview of snapshot isolation and session and transaction handling, see Consistency guarantees and transaction isolation. For examples, see BigQuery MERGE examples and Netezza MERGE examples.

DDL syntax

This section compares Netezza and BigQuery DDL syntax.

CREATE TABLE statement

Netezza BigQuery Description
TEMP
TEMPORARY
With BigQuery's DDL support, you can create a table from the results of a query and specify its expiration at creation time. For example, for three days:

CREATE TABLE 'fh-bigquery.public_dump.vtemp'
OPTIONS(
expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(),
INTERVAL 3 DAY))
Create tables temporary to a session.
ZONE MAPS Not supported. Quick search for WHERE condition.
DISTRIBUTE ON PARTITION BY Partitioning. This is not a direct translation. DISTRIBUTE ON shares data between nodes, usually with a unique key for even distribution, while PARTITION BY prunes data into segments.
ORGANIZE ON CLUSTER BY Both Netezza and BigQuery support up to four keys for clustering. Netezza clustered base tables (CBT) provide equal precedence to each of the clustering columns. BigQuery gives precedence to the first column on which the table is clustered, followed by the second column, and so on.
ROW SECURITY Authorized View Row-level security.
CONSTRAINT Not supported Check constraints.

DROP statement

Netezza BigQuery Description
DROP TABLE DROP TABLE
DROP DATABASE DROP DATABASE
DROP VIEW DROP VIEW

Column options and attributes

Netezza BigQuery Description
NULL
NOT NULL
NULLABLE
REQUIRED
Specify if the column is allowed to contain NULL values.
REFERENCES Not supported Specify column constraint.
UNIQUE Not supported Each value in the column must be unique.
DEFAULT Not supported Default value for all values in the column.

Temporary tables

Netezza supports TEMPORARY tables that exist during the duration of a session.

To build a temporary table in BigQuery, do the following:

  1. Create a dataset that has a short time to live (for example, 12 hours).
  2. Create the temporary table in the dataset, with a table name prefix of temp. For example, to create a table that expires in one hour, do this:

    CREATE TABLE temp.name (col1, col2, ...)
    OPTIONS(expiration_timestamp = TIMESTAMP_ADD(CURRENT_TIMESTAMP(),
    INTERVAL 1 HOUR));
  3. Start reading and writing from the temporary table.

You can also remove duplicates independently in order to find errors in downstream systems.

Note that BigQuery does not support DEFAULT and IDENTITY (sequences) columns.

Procedural SQL statements

Netezza uses the NZPLSQL scripting language to work with stored procedures. NZPLSQL is based on Postgres' PL/pgSQL language. This section describes how to convert procedural SQL statements used in stored procedures, functions, and triggers from Netezza to BigQuery.

CREATE PROCEDURE statement

Netezza and BigQuery both support creating stored procedures by using the CREATE PROCEDURE statement. For more information, see Work with SQL stored procedures.

Variable declaration and assignment

Netezza BigQuery Description
DECLARE var datatype(len) [DEFAULT value]; DECLARE Declare variable.
SET var = value; SET Assign value to variable.

Exception handlers

Netezza supports exception handlers that can be triggered for certain error conditions. BigQuery does not support condition handlers.

Netezza BigQuery Description
EXCEPTION Not supported Declare SQL exception handler for general errors.

Dynamic SQL statements

Netezza supports dynamic SQL queries inside stored procedures. BigQuery does not support dynamic SQL statements.

Netezza BigQuery Description
EXECUTE IMMEDIATE sql_str; EXECUTE IMMEDIATE sql_str; Execute dynamic SQL.

Flow-of-control statements

Netezza BigQuery Description
IF THEN ELSE STATEMENT
IF condition
THEN ...
ELSE ...
END IF;
IF condition
THEN ...
ELSE ...
END IF;
Execute conditionally.
Iterative Control
FOR var AS SELECT ...
DO stmts END FOR;
FOR var AS cur CURSOR
FOR SELECT ...
DO stmts END FOR;
Not supported Iterate over a collection of rows.
Iterative Control
LOOP stmts END LOOP;
LOOP
sql_statement_list END LOOP;
Loop block of statements.
EXIT WHEN BREAK Exit a procedure.
WHILE *condition* LOOP WHILE condition
DO ...
END WHILE
Execute a loop of statements until a while condition fails.

Other statements and procedural language elements

Netezza BigQuery Description
CALL proc(param,...) Not supported Execute a procedure.
EXEC proc(param,...) Not supported Execute a procedure.
EXECUTE proc(param,...) Not supported Execute a procedure.

Multi-statement and multi-line SQL statements

Both Netezza and BigQuery support transactions (sessions) and therefore support statements separated by semicolons that are consistently executed together. For more information, see Multi-statement transactions.

Other SQL statements

Netezza BigQuery Description
GENERATE STATISTICS Generate statistics for all the tables in the current database.
GENERATE STATISTICS ON table_name Generate statistics for a specific table.
GENERATE STATISTICS ON table_name(col1,col4) Either use statistical functions like MIN, MAX, AVG, etc., use the UI, or use the Cloud Data Loss Prevention API. Generate statistics for specific columns in a table.
GENERATE STATISTICS ON table_name APPROX_COUNT_DISTINCT(col) Show the number of unique values for columns.
INSERT INTO table_name INSERT INTO table_name Insert a row.
LOCK TABLE table_name FOR EXCLUSIVE; Not supported Lock row.
SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL ... BigQuery always uses Snapshot Isolation. For details, see Consistency guarantees and transaction isolation. Define the transaction isolation level.
BEGIN TRANSACTION
END TRANSACTION
COMMIT
BigQuery always uses Snapshot Isolation. For details, see Consistency guarantees and transaction isolation. Define the transaction boundary for multi-statement requests.
EXPLAIN ... Not supported. Similar features in the query plan and timeline Show query plan for a SELECT statement.
User Views metadata
System Views metadata
SELECT
* EXCEPT(is_typed)
FROM
mydataset.INFORMATION_SCHEMA.TABLES;

BigQuery Information Schema
Query objects in the database

Consistency guarantees and transaction isolation

Both Netezza and BigQuery are atomic, that is, ACID compliant on a per-mutation level across many rows. For example, a MERGE operation is completely atomic, even with multiple inserted values.

Transactions

Netezza syntactically accepts all four modes of ANSI SQL transaction isolation. However, regardless of what mode is specified, only the SERIALIZABLE mode is used, which provides the highest possible level of consistency. This mode also avoids dirty, nonrepeatable, and phantom reads between concurrent transactions. Netezza does not use conventional locking to enforce consistency. Instead, it uses serialization dependency checking, a form of optimistic concurrency control to automatically roll back the latest transaction when two transactions attempt to modify the same data.

BigQuery also supports transactions. BigQuery helps ensure optimistic concurrency control (first to commit has priority) with snapshot isolation, in which a query reads the last committed data before the query starts. This approach guarantees the same level of consistency on a per-row, per-mutation basis and across rows within the same DML statement, yet avoids deadlocks. In the case of multiple DML updates against the same table, BigQuery switches to pessimistic concurrency control. Load jobs can run completely independently and append to tables.

Rollback

Netezza supports the ROLLBACK statement to abort the current transaction and roll back all the changes made in the transaction.

In BigQuery, you can use the ROLLBACK TRANSACTION statement.

Database limits

Limit Netezza BigQuery
Tables per database 32,000 Unrestricted
Columns per table 1600 10000
Maximum row size 64 KB 100 MB
Column and table name length 128 bytes 16,384 Unicode characters
Rows per table Unlimited Unlimited
Maximum SQL request length 1 MB (maximum unresolved standard SQL query length).

12 MB (maximum resolved legacy and standard SQL query length).

Streaming:
10 MB (HTTP request size limit)
10,000 (maximum rows per request)
Maximum request and response size 10 MB (request) and 10 GB (response) or virtually unlimited if using pagination or the Cloud Storage API.
Maximum number of concurrent sessions 63 concurrent read-write transactions. 2000 concurrent connections to the server. 100 concurrent queries (can be raised with slot reservation), 300 concurrent API requests per user.

What's next