Data types

This page provides an overview of all GoogleSQL for BigQuery data types, including information about their value domains. For information on data type literals and constructors, see Lexical Structure and Syntax.

Data type properties

When storing and querying data, it is helpful to keep the following data type properties in mind:

Nullable data types

For nullable data types, NULL is a valid value. Currently, all existing data types are nullable, but conditions apply for arrays.

Orderable data types

Expressions of orderable data types can be used in an ORDER BY clause. Applies to all data types except for:

  • ARRAY
  • STRUCT
  • GEOGRAPHY
  • JSON

Ordering NULLs

In the context of the ORDER BY clause, NULLs are the minimum possible value; that is, NULLs appear first in ASC sorts and last in DESC sorts.

NULL values can be specified as the first or last values for a column irrespective of ASC or DESC by using the NULLS FIRST or NULLS LAST modifiers respectively.

To learn more about using ASC, DESC, NULLS FIRST and NULLS LAST, see the ORDER BY clause.

Ordering floating points

Floating point values are sorted in this order, from least to greatest:

  1. NULL
  2. NaN — All NaN values are considered equal when sorting.
  3. -inf
  4. Negative numbers
  5. 0 or -0 — All zero values are considered equal when sorting.
  6. Positive numbers
  7. +inf

Groupable data types

Groupable data types can generally appear in an expression following GROUP BY, DISTINCT, and PARTITION BY. However, PARTITION BY expressions cannot include floating point types. All data types are supported except for:

  • GEOGRAPHY
  • JSON
  • ARRAY
  • STRUCT

Special floating point values are grouped in the following way, including both grouping done by a GROUP BY clause and grouping done by the DISTINCT keyword:

  • NULL
  • NaN — All NaN values are considered equal when grouping.
  • -inf
  • 0 or -0 — All zero values are considered equal when grouping.
  • +inf

Comparable data types

Values of the same comparable data type can be compared to each other. All data types are supported except for:

  • GEOGRAPHY
  • JSON

Notes:

  • Equality comparisons for structs are supported field by field, in field order. Field names are ignored. Less than and greater than comparisons are not supported.
  • To compare geography values, use ST_Equals.
  • All types that support comparisons can be used in a JOIN condition. See JOIN Types for an explanation of join conditions.

Collatable data types

Collatable data types support collation, which determines how to sort and compare strings. These data types support collation:

  • String
  • String fields in a struct
  • String elements in an array

The maximum size of a column value is 10MiB, which applies to scalar and array types.

Data type sizes

Use the following table to see the size in logical bytes for each supported data type.

Data type Size
ARRAY The sum of the size of its elements. For example, an array defined as (ARRAY<INT64>) that contains 4 entries is calculated as 32 logical bytes (4 entries x 8 logical bytes).
BIGNUMERIC 32 logical bytes
BOOL 1 logical byte
BYTES 2 logical bytes + the number of logical bytes in the value
DATE 8 logical bytes
DATETIME 8 logical bytes
FLOAT64 8 logical bytes
GEOGRAPHY 16 logical bytes + 24 logical bytes * the number of vertices in the geography type. To verify the number of vertices, use the ST_NumPoints function.
INT64 8 logical bytes
INTERVAL 16 logical bytes
JSON The number of logical bytes in UTF-8 encoding of the JSON-formatted string equivalent after canonicalization.
NUMERIC 16 logical bytes
STRING 2 logical bytes + the UTF-8 encoded string size
STRUCT 0 logical bytes + the size of the contained fields
TIME 8 logical bytes
TIMESTAMP 8 logical bytes

A NULL value for any data type is calculated as 0 logical bytes.

A repeated column is stored as an array, and the size is calculated based on the column data type and the number of values. For example, an integer column (INT64) that is repeated (ARRAY<INT64>) and contains 4 entries is calculated as 32 logical bytes (4 entries x 8 logical bytes). The total size of all values in a table row can't exceed the maximum row size.

Parameterized data types

Syntax:

DATA_TYPE(param[, ...])

You can use parameters to specify constraints for the following data types:

  • STRING
  • BYTES
  • NUMERIC
  • BIGNUMERIC

A data type that is declared with parameters is called a parameterized data type. You can only use parameterized data types with columns and script variables. A column with a parameterized data type is a parameterized column and a script variable with a parameterized data type is a parameterized script variable. Parameterized type constraints are enforced when writing a value to a parameterized column or when assigning a value to a parameterized script variable.

A data type's parameters are not propagated in an expression, only the data type is.

Examples

-- Declare a variable with type parameters.
DECLARE x STRING(10);

-- This is a valid assignment to x.
SET x = "hello";

-- This assignment to x violates the type parameter constraint and results in an OUT_OF_RANGE error.
SET x = "this string is too long"
-- Declare variables with type parameters.
DECLARE x NUMERIC(10) DEFAULT 12345;
DECLARE y NUMERIC(5, 2) DEFAULT 123.45;

-- The variable x is treated as a NUMERIC value when read, so the result of this query
-- is a NUMERIC without type parameters.
SELECT x;

-- Type parameters are not propagated within expressions, so variables x and y are treated
-- as NUMERIC values when read and the result of this query is a NUMERIC without type parameters.
SELECT x + y;

Array type

Name Description
ARRAY Ordered list of zero or more elements of any non-array type.

An array is an ordered list of zero or more elements of non-array values. Elements in an array must share the same type.

Arrays of arrays are not allowed. Queries that would produce an array of arrays will return an error. Instead, a struct must be inserted between the arrays using the SELECT AS STRUCT construct.

To learn more about the literal representation of an array type, see Array literals.

To learn more about using arrays in GoogleSQL, see Work with arrays.

NULLs and the array type

Currently, GoogleSQL for BigQuery has the following rules with respect to NULLs and arrays:

  • An array can be NULL.

    For example:

    SELECT CAST(NULL AS ARRAY<INT64>) IS NULL AS array_is_null;
    
    /*---------------*
     | array_is_null |
     +---------------+
     | TRUE          |
     *---------------*/
    
  • GoogleSQL for BigQuery translates a NULL array into an empty array in the query result, although inside the query, NULL and empty arrays are two distinct values.

    For example:

    WITH Items AS (
      SELECT [] AS numbers, "Empty array in query" AS description UNION ALL
      SELECT CAST(NULL AS ARRAY<INT64>), "NULL array in query")
    SELECT numbers, description, numbers IS NULL AS numbers_null
    FROM Items;
    
    /*---------+----------------------+--------------*
     | numbers | description          | numbers_null |
     +---------+----------------------+--------------+
     | []      | Empty array in query | false        |
     | []      | NULL array in query  | true         |
     *---------+----------------------+--------------*/
    

    When you write a NULL array to a table, it is converted to an empty array. If you write Items to a table from the previous query, then each array is written as an empty array:

    SELECT numbers, description, numbers IS NULL AS numbers_null
    FROM Items;
    
    /*---------+----------------------+--------------*
     | numbers | description          | numbers_null |
     +---------+----------------------+--------------+
     | []      | Empty array in query | false        |
     | []      | NULL array in query  | false        |
     *---------+----------------------+--------------*/
    
  • GoogleSQL for BigQuery raises an error if the query result has an array which contains NULL elements, although such an array can be used inside the query.

    For example, this works:

    SELECT FORMAT("%T", [1, NULL, 3]) as numbers;
    
    /*--------------*
     | numbers      |
     +--------------+
     | [1, NULL, 3] |
     *--------------*/
    

    But this raises an error:

    -- error
    SELECT [1, NULL, 3] as numbers;
    

Declaring an array type

ARRAY<T>

Array types are declared using the angle brackets (< and >). The type of the elements of an array can be arbitrarily complex with the exception that an array cannot directly contain another array.

Examples

Type Declaration Meaning
ARRAY<INT64> Simple array of 64-bit integers.
ARRAY<BYTES(5)> Simple array of parameterized bytes.
ARRAY<STRUCT<INT64, INT64>> An array of structs, each of which contains two 64-bit integers.
ARRAY<ARRAY<INT64>>
(not supported)
This is an invalid type declaration which is included here just in case you came looking for how to create a multi-level array. Arrays cannot contain arrays directly. Instead see the next example.
ARRAY<STRUCT<ARRAY<INT64>>> An array of arrays of 64-bit integers. Notice that there is a struct between the two arrays because arrays cannot hold other arrays directly.

Constructing an array

You can construct an array using array literals or array functions.

Using array literals

You can build an array literal in GoogleSQL using brackets ([ and ]). Each element in an array is separated by a comma.

SELECT [1, 2, 3] AS numbers;

SELECT ["apple", "pear", "orange"] AS fruit;

SELECT [true, false, true] AS booleans;

You can also create arrays from any expressions that have compatible types. For example:

SELECT [a, b, c]
FROM
  (SELECT 5 AS a,
          37 AS b,
          406 AS c);

SELECT [a, b, c]
FROM
  (SELECT CAST(5 AS INT64) AS a,
          CAST(37 AS FLOAT64) AS b,
          406 AS c);

Notice that the second example contains three expressions: one that returns an INT64, one that returns a FLOAT64, and one that declares a literal. This expression works because all three expressions share FLOAT64 as a supertype.

<