Query syntax

Query statements scan one or more tables or expressions and return the computed result rows. This topic describes the syntax for SQL queries in GoogleSQL for BigQuery.

SQL syntax notation rules

The GoogleSQL documentation commonly uses the following syntax notation rules:

  • Square brackets [ ]: Optional clause.
  • Curly braces with vertical bars { a | b | c }: Logical OR. Select one option.
  • Ellipsis ...: Preceding item can repeat.
  • Double quotes ": Syntax wrapped in double quotes ("") is required.

SQL syntax

query_statement:
  query_expr

query_expr:
  [ WITH [ RECURSIVE ] { non_recursive_cte | recursive_cte }[, ...] ]
  { select | ( query_expr ) | set_operation }
  [ ORDER BY expression [{ ASC | DESC }] [, ...] ]
  [ LIMIT count [ OFFSET skip_rows ] ]

select:
  SELECT
    [ WITH differential_privacy_clause ]
    [ { ALL | DISTINCT } ]
    [ AS { STRUCT | VALUE } ]
    select_list
  [ FROM from_clause[, ...] ]
  [ WHERE bool_expression ]
  [ GROUP BY group_by_specification ]
  [ HAVING bool_expression ]
  [ QUALIFY bool_expression ]
  [ WINDOW window_clause ]

SELECT statement

SELECT
  [ WITH differential_privacy_clause ]
  [ { ALL | DISTINCT } ]
  [ AS { STRUCT | VALUE } ]
  select_list

select_list:
  { select_all | select_expression } [, ...]

select_all:
  [ expression. ]*
  [ EXCEPT ( column_name [, ...] ) ]
  [ REPLACE ( expression AS column_name [, ...] ) ]

select_expression:
  expression [ [ AS ] alias ]

The SELECT list defines the columns that the query will return. Expressions in the SELECT list can refer to columns in any of the from_items in its corresponding FROM clause.

Each item in the SELECT list is one of:

  • *
  • expression
  • expression.*

SELECT *

SELECT *, often referred to as select star, produces one output column for each column that's visible after executing the full query.

SELECT * FROM (SELECT "apple" AS fruit, "carrot" AS vegetable);

/*-------+-----------*
 | fruit | vegetable |
 +-------+-----------+
 | apple | carrot    |
 *-------+-----------*/

SELECT expression

Items in a SELECT list can be expressions. These expressions evaluate to a single value and produce one output column, with an optional explicit alias.

If the expression doesn't have an explicit alias, it receives an implicit alias according to the rules for implicit aliases, if possible. Otherwise, the column is anonymous and you can't refer to it by name elsewhere in the query.

SELECT expression.*

An item in a SELECT list can also take the form of expression.*. This produces one output column for each column or top-level field of expression. The expression must either be a table alias or evaluate to a single value of a data type with fields, such as a STRUCT.

The following query produces one output column for each column in the table groceries, aliased as g.

WITH groceries AS
  (SELECT "milk" AS dairy,
   "eggs" AS protein,
   "bread" AS grain)
SELECT g.*
FROM groceries AS g;

/*-------+---------+-------*
 | dairy | protein | grain |
 +-------+---------+-------+
 | milk  | eggs    | bread |
 *-------+---------+-------*/

More examples:

WITH locations AS
  (SELECT STRUCT("Seattle" AS city, "Washington" AS state) AS location
  UNION ALL
  SELECT STRUCT("Phoenix" AS city, "Arizona" AS state) AS location)
SELECT l.location.*
FROM locations l;

/*---------+------------*
 | city    | state      |
 +---------+------------+
 | Seattle | Washington |
 | Phoenix | Arizona    |
 *---------+------------*/
WITH locations AS
  (SELECT ARRAY<STRUCT<city STRING, state STRING>>[("Seattle", "Washington"),
    ("Phoenix", "Arizona")] AS location)
SELECT l.LOCATION[offset(0)].*
FROM locations l;

/*---------+------------*
 | city    | state      |
 +---------+------------+
 | Seattle | Washington |
 *---------+------------*/

SELECT * EXCEPT

A SELECT * EXCEPT statement specifies the names of one or more columns to exclude from the result. All matching column names are omitted from the output.

WITH orders AS
  (SELECT 5 as order_id,
  "sprocket" as item_name,
  200 as quantity)
SELECT * EXCEPT (order_id)
FROM orders;

/*-----------+----------*
 | item_name | quantity |
 +-----------+----------+
 | sprocket  | 200      |
 *-----------+----------*/

SELECT * REPLACE

A SELECT * REPLACE statement specifies one or more expression AS identifier clauses. Each identifier must match a column name from the SELECT * statement. In the output column list, the column that matches the identifier in a REPLACE clause is replaced by the expression in that REPLACE clause.

A SELECT * REPLACE statement doesn't change the names or order of columns. However, it can change the value and the value type.

WITH orders AS
  (SELECT 5 as order_id,
  "sprocket" as item_name,
  200 as quantity)
SELECT * REPLACE ("widget" AS item_name)
FROM orders;

/*----------+-----------+----------*
 | order_id | item_name | quantity |
 +----------+-----------+----------+
 | 5        | widget    | 200      |
 *----------+-----------+----------*/

WITH orders AS
  (SELECT 5 as order_id,
  "sprocket" as item_name,
  200 as quantity)
SELECT * REPLACE (quantity/2 AS quantity)
FROM orders;

/*----------+-----------+----------*
 | order_id | item_name | quantity |
 +----------+-----------+----------+
 | 5        | sprocket  | 100      |
 *----------+-----------+----------*/

SELECT DISTINCT

A SELECT DISTINCT statement discards duplicate rows and returns only the remaining rows. SELECT DISTINCT can't return columns of the following types:

In the following example, SELECT DISTINCT is used to produce distinct arrays:

WITH PlayerStats AS (
  SELECT ['Coolidge', 'Adams'] as Name, 3 as PointsScored UNION ALL
  SELECT ['Adams', 'Buchanan'], 0 UNION ALL
  SELECT ['Coolidge', 'Adams'], 1 UNION ALL
  SELECT ['Kiran', 'Noam'], 1)
SELECT DISTINCT Name

/*------------------+
 | Name             |
 +------------------+
 | [Coolidge,Adams] |
 | [Adams,Buchanan] |
 | [Kiran,Noam]     |
 +------------------*/

In the following example, SELECT DISTINCT is used to produce distinct structs:

WITH
  PlayerStats AS (
    SELECT
      STRUCT<last_name STRING, first_name STRING, age INT64>(
        'Adams', 'Noam', 20) AS Player,
      3 AS PointsScored UNION ALL
    SELECT ('Buchanan', 'Jie', 19), 0 UNION ALL
    SELECT ('Adams', 'Noam', 20), 4 UNION ALL
    SELECT ('Buchanan', 'Jie', 19), 13
  )
SELECT DISTINCT Player
FROM PlayerStats;

/*--------------------------+
 | player                   |
 +--------------------------+
 | {                        |
 |   last_name: "Adams",    |
 |   first_name: "Noam",    |
 |   age: 20                |
 |  }                       |
 +--------------------------+
 | {                        |
 |   last_name: "Buchanan", |
 |   first_name: "Jie",     |
 |   age: 19                |
 |  }                       |
 +---------------------------*/

SELECT ALL

A SELECT ALL statement returns all rows, including duplicate rows. SELECT ALL is the default behavior of SELECT.

SELECT AS STRUCT

SELECT AS STRUCT expr [[AS] struct_field_name1] [,...]

This produces a value table with a STRUCT row type, where the STRUCT field names and types match the column names and types produced in the SELECT list.

Example:

SELECT ARRAY(SELECT AS STRUCT 1 a, 2 b)

SELECT AS STRUCT can be used in a scalar or array subquery to produce a single STRUCT type grouping multiple values together. Scalar and array subqueries (see Subqueries) are normally not allowed to return multiple columns, but can return a single column with STRUCT type.

SELECT AS VALUE

SELECT AS VALUE produces a value table from any SELECT list that produces exactly one column. Instead of producing an output table with one column, possibly with a name, the output will be a value table where the row type is just the value type that was produced in the one SELECT column. Any alias the column had will be discarded in the value table.

Example:

SELECT AS VALUE STRUCT(1 AS a, 2 AS b) xyz

The query above produces a table with row type STRUCT<a int64, b int64>.

FROM clause

FROM from_clause[, ...]

from_clause:
  from_item
  [ { pivot_operator | unpivot_operator } ]
  [ tablesample_operator ]

from_item:
  {
    table_name [ as_alias ] [ FOR SYSTEM_TIME AS OF timestamp_expression ] 
    | { join_operation | ( join_operation ) }
    | ( query_expr ) [ as_alias ]
    | field_path
    | unnest_operator
    | cte_name [ as_alias ]
  }

as_alias:
  [ AS ] alias

The FROM clause indicates the table or tables from which to retrieve rows, and specifies how to join those rows together to produce a single stream of rows for processing in the rest of the query.

pivot_operator

See PIVOT operator.

unpivot_operator

See UNPIVOT operator.

tablesample_operator

See TABLESAMPLE operator.

table_name

The name (optionally qualified) of an existing table.

SELECT * FROM Roster;
SELECT * FROM dataset.Roster;
SELECT * FROM project.dataset.Roster;

FOR SYSTEM_TIME AS OF

FOR SYSTEM_TIME AS OF references the historical versions of the table definition and rows that were current at timestamp_expression.

Limitations:

The source table in the FROM clause containing FOR SYSTEM_TIME AS OF must not be any of the following:

  • An array scan, including a flattened array or the output of the UNNEST operator.
  • A common table expression defined by a WITH clause.
  • The source table in a CREATE TABLE FUNCTION statement creating a new table-valued function

timestamp_expression must be a constant expression. It can't contain the following:

  • Subqueries.
  • Correlated references (references to columns of a table that appear at a higher level of the query statement, such as in the SELECT list).
  • User-defined functions (UDFs).

The value of timestamp_expression can't fall into the following ranges:

  • After the current timestamp (in the future).
  • More than seven (7) days before the current timestamp.

A single query statement can't reference a single table at more than one point in time, including the current time. That is, a query can reference a table multiple times at the same timestamp, but not the current version and a historical version, or two different historical versions.

The default time zone for timestamp_expression in a FOR SYSTEM_TIME AS OF expression is America/Los_Angeles, even though the default time zone for timestamp literals is UTC.

Examples:

The following query returns a historical version of the table from one hour ago.

SELECT *
FROM t
  FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR);

The following query returns a historical version of the table at an absolute point in time.

SELECT *
FROM t
  FOR SYSTEM_TIME AS OF '2017-01-01 10:00:00-07:00';

The following query returns an error because the timestamp_expression contains a correlated reference to a column in the containing query.

SELECT *
FROM t1
WHERE t1.a IN (SELECT t2.a
               FROM t2 FOR SYSTEM_TIME AS OF t1.timestamp_column);

The following operations show accessing a historical version of the table before table is replaced.

DECLARE before_replace_timestamp TIMESTAMP;

-- Create table books.
CREATE TABLE books AS
SELECT 'Hamlet' title, 'William Shakespeare' author;

-- Get current timestamp before table replacement.
SET before_replace_timestamp = CURRENT_TIMESTAMP();

-- Replace table with different schema(title and release_date).
CREATE OR REPLACE TABLE books AS
SELECT 'Hamlet' title, DATE '1603-01-01' release_date;

-- This query returns Hamlet, William Shakespeare as result.
SELECT * FROM books FOR SYSTEM_TIME AS OF before_replace_timestamp;

The following operations show accessing a historical version of the table before a DML job.

DECLARE JOB_START_TIMESTAMP TIMESTAMP;

-- Create table books.
CREATE OR REPLACE TABLE books AS
SELECT 'Hamlet' title, 'William Shakespeare' author;

-- Insert two rows into the books.
INSERT books (title, author)
VALUES('The Great Gatsby', 'F. Scott Fizgerald'),
      ('War and Peace', 'Leo Tolstoy');

SELECT * FROM books;

SET JOB_START_TIMESTAMP = (
  SELECT start_time
  FROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_USER
  WHERE job_type="QUERY"
    AND statement_type="INSERT"
  ORDER BY start_time DESC
  LIMIT 1
 );

-- This query only returns Hamlet, William Shakespeare as result.
SELECT * FROM books FOR SYSTEM_TIME AS OF JOB_START_TIMESTAMP;

The following query returns an error because the DML operates on the current version of the table, and a historical version of the table from one day ago.

INSERT INTO t1
SELECT * FROM t1
  FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY);

join_operation

See Join operation.

query_expr

( query_expr ) [ [ AS ] alias ] is a table subquery.

field_path

In the FROM clause, field_path is any path that resolves to a field within a data type. field_path can go arbitrarily deep into a nested data structure.

Some examples of valid field_path values include:

SELECT * FROM T1 t1, t1.array_column;

SELECT * FROM T1 t1, t1.struct_column.array_field;

SELECT (SELECT ARRAY_AGG(c) FROM t1.array_column c) FROM T1 t1;

SELECT a.struct_field1 FROM T1 t1, t1.array_of_structs a;

SELECT (SELECT STRING_AGG(a.struct_field1) FROM t1.array_of_structs a) FROM T1 t1;

Field paths in the FROM clause must end in an array field. In addition, field paths can't contain arrays before the end of the path. For example, the path array_column.some_array.some_array_field is invalid because it contains an array before the end of the path.

unnest_operator

See UNNEST operator.

cte_name

Common table expressions (CTEs) in a WITH Clause act like temporary tables that you can reference anywhere in the FROM clause. In the example below, subQ1 and subQ2 are CTEs.

Example:

WITH
  subQ1 AS (SELECT * FROM Roster WHERE SchoolID = 52),
  subQ2 AS (SELECT SchoolID FROM subQ1)
SELECT DISTINCT * FROM subQ2;

The WITH clause hides any permanent tables with the same name for the duration of the query, unless you qualify the table name, for example:

dataset.Roster or project.dataset.Roster.

UNNEST operator

unnest_operator:
  {
    UNNEST( array ) [ as_alias ]
    | array_path [ as_alias ]
  }
  [ WITH OFFSET [ as_alias ] ]

array:
  { array_expression | array_path }

as_alias:
  [AS] alias

The UNNEST operator takes an array and returns a table with one row for each element in the array. The output of UNNEST is one value table column. For these ARRAY element types, SELECT * against the value table column returns multiple columns:

  • STRUCT

Input values:

  • array_expression: An expression that produces an array.
  • array_path: The path to an ARRAY type.

    • In an implicit UNNEST operation, the path must start with a range variable name.
    • In an explicit UNNEST operation, the path can optionally start with a range variable name.

    The UNNEST operation with any correlated array_path must be on the right side of a CROSS JOIN, LEFT JOIN, or INNER JOIN operation.

  • as_alias: If specified, defines the explicit name of the value table column containing the array element values. It can be used to refer to the column elsewhere in the query.

  • WITH OFFSET: UNNEST destroys the order of elements in the input array. Use this optional clause to return an additional column with the array element indexes, or offsets. Offset counting starts at zero for each row produced by the UNNEST operation. This column has an optional alias; If the optional alias isn't used, the default column name is offset.

    Example:

    SELECT * FROM UNNEST ([10,20,30]) as numbers WITH OFFSET;
    
    /*---------+--------*
     | numbers | offset |
     +---------+--------+
     | 10      | 0      |
     | 20      | 1      |
     | 30      | 2      |
     *---------+--------*/
    

You can also use UNNEST outside of the FROM clause with the IN operator.

For several ways to use UNNEST, including construction, flattening, and filtering, see Work with arrays.

To learn more about the ways you can use UNNEST explicitly and implicitly, see Explicit and implicit UNNEST.

UNNEST and structs

For an input array of structs, UNNEST returns a row for each struct, with a separate column for each field in the struct. The alias for each column is the name of the corresponding struct field.

Example:

SELECT *
FROM UNNEST(
  ARRAY<
    STRUCT<
      x INT64,
      y STRING,
      z STRUCT<a INT64, b INT64>>>[
        (1, 'foo', (10, 11)),
        (3, 'bar', (20, 21))]);

/*---+-----+----------*
 | x | y   | z        |
 +---+-----+----------+
 | 1 | foo | {10, 11} |
 | 3 | bar | {20, 21} |
 *---+-----+----------*/

Because the UNNEST operator returns a value table, you can alias UNNEST to define a range variable that you can reference elsewhere in the query. If you reference the range variable in the SELECT list, the query returns a struct containing all of the fields of the original struct in the input table.

Example:

SELECT *, struct_value
FROM UNNEST(
  ARRAY<
    STRUCT<
    x INT64,
    y STRING>>[
      (1, 'foo'),
      (3, 'bar')]) AS struct_value;

/*---+-----+--------------*
 | x | y   | struct_value |
 +---+-----+--------------+
 | 3 | bar | {3, bar}     |
 | 1 | foo | {1, foo}     |
 *---+-----+--------------*/

Explicit and implicit UNNEST

Array unnesting can be either explicit or implicit. To learn more, see the following sections.

Explicit unnesting

The UNNEST keyword is required in explicit unnesting. For example:

WITH Coordinates AS (SELECT [1,2] AS position)
SELECT results FROM Coordinates, UNNEST(Coordinates.position) AS results;

In explicit unnesting, array_expression must return an array value but doesn't need to resolve to an array.

Implicit unnesting

The UNNEST keyword isn't used in implicit unnesting.

For example:

WITH Coordinates AS (SELECT [1,2] AS position)
SELECT results FROM Coordinates, Coordinates.position AS results;
Tables and implicit unnesting

When you use array_path with implicit UNNEST, array_path must be prepended with the table. For example:

WITH Coordinates AS (SELECT [1,2] AS position)
SELECT results FROM Coordinates, Coordinates.position AS results;

UNNEST and NULL values

UNNEST treats NULL values as follows:

  • NULL and empty arrays produce zero rows.
  • An array containing NULL values produces rows containing NULL values.

PIVOT operator

FROM from_item[, ...] pivot_operator

pivot_operator:
  PIVOT(
    aggregate_function_call [as_alias][, ...]
    FOR input_column
    IN ( pivot_column [as_alias][, ...] )
  ) [AS alias]

as_alias:
  [AS] alias

The PIVOT operator rotates rows into columns, using aggregation. PIVOT is part of the FROM clause.

  • PIVOT can be used to modify any table expression.
  • Combining PIVOT with FOR SYSTEM_TIME AS OF isn't allowed, although users may use PIVOT against a subquery input which itself uses FOR SYSTEM_TIME AS OF.
  • A WITH OFFSET clause immediately preceding the PIVOT operator isn't allowed.

Conceptual example:

-- Before PIVOT is used to rotate sales and quarter into Q1, Q2, Q3, Q4 columns:
/*---------+-------+---------+------*
 | product | sales | quarter | year |
 +---------+-------+---------+------|
 | Kale    | 51    | Q1      | 2020 |
 | Kale    | 23    | Q2      | 2020 |
 | Kale    | 45    | Q3      | 2020 |
 | Kale    | 3     | Q4      | 2020 |
 | Kale    | 70    | Q1      | 2021 |
 | Kale    | 85    | Q2      | 2021 |
 | Apple   | 77    | Q1      | 2020 |
 | Apple   | 0     | Q2      | 2020 |
 | Apple   | 1     | Q1      | 2021 |
 *---------+-------+---------+------*/

-- After PIVOT is used to rotate sales and quarter into Q1, Q2, Q3, Q4 columns:
/*---------+------+----+------+------+------*
 | product | year | Q1 | Q2   | Q3   | Q4   |
 +---------+------+----+------+------+------+
 | Apple   | 2020 | 77 | 0    | NULL | NULL |
 | Apple   | 2021 | 1  | NULL | NULL | NULL |
 | Kale    | 2020 | 51 | 23   | 45   | 3    |
 | Kale    | 2021 | 70 | 85   | NULL | NULL |
 *---------+------+----+------+------+------*/

Definitions

Top-level definitions:

  • from_item: The table, subquery, or table-valued function (TVF) on which to perform a pivot operation. The from_item must follow these rules.
  • pivot_operator: The pivot operation to perform on a from_item.
  • alias: An alias to use for an item in the query.

pivot_operator definitions:

  • aggregate_function_call: An aggregate function call that aggregates all input rows such that input_column matches a particular value in pivot_column. Each aggregation corresponding to a different pivot_column value produces a different column in the output. Follow these rules when creating an aggregate function call.
  • input_column: Takes a column and retrieves the row values for the column, following these rules.
  • pivot_column: A pivot column to create for each aggregate function call. If an alias isn't provided, a default alias is created. A pivot column value type must match the value type in input_column so that the values can be compared. It's possible to have a value in pivot_column that doesn't match a value in input_column. Must be a constant and follow these rules.

Rules

Rules for a from_item passed to PIVOT:

  • The from_item may consist of any table, subquery, or table-valued function (TVF) result.
  • The from_item may not produce a value table.
  • The from_item may not be a subquery using SELECT AS STRUCT.

Rules for aggregate_function_call:

  • Must be an aggregate function. For example, SUM.
  • You may reference columns in a table passed to PIVOT, as well as correlated columns, but may not access columns defined by the PIVOT clause itself.
  • A table passed to PIVOT may be accessed through its alias if one is provided.
  • You can only use an aggregate function that takes one argument.
  • Except for COUNT, you can only use aggregate functions that ignore NULL inputs.
  • If you are using COUNT, you can use * as an argument.

Rules for input_column:

  • May access columns from the input table, as well as correlated columns, not columns defined by the PIVOT clause, itself.
  • Evaluated against each row in the input table; aggregate and window function calls are prohibited.
  • Non-determinism is okay.
  • The type must be groupable.
  • The input table may be accessed through its alias if one is provided.

Rules for pivot_column:

  • A pivot_column must be a constant.
  • Named constants, such as variables, aren't supported.
  • Query parameters aren't supported.
  • If a name is desired for a named constant or query parameter, specify it explicitly with an alias.
  • Corner cases exist where a distinct pivot_columns can end up with the same default column names. For example, an input column might contain both a NULL value and the string literal "NULL". When this happens, multiple pivot columns are created with the same name. To avoid this situation, use aliases for pivot column names.
  • If a pivot_column doesn't specify an alias, a column name is constructed as follows:
From To Example
NULL NULL Input: NULL
Output: "NULL"
INT64
NUMERIC
BIGNUMERIC
The number in string format with the following rules:
  • Positive numbers are preceded with _.
  • Negative numbers are preceded with minus_.
  • A decimal point is replaced with _point_.
Input: 1
Output: _1

Input: -1
Output: minus_1

Input: 1.0
Output: _1_point_0
BOOL TRUE or FALSE. Input: TRUE
Output: TRUE

Input: FALSE
Output: FALSE
STRING The string value. Input: "PlayerName"
Output: PlayerName
DATE The date in _YYYY_MM_DD format. Input: DATE '2013-11-25'
Output: _2013_11_25
ENUM The name of the enumeration constant. Input: COLOR.RED
Output: RED
STRUCT A string formed by computing the pivot_column name for each field and joining the results together with an underscore. The following rules apply:
  • If the field is named: <field_name>_<pivot_column_name_for_field_name>.
  • If the field is unnamed: <pivot_column_name_for_field_name>.

<pivot_column_name_for_field_name> is determined by applying the rules in this table, recursively. If no rule is available for any STRUCT field, the entire pivot column is unnamed.

Due to implicit type coercion from the IN list values to the type of <value-expression>, field names must be present in input_column to have an effect on the names of the pivot columns.

Input: STRUCT("one", "two")
Output: one_two

Input: STRUCT("one" AS a, "two" AS b)
Output: one_a_two_b
All other data types Not supported. You must provide an alias.

Examples

The following examples reference a table called Produce that looks like this:

WITH Produce AS (
  SELECT 'Kale' as product, 51 as sales, 'Q1' as quarter, 2020 as year UNION ALL
  SELECT 'Kale', 23, 'Q2', 2020 UNION ALL
  SELECT 'Kale', 45, 'Q3', 2020 UNION ALL
  SELECT 'Kale', 3, 'Q4', 2020 UNION ALL
  SELECT 'Kale', 70, 'Q1', 2021 UNION ALL
  SELECT 'Kale', 85, 'Q2', 2021 UNION ALL
  SELECT 'Apple', 77, 'Q1', 2020 UNION ALL
  SELECT 'Apple', 0, 'Q2', 2020 UNION ALL
  SELECT 'Apple', 1, 'Q1', 2021)
SELECT * FROM Produce

/*---------+-------+---------+------*
 | product | sales | quarter | year |
 +---------+-------+---------+------|
 | Kale    | 51    | Q1      | 2020 |
 | Kale    | 23    | Q2      | 2020 |
 | Kale    | 45    | Q3      | 2020 |
 | Kale    | 3     | Q4      | 2020 |
 | Kale    | 70    | Q1      | 2021 |
 | Kale    | 85    | Q2      | 2021 |
 | Apple   | 77    | Q1      | 2020 |
 | Apple   | 0     | Q2      | 2020 |
 | Apple   | 1     | Q1      | 2021 |
 *---------+-------+---------+------*/

With the PIVOT operator, the rows in the quarter column are rotated into these new columns: Q1, Q2, Q3, Q4. The aggregate function SUM is implicitly grouped by all unaggregated columns other than the pivot_column: product and year.

SELECT * FROM
  Produce
  PIVOT(SUM(sales) FOR quarter IN ('Q1', 'Q2', 'Q3', 'Q4'))

/*---------+------+----+------+------+------*
 | product | year | Q1 | Q2   | Q3   | Q4   |
 +---------+------+----+------+------+------+
 | Apple   | 2020 | 77 | 0    | NULL | NULL |
 | Apple   | 2021 | 1  | NULL | NULL | NULL |
 | Kale    | 2020 | 51 | 23   | 45   | 3    |
 | Kale    | 2021 | 70 | 85   | NULL | NULL |
 *---------+------+----+------+------+------*/

If you don't include year, then SUM is grouped only by product.

SELECT * FROM
  (SELECT product, sales, quarter FROM Produce)
  PIVOT(SUM(sales) FOR quarter IN ('Q1', 'Q2', 'Q3', 'Q4'))

/*---------+-----+-----+------+------*
 | product | Q1  | Q2  | Q3   | Q4   |
 +---------+-----+-----+------+------+
 | Apple   | 78  | 0   | NULL | NULL |
 | Kale    | 121 | 108 | 45   | 3    |
 *---------+-----+-----+------+------*/

You can select a subset of values in the pivot_column:

SELECT * FROM
  (SELECT product, sales, quarter FROM Produce)
  PIVOT(SUM(sales) FOR quarter IN ('Q1', 'Q2', 'Q3'))

/*---------+-----+-----+------*
 | product | Q1  | Q2  | Q3   |
 +---------+-----+-----+------+
 | Apple   | 78  | 0   | NULL |
 | Kale    | 121 | 108 | 45   |
 *---------+-----+-----+------*/
SELECT * FROM
  (SELECT sales, quarter FROM Produce)
  PIVOT(SUM(sales) FOR quarter IN ('Q1', 'Q2', 'Q3'))

/*-----+-----+----*
 | Q1  | Q2  | Q3 |
 +-----+-----+----+
 | 199 | 108 | 45 |
 *-----+-----+----*/

You can include multiple aggregation functions in the PIVOT. In this case, you must specify an alias for each aggregation. These aliases are used to construct the column names in the resulting table.

SELECT * FROM
  (SELECT product, sales, quarter FROM Produce)
  PIVOT(SUM(sales) AS total_sales, COUNT(*) AS num_records FOR quarter IN ('Q1', 'Q2'))

/*--------+----------------+----------------+----------------+----------------*
 |product | total_sales_Q1 | num_records_Q1 | total_sales_Q2 | num_records_Q2 |
 +--------+----------------+----------------+----------------+----------------+
 | Kale   | 121            | 2              | 108            | 2              |
 | Apple  | 78             | 2              | 0              | 1              |
 *--------+----------------+----------------+----------------+----------------*/

UNPIVOT operator

FROM from_item[, ...] unpivot_operator

unpivot_operator:
  UNPIVOT [ { INCLUDE NULLS | EXCLUDE NULLS } ] (
    { single_column_unpivot | multi_column_unpivot }
  ) [unpivot_alias]

single_column_unpivot:
  values_column
  FOR name_column
  IN (columns_to_unpivot)

multi_column_unpivot:
  values_column_set
  FOR name_column
  IN (column_sets_to_unpivot)

values_column_set:
  (values_column[, ...])

columns_to_unpivot:
  unpivot_column [row_value_alias][, ...]

column_sets_to_unpivot:
  (unpivot_column [row_value_alias][, ...])

unpivot_alias and row_value_alias:
  [AS] alias

The UNPIVOT operator rotates columns into rows. UNPIVOT is part of the FROM clause.

  • UNPIVOT can be used to modify any table expression.
  • Combining UNPIVOT with FOR SYSTEM_TIME AS OF isn't allowed, although users may use UNPIVOT against a subquery input which itself uses FOR SYSTEM_TIME AS OF.
  • A WITH OFFSET clause immediately preceding the UNPIVOT operator isn't allowed.
  • PIVOT aggregations can't be reversed with UNPIVOT.

Conceptual example:

-- Before UNPIVOT is used to rotate Q1, Q2, Q3, Q4 into sales and quarter columns:
/*---------+----+----+----+----*
 | product | Q1 | Q2 | Q3 | Q4 |
 +---------+----+----+----+----+
 | Kale    | 51 | 23 | 45 | 3  |
 | Apple   | 77 | 0  | 25 | 2  |
 *---------+----+----+----+----*/

-- After UNPIVOT is used to rotate Q1, Q2, Q3, Q4 into sales and quarter columns:
/*---------+-------+---------*
 | product | sales | quarter |
 +---------+-------+---------+
 | Kale    | 51    | Q1      |
 | Kale    | 23    | Q2      |
 | Kale    | 45    | Q3      |
 | Kale    | 3     | Q4      |
 | Apple   | 77    | Q1      |
 | Apple   | 0     | Q2      |
 | Apple   | 25    | Q3      |
 | Apple   | 2     | Q4      |
 *---------+-------+---------*/

Definitions

Top-level definitions:

  • from_item: The table, subquery, or table-valued function (TVF) on which to perform a pivot operation. The from_item must follow these rules.
  • unpivot_operator: The pivot operation to perform on a from_item.

unpivot_operator definitions:

  • INCLUDE NULLS: Add rows with NULL values to the result.
  • EXCLUDE NULLS: don't add rows with NULL values to the result. By default, UNPIVOT excludes rows with NULL values.
  • single_column_unpivot: Rotates columns into one values_column and one name_column.
  • multi_column_unpivot: Rotates columns into multiple values_columns and one name_column.
  • unpivot_alias: An alias for the results of the UNPIVOT operation. This alias can be referenced elsewhere in the query.

single_column_unpivot definitions:

  • values_column: A column to contain the row values from columns_to_unpivot. Follow these rules when creating a values column.
  • name_column: A column to contain the column names from columns_to_unpivot. Follow these rules when creating a name column.
  • columns_to_unpivot: The columns from the from_item to populate values_column and name_column. Follow these rules when creating an unpivot column.
    • row_value_alias: An optional alias for a column that's displayed for the column in name_column. If not specified, the string value of the column name is used. Follow these rules when creating a row value alias.

multi_column_unpivot definitions:

  • values_column_set: A set of columns to contain the row values from columns_to_unpivot. Follow these rules when creating a values column.
  • name_column: A set of columns to contain the column names from columns_to_unpivot. Follow these rules when creating a name column.
  • column_sets_to_unpivot: The columns from the from_item to unpivot. Follow these rules when creating an unpivot column.
    • row_value_alias: An optional alias for a column set that's displayed for the column set in name_column. If not specified, a string value for the column set is used and each column in the string is separated with an underscore (_). For example, (col1, col2) outputs col1_col2. Follow these rules when creating a row value alias.

Rules

Rules for a from_item passed to UNPIVOT:

  • The from_item may consist of any table, subquery, or table-valued function (TVF) result.
  • The from_item may not produce a value table.
  • Duplicate columns in a from_item can't be referenced in the UNPIVOT clause.

Rules for unpivot_operator:

  • Expressions aren't permitted.
  • Qualified names aren't permitted. For example, mytable.mycolumn isn't allowed.
  • In the case where the UNPIVOT result has duplicate column names:
    • SELECT * is allowed.
    • SELECT values_column causes ambiguity.

Rules for values_column:

  • It can't be a name used for a name_column or an unpivot_column.
  • It can be the same name as a column from the from_item.

Rules for name_column:

  • It can't be a name used for a values_column or an unpivot_column.
  • It can be the same name as a column from the from_item.

Rules for unpivot_column:

  • Must be a column name from the from_item.
  • It can't reference duplicate from_item column names.
  • All columns in a column set must have equivalent data types.
    • Data types can't be coerced to a common supertype.
    • If the data types are exact matches (for example, a struct with different field names), the data type of the first input is the data type of the output.
  • You can't have the same name in the same column set. For example, (emp1, emp1) results in an error.
  • You can have a the same name in different column sets. For example, (emp1, emp2), (emp1, emp3) is valid.

Rules for row_value_alias:

  • This can be a string or an INT64 literal.
  • The data type for all row_value_alias clauses must be the same.
  • If the value is an INT64, the row_value_alias for each unpivot_column must be specified.

Examples

The following examples reference a table called Produce that looks like this:

WITH Produce AS (
  SELECT 'Kale' as product, 51 as Q1, 23 as Q2, 45 as Q3, 3 as Q4 UNION ALL
  SELECT 'Apple', 77, 0, 25, 2)
SELECT * FROM Produce

/*---------+----+----+----+----*
 | product | Q1 | Q2 | Q3 | Q4 |
 +---------+----+----+----+----+
 | Kale    | 51 | 23 | 45 | 3  |
 | Apple   | 77 | 0  | 25 | 2  |
 *---------+----+----+----+----*/

With the UNPIVOT operator, the columns Q1, Q2, Q3, and Q4 are rotated. The values of these columns now populate a new column called Sales and the names of these columns now populate a new column called Quarter. This is a single-column unpivot operation.

SELECT * FROM Produce
UNPIVOT(sales FOR quarter IN (Q1, Q2, Q3, Q4))

/*---------+-------+---------*
 | product | sales | quarter |
 +---------+-------+---------+
 | Kale    | 51    | Q1      |
 | Kale    | 23    | Q2      |
 | Kale    | 45    | Q3      |
 | Kale    | 3     | Q4      |
 | Apple   | 77    | Q1      |
 | Apple   | 0     | Q2      |
 | Apple   | 25    | Q3      |
 | Apple   | 2     | Q4      |
 *---------+-------+---------*/

In this example, we UNPIVOT four quarters into two semesters. This is a multi-column unpivot operation.

SELECT * FROM Produce
UNPIVOT(
  (first_half_sales, second_half_sales)
  FOR semesters
  IN ((Q1, Q2) AS 'semester_1', (Q3, Q4) AS 'semester_2'))

/*---------+------------------+-------------------+------------*
 | product | first_half_sales | second_half_sales | semesters  |
 +---------+------------------+-------------------+------------+
 | Kale    | 51               | 23                | semester_1 |
 | Kale    | 45               | 3                 | semester_2 |
 | Apple   | 77               | 0                 | semester_1 |
 | Apple   | 25               | 2                 | semester_2 |
 *---------+------------------+-------------------+------------*/

TABLESAMPLE operator

TABLESAMPLE SYSTEM ( percent PERCENT )

Description

You can use the TABLESAMPLE operator to select a random sample of a dataset. This operator is useful when you're working with tables that have large amounts of data and you don't need precise answers.

Sampling returns a variety of records while avoiding the costs associated with scanning and processing an entire table. Each execution of the query might return different results because each execution processes an independently computed sample. GoogleSQL doesn't cache the results of queries that include a TABLESAMPLE clause.

Replace percent with the percentage of the dataset that you want to include in the results. The value must be between 0 and 100. The value can be a literal value or a query parameter. It can't be a variable.

For more information, see Table sampling.

Example

The following query selects approximately 10% of a table's data:

SELECT * FROM dataset.my_table TABLESAMPLE SYSTEM (10 PERCENT)

Join operation

join_operation:
  { cross_join_operation | condition_join_operation }

cross_join_operation:
  from_item cross_join_operator from_item

condition_join_operation:
  from_item condition_join_operator from_item join_condition

cross_join_operator:
  { CROSS JOIN | , }

condition_join_operator:
  {
    [INNER] JOIN
    | FULL [OUTER] JOIN
    | LEFT [OUTER] JOIN
    | RIGHT [OUTER] JOIN
  }

join_condition:
  { on_clause | using_clause }

on_clause:
  ON bool_expression

using_clause:
  USING ( column_list )

The JOIN operation merges two from_items so that the SELECT clause can query them as one source. The join operator and join condition specify how to combine and discard rows from the two from_items to form a single source.

[INNER] JOIN

An INNER JOIN, or simply JOIN, effectively calculates the Cartesian product of the two from_items and discards all rows that don't meet the join condition. Effectively means that it's possible to implement an INNER JOIN without actually calculating the Cartesian product.

FROM A INNER JOIN B ON A.w = B.y

/*
Table A       Table B       Result
+-------+     +-------+     +---------------+
| w | x |  *  | y | z |  =  | w | x | y | z |
+-------+     +-------+     +---------------+
| 1 | a |     | 2 | k |     | 2 | b | 2 | k |
| 2 | b |     | 3 | m |     | 3 | c | 3 | m |
| 3 | c |     | 3 | n |     | 3 | c | 3 | n |
| 3 | d |     | 4 | p |     | 3 | d | 3 | m |
+-------+     +-------+     | 3 | d | 3 | n |
                            +---------------+
*/
FROM A INNER JOIN B USING (x)

/*
Table A       Table B       Result
+-------+     +-------+     +-----------+
| x | y |  *  | x | z |  =  | x | y | z |
+-------+     +-------+     +-----------+
| 1 | a |     | 2 | k |     | 2 | b | k |
| 2 | b |     | 3 | m |     | 3 | c | m |
| 3 | c |     | 3 | n |     | 3 | c | n |
| 3 | d |     | 4 | p |     | 3 | d | m |
+-------+     +-------+     | 3 | d | n |
                            +-----------+
*/

Example

This query performs an INNER JOIN on the Roster and TeamMascot tables.

SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster JOIN TeamMascot ON Roster.SchoolID = TeamMascot.SchoolID;

/*---------------------------*
 | LastName   | Mascot       |
 +---------------------------+
 | Adams      | Jaguars      |
 | Buchanan   | Lakers       |
 | Coolidge   | Lakers       |
 | Davis      | Knights      |
 *---------------------------*/

You can use a correlated INNER JOIN to flatten an array into a set of rows. To learn more, see Convert elements in an array to rows in a table.

CROSS JOIN

CROSS JOIN returns the Cartesian product of the two from_items. In other words, it combines each row from the first from_item with each row from the second from_item.

If the rows of the two from_items are independent, then the result has M * N rows, given M rows in one from_item and N in the other. Note that this still holds for the case when either from_item has zero rows.

In a FROM clause, a CROSS JOIN can be written like this:

FROM A CROSS JOIN B

/*
Table A       Table B       Result
+-------+     +-------+     +---------------+
| w | x |  *  | y | z |  =  | w | x | y | z |
+-------+     +-------+     +---------------+
| 1 | a |     | 2 | c |     | 1 | a | 2 | c |
| 2 | b |     | 3 | d |     | 1 | a | 3 | d |
+-------+     +-------+     | 2 | b | 2 | c |
                            | 2 | b | 3 | d |
                            +---------------+
*/

You can use a correlated cross join to convert or flatten an array into a set of rows, though the (equivalent) INNER JOIN is preferred over CROSS JOIN for this case. To learn more, see Convert elements in an array to rows in a table.

Examples

This query performs an CROSS JOIN on the Roster and TeamMascot tables.

SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster CROSS JOIN TeamMascot;

/*---------------------------*
 | LastName   | Mascot       |
 +---------------------------+
 | Adams      | Jaguars      |
 | Adams      | Knights      |
 | Adams      | Lakers       |
 | Adams      | Mustangs     |
 | Buchanan   | Jaguars      |
 | Buchanan   | Knights      |
 | Buchanan   | Lakers       |
 | Buchanan   | Mustangs     |
 | ...                       |
 *---------------------------*/

Comma cross join (,)

CROSS JOINs can be written implicitly with a comma. This is called a comma cross join.

A comma cross join looks like this in a FROM clause:

FROM A, B

/*
Table A       Table B       Result
+-------+     +-------+     +---------------+
| w | x |  *  | y | z |  =  | w | x | y | z |
+-------+     +-------+     +---------------+
| 1 | a |     | 2 | c |     | 1 | a | 2 | c |
| 2 | b |     | 3 | d |     | 1 | a | 3 | d |
+-------+     +-------+     | 2 | b | 2 | c |
                            | 2 | b | 3 | d |
                            +---------------+
*/

You can't write comma cross joins inside parentheses. To learn more, see Join operations in a sequence.

FROM (A, B)  // INVALID

You can use a correlated comma cross join to convert or flatten an array into a set of rows. To learn more, see Convert elements in an array to rows in a table.

Examples

This query performs a comma cross join on the Roster and TeamMascot tables.

SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster, TeamMascot;

/*---------------------------*
 | LastName   | Mascot       |
 +---------------------------+
 | Adams      | Jaguars      |
 | Adams      | Knights      |
 | Adams      | Lakers       |
 | Adams      | Mustangs     |
 | Buchanan   | Jaguars      |
 | Buchanan   | Knights      |
 | Buchanan   | Lakers       |
 | Buchanan   | Mustangs     |
 | ...                       |
 *---------------------------*/

FULL [OUTER] JOIN

A FULL OUTER JOIN (or simply FULL JOIN) returns all fields for all matching rows in both from_items that meet the join condition. If a given row from one from_item doesn't join to any row in the other from_item, the row returns with NULL values for all columns from the other from_item.

FROM A FULL OUTER JOIN B ON A.w = B.y

/*
Table A       Table B       Result
+-------+     +-------+     +---------------------------+
| w | x |  *  | y | z |  =  | w    | x    | y    | z    |
+-------+     +-------+     +---------------------------+
| 1 | a |     | 2 | k |     | 1    | a    | NULL | NULL |
| 2 | b |     | 3 | m |     | 2    | b    | 2    | k    |
| 3 | c |     | 3 | n |     | 3    | c    | 3    | m    |
| 3 | d |     | 4 | p |     | 3    | c    | 3    | n    |
+-------+     +-------+     | 3    | d    | 3    | m    |
                            | 3    | d    | 3    | n    |
                            | NULL | NULL | 4    | p    |
                            +---------------------------+
*/
FROM A FULL OUTER JOIN B USING (x)

/*
Table A       Table B       Result
+-------+     +-------+     +--------------------+
| x | y |  *  | x | z |  =  | x    | y    | z    |
+-------+     +-------+     +--------------------+
| 1 | a |     | 2 | k |     | 1    | a    | NULL |
| 2 | b |     | 3 | m |     | 2    | b    | k    |
| 3 | c |     | 3 | n |     | 3    | c    | m    |
| 3 | d |     | 4 | p |     | 3    | c    | n    |
+-------+     +-------+     | 3    | d    | m    |
                            | 3    | d    | n    |
                            | 4    | NULL | p    |
                            +--------------------+
*/

Example

This query performs a FULL JOIN on the Roster and TeamMascot tables.

SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster FULL JOIN TeamMascot ON Roster.SchoolID = TeamMascot.SchoolID;

/*---------------------------*
 | LastName   | Mascot       |
 +---------------------------+
 | Adams      | Jaguars      |
 | Buchanan   | Lakers       |
 | Coolidge   | Lakers       |
 | Davis      | Knights      |
 | Eisenhower | NULL         |
 | NULL       | Mustangs     |
 *---------------------------*/

LEFT [OUTER] JOIN

The result of a LEFT OUTER JOIN (or simply LEFT JOIN) for two from_items always retains all rows of the left from_item in the JOIN operation, even if no rows in the right from_item satisfy the join predicate.

All rows from the left from_item are retained; if a given row from the left from_item doesn't join to any row in the right from_item, the row will return with NULL values for all columns exclusively from the right from_item. Rows from the right from_item that don't join to any row in the left from_item are discarded.

FROM A LEFT OUTER JOIN B ON A.w = B.y

/*
Table A       Table B       Result
+-------+     +-------+     +---------------------------+
| w | x |  *  | y | z |  =  | w    | x    | y    | z    |
+-------+     +-------+     +---------------------------+
| 1 | a |     | 2 | k |     | 1    | a    | NULL | NULL |
| 2 | b |     | 3 | m |     | 2    | b    | 2    | k    |
| 3 | c |     | 3 | n |     | 3    | c    | 3    | m    |
| 3 | d |     | 4 | p |     | 3    | c    | 3    | n    |
+-------+     +-------+     | 3    | d    | 3    | m    |
                            | 3    | d    | 3    | n    |
                            +---------------------------+
*/
FROM A LEFT OUTER JOIN B USING (x)

/*
Table A       Table B       Result
+-------+     +-------+     +--------------------+
| x | y |  *  | x | z |  =  | x    | y    | z    |
+-------+     +-------+     +--------------------+
| 1 | a |     | 2 | k |     | 1    | a    | NULL |
| 2 | b |     | 3 | m |     | 2    | b    | k    |
| 3 | c |     | 3 | n |     | 3    | c    | m    |
| 3 | d |     | 4 | p |     | 3    | c    | n    |
+-------+     +-------+     | 3    | d    | m    |
                            | 3    | d    | n    |
                            +--------------------+
*/

Example

This query performs a LEFT JOIN on the Roster and TeamMascot tables.

SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster LEFT JOIN TeamMascot ON Roster.SchoolID = TeamMascot.SchoolID;

/*---------------------------*
 | LastName   | Mascot       |
 +---------------------------+
 | Adams      | Jaguars      |
 | Buchanan   | Lakers       |
 | Coolidge   | Lakers       |
 | Davis      | Knights      |
 | Eisenhower | NULL         |
 *---------------------------*/

RIGHT [OUTER] JOIN

The result of a RIGHT OUTER JOIN (or simply RIGHT JOIN) for two from_items always retains all rows of the right from_item in the JOIN operation, even if no rows in the left from_item satisfy the join predicate.

All rows from the right from_item are returned; if a given row from the right from_item doesn't join to any row in the left from_item, the row will return with NULL values for all columns exclusively from the left from_item. Rows from the left from_item that don't join to any row in the right from_item are discarded.

FROM A RIGHT OUTER JOIN B ON A.w = B.y

/*
Table A       Table B       Result
+-------+     +-------+     +---------------------------+
| w | x |  *  | y | z |  =  | w    | x    | y    | z    |
+-------+     +-------+     +---------------------------+
| 1 | a |     | 2 | k |     | 2    | b    | 2    | k    |
| 2 | b |     | 3 | m |     | 3    | c    | 3    | m    |
| 3 | c |     | 3 | n |     | 3    | c    | 3    | n    |
| 3 | d |     | 4 | p |     | 3    | d    | 3    | m    |
+-------+     +-------+     | 3    | d    | 3    | n    |
                            | NULL | NULL | 4    | p    |
                            +---------------------------+
*/
FROM A RIGHT OUTER JOIN B USING (x)

/*
Table A       Table B       Result
+-------+     +-------+     +--------------------+
| x | y |  *  | x | z |  =  | x    | y    | z    |
+-------+     +-------+     +--------------------+
| 1 | a |     | 2 | k |     | 2    | b    | k    |
| 2 | b |     | 3 | m |     | 3    | c    | m    |
| 3 | c |     | 3 | n |     | 3    | c    | n    |
| 3 | d |     | 4 | p |     | 3    | d    | m    |
+-------+     +-------+     | 3    | d    | n    |
                            | 4    | NULL | p    |
                            +--------------------+
*/

Example

This query performs a RIGHT JOIN on the Roster and TeamMascot tables.

SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster RIGHT JOIN TeamMascot ON Roster.SchoolID = TeamMascot.SchoolID;

/*---------------------------*
 | LastName   | Mascot       |
 +---------------------------+
 | Adams      | Jaguars      |
 | Buchanan   | Lakers       |
 | Coolidge   | Lakers       |
 | Davis      | Knights      |
 | NULL       | Mustangs     |
 *---------------------------*/

Join conditions

In a join operation, a join condition helps specify how to combine rows in two from_items to form a single source.

The two types of join conditions are the ON clause and USING clause. You must use a join condition when you perform a conditional join operation. You can't use a join condition when you perform a cross join operation.

ON clause

ON bool_expression

Description

Given a row from each table, if the ON clause evaluates to TRUE, the query generates a consolidated row with the result of combining the given rows.

Definitions:

  • bool_expression: The boolean expression that specifies the condition for the join. This is frequently a comparison operation or logical combination of comparison operators.

Details:

Similarly to CROSS JOIN, ON produces a column once for each column in each input table.

A NULL join condition evaluation is equivalent to a FALSE evaluation.

If a column-order sensitive operation such as UNION or SELECT * is used with the ON join condition, the resulting table contains all of the columns from the left-hand input in order, and then all of the columns from the right-hand input in order.

Examples

The following examples show how to use the ON clause:

WITH
  A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3),
  B AS ( SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4)
SELECT * FROM A INNER JOIN B ON A.x = B.x;

WITH
  A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3),
  B AS ( SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4)
SELECT A.x, B.x FROM A INNER JOIN B ON A.x = B.x;

/*
Table A   Table B   Result (A.x, B.x)
+---+     +---+     +-------+
| x |  *  | x |  =  | x | x |
+---+     +---+     +-------+
| 1 |     | 2 |     | 2 | 2 |
| 2 |     | 3 |     | 3 | 3 |
| 3 |     | 4 |     +-------+
+---+     +---+
*/
WITH
  A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  B AS ( SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT * FROM A LEFT OUTER JOIN B ON A.x = B.x;

WITH
  A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  B AS ( SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT A.x, B.x FROM A LEFT OUTER JOIN B ON A.x = B.x;

/*
Table A    Table B   Result
+------+   +---+     +-------------+
| x    | * | x |  =  | x    | x    |
+------+   +---+     +-------------+
| 1    |   | 2 |     | 1    | NULL |
| 2    |   | 3 |     | 2    | 2    |
| 3    |   | 4 |     | 3    | 3    |
| NULL |   | 5 |     | NULL | NULL |
+------+   +---+     +-------------+
*/
WITH
  A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  B AS ( SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT * FROM A FULL OUTER JOIN B ON A.x = B.x;

WITH
  A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  B AS ( SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT A.x, B.x FROM A FULL OUTER JOIN B ON A.x = B.x;

/*
Table A    Table B   Result
+------+   +---+     +-------------+
| x    | * | x |  =  | x    | x    |
+------+   +---+     +-------------+
| 1    |   | 2 |     | 1    | NULL |
| 2    |   | 3 |     | 2    | 2    |
| 3    |   | 4 |     | 3    | 3    |
| NULL |   | 5 |     | NULL | NULL |
+------+   +---+     | NULL | 4    |
                     | NULL | 5    |
                     +-------------+
*/

USING clause

USING ( column_name_list )

column_name_list:
    column_name[, ...]

Description

When you are joining two tables, USING performs an equality comparison operation on the columns named in column_name_list. Each column name in column_name_list must appear in both input tables. For each pair of rows from the input tables, if the equality comparisons all evaluate to TRUE, one row is added to the resulting column.

Definitions:

  • column_name_list: A list of columns to include in the join condition.
  • column_name: The column that exists in both of the tables that you are joining.

Details:

A NULL join condition evaluation is equivalent to a FALSE evaluation.

If a column-order sensitive operation such as UNION or SELECT * is used with the USING join condition, the resulting table contains columns in this order:

  • The columns from column_name_list in the order they appear in the USING clause.
  • All other columns of the left-hand input in the order they appear in the input.
  • All other columns of the right-hand input in the order they appear in the input.

A column name in the USING clause must not be qualified by a table name.

If the join is an INNER JOIN or a LEFT OUTER JOIN, the output columns are populated from the values in the first table. If the join is a RIGHT OUTER JOIN, the output columns are populated from the values in the second table. If the join is a FULL OUTER JOIN, the output columns are populated by coalescing the values from the left and right tables in that order.

Examples

The following example shows how to use the USING clause with one column name in the column name list:

WITH
  A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 9 UNION ALL SELECT NULL),
  B AS ( SELECT 2 as x UNION ALL SELECT 9 UNION ALL SELECT 9 UNION ALL SELECT 5)
SELECT * FROM A INNER JOIN B USING (x);

/*
Table A    Table B   Result
+------+   +---+     +---+
| x    | * | x |  =  | x |
+------+   +---+     +---+
| 1    |   | 2 |     | 2 |
| 2    |   | 9 |     | 9 |
| 9    |   | 9 |     | 9 |
| NULL |   | 5 |     +---+
+------+   +---+
*/

The following example shows how to use the USING clause with multiple column names in the column name list:

WITH
  A AS (
    SELECT 1 as x, 15 as y UNION ALL
    SELECT 2, 10 UNION ALL
    SELECT 9, 16 UNION ALL
    SELECT NULL, 12),
  B AS (
    SELECT 2 as x, 10 as y UNION ALL
    SELECT 9, 17 UNION ALL
    SELECT 9, 16 UNION ALL
    SELECT 5, 15)
SELECT * FROM A INNER JOIN B USING (x, y);

/*
Table A         Table B        Result
+-----------+   +---------+     +---------+
| x    | y  | * | x  | y  |  =  | x  | y  |
+-----------+   +---------+     +---------+
| 1    | 15 |   | 2  | 10 |     | 2  | 10 |
| 2    | 10 |   | 9  | 17 |     | 9  | 16 |
| 9    | 16 |   | 9  | 16 |     +---------+
| NULL | 12 |   | 5  | 15 |
+-----------+   +---------+
*/

The following examples show additional ways in which to use the USING clause with one column name in the column name list:

WITH
  A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 9 UNION ALL SELECT NULL),
  B AS ( SELECT 2 as x UNION ALL SELECT 9 UNION ALL SELECT 9 UNION ALL SELECT 5)
SELECT x, A.x, B.x FROM A INNER JOIN B USING (x)

/*
Table A    Table B   Result
+------+   +---+     +--------------------+
| x    | * | x |  =  | x    | A.x  | B.x  |
+------+   +---+     +--------------------+
| 1    |   | 2 |     | 2    | 2    | 2    |
| 2    |   | 9 |     | 9    | 9    | 9    |
| 9    |   | 9 |     | 9    | 9    | 9    |
| NULL |   | 5 |     +--------------------+
+------+   +---+
*/
WITH
  A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 9 UNION ALL SELECT NULL),
  B AS ( SELECT 2 as x UNION ALL SELECT 9 UNION ALL SELECT 9 UNION ALL SELECT 5)
SELECT x, A.x, B.x FROM A LEFT OUTER JOIN B USING (x)

/*
Table A    Table B   Result
+------+   +---+     +--------------------+
| x    | * | x |  =  | x    | A.x  | B.x  |
+------+   +---+     +--------------------+
| 1    |   | 2 |     | 1    | 1    | NULL |
| 2    |   | 9 |     | 2    | 2    | 2    |
| 9    |   | 9 |     | 9    | 9    | 9    |
| NULL |   | 5 |     | 9    | 9    | 9    |
+------+   +---+     | NULL | NULL | NULL |
                     +--------------------+
*/
WITH
  A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 2 UNION ALL SELECT NULL),
  B AS ( SELECT 2 as x UNION ALL SELECT 9 UNION ALL SELECT 9 UNION ALL SELECT 5)
SELECT x, A.x, B.x FROM A RIGHT OUTER JOIN B USING (x)

/*
Table A    Table B   Result
+------+   +---+     +--------------------+
| x    | * | x |  =  | x    | A.x  | B.x  |
+------+   +---+     +--------------------+
| 1    |   | 2 |     | 2    | 2    | 2    |
| 2    |   | 9 |     | 2    | 2    | 2    |
| 2    |   | 9 |     | 9    | NULL | 9    |
| NULL |   | 5 |     | 9    | NULL | 9    |
+------+   +---+     | 5    | NULL | 5    |
                     +--------------------+
*/
WITH
  A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 2 UNION ALL SELECT NULL),
  B AS ( SELECT 2 as x UNION ALL SELECT 9 UNION ALL SELECT 9 UNION ALL SELECT 5)
SELECT x, A.x, B.x FROM A FULL OUTER JOIN B USING (x);

/*
Table A    Table B   Result
+------+   +---+     +--------------------+
| x    | * | x |  =  | x    | A.x  | B.x  |
+------+   +---+     +--------------------+
| 1    |   | 2 |     | 1    | 1    | NULL |
| 2    |   | 9 |     | 2    | 2    | 2    |
| 2    |   | 9 |     | 2    | 2    | 2    |
| NULL |   | 5 |     | NULL | NULL | NULL |
+------+   +---+     | 9    | NULL | 9    |
                     | 9    | NULL | 9    |
                     | 5    | NULL | 5    |
                     +--------------------+
*/

The following example shows how to use the USING clause with only some column names in the column name list.

WITH
  A AS (
    SELECT 1 as x, 15 as y UNION ALL
    SELECT 2, 10 UNION ALL
    SELECT 9, 16 UNION ALL
    SELECT NULL, 12),
  B AS (
    SELECT 2 as x, 10 as y UNION ALL
    SELECT 9, 17 UNION ALL
    SELECT 9, 16 UNION ALL
    SELECT 5, 15)
SELECT * FROM A INNER JOIN B USING (x);

/*
Table A         Table B         Result
+-----------+   +---------+     +-----------------+
| x    | y  | * | x  | y  |  =  | x   | A.y | B.y |
+-----------+   +---------+     +-----------------+
| 1    | 15 |   | 2  | 10 |     | 2   | 10  | 10  |
| 2    | 10 |   | 9  | 17 |     | 9   | 16  | 17  |
| 9    | 16 |   | 9  | 16 |     | 9   | 16  | 16  |
| NULL | 12 |   | 5  | 15 |     +-----------------+
+-----------+   +---------+
*/

The following query performs an INNER JOIN on the Roster and TeamMascot table. The query returns the rows from Roster and TeamMascot where Roster.SchoolID is the same as TeamMascot.SchoolID. The results include a single SchoolID column.

SELECT * FROM Roster INNER JOIN TeamMascot USING (SchoolID);

/*----------------------------------------*
 | SchoolID   | LastName   | Mascot       |
 +----------------------------------------+
 | 50         | Adams      | Jaguars      |
 | 52         | Buchanan   | Lakers       |
 | 52         | Coolidge   | Lakers       |
 | 51         | Davis      | Knights      |
 *----------------------------------------*/

ON and USING equivalency

The ON and USING join conditions aren't equivalent, but they share some rules and sometimes they can produce similar results.

In the following examples, observe what is returned when all rows are produced for inner and outer joins. Also, look at how each join condition handles NULL values.

WITH
  A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3),
  B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4)
SELECT * FROM A INNER JOIN B ON A.x = B.x;

WITH
  A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3),
  B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4)
SELECT * FROM A INNER JOIN B USING (x);

/*
Table A   Table B   Result ON     Result USING
+---+     +---+     +-------+     +---+
| x |  *  | x |  =  | x | x |     | x |
+---+     +---+     +-------+     +---+
| 1 |     | 2 |     | 2 | 2 |     | 2 |
| 2 |     | 3 |     | 3 | 3 |     | 3 |
| 3 |     | 4 |     +-------+     +---+
+---+     +---+
*/
WITH
  A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT * FROM A LEFT OUTER JOIN B ON A.x = B.x;

WITH
  A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT * FROM A LEFT OUTER JOIN B USING (x);

/*
Table A    Table B   Result ON           Result USING
+------+   +---+     +-------------+     +------+
| x    | * | x |  =  | x    | x    |     | x    |
+------+   +---+     +-------------+     +------+
| 1    |   | 2 |     | 1    | NULL |     | 1    |
| 2    |   | 3 |     | 2    | 2    |     | 2    |
| 3    |   | 4 |     | 3    | 3    |     | 3    |
| NULL |   | 5 |     | NULL | NULL |     | NULL |
+------+   +---+     +-------------+     +------+
*/
WITH
  A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3),
  B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4)
SELECT * FROM A FULL OUTER JOIN B ON A.x = B.x;

WITH
  A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3),
  B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4)
SELECT * FROM A FULL OUTER JOIN B USING (x);

/*
Table A   Table B   Result ON           Result USING
+---+     +---+     +-------------+     +---+
| x |  *  | x |  =  | x    | x    |     | x |
+---+     +---+     +-------------+     +---+
| 1 |     | 2 |     | 1    | NULL |     | 1 |
| 2 |     | 3 |     | 2    | 2    |     | 2 |
| 3 |     | 4 |     | 3    | 3    |     | 3 |
+---+     +---+     | NULL | 4    |     | 4 |
                    +-------------+     +---+
*/

Although ON and USING aren't equivalent, they can return the same results in some situations if you specify the columns you want to return.

In the following examples, observe what is returned when a specific row is produced for inner and outer joins. Also, look at how each join condition handles NULL values.

WITH
  A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT A.x FROM A INNER JOIN B ON A.x = B.x;

WITH
  A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT x FROM A INNER JOIN B USING (x);

/*
Table A    Table B   Result ON     Result USING
+------+   +---+     +---+         +---+
| x    | * | x |  =  | x |         | x |
+------+   +---+     +---+         +---+
| 1    |   | 2 |     | 2 |         | 2 |
| 2    |   | 3 |     | 3 |         | 3 |
| 3    |   | 4 |     +---+         +---+
| NULL |   | 5 |
+------+   +---+
*/
WITH
  A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT A.x FROM A LEFT OUTER JOIN B ON A.x = B.x;

WITH
  A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT x FROM A LEFT OUTER JOIN B USING (x);

/*
Table A    Table B   Result ON    Result USING
+------+   +---+     +------+     +------+
| x    | * | x |  =  | x    |     | x    |
+------+   +---+     +------+     +------+
| 1    |   | 2 |     | 1    |     | 1    |
| 2    |   | 3 |     | 2    |     | 2    |
| 3    |   | 4 |     | 3    |     | 3    |
| NULL |   | 5 |     | NULL |     | NULL |
+------+   +---+     +------+     +------+
*/
WITH
  A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT A.x FROM A FULL OUTER JOIN B ON A.x = B.x;

WITH
  A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT x FROM A FULL OUTER JOIN B USING (x);

/*
Table A    Table B   Result ON    Result USING
+------+   +---+     +------+     +------+
| x    | * | x |  =  | x    |     | x    |
+------+   +---+     +------+     +------+
| 1    |   | 2 |     | 1    |     | 1    |
| 2    |   | 3 |     | 2    |     | 2    |
| 3    |   | 4 |     | 3    |     | 3    |
| NULL |   | 5 |     | NULL |     | NULL |
+------+   +---+     | NULL |     | 4    |
                     | NULL |     | 5    |
                     +------+     +------+
*/
WITH
  A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT B.x FROM A FULL OUTER JOIN B ON A.x = B.x;

WITH
  A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT x FROM A FULL OUTER JOIN B USING (x);

/*
Table A    Table B   Result ON    Result USING
+------+   +---+     +------+     +------+
| x    | * | x |  =  | x    |     | x    |
+------+   +---+     +------+     +------+
| 1    |   | 2 |     | 2    |     | 1    |
| 2    |   | 3 |     | 3    |     | 2    |
| 3    |   | 4 |     | NULL |     | 3    |
| NULL |   | 5 |     | NULL |     | NULL |
+------+   +---+     | 4    |     | 4    |
                     | 5    |     | 5    |
                     +------+     +------+
*/

In the following example, observe what is returned when COALESCE is used with the ON clause. It provides the same results as a query with the USING clause.

WITH
  A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT COALESCE(A.x, B.x) FROM A FULL OUTER JOIN B ON A.x = B.x;

WITH
  A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT x FROM A FULL OUTER JOIN B USING (x);

/*
Table A    Table B   Result ON    Result USING
+------+   +---+     +------+     +------+
| x    | * | x |  =  | x    |     | x    |
+------+   +---+     +------+     +------+
| 1    |   | 2 |     | 1    |     | 1    |
| 2    |   | 3 |     | 2    |     | 2    |
| 3    |   | 4 |     | 3    |     | 3    |
| NULL |   | 5 |     | NULL |     | NULL |
+------+   +---+     | 4    |     | 4    |
                     | 5    |     | 5    |
                     +------+     +------+
*/

Join operations in a sequence

The FROM clause can contain multiple JOIN operations in a sequence. JOINs are bound from left to right. For example:

FROM A JOIN B USING (x) JOIN C USING (x)

-- A JOIN B USING (x)        = result_1
-- result_1 JOIN C USING (x) = result_2
-- result_2                  = return value

You can also insert parentheses to group JOINs:

FROM ( (A JOIN B USING (x)) JOIN C USING (x) )

-- A JOIN B USING (x)        = result_1
-- result_1 JOIN C USING (x) = result_2
-- result_2                  = return value

With parentheses, you can group JOINs so that they are bound in a different order:

FROM ( A JOIN (B JOIN C USING (x)) USING (x) )

-- B JOIN C USING (x)       = result_1
-- A JOIN result_1          = result_2
-- result_2                 = return value

A FROM clause can have multiple joins. Provided there are no comma cross joins in the FROM clause, joins don't require parenthesis, though parenthesis can help readability:

FROM A JOIN B JOIN C JOIN D USING (w) ON B.x = C