Query statements scan one or more tables or expressions and return the computed result rows. This topic describes the syntax for SQL queries in GoogleSQL for Spanner.
SQL syntax
query_statement: [ statement_hint_expr ][ table_hint_expr ][ join_hint_expr ]
query_expr statement_hint_expr: '@{' statement_hint_key = statement_hint_value [, ...] '}' statement_hint_key: { USE_ADDITIONAL_PARALLELISM | OPTIMIZER_VERSION | OPTIMIZER_STATISTICS_PACKAGE | ALLOW_DISTRIBUTED_MERGE | LOCK_SCANNED_RANGES } query_expr: [ WITH cte[, ...] ] { select | ( query_expr ) | set_operation } [ ORDER BY expression [{ ASC | DESC }] [, ...] ] [ LIMIT count [ OFFSET skip_rows ] ] select: SELECT [ { ALL | DISTINCT } ] [ AS { STRUCT | VALUE } ] select_list [ FROM from_clause[, ...] ] [ WHERE bool_expression ] [ GROUP BY expression [, ...] ] [ HAVING bool_expression ] table_hint_expr: '@{' table_hint_key = table_hint_value [, ...] '}' table_hint_key: { FORCE_INDEX | GROUPBY_SCAN_OPTIMIZATION }
Notation rules
- Square brackets
[ ]
indicate optional clauses. - Parentheses
( )
indicate literal parentheses. - The vertical bar
|
indicates a logicalOR
. - Curly braces
{ }
enclose a set of options. - A comma followed by an ellipsis within square brackets
[, ... ]
indicates that the preceding item can repeat in a comma-separated list.
Statement hints
The following hints are supported for query statements:
Hint key | Possible values | Description |
---|---|---|
USE_ADDITIONAL_PARALLELISM |
TRUE FALSE (default)
|
If TRUE , the execution engine favors using more
parallelism when possible. Because this can reduce resources available
to other operations, you may want to avoid this hint if you run
latency-sensitive operations on the same instance.
|
OPTIMIZER_VERSION |
1 to N|latest_version|default_version
|
Executes the query using the specified optimizer version. Possible
values are In terms of version setting precedence, the value set by the client API takes precedence over the value in the database options and the value set by this hint takes precedence over everything else. For more information, see Query optimizer. |
OPTIMIZER_STATISTICS_PACKAGE |
package_name|latest |
Executes the query using the specified optimizer statistics package.
Possible values for
If the hint is not set, the optimizer executes against the package that is set in the database option or specified through the client API. If neither of those are set, the optimizer defaults to the latest package. The value set by the client API takes precedence over the value in the database options and the value set by this hint takes precedence over everything else.
The specified package needs to be pinned by the database option or
have For more information, see Query optimizer statistics packages. |
ALLOW_DISTRIBUTED_MERGE |
TRUE (default)FALSE
|
If This feature can increase parallelism of certain ORDER BY queries. This hint has been provided so that users can experiment with turning off the distributed merge algorithm if desired. |
LOCK_SCANNED_RANGES |
exclusive shared (default)
|
Use this hint to request an exclusive lock on a set of ranges scanned by a transaction. Acquiring an exclusive lock helps in scenarios when you observe high write contention, that is, you notice that multiple transactions are concurrently trying to read and write to the same data, resulting in a large number of aborts. Without the hint, it's possible that multiple simultaneous transactions will acquire shared locks, and then try to upgrade to exclusive locks. This will cause a deadlock, because each transaction's shared lock is preventing the other transaction(s) from upgrading to exclusive. Spanner aborts all but one of the transactions. When requesting an exclusive lock using this hint, one transaction acquires the lock and proceeds to execute, while other transactions wait their turn for the lock. Throughput is still limited because the conflicting transactions can only be performed one at a time, but in this case Spanner is always making progress on one transaction, saving time that would otherwise be spent aborting and retrying transactions. This hint is supported on all statement types, both query and DML. Spanner always enforces serializability. Lock mode hints can affect which transactions wait or abort in contended workloads, but don't change the isolation level. Because this is just a hint, it should not be considered equivalent to a mutex. In other words, you should not use Spanner exclusive locks as a mutual exclusion mechanism for the execution of code outside of Spanner. For more information, see Locking. |
Sample tables
The following tables are used to illustrate the behavior of different query clauses in this reference.
Roster table
The Roster
table includes a list of player names (LastName
) and the
unique ID assigned to their school (SchoolID
). It looks like this:
/*-----------------------*
| LastName | SchoolID |
+-----------------------+
| Adams | 50 |
| Buchanan | 52 |
| Coolidge | 52 |
| Davis | 51 |
| Eisenhower | 77 |
*-----------------------*/
You can use this WITH
clause to emulate a temporary table name for the
examples in this reference:
WITH Roster AS
(SELECT 'Adams' as LastName, 50 as SchoolID UNION ALL
SELECT 'Buchanan', 52 UNION ALL
SELECT 'Coolidge', 52 UNION ALL
SELECT 'Davis', 51 UNION ALL
SELECT 'Eisenhower', 77)
SELECT * FROM Roster
PlayerStats table
The PlayerStats
table includes a list of player names (LastName
) and the
unique ID assigned to the opponent they played in a given game (OpponentID
)
and the number of points scored by the athlete in that game (PointsScored
).
/*----------------------------------------*
| LastName | OpponentID | PointsScored |
+----------------------------------------+
| Adams | 51 | 3 |
| Buchanan | 77 | 0 |
| Coolidge | 77 | 1 |
| Adams | 52 | 4 |
| Buchanan | 50 | 13 |
*----------------------------------------*/
You can use this WITH
clause to emulate a temporary table name for the
examples in this reference:
WITH PlayerStats AS
(SELECT 'Adams' as LastName, 51 as OpponentID, 3 as PointsScored UNION ALL
SELECT 'Buchanan', 77, 0 UNION ALL
SELECT 'Coolidge', 77, 1 UNION ALL
SELECT 'Adams', 52, 4 UNION ALL
SELECT 'Buchanan', 50, 13)
SELECT * FROM PlayerStats
TeamMascot table
The TeamMascot
table includes a list of unique school IDs (SchoolID
) and the
mascot for that school (Mascot
).
/*---------------------*
| SchoolID | Mascot |
+---------------------+
| 50 | Jaguars |
| 51 | Knights |
| 52 | Lakers |
| 53 | Mustangs |
*---------------------*/
You can use this WITH
clause to emulate a temporary table name for the
examples in this reference:
WITH TeamMascot AS
(SELECT 50 as SchoolID, 'Jaguars' as Mascot UNION ALL
SELECT 51, 'Knights' UNION ALL
SELECT 52, 'Lakers' UNION ALL
SELECT 53, 'Mustangs')
SELECT * FROM TeamMascot
SELECT
statement
SELECT [ { ALL | DISTINCT } ] [ AS { STRUCT | VALUE } ] select_list select_list: { select_all | select_expression } [, ...] select_all: [ expression. ]* [ EXCEPT ( column_name [, ...] ) ] [ REPLACE ( expression [ AS ] column_name [, ...] ) ] select_expression: expression [ [ AS ] alias ]
The SELECT
list defines the columns that the query will return. Expressions in
the SELECT
list can refer to columns in any of the from_item
s in its
corresponding FROM
clause.
Each item in the SELECT
list is one of:
*
expression
expression.*
SELECT *
SELECT *
, often referred to as select star, produces one output column for
each column that is visible after executing the full query.
SELECT * FROM (SELECT "apple" AS fruit, "carrot" AS vegetable);
/*-------+-----------*
| fruit | vegetable |
+-------+-----------+
| apple | carrot |
*-------+-----------*/
SELECT expression
Items in a SELECT
list can be expressions. These expressions evaluate to a
single value and produce one output column, with an optional explicit alias
.
If the expression does not have an explicit alias, it receives an implicit alias according to the rules for implicit aliases, if possible. Otherwise, the column is anonymous and you cannot refer to it by name elsewhere in the query.
SELECT expression.*
An item in a SELECT
list can also take the form of expression.*
. This
produces one output column for each column or top-level field of expression
.
The expression must either be a table alias or evaluate to a single value of a
data type with fields, such as a STRUCT.
The following query produces one output column for each column in the table
groceries
, aliased as g
.
WITH groceries AS
(SELECT "milk" AS dairy,
"eggs" AS protein,
"bread" AS grain)
SELECT g.*
FROM groceries AS g;
/*-------+---------+-------*
| dairy | protein | grain |
+-------+---------+-------+
| milk | eggs | bread |
*-------+---------+-------*/
More examples:
WITH locations AS
(SELECT STRUCT("Seattle" AS city, "Washington" AS state) AS location
UNION ALL
SELECT STRUCT("Phoenix" AS city, "Arizona" AS state) AS location)
SELECT l.location.*
FROM locations l;
/*---------+------------*
| city | state |
+---------+------------+
| Seattle | Washington |
| Phoenix | Arizona |
*---------+------------*/
WITH locations AS
(SELECT ARRAY<STRUCT<city STRING, state STRING>>[("Seattle", "Washington"),
("Phoenix", "Arizona")] AS location)
SELECT l.LOCATION[offset(0)].*
FROM locations l;
/*---------+------------*
| city | state |
+---------+------------+
| Seattle | Washington |
*---------+------------*/
SELECT * EXCEPT
A SELECT * EXCEPT
statement specifies the names of one or more columns to
exclude from the result. All matching column names are omitted from the output.
WITH orders AS
(SELECT 5 as order_id,
"sprocket" as item_name,
200 as quantity)
SELECT * EXCEPT (order_id)
FROM orders;
/*-----------+----------*
| item_name | quantity |
+-----------+----------+
| sprocket | 200 |
*-----------+----------*/
SELECT * REPLACE
A SELECT * REPLACE
statement specifies one or more
expression AS identifier
clauses. Each identifier must match a column name
from the SELECT *
statement. In the output column list, the column that
matches the identifier in a REPLACE
clause is replaced by the expression in
that REPLACE
clause.
A SELECT * REPLACE
statement does not change the names or order of columns.
However, it can change the value and the value type.
WITH orders AS
(SELECT 5 as order_id,
"sprocket" as item_name,
200 as quantity)
SELECT * REPLACE ("widget" AS item_name)
FROM orders;
/*----------+-----------+----------*
| order_id | item_name | quantity |
+----------+-----------+----------+
| 5 | widget | 200 |
*----------+-----------+----------*/
WITH orders AS
(SELECT 5 as order_id,
"sprocket" as item_name,
200 as quantity)
SELECT * REPLACE (quantity/2 AS quantity)
FROM orders;
/*----------+-----------+----------*
| order_id | item_name | quantity |
+----------+-----------+----------+
| 5 | sprocket | 100 |
*----------+-----------+----------*/
SELECT DISTINCT
A SELECT DISTINCT
statement discards duplicate rows and returns only the
remaining rows. SELECT DISTINCT
cannot return columns of the following types:
STRUCT
ARRAY
SELECT ALL
A SELECT ALL
statement returns all rows, including duplicate rows.
SELECT ALL
is the default behavior of SELECT
.
Using STRUCTs with SELECT
Queries that return a
STRUCT
at the root of the return type are not supported in Spanner APIs. For example, the following query is supported only as a subquery:SELECT STRUCT(1, 2) FROM Users;
Returning an array of structs is supported. For example, the following queries are supported in Spanner APIs:
SELECT ARRAY(SELECT STRUCT(1 AS A, 2 AS B)) FROM Users;
SELECT ARRAY(SELECT AS STRUCT 1 AS a, 2 AS b) FROM Users;
However, query shapes that can return an
ARRAY<STRUCT<...>>
typedNULL
value or anARRAY<STRUCT<...>>
typed value with an element that isNULL
are not supported in Spanner APIs, so the following query is supported only as a subquery:SELECT ARRAY(SELECT IF(STARTS_WITH(Users.username, "a"), NULL, STRUCT(1, 2))) FROM Users;
See Querying STRUCT elements in an ARRAY for more examples on
how to query STRUCTs
inside an ARRAY
.
Also see notes about using STRUCTs
in subqueries.
SELECT AS STRUCT
SELECT AS STRUCT expr [[AS] struct_field_name1] [,...]
This produces a value table with a
STRUCT row type, where the
STRUCT field names and types match the column names
and types produced in the SELECT
list.
Example:
SELECT ARRAY(SELECT AS STRUCT 1 a, 2 b)
SELECT AS STRUCT
can be used in a scalar or array subquery to produce a single
STRUCT type grouping multiple values together. Scalar
and array subqueries (see Subqueries) are normally not
allowed to return multiple columns, but can return a single column with
STRUCT type.
Anonymous columns are allowed.
Example:
SELECT AS STRUCT 1 x, 2, 3
The query above produces STRUCT values of type
STRUCT<int64 x, int64, int64>.
The first field has the name x
while the
second and third fields are anonymous.
The example above produces the same result as this SELECT AS VALUE
query using
a struct constructor:
SELECT AS VALUE STRUCT(1 AS x, 2, 3)
Duplicate columns are allowed.
Example:
SELECT AS STRUCT 1 x, 2 y, 3 x
The query above produces STRUCT values of type
STRUCT<int64 x, int64 y, int64 x>.
The first and third fields have the same
name x
while the second field has the name y
.
The example above produces the same result as this SELECT AS VALUE
query
using a struct constructor:
SELECT AS VALUE STRUCT(1 AS x, 2 AS y, 3 AS x)
SELECT AS VALUE
SELECT AS VALUE
produces a value table from any
SELECT
list that produces exactly one column. Instead of producing an
output table with one column, possibly with a name, the output will be a
value table where the row type is just the value type that was produced in the
one SELECT
column. Any alias the column had will be discarded in the
value table.
Example:
SELECT AS VALUE 1
The query above produces a table with row type INT64.
Example:
SELECT AS VALUE STRUCT(1 AS a, 2 AS b) xyz
The query above produces a table with row type STRUCT<a int64, b int64>
.
Example:
SELECT AS VALUE v FROM (SELECT AS STRUCT 1 a, true b) v WHERE v.b
Given a value table v
as input, the query above filters out certain values in
the WHERE
clause, and then produces a value table using the exact same value
that was in the input table. If the query above did not use SELECT AS VALUE
,
then the output table schema would differ from the input table schema because
the output table would be a regular table with a column named v
containing the
input value.
FROM
clause
FROM from_clause[, ...] from_clause: from_item [ tablesample_operator ] from_item: { table_name [ table_hint_expr ] [ as_alias ] | { join_operation | ( join_operation ) } | ( query_expr ) [ table_hint_expr ] [ as_alias ] | field_path | unnest_operator | cte_name [ table_hint_expr ] [ as_alias ] } table_hint_expr: '@{' table_hint_key = table_hint_value [, ...] '}' table_hint_key: { FORCE_INDEX | GROUPBY_SCAN_OPTIMIZATION } as_alias: [ AS ] alias
The FROM
clause indicates the table or tables from which to retrieve rows,
and specifies how to join those rows together to produce a single stream of
rows for processing in the rest of the query.
tablesample_operator
See TABLESAMPLE operator.
table_name
The name of an existing table.
SELECT * FROM Roster;
Table hints
The following hints are supported for tables:
Hint key | Possible values | Description |
---|---|---|
FORCE_INDEX |
String. The name of an existing index in the database or
_BASE_TABLE to use the base table rather than an index.
|
Note: |
GROUPBY_SCAN_OPTIMIZATION |
TRUE FALSE |
The group by scan optimization can make queries faster if they use
The optimization is applied if the optimizer estimates that it will
make the query more efficient. The hint overrides that decision. If
the hint is set to |
The following example shows how to use a secondary index
when reading from a table, by appending an index directive of the form
@{FORCE_INDEX=index_name}
to the table name:
SELECT s.SingerId, s.FirstName, s.LastName, s.SingerInfo
FROM Singers@{FORCE_INDEX=SingersByFirstLastName} AS s
WHERE s.FirstName = "Catalina" AND s.LastName > "M";
You can include multiple indexes in a query, though only a single index is supported for each distinct table reference. Example:
SELECT s.SingerId, s.FirstName, s.LastName, s.SingerInfo, c.ConcertDate
FROM Singers@{FORCE_INDEX=SingersByFirstLastName} AS s JOIN
Concerts@{FORCE_INDEX=ConcertsBySingerId} AS c ON s.SingerId = c.SingerId
WHERE s.FirstName = "Catalina" AND s.LastName > "M";
Read more about index directives in Secondary Indexes.
join_operation
See Join operation.
query_expr
( query_expr ) [ [ AS ] alias ]
is a table subquery.
field_path
In the FROM
clause, field_path
is any path that
resolves to a field within a data type. field_path
can go
arbitrarily deep into a nested data structure.
Some examples of valid field_path
values include:
SELECT * FROM T1 t1, t1.array_column;
SELECT * FROM T1 t1, t1.struct_column.array_field;
SELECT (SELECT ARRAY_AGG(c) FROM t1.array_column c) FROM T1 t1;
SELECT a.struct_field1 FROM T1 t1, t1.array_of_structs a;
SELECT (SELECT STRING_AGG(a.struct_field1) FROM t1.array_of_structs a) FROM T1 t1;
Field paths in the FROM
clause must end in an
array field. In
addition, field paths cannot contain arrays before the end of the path. For example, the path
array_column.some_array.some_array_field
is invalid because it
contains an array before the end of the path.
unnest_operator
See UNNEST operator.
cte_name
Common table expressions (CTEs) in a WITH
Clause act like
temporary tables that you can reference anywhere in the FROM
clause.
In the example below, subQ1
and subQ2
are CTEs.
Example:
WITH
subQ1 AS (SELECT * FROM Roster WHERE SchoolID = 52),
subQ2 AS (SELECT SchoolID FROM subQ1)
SELECT DISTINCT * FROM subQ2;
The WITH
clause hides any permanent tables with the same name
for the duration of the query, unless you qualify the table name, for example:
db.Roster
.
UNNEST
operator
unnest_operator: { UNNEST( array_expression ) | UNNEST( array_path ) | array_path } [ table_hint_expr ] [ as_alias ] [ WITH OFFSET [ as_alias ] ] as_alias: [AS] alias
The UNNEST
operator takes an array and returns a table, with one row for each
element in the array. For input arrays of most element types, the output of
UNNEST
generally has one column. ARRAYS
with these element types return
multiple columns:
STRUCT
JSON
Input values:
array_expression
: An expression that produces an array.table_name
: The name of a table.array_path
: The path to anARRAY
type.array_path
must be prepended with a table in an implicitUNNEST
operation, but otherwise is optional.The
UNNEST
operation with a correlatedarray_path
must be on the right side of aCROSS JOIN
,LEFT JOIN
, orINNER JOIN
operation.WITH OFFSET
:UNNEST
destroys the order of elements in the input array. Use this optional clause to return an additional column with the array element indexes, or offsets. Offset counting starts at zero for each row produced by theUNNEST
operation. This column has an optional alias; If the optional alias is not used, the default column name isoffset
.Example:
SELECT * FROM UNNEST ([10,20,30]) as numbers WITH OFFSET; /*---------+--------* | numbers | offset | +---------+--------+ | 10 | 0 | | 20 | 1 | | 30 | 2 | *---------+--------*/
alias
: An alias for a value table. An input array that produces a single column can have an optional alias, which you can use to refer to the column elsewhere in the query. You can also use an additional alias with theWITH OFFSET
clause to rename theoffset
column.
You can also use UNNEST
outside of the FROM
clause with the
IN
operator.
For several ways to use UNNEST
, including construction, flattening, and
filtering, see Work with arrays.
To learn more about the ways you can use UNNEST
explicitly and implicitly,
see Explicit and implicit UNNEST
.
UNNEST
and structs
For an input array of structs, UNNEST
returns a row for each struct, with a separate column for each field in the
struct. The alias for each column is the name of the corresponding struct
field.
Example:
SELECT *
FROM UNNEST(
ARRAY<
STRUCT<
x INT64,
y STRING,
z STRUCT<a INT64, b INT64>>>[
(1, 'foo', (10, 11)),
(3, 'bar', (20, 21))]);
/*---+-----+----------*
| x | y | z |
+---+-----+----------+
| 1 | foo | {10, 11} |
| 3 | bar | {20, 21} |
*---+-----+----------*/
Because the UNNEST
operator returns a
value table,
you can alias UNNEST
to define a range variable that you can reference
elsewhere in the query. If you reference the range variable in the SELECT
list, the query returns a struct containing all of the fields of the original
struct in the input table.
Example:
SELECT *, struct_value
FROM UNNEST(
ARRAY<
STRUCT<
x INT64,
y STRING>>[
(1, 'foo'),
(3, 'bar')]) AS struct_value;
/*---+-----+--------------*
| x | y | struct_value |
+---+-----+--------------+
| 3 | bar | {3, bar} |
| 1 | foo | {1, foo} |
*---+-----+--------------*/
Explicit and implicit UNNEST
Array unnesting can be either explicit or implicit. To learn more, see the following sections.
Explicit unnesting
The UNNEST
keyword is required in explicit unnesting. For example:
WITH Coordinates AS (SELECT [1,2] AS position)
SELECT results FROM Coordinates, UNNEST(Coordinates.position) AS results;
In explicit unnesting, array_expression
must return an
array value but does not need to resolve to an array.
Implicit unnesting
The UNNEST
keyword is not used in implicit unnesting.
For example:
WITH Coordinates AS (SELECT [1,2] AS position)
SELECT results FROM Coordinates, Coordinates.position AS results;
Tables and implicit unnesting
When you use array_path
with implicit UNNEST
, array_path
must be prepended
with the table. For example:
WITH Coordinates AS (SELECT [1,2] AS position)
SELECT results FROM Coordinates, Coordinates.position AS results;
UNNEST
and NULL
values
UNNEST
treats NULL
values as follows:
NULL
and empty arrays produce zero rows.- An array containing
NULL
values produces rows containingNULL
values.
TABLESAMPLE
operator
tablesample_clause:
TABLESAMPLE sample_method (sample_size percent_or_rows )
sample_method:
{ BERNOULLI | RESERVOIR }
sample_size:
numeric_value_expression
percent_or_rows:
{ PERCENT | ROWS }
partition_by:
PARTITION BY partition_expression [, ...]
Description
You can use the TABLESAMPLE
operator to select a random sample of a dataset.
This operator is useful when you're working with tables that have large
amounts of data and you don't need precise answers.
sample_method
: When using theTABLESAMPLE
operator, you must specify the sampling algorithm to use:BERNOULLI
: Each row is independently selected with the probability given in thepercent
clause. As a result, you get approximatelyN * percent/100
rows.RESERVOIR
: Takes as parameter an actual sample size K (expressed as a number of rows). If the input is smaller than K, it outputs the entire input relation. If the input is larger than K, reservoir sampling outputs a sample of size exactly K, where any sample of size K is equally likely.
sample_size
: The size of the sample.percent_or_rows
: TheTABLESAMPLE
operator requires that you choose eitherROWS
orPERCENT
. If you choosePERCENT
, the value must be between 0 and 100. If you chooseROWS
, the value must be greater than or equal to 0.
Examples
The following examples illustrate the use of the TABLESAMPLE
operator.
Select from a table using the RESERVOIR
sampling method:
SELECT MessageId
FROM Messages TABLESAMPLE RESERVOIR (100 ROWS);
Select from a table using the BERNOULLI
sampling method:
SELECT MessageId
FROM Messages TABLESAMPLE BERNOULLI (0.1 PERCENT);
Use TABLESAMPLE
with a subquery:
SELECT Subject FROM
(SELECT MessageId, Subject FROM Messages WHERE ServerId="test")
TABLESAMPLE BERNOULLI(50 PERCENT)
WHERE MessageId > 3;
Use a TABLESAMPLE
operation with a join to another table.
SELECT S.Subject
FROM
(SELECT MessageId, ThreadId FROM Messages WHERE ServerId="test") AS R
TABLESAMPLE RESERVOIR(5 ROWS),
Threads AS S
WHERE S.ServerId="test" AND R.ThreadId = S.ThreadId;
Join operation
join_operation: { cross_join_operation | condition_join_operation } cross_join_operation: from_item cross_join_operator [ join_hint_expr ] from_item condition_join_operation: from_item condition_join_operator [ join_hint_expr ] from_item join_condition cross_join_operator: { CROSS JOIN | , } condition_join_operator: { [INNER] [ join_method ] JOIN | FULL [OUTER] [ join_method ] JOIN | LEFT [OUTER] [ join_method ] JOIN | RIGHT [OUTER] [ join_method ] JOIN } join_method: { HASH } join_hint_expr: '@{' join_hint_key = join_hint_value [, ...] '}' join_hint_key: { FORCE_JOIN_ORDER | JOIN_METHOD | HASH_JOIN_BUILD_SIDE | BATCH_MODE | HASH_JOIN_EXECUTION } join_condition: { on_clause | using_clause } on_clause: ON bool_expression using_clause: USING ( column_list )
The JOIN
operation merges two from_item
s so that the SELECT
clause can
query them as one source. The join operator and join condition specify how to
combine and discard rows from the two from_item
s to form a single source.
Join hints
The following hints are supported for JOIN
:
Hint key | Possible values | Description |
---|---|---|
FORCE_JOIN_ORDER |
TRUE FALSE (default)
|
If set to true, use the join order that's specified in the query. |
JOIN_METHOD |
HASH_JOIN APPLY_JOIN MERGE_JOIN PUSH_BROADCAST_HASH_JOIN |
When implementing a logical join, choose a specific alternative to
use for the underlying join method. Learn more in
Join methods.
To use a HASH join, either use
HASH JOIN or
JOIN@{JOIN_METHOD=HASH_JOIN} , but not both.
|
HASH_JOIN_BUILD_SIDE |
BUILD_LEFT BUILD_RIGHT |
Specifies which side of the hash join is used as the build
side. Can only be used with JOIN_METHOD=HASH_JOIN
|
BATCH_MODE |
TRUE (default)FALSE |
Used to disable batched apply join in favor of row-at-a-time apply join.
Can only be used with JOIN_METHOD=APPLY_JOIN .
|
HASH_JOIN_EXECUTION |
MULTI_PASS (default)ONE_PASS
|
For a hash join, specifies what should be done when the hash table size
reaches its memory limit.
Can only be used when JOIN_METHOD=HASH_JOIN .
See Hash Join Execution for more
details.
|
Join methods
Join methods are specific implementations of the various logical join types. Some join methods are available only for certain join types. The choice of which join method to use depends on the specifics of your query and of the data being queried. The best way to figure out if a particular join method helps with the performance of your query is to try the method and view the resulting query execution plan. See Query Execution Operators for more details.
Join Method | Description | Operands |
---|---|---|
HASH_JOIN |
The hash join operator builds a hash table out of one side (the build side), and probes in the hash table for all the elements in the other side (the probe side). | Different variants are used for various join types. View the query execution plan for your query to see which variant is used. Read more about the Hash join operator. |
APPLY_JOIN |
The apply join operator gets each item from one side (the input side), and evaluates the subquery on other side (the map side) using the values of the item from the input side. | Different variants are used for various join types. Cross apply is used for inner join, and outer apply is used for left joins. Read more about the Cross apply and Outer apply operators. |
MERGE_JOIN |
The merge join operator joins two streams of sorted data. The optimizer will add Sort operators to the plan if the data is not already providing the required sort property for the given join condition. The engine provides a distributed merge sort by default, which when coupled with merge join may allow for larger joins, potentially avoiding disk spilling and improving scale and latency. | Different variants are used for various join types. View the query execution plan for your query to see which variant is used. Read more about the Merge join operator. |
PUSH_BROADCAST_HASH_JOIN |
The push broadcast hash join operator builds a batch of data from the build side of the join. The batch is then sent in parallel to all the local splits of the probe side of the join. On each of the local servers, a hash join is executed between the batch and the local data. This join is most likely to be beneficial when the input can fit within one batch, but is not strict. Another potential area of benefit is when operations can be distributed to the local servers, such as an aggregation that occurs after a join. A push broadcast hash join can distribute some aggregation where a traditional hash join cannot. | Different variants are used for various join types. View the query execution plan for your query to see which variant is used. Read more about the Push broadcast hash join operator. |
Hash Join Execution
To execute a hash join between two tables, Spanner first scans
rows from the build side and loads them into a hash table. Then it scans rows
from the probe side, while comparing them against the hash table. If the hash
table reaches its memory limit, depending on the value of the
HASH_JOIN_EXECUTION
query hint, the hash join has one of the following
behaviors:
HASH_JOIN_EXECUTION=MULTI_PASS
(default): The query engine splits the build side table into partitions in a way that the size of a hash table corresponding to each partition is less than the memory size limit. For every partition of the build side table, the probe side is scanned once.HASH_JOIN_EXECUTION=ONE_PASS
: The query engine writes both the build side table and the probe side table to disk in partitions in a way that the hash table of the build side table in each partition is less than the memory limit. The probe side is only scanned once.
[INNER] JOIN
An INNER JOIN
, or simply JOIN
, effectively calculates the Cartesian product
of the two from_item
s and discards all rows that do not meet the join
condition. Effectively means that it is possible to implement an INNER JOIN
without actually calculating the Cartesian product.
FROM A INNER JOIN B ON A.w = B.y
/*
Table A Table B Result
+-------+ +-------+ +---------------+
| w | x | * | y | z | = | w | x | y | z |
+-------+ +-------+ +---------------+
| 1 | a | | 2 | k | | 2 | b | 2 | k |
| 2 | b | | 3 | m | | 3 | c | 3 | m |
| 3 | c | | 3 | n | | 3 | c | 3 | n |
| 3 | d | | 4 | p | | 3 | d | 3 | m |
+-------+ +-------+ | 3 | d | 3 | n |
+---------------+
*/
FROM A INNER JOIN B USING (x)
/*
Table A Table B Result
+-------+ +-------+ +-----------+
| x | y | * | x | z | = | x | y | z |
+-------+ +-------+ +-----------+
| 1 | a | | 2 | k | | 2 | b | k |
| 2 | b | | 3 | m | | 3 | c | m |
| 3 | c | | 3 | n | | 3 | c | n |
| 3 | d | | 4 | p | | 3 | d | m |
+-------+ +-------+ | 3 | d | n |
+-----------+
*/
Example
This query performs an INNER JOIN
on the Roster
and TeamMascot
tables.
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster JOIN TeamMascot ON Roster.SchoolID = TeamMascot.SchoolID;
/*---------------------------*
| LastName | Mascot |
+---------------------------+
| Adams | Jaguars |
| Buchanan | Lakers |
| Coolidge | Lakers |
| Davis | Knights |
*---------------------------*/
CROSS JOIN
CROSS JOIN
returns the Cartesian product of the two from_item
s. In other
words, it combines each row from the first from_item
with each row from the
second from_item
.
If the rows of the two from_item
s are independent, then the result has
M * N rows, given M rows in one from_item
and N in the other. Note that
this still holds for the case when either from_item
has zero rows.
In a FROM
clause, a CROSS JOIN
can be written like this:
FROM A CROSS JOIN B
/*
Table A Table B Result
+-------+ +-------+ +---------------+
| w | x | * | y | z | = | w | x | y | z |
+-------+ +-------+ +---------------+
| 1 | a | | 2 | c | | 1 | a | 2 | c |
| 2 | b | | 3 | d | | 1 | a | 3 | d |
+-------+ +-------+ | 2 | b | 2 | c |
| 2 | b | 3 | d |
+---------------+
*/
You can use a correlated cross join to convert or flatten an array into a set of rows. To learn more, see Convert elements in an array to rows in a table.
Examples
This query performs an CROSS JOIN
on the Roster
and TeamMascot
tables.
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster CROSS JOIN TeamMascot;
/*---------------------------*
| LastName | Mascot |
+---------------------------+
| Adams | Jaguars |
| Adams | Knights |
| Adams | Lakers |
| Adams | Mustangs |
| Buchanan | Jaguars |
| Buchanan | Knights |
| Buchanan | Lakers |
| Buchanan | Mustangs |
| ... |
*---------------------------*/
Comma cross join (,)
CROSS JOIN
s can be written implicitly with a comma. This is
called a comma cross join.
A comma cross join looks like this in a FROM
clause:
FROM A, B
/*
Table A Table B Result
+-------+ +-------+ +---------------+
| w | x | * | y | z | = | w | x | y | z |
+-------+ +-------+ +---------------+
| 1 | a | | 2 | c | | 1 | a | 2 | c |
| 2 | b | | 3 | d | | 1 | a | 3 | d |
+-------+ +-------+ | 2 | b | 2 | c |
| 2 | b | 3 | d |
+---------------+
*/
You cannot write comma cross joins inside parentheses. To learn more, see Join operations in a sequence.
FROM (A, B) // INVALID
You can use a correlated comma cross join to convert or flatten an array into a set of rows. To learn more, see Convert elements in an array to rows in a table.
Examples
This query performs a comma cross join on the Roster
and TeamMascot
tables.
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster, TeamMascot;
/*---------------------------*
| LastName | Mascot |
+---------------------------+
| Adams | Jaguars |
| Adams | Knights |
| Adams | Lakers |
| Adams | Mustangs |
| Buchanan | Jaguars |
| Buchanan | Knights |
| Buchanan | Lakers |
| Buchanan | Mustangs |
| ... |
*---------------------------*/
FULL [OUTER] JOIN
A FULL OUTER JOIN
(or simply FULL JOIN
) returns all fields for all matching
rows in both from_items
that meet the join condition. If a given row from one
from_item
does not join to any row in the other from_item
, the row returns
with NULL
values for all columns from the other from_item
.
FROM A FULL OUTER JOIN B ON A.w = B.y
/*
Table A Table B Result
+-------+ +-------+ +---------------------------+
| w | x | * | y | z | = | w | x | y | z |
+-------+ +-------+ +---------------------------+
| 1 | a | | 2 | k | | 1 | a | NULL | NULL |
| 2 | b | | 3 | m | | 2 | b | 2 | k |
| 3 | c | | 3 | n | | 3 | c | 3 | m |
| 3 | d | | 4 | p | | 3 | c | 3 | n |
+-------+ +-------+ | 3 | d | 3 | m |
| 3 | d | 3 | n |
| NULL | NULL | 4 | p |
+---------------------------+
*/
FROM A FULL OUTER JOIN B USING (x)
/*
Table A Table B Result
+-------+ +-------+ +--------------------+
| x | y | * | x | z | = | x | y | z |
+-------+ +-------+ +--------------------+
| 1 | a | | 2 | k | | 1 | a | NULL |
| 2 | b | | 3 | m | | 2 | b | k |
| 3 | c | | 3 | n | | 3 | c | m |
| 3 | d | | 4 | p | | 3 | c | n |
+-------+ +-------+ | 3 | d | m |
| 3 | d | n |
| 4 | NULL | p |
+--------------------+
*/
Example
This query performs a FULL JOIN
on the Roster
and TeamMascot
tables.
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster FULL JOIN TeamMascot ON Roster.SchoolID = TeamMascot.SchoolID;
/*---------------------------*
| LastName | Mascot |
+---------------------------+
| Adams | Jaguars |
| Buchanan | Lakers |
| Coolidge | Lakers |
| Davis | Knights |
| Eisenhower | NULL |
| NULL | Mustangs |
*---------------------------*/
LEFT [OUTER] JOIN
The result of a LEFT OUTER JOIN
(or simply LEFT JOIN
) for two
from_item
s always retains all rows of the left from_item
in the
JOIN
operation, even if no rows in the right from_item
satisfy the join
predicate.
All rows from the left from_item
are retained;
if a given row from the left from_item
does not join to any row
in the right from_item
, the row will return with NULL
values for all
columns exclusively from the right from_item
. Rows from the right
from_item
that do not join to any row in the left from_item
are discarded.
FROM A LEFT OUTER JOIN B ON A.w = B.y
/*
Table A Table B Result
+-------+ +-------+ +---------------------------+
| w | x | * | y | z | = | w | x | y | z |
+-------+ +-------+ +---------------------------+
| 1 | a | | 2 | k | | 1 | a | NULL | NULL |
| 2 | b | | 3 | m | | 2 | b | 2 | k |
| 3 | c | | 3 | n | | 3 | c | 3 | m |
| 3 | d | | 4 | p | | 3 | c | 3 | n |
+-------+ +-------+ | 3 | d | 3 | m |
| 3 | d | 3 | n |
+---------------------------+
*/
FROM A LEFT OUTER JOIN B USING (x)
/*
Table A Table B Result
+-------+ +-------+ +--------------------+
| x | y | * | x | z | = | x | y | z |
+-------+ +-------+ +--------------------+
| 1 | a | | 2 | k | | 1 | a | NULL |
| 2 | b | | 3 | m | | 2 | b | k |
| 3 | c | | 3 | n | | 3 | c | m |
| 3 | d | | 4 | p | | 3 | c | n |
+-------+ +-------+ | 3 | d | m |
| 3 | d | n |
+--------------------+
*/
Example
This query performs a LEFT JOIN
on the Roster
and TeamMascot
tables.
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster LEFT JOIN TeamMascot ON Roster.SchoolID = TeamMascot.SchoolID;
/*---------------------------*
| LastName | Mascot |
+---------------------------+
| Adams | Jaguars |
| Buchanan | Lakers |
| Coolidge | Lakers |
| Davis | Knights |
| Eisenhower | NULL |
*---------------------------*/
RIGHT [OUTER] JOIN
The result of a RIGHT OUTER JOIN
(or simply RIGHT JOIN
) for two
from_item
s always retains all rows of the right from_item
in the
JOIN
operation, even if no rows in the left from_item
satisfy the join
predicate.
All rows from the right from_item
are returned;
if a given row from the right from_item
does not join to any row
in the left from_item
, the row will return with NULL
values for all
columns exclusively from the left from_item
. Rows from the left from_item
that do not join to any row in the right from_item
are discarded.
FROM A RIGHT OUTER JOIN B ON A.w = B.y
/*
Table A Table B Result
+-------+ +-------+ +---------------------------+
| w | x | * | y | z | = | w | x | y | z |
+-------+ +-------+ +---------------------------+
| 1 | a | | 2 | k | | 2 | b | 2 | k |
| 2 | b | | 3 | m | | 3 | c | 3 | m |
| 3 | c | | 3 | n | | 3 | c | 3 | n |
| 3 | d | | 4 | p | | 3 | d | 3 | m |
+-------+ +-------+ | 3 | d | 3 | n |
| NULL | NULL | 4 | p |
+---------------------------+
*/
FROM A RIGHT OUTER JOIN B USING (x)
/*
Table A Table B Result
+-------+ +-------+ +--------------------+
| x | y | * | x | z | = | x | y | z |
+-------+ +-------+ +--------------------+
| 1 | a | | 2 | k | | 2 | b | k |
| 2 | b | | 3 | m | | 3 | c | m |
| 3 | c | | 3 | n | | 3 | c | n |
| 3 | d | | 4 | p | | 3 | d | m |
+-------+ +-------+ | 3 | d | n |
| 4 | NULL | p |
+--------------------+
*/
Example
This query performs a RIGHT JOIN
on the Roster
and TeamMascot
tables.
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster RIGHT JOIN TeamMascot ON Roster.SchoolID = TeamMascot.SchoolID;
/*---------------------------*
| LastName | Mascot |
+---------------------------+
| Adams | Jaguars |
| Buchanan | Lakers |
| Coolidge | Lakers |
| Davis | Knights |
| NULL | Mustangs |
*---------------------------*/
Join conditions
In a join operation, a join condition helps specify how to
combine rows in two from_items
to form a single source.
The two types of join conditions are the ON
clause and
USING
clause. You must use a join condition when you perform a
conditional join operation. You can't use a join condition when you perform a
cross join operation.
ON
clause
A combined row (the result of joining two rows) meets the ON
join condition
if join condition returns TRUE
.
FROM A JOIN B ON A.x = B.x
/*
Table A Table B Result (A.x, B.x)
+---+ +---+ +-------+
| x | * | x | = | x | x |
+---+ +---+ +-------+
| 1 | | 2 | | 2 | 2 |
| 2 | | 3 | | 3 | 3 |
| 3 | | 4 | +-------+
+---+ +---+
*/
Example
This query performs an INNER JOIN
on the
Roster
and TeamMascot
table.
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster JOIN TeamMascot ON Roster.SchoolID = TeamMascot.SchoolID;
/*---------------------------*
| LastName | Mascot |
+---------------------------+
| Adams | Jaguars |
| Buchanan | Lakers |
| Coolidge | Lakers |
| Davis | Knights |
*---------------------------*/
USING
clause
The USING
clause requires a column list of one or more columns which
occur in both input tables. It performs an equality comparison on that column,
and the rows meet the join condition if the equality comparison returns TRUE
.
FROM A JOIN B USING (x)
/*
Table A Table B Result
+---+ +---+ +---+
| x | * | x | = | x |
+---+ +---+ +---+
| 1 | | 2 | | 2 |
| 2 | | 3 | | 3 |
| 3 | | 4 | +---+
+---+ +---+
*/
Example
This query performs an INNER JOIN
on the
Roster
and TeamMascot
table.
This statement returns the rows from Roster
and TeamMascot
where
Roster.SchoolID
is the same as TeamMascot.SchoolID
. The results include a
single SchoolID
column.
SELECT * FROM Roster INNER JOIN TeamMascot USING (SchoolID);
/*----------------------------------------*
| SchoolID | LastName | Mascot |
+----------------------------------------+
| 50 | Adams | Jaguars |
| 52 | Buchanan | Lakers |
| 52 | Coolidge | Lakers |
| 51 | Davis | Knights |
*----------------------------------------*/
ON
and USING
equivalency
The ON
and USING
keywords are not equivalent, but they are similar.
ON
returns multiple columns, and USING
returns one.
FROM A JOIN B ON A.x = B.x
FROM A JOIN B USING (x)
/*
Table A Table B Result ON Result USING
+---+ +---+ +-------+ +---+
| x | * | x | = | x | x | | x |
+---+ +---+ +-------+ +---+
| 1 | | 2 | | 2 | 2 | | 2 |
| 2 | | 3 | | 3 | 3 | | 3 |
| 3 | | 4 | +-------+ +---+
+---+ +---+
*/
Although ON
and USING
are not equivalent, they can return the same results
if you specify the columns you want to return.
SELECT x FROM A JOIN B USING (x);
SELECT A.x FROM A JOIN B ON A.x = B.x;
/*
Table A Table B Result
+---+ +---+ +---+
| x | * | x | = | x |
+---+ +---+ +---+
| 1 | | 2 | | 2 |
| 2 | | 3 | | 3 |
| 3 | | 4 | +---+
+---+ +---+
*/
Join operations in a sequence
The FROM
clause can contain multiple JOIN
operations in a sequence.
JOIN
s are bound from left to right. For example:
FROM A JOIN B USING (x) JOIN C USING (x)
-- A JOIN B USING (x) = result_1
-- result_1 JOIN C USING (x) = result_2
-- result_2 = return value
You can also insert parentheses to group JOIN
s:
FROM ( (A JOIN B USING (x)) JOIN C USING (x) )
-- A JOIN B USING (x) = result_1
-- result_1 JOIN C USING (x) = result_2
-- result_2 = return value
With parentheses, you can group JOIN
s so that they are bound in a different
order:
FROM ( A JOIN (B JOIN C USING (x)) USING (x) )
-- B JOIN C USING (x) = result_1
-- A JOIN result_1 = result_2
-- result_2 = return value
When comma cross joins are present in a query with a sequence of JOINs, they
group from left to right like other JOIN
types:
FROM A JOIN B USING (x) JOIN C USING (x), D
-- A JOIN B USING (x) = result_1
-- result_1 JOIN C USING (x) = result_2
-- result_2 CROSS JOIN D = return value
There cannot be a RIGHT JOIN
or FULL JOIN
after a comma cross join unless it
is parenthesized:
FROM A, B RIGHT JOIN C ON TRUE // INVALID
FROM A, B FULL JOIN C ON TRUE // INVALID
FROM A, B JOIN C ON TRUE // VALID
FROM A, (B RIGHT JOIN C ON TRUE) // VALID
FROM A, (B FULL JOIN C ON TRUE) // VALID
Correlated join operation
A join operation is correlated when the right from_item
contains a
reference to at least one range variable or
column name introduced by the left from_item
.
In a correlated join operation, rows from the right from_item
are determined
by a row from the left from_item
. Consequently, RIGHT OUTER
and FULL OUTER
joins cannot be correlated because right from_item
rows cannot be determined
in the case when there is no row from the left from_item
.
All correlated join operations must reference an array in the right from_item
.
This is a conceptual example of a correlated join operation that includes a correlated subquery:
FROM A JOIN UNNEST(ARRAY(SELECT AS STRUCT * FROM B WHERE A.ID = B.ID)) AS C
- Left
from_item
:A
- Right
from_item
:UNNEST(...) AS C
- A correlated subquery:
(SELECT AS STRUCT * FROM B WHERE A.ID = B.ID)
This is another conceptual example of a correlated join operation.
array_of_IDs
is part of the left from_item
but is referenced in the
right from_item
.
FROM A JOIN UNNEST(A.array_of_IDs) AS C
The UNNEST
operator can be explicit or implicit.
These are both allowed:
FROM A JOIN UNNEST(A.array_of_IDs) AS IDs
FROM A JOIN A.array_of_IDs AS IDs
In a correlated join operation, the right from_item
is re-evaluated
against each distinct row from the left from_item
. In the following
conceptual example, the correlated join operation first
evaluates A
and B
, then A
and C
:
FROM
A
JOIN
UNNEST(ARRAY(SELECT AS STRUCT * FROM B WHERE A.ID = B.ID)) AS C
ON A.Name = C.Name
Caveats
- In a correlated
LEFT JOIN
, when the input table on the right side is empty for some row from the left side, it is as if no rows from the right side satisfied the join condition in a regularLEFT JOIN
. When there are no joining rows, a row withNULL
values for all columns on the right side is generated to join with the row from the left side. - In a correlated
CROSS JOIN
, when the input table on the right side is empty for some row from the left side, it is as if no rows from the right side satisfied the join condition in a regular correlatedINNER JOIN
. This means that the row is dropped from the results.
Examples
This is an example of a correlated join, using the Roster and PlayerStats tables:
SELECT *
FROM
Roster
JOIN
UNNEST(
ARRAY(
SELECT AS STRUCT *
FROM PlayerStats
WHERE PlayerStats.OpponentID = Roster.SchoolID
)) AS PlayerMatches
ON PlayerMatches.LastName = 'Buchanan'
/*------------+----------+----------+------------+--------------*
| LastName | SchoolID | LastName | OpponentID | PointsScored |
+------------+----------+----------+------------+--------------+
| Adams | 50 | Buchanan | 50 | 13 |
| Eisenhower | 77 | Buchanan | 77 | 0 |
*------------+----------+----------+------------+--------------*/
A common pattern for a correlated LEFT JOIN
is to have an UNNEST
operation
on the right side that references an array from some column introduced by
input on the left side. For rows where that array is empty or NULL
,
the UNNEST
operation produces no rows on the right input. In that case, a row
with a NULL
entry in each column of the right input is created to join with
the row from the left input. For example:
SELECT A.name, item, ARRAY_LENGTH(A.items) item_count_for_name
FROM
UNNEST(
[
STRUCT(
'first' AS name,
[1, 2, 3, 4] AS items),
STRUCT(
'second' AS name,
[] AS items)]) AS A
LEFT JOIN
A.items AS item;
/*--------+------+---------------------*
| name | item | item_count_for_name |
+--------+------+---------------------+
| first | 1 | 4 |
| first | 2 | 4 |
| first | 3 | 4 |
| first | 4 | 4 |
| second | NULL | 0 |
*--------+------+---------------------*/
In the case of a correlated CROSS JOIN
, when the input on the right side
is empty for some row from the left side, the final row is dropped from the
results. For example:
SELECT A.name, item
FROM
UNNEST(
[
STRUCT(
'first' AS name,
[1, 2, 3, 4] AS items),
STRUCT(
'second' AS name,
[] AS items)]) AS A
CROSS JOIN
A.items AS item;
/*-------+------*
| name | item |
+-------+------+
| first | 1 |
| first | 2 |
| first | 3 |
| first | 4 |
*-------+------*/
WHERE
clause
WHERE bool_expression
The WHERE
clause filters the results of the FROM
clause.
Only rows whose bool_expression
evaluates to TRUE
are included. Rows
whose bool_expression
evaluates to NULL
or FALSE
are
discarded.
The evaluation of a query with a WHERE
clause is typically completed in this
order:
FROM
WHERE
GROUP BY
and aggregationHAVING
DISTINCT
ORDER BY
LIMIT
Evaluation order does not always match syntax order.
The WHERE
clause only references columns available via the FROM
clause;
it cannot reference SELECT
list aliases.
Examples
This query returns returns all rows from the Roster
table
where the SchoolID
column has the value 52
:
SELECT * FROM Roster
WHERE SchoolID = 52;
The bool_expression
can contain multiple sub-conditions:
SELECT * FROM Roster
WHERE STARTS_WITH(LastName, "Mc") OR STARTS_WITH(LastName, "Mac");
Expressions in an INNER JOIN
have an equivalent expression in the
WHERE
clause. For example, a query using INNER
JOIN
and ON
has an
equivalent expression using CROSS JOIN
and WHERE
. For example,
the following two queries are equivalent:
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster INNER JOIN TeamMascot
ON Roster.SchoolID = TeamMascot.SchoolID;
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster CROSS JOIN TeamMascot
WHERE Roster.SchoolID = TeamMascot.SchoolID;
GROUP BY
clause
GROUP BY expression [, ...]
The GROUP BY
clause groups together rows in a table with non-distinct values
for the expression
in the GROUP BY
clause. For multiple rows in the
source table with non-distinct values for expression
, the
GROUP BY
clause produces a single combined row. GROUP BY
is commonly used
when aggregate functions are present in the SELECT
list, or to eliminate
redundancy in the output. The data type of expression
must be
groupable.
Example:
SELECT SUM(PointsScored), LastName
FROM PlayerStats
GROUP BY LastName;
The GROUP BY
clause can refer to expression names in the SELECT
list. The
GROUP BY
clause also allows ordinal references to expressions in the SELECT
list using integer values. 1
refers to the first expression in the
SELECT
list, 2
the second, and so forth. The expression list can combine
ordinals and expression names.
Example:
SELECT SUM(PointsScored), LastName, FirstName
FROM PlayerStats
GROUP BY LastName, FirstName;
The query above is equivalent to:
SELECT SUM(PointsScored), LastName, FirstName
FROM PlayerStats
GROUP BY 2, FirstName;
GROUP BY
clauses may also refer to aliases. If a query contains aliases in
the SELECT
clause, those aliases override names in the corresponding FROM
clause.
Example:
SELECT SUM(PointsScored), LastName as last_name
FROM PlayerStats
GROUP BY last_name;
HAVING
clause
HAVING bool_expression
The HAVING
clause filters the results produced by GROUP BY
or
aggregation. GROUP BY
or aggregation must be present in the query. If
aggregation is present, the HAVING
clause is evaluated once for every
aggregated row in the result set.
Only rows whose bool_expression
evaluates to TRUE
are included. Rows
whose bool_expression
evaluates to NULL
or FALSE
are
discarded.
The evaluation of a query with a HAVING
clause is typically completed in this
order:
FROM
WHERE
GROUP BY
and aggregationHAVING
DISTINCT
ORDER BY
LIMIT
Evaluation order does not always match syntax order.
The HAVING
clause references columns available via the FROM
clause, as
well as SELECT
list aliases. Expressions referenced in the HAVING
clause
must either appear in the GROUP BY
clause or they must be the result of an
aggregate function:
SELECT LastName
FROM Roster
GROUP BY LastName
HAVING SUM(PointsScored) > 15;
If a query contains aliases in the SELECT
clause, those aliases override names
in a FROM
clause.
SELECT LastName, SUM(PointsScored) AS ps
FROM Roster
GROUP BY LastName
HAVING ps > 0;
Mandatory aggregation
Aggregation does not have to be present in the HAVING
clause itself, but
aggregation must be present in at least one of the following forms:
Aggregation function in the SELECT
list.
SELECT LastName, SUM(PointsScored) AS total
FROM PlayerStats
GROUP BY LastName
HAVING total > 15;
Aggregation function in the HAVING
clause.
SELECT LastName
FROM PlayerStats
GROUP BY LastName
HAVING SUM(PointsScored) > 15;
Aggregation in both the SELECT
list and HAVING
clause.
When aggregation functions are present in both the SELECT
list and HAVING
clause, the aggregation functions and the columns they reference do not need
to be the same. In the example below, the two aggregation functions,
COUNT()
and SUM()
, are different and also use different columns.
SELECT LastName, COUNT(*)
FROM PlayerStats
GROUP BY LastName
HAVING SUM(PointsScored) > 15;
ORDER BY
clause
ORDER BY expression [COLLATE collation_specification] [{ ASC | DESC }] [, ...] collation_specification: language_tag[:collation_attribute]
The ORDER BY
clause specifies a column or expression as the sort criterion for
the result set. If an ORDER BY clause is not present, the order of the results
of a query is not defined. Column aliases from a FROM
clause or SELECT
list
are allowed. If a query contains aliases in the SELECT
clause, those aliases
override names in the corresponding FROM
clause. The data type of
expression
must be orderable.
Optional Clauses
COLLATE
: You can use theCOLLATE
clause to refine how data is ordered by anORDER BY
clause. Collation refers to a set of rules that determine how strings are compared according to the conventions and standards of a particular written language, region, or country. These rules might define the correct character sequence, with options for specifying case-insensitivity. You can useCOLLATE
only on columns of typeSTRING
.collation_specification
represents the collation specification for theCOLLATE
clause. The collation specification can be a string literal or a query parameter. To learn more see collation specification details.ASC | DESC
: Sort the results in ascending or descending order ofexpression
values.ASC
is the default value.
Examples
Use the default sort order (ascending).
SELECT x, y
FROM (SELECT 1 AS x, true AS y UNION ALL
SELECT 9, true)
ORDER BY x;
/*------+-------*
| x | y |
+------+-------+
| 1 | true |
| 9 | true |
*------+-------*/
Use descending sort order.
SELECT x, y
FROM (SELECT 1 AS x, true AS y UNION ALL
SELECT 9, true)
ORDER BY x DESC;
/*------+-------*
| x | y |
+------+-------+
| 9 | true |
| 1 | true |
*------+-------*/
It is possible to order by multiple columns. In the example below, the result
set is ordered first by SchoolID
and then by LastName
:
SELECT LastName, PointsScored, OpponentID
FROM PlayerStats
ORDER BY SchoolID, LastName;
When used in conjunction with
set operators,
the ORDER BY
clause applies to the result set of the entire query; it does not
apply only to the closest SELECT
statement. For this reason, it can be helpful
(though it is not required) to use parentheses to show the scope of the ORDER
BY
.
This query without parentheses:
SELECT * FROM Roster
UNION ALL
SELECT * FROM TeamMascot
ORDER BY SchoolID;
is equivalent to this query with parentheses:
( SELECT * FROM Roster
UNION ALL
SELECT * FROM TeamMascot )
ORDER BY SchoolID;
but is not equivalent to this query, where the ORDER BY
clause applies only to
the second SELECT
statement:
SELECT * FROM Roster
UNION ALL
( SELECT * FROM TeamMascot
ORDER BY SchoolID );
You can also use integer literals as column references in ORDER BY
clauses. An
integer literal becomes an ordinal (for example, counting starts at 1) into
the SELECT
list.
Example - the following two queries are equivalent:
SELECT SUM(PointsScored), LastName
FROM PlayerStats
GROUP BY LastName
ORDER BY LastName;
SELECT SUM(PointsScored), LastName
FROM PlayerStats
GROUP BY 2
ORDER BY 2;
Collate results using English - Canada:
SELECT Place
FROM Locations
ORDER BY Place COLLATE "en_CA"
Collate results using a parameter:
#@collate_param = "arg_EG"
SELECT Place
FROM Locations
ORDER BY Place COLLATE @collate_param
Using multiple COLLATE
clauses in a statement:
SELECT APlace, BPlace, CPlace
FROM Locations
ORDER BY APlace COLLATE "en_US" ASC,
BPlace COLLATE "ar_EG" DESC,
CPlace COLLATE "en" DESC
Case insensitive collation:
SELECT Place
FROM Locations
ORDER BY Place COLLATE "en_US:ci"
Default Unicode case-insensitive collation:
SELECT Place
FROM Locations
ORDER BY Place COLLATE "und:ci"
Set operators
set_operation: query_expr set_operator query_expr set_operator: { UNION { ALL | DISTINCT } | INTERSECT { ALL | DISTINCT } | EXCEPT { ALL | DISTINCT } }
Set operators combine results from two or
more input queries into a single result set. If you specify ALL
, then all rows are
retained. If DISTINCT
is specified, duplicate rows are
discarded.
If a given row R appears exactly m times in the first input query and n times in the second input query (m >= 0, n >= 0):
- For
UNION ALL
, R appears exactly m + n times in the result. - For
INTERSECT ALL
, R will appear exactlyMIN(m, n)
in the result. - For
EXCEPT ALL
, R appears exactlyMAX(m - n, 0)
in the result. - For
UNION DISTINCT
, theDISTINCT
is computed after theUNION
is computed, so R appears exactly one time. - For
INTERSECT DISTINCT
, theDISTINCT
is computed after the result above is computed. - For
EXCEPT DISTINCT
, row R appears once in the output if m > 0 and n = 0. - If there are more than two input queries, the above operations generalize and the output is the same as if the input queries were combined incrementally from left to right.
The following rules apply:
- For set operations other than
UNION ALL
, all column types must support equality comparison. - The input queries on each side of the operator must return the same number of columns.
- The operators pair the columns returned by each input query
according to the columns' positions in their respective
SELECT
lists. That is, the first column in the first input query is paired with the first column in the second input query. - The results of the set operation always uses the column names from the first input query.
- The results of the set operation always uses the supertypes of input types in corresponding columns, so paired columns must also have either the same data type or a common supertype.
You must use parentheses to separate different set operations.
- Set operations such as
UNION ALL
andUNION DISTINCT
are different.
- Set operations such as
If the statement only repeats the same set operation, parentheses are not necessary.
Examples:
-- This works
query1 UNION ALL query2 UNION ALL query3
-- This works
query1 UNION ALL (query2 UNION DISTINCT query3)
-- This is invalid
query1 UNION ALL query2 UNION DISTINCT query3
-- This is invalid
query1 UNION ALL query2 INTERSECT ALL query3;
UNION
The UNION
operator combines the results of two or more input queries by
pairing columns from the results of each query and vertically concatenating
them.
INTERSECT
The INTERSECT
operator returns rows that are found in the results of both
the left and right input queries. Unlike EXCEPT
, the positioning of the input
queries (to the left versus right of the INTERSECT
operator) does not matter.
EXCEPT
The EXCEPT
operator returns rows from the left input query that are
not present in the right input query.
Example:
SELECT * FROM UNNEST(ARRAY<int64>[1, 2, 3]) AS number
EXCEPT DISTINCT SELECT 1;
/*--------*
| number |
+--------+
| 2 |
| 3 |
*--------*/
LIMIT
and OFFSET
clauses
LIMIT count [ OFFSET skip_rows ]
LIMIT
specifies a non-negative count
of type INT64,
and no more than count
rows will be returned. LIMIT
0
returns 0 rows.
If there is a set operation, LIMIT
is applied after the set operation is
evaluated.
OFFSET
specifies a non-negative number of rows to skip before applying
LIMIT
. skip_rows
is of type INT64.
These clauses accept only literal or parameter values. The rows that are
returned by LIMIT
and OFFSET
are unspecified unless these
operators are used after ORDER BY
.
Examples:
SELECT *
FROM UNNEST(ARRAY<STRING>['a', 'b', 'c', 'd', 'e']) AS letter
ORDER BY letter ASC LIMIT 2
/*---------*
| letter |
+---------+
| a |
| b |
*---------*/
SELECT *
FROM UNNEST(ARRAY<STRING>['a', 'b', 'c', 'd', 'e']) AS letter
ORDER BY letter ASC LIMIT 3 OFFSET 1
/*---------*
| letter |
+---------+
| b |
| c |
| d |
*---------*/
WITH
clause
WITH cte[, ...]
A WITH
clause contains one or more common table expressions (CTEs).
A CTE acts like a temporary table that you can reference within a single
query expression. Each CTE binds the results of a subquery
to a table name, which can be used elsewhere in the same query expression,
but rules apply.
CTEs
cte: cte_name AS ( query_expr )
A common table expression (CTE) contains a subquery and a name associated with the CTE.
- A CTE cannot reference itself.
- A CTE can be referenced by the query expression that
contains the
WITH
clause, but rules apply.
Examples
In this example, a WITH
clause defines two CTEs that
are referenced in the related set operation, where one CTE is referenced by
each of the set operation's input query expressions:
WITH subQ1 AS (SELECT SchoolID FROM Roster),
subQ2 AS (SELECT OpponentID FROM PlayerStats)
SELECT * FROM subQ1
UNION ALL
SELECT * FROM subQ2
WITH
is not supported in a subquery. This returns an error:
SELECT account
FROM (
WITH result AS (SELECT * FROM NPCs)
SELECT *
FROM result)
WITH
clause is not supported in DML statements.
Temporary tables defined by the WITH
clause are stored in memory.
Spanner dynamically allocates memory for all temporary tables
created by a query. If the available resources are not sufficient then the query
will fail.
CTE rules and constraints
Common table expressions (CTEs) can be referenced inside the query expression
that contains the WITH
clause.
Here are some general rules and constraints to consider when working with CTEs:
- Each CTE in the same
WITH
clause must have a unique name. - A CTE defined in a
WITH
clause is only visible to other CTEs in the sameWITH
clause that were defined after it. - A local CTE overrides an outer CTE or table with the same name.
- A CTE on a subquery may not reference correlated columns from the outer query.
CTE visibility
References between common table expressions (CTEs) in the WITH
clause can go
backward but not forward.
This is what happens when you have two CTEs that reference
themselves or each other in a WITH
clause. Assume that A
is the first CTE and B
is the second CTE in the clause:
- A references A = Invalid
- A references B = Invalid
- B references A = Valid
- A references B references A = Invalid (cycles are not allowed)
This produces an error. A
cannot reference itself because self-references are
not supported:
WITH
A AS (SELECT 1 AS n UNION ALL (SELECT n + 1 FROM A WHERE n < 3))
SELECT * FROM A
-- Error
This produces an error. A
cannot reference B
because references between
CTEs can go backwards but not forwards:
WITH
A AS (SELECT * FROM B),
B AS (SELECT 1 AS n)
SELECT * FROM B
-- Error
B
can reference A
because references between CTEs can go backwards:
WITH
A AS (SELECT 1 AS n),
B AS (SELECT * FROM A)
SELECT * FROM B
/*---*
| n |
+---+
| 1 |
*---*/
This produces an error. A
and B
reference each other, which creates a
cycle:
WITH
A AS (SELECT * FROM B),
B AS (SELECT * FROM A)
SELECT * FROM B
-- Error
Using aliases
An alias is a temporary name given to a table, column, or expression present in
a query. You can introduce explicit aliases in the SELECT
list or FROM
clause, or GoogleSQL will infer an implicit alias for some expressions.
Expressions with neither an explicit nor implicit alias are anonymous and the
query cannot reference them by name.
Explicit aliases
You can introduce explicit aliases in either the FROM
clause or the SELECT
list.
In a FROM
clause, you can introduce explicit aliases for any item, including
tables, arrays, subqueries, and UNNEST
clauses, using [AS] alias
. The AS
keyword is optional.
Example:
SELECT s.FirstName, s2.SongName
FROM Singers AS s, (SELECT * FROM Songs) AS s2;
You can introduce explicit aliases for any expression in the SELECT
list using
[AS] alias
. The AS
keyword is optional.
Example:
SELECT s.FirstName AS name, LOWER(s.FirstName) AS lname
FROM Singers s;
Implicit aliases
In the SELECT
list, if there is an expression that does not have an explicit
alias, GoogleSQL assigns an implicit alias according to the following
rules. There can be multiple columns with the same alias in the SELECT
list.
- For identifiers, the alias is the identifier. For example,
SELECT abc
impliesAS abc
. - For path expressions, the alias is the last identifier in the path. For
example,
SELECT abc.def.ghi
impliesAS ghi
. - For field access using the "dot" member field access operator, the alias is
the field name. For example,
SELECT (struct_function()).fname
impliesAS fname
.
In all other cases, there is no implicit alias, so the column is anonymous and cannot be referenced by name. The data from that column will still be returned and the displayed query results may have a generated label for that column, but the label cannot be used like an alias.
In a FROM
clause, from_item
s are not required to have an alias. The
following rules apply:
-
If there is an expression that does not have an explicit alias,
GoogleSQL assigns an implicit alias in these cases:
-
For identifiers, the alias is the identifier. For example,
FROM abc
impliesAS abc
. -
For path expressions, the alias is the last identifier in the path. For
example,
FROM abc.def.ghi
impliesAS ghi
-
The column produced using
WITH OFFSET
has the implicit aliasoffset
.
-
For identifiers, the alias is the identifier. For example,
- Table subqueries do not have implicit aliases.
-
FROM UNNEST(x)
does not have an implicit alias.
Alias visibility
After you introduce an explicit alias in a query, there are restrictions on where else in the query you can reference that alias. These restrictions on alias visibility are the result of GoogleSQL name scoping rules.
Visibility in the FROM
clause
GoogleSQL processes aliases in a FROM
clause from left to right,
and aliases are visible only to subsequent path expressions in a FROM
clause.
Example:
Assume the Singers
table had a Concerts
column of ARRAY
type.
SELECT FirstName
FROM Singers AS s, s.Concerts;
Invalid:
SELECT FirstName
FROM s.Concerts, Singers AS s; // INVALID.
FROM
clause aliases are not visible to subqueries in the same FROM
clause. Subqueries in a FROM
clause cannot contain correlated references to
other tables in the same FROM
clause.
Invalid:
SELECT FirstName
FROM Singers AS s, (SELECT (2020 - ReleaseDate) FROM s) // INVALID.
You can use any column name from a table in the FROM
as an alias anywhere in
the query, with or without qualification with the table name.
Example:
SELECT FirstName, s.ReleaseDate
FROM Singers s WHERE ReleaseDate = 1975;
If the FROM
clause contains an explicit alias, you must use the explicit alias
instead of the implicit alias for the remainder of the query (see
Implicit Aliases). A table alias is useful for brevity or
to eliminate ambiguity in cases such as self-joins, where the same table is
scanned multiple times during query processing.
Example:
SELECT * FROM Singers as s, Songs as s2
ORDER BY s.LastName
Invalid — ORDER BY
does not use the table alias:
SELECT * FROM Singers as s, Songs as s2
ORDER BY Singers.LastName; // INVALID.
Visibility in the SELECT
list
Aliases in the SELECT
list are visible only to the following clauses:
GROUP BY
clauseORDER BY
clauseHAVING
clause
Example:
SELECT LastName AS last, SingerID
FROM Singers
ORDER BY last;
Visibility in the GROUP BY
, ORDER BY
, and HAVING
clauses
These three clauses, GROUP BY
, ORDER BY
, and HAVING
, can refer to only the
following values:
- Tables in the
FROM
clause and any of their columns. - Aliases from the
SELECT
list.
GROUP BY
and ORDER BY
can also refer to a third group:
- Integer literals, which refer to items in the
SELECT
list. The integer1
refers to the first item in theSELECT
list,2
refers to the second item, etc.
Example:
SELECT SingerID AS sid, COUNT(Songid) AS s2id
FROM Songs
GROUP BY 1
ORDER BY 2 DESC;
The previous query is equivalent to:
SELECT SingerID AS sid, COUNT(Songid) AS s2id
FROM Songs
GROUP BY sid
ORDER BY s2id DESC;
Duplicate aliases
A SELECT
list or subquery containing multiple explicit or implicit aliases
of the same name is allowed, as long as the alias name is not referenced
elsewhere in the query, since the reference would be
ambiguous.
Example:
SELECT 1 AS a, 2 AS a;
/*---+---*
| a | a |
+---+---+
| 1 | 2 |
*---+---*/
Ambiguous aliases
GoogleSQL provides an error if accessing a name is ambiguous, meaning it can resolve to more than one unique object in the query or in a table schema, including the schema of a destination table.
The following query contains column names that conflict between tables, since
both Singers
and Songs
have a column named SingerID
:
SELECT SingerID
FROM Singers, Songs;
The following query contains aliases that are ambiguous in the GROUP BY
clause
because they are duplicated in the SELECT
list:
SELECT FirstName AS name, LastName AS name,
FROM Singers
GROUP BY name;
The following query contains aliases that are ambiguous in the SELECT
list and
FROM
clause because they share a column and field with same name.
- Assume the
Person
table has three columns:FirstName
,LastName
, andPrimaryContact
. - Assume the
PrimaryContact
column represents a struct with these fields:FirstName
andLastName
.
The alias P
is ambiguous and will produce an error because P.FirstName
in
the GROUP BY
clause could refer to either Person.FirstName
or
Person.PrimaryContact.FirstName
.
SELECT FirstName, LastName, PrimaryContact AS P
FROM Person AS P
GROUP BY P.FirstName;
A name is not ambiguous in GROUP BY
, ORDER BY
or HAVING
if it is both
a column name and a SELECT
list alias, as long as the name resolves to the
same underlying object. In the following example, The alias BirthYear
is not
ambiguous because it resolves to the same underlying column,
Singers.BirthYear
.
SELECT LastName, BirthYear AS BirthYear
FROM Singers
GROUP BY BirthYear;
Range variables
In GoogleSQL, a range variable is a table expression alias in the
FROM
clause. Sometimes a range variable is known as a table alias
. A
range variable lets you reference rows being scanned from a table expression.
A table expression represents an item in the FROM
clause that returns a table.
Common items that this expression can represent include
tables, value tables, subqueries,
joins, and parenthesized joins.
In general, a range variable provides a reference to the rows of a table
expression. A range variable can be used to qualify a column reference and
unambiguously identify the related table, for example range_variable.column_1
.
When referencing a range variable on its own without a specified column suffix, the result of a table expression is the row type of the related table. Value tables have explicit row types, so for range variables related to value tables, the result type is the value table's row type. Other tables do not have explicit row types, and for those tables, the range variable type is a dynamically defined struct that includes all of the columns in the table.
Examples
In these examples, the WITH
clause is used to emulate a temporary table
called Grid
. This table has columns x
and y
. A range variable called
Coordinate
refers to the current row as the table is scanned. Coordinate
can be used to access the entire row or columns in the row.
The following example selects column x
from range variable Coordinate
,
which in effect selects column x
from table Grid
.
WITH Grid AS (SELECT 1 x, 2 y)
SELECT Coordinate.x FROM Grid AS Coordinate;
/*---*
| x |
+---+
| 1 |
*---*/
The following example selects all columns from range variable Coordinate
,
which in effect selects all columns from table Grid
.
WITH Grid AS (SELECT 1 x, 2 y)
SELECT Coordinate.* FROM Grid AS Coordinate;
/*---+---*
| x | y |
+---+---+
| 1 | 2 |
*---+---*/
The following example selects the range variable Coordinate
, which is a
reference to rows in table Grid
. Since Grid
is not a value table,
the result type of Coordinate
is a struct that contains all the columns
from Grid
.
WITH Grid AS (SELECT 1 x, 2 y)
SELECT Coordinate FROM Grid AS Coordinate;
/*--------------*
| Coordinate |
+--------------+
| {x: 1, y: 2} |
*--------------*/
Value tables
In addition to standard SQL tables, GoogleSQL supports value tables. In a value table, rather than having rows made up of a list of columns, each row is a single value of a specific type, and there are no column names.
In the following example, a value table for a STRUCT
is produced with the
SELECT AS VALUE
statement:
SELECT * FROM (SELECT AS VALUE STRUCT(123 AS a, FALSE AS b))
/*-----+-------*
| a | b |
+-----+-------+
| 123 | FALSE |
*-----+-------*/
Value tables are often but not exclusively used with compound data types. A value table can consist of any supported GoogleSQL data type, although value tables consisting of scalar types occur less frequently than structs.
Return query results as a value table
Spanner does not support value tables as base tables in database schemas and does not support returning value tables in query results. As a consequence, value table producing queries are not supported as top-level queries.
Value tables can also occur as the output of the UNNEST
operator or a subquery. The WITH
clause
introduces a value table if the subquery used produces a value table.
In contexts where a query with exactly one column is expected, a value table query can be used instead. For example, scalar and array subqueries normally require a single-column query, but in GoogleSQL, they also allow using a value table query.
Create a table with a value table
Value tables are not supported as top-level queries in the
CREATE TABLE
statement, but they can be included in subqueries and
UNNEST
operations. For example, you can create a table from a
value table with this query:
CREATE TABLE Reviews AS
SELECT * FROM (SELECT AS VALUE STRUCT(5 AS star_rating, FALSE AS up_down_rating))
Column Name | Data Type |
---|---|
star_rating | INT64 |
up_down_rating | BOOL |
Use a set operation on a value table
In SET
operations like UNION ALL
you can combine tables with value tables,
provided that the table consists of a single column with a type that matches the
value table's type. The result of these operations is always a value table.
Appendix A: examples with sample data
These examples include statements which perform queries on the
Roster
and TeamMascot
,
and PlayerStats
tables.
GROUP BY
clause
Example:
SELECT LastName, SUM(PointsScored)
FROM PlayerStats
GROUP BY LastName;
LastName | SUM |
---|---|
Adams | 7 |
Buchanan | 13 |
Coolidge | 1 |
UNION
The UNION
operator combines the result sets of two or more SELECT
statements
by pairing columns from the result set of each SELECT
statement and vertically
concatenating them.
Example:
SELECT Mascot AS X, SchoolID AS Y
FROM TeamMascot
UNION ALL
SELECT LastName, PointsScored
FROM PlayerStats;
Results:
X | Y |
---|---|
Jaguars | 50 |
Knights | 51 |
Lakers | 52 |
Mustangs | 53 |
Adams | 3 |
Buchanan | 0 |
Coolidge | 1 |
Adams | 4 |
Buchanan | 13 |
INTERSECT
This query returns the last names that are present in both Roster and PlayerStats.
SELECT LastName
FROM Roster
INTERSECT ALL
SELECT LastName
FROM PlayerStats;
Results:
LastName |
---|
Adams |
Coolidge |
Buchanan |
EXCEPT
The query below returns last names in Roster that are not present in PlayerStats.
SELECT LastName
FROM Roster
EXCEPT DISTINCT
SELECT LastName
FROM PlayerStats;
Results:
LastName |
---|
Eisenhower |
Davis |
Reversing the order of the SELECT
statements will return last names in
PlayerStats that are not present in Roster:
SELECT LastName
FROM PlayerStats
EXCEPT DISTINCT
SELECT LastName
FROM Roster;
Results:
(empty)