Working with Arrays

In Cloud Spanner SQL, an array is an ordered list consisting of zero or more values of the same data type. You can construct arrays of simple data types, such as INT64, and complex data types, such as STRUCTs. The current exception to this is the ARRAY data type: arrays of arrays are not supported.

With Cloud Spanner SQL, you can construct array literals, build arrays from subqueries using the ARRAY function, and aggregate values into an array using the ARRAY_AGG function.

Constructing arrays

Using array literals

You can build an array literal in Cloud Spanner SQL using brackets ([ and ]). Each element in an array is separated by a comma.

SELECT [1, 2, 3] as numbers;

SELECT ["apple", "pear", "orange"] as fruit;

SELECT [true, false, true] as booleans;

You can also create arrays from any expressions that have compatible types. For example:

SELECT [a, b, c]
FROM
  (SELECT 5 AS a,
          37 AS b,
          406 AS c);

SELECT [a, b, c]
FROM
  (SELECT CAST(5 AS INT64) AS a,
          CAST(37 AS FLOAT64) AS b,
          406 AS c);

Notice that the second example contains three expressions: one that returns an INT64, one that returns a FLOAT64, and one that declares a literal. This expression works because all three expressions share FLOAT64 as a supertype.

To declare a specific data type for an array, use angle brackets (< and >). For example:

SELECT ARRAY<FLOAT64>[1, 2, 3] as floats;

Arrays of most data types, such as INT64 or STRING, don't require that you declare them first.

SELECT [1, 2, 3] as numbers;

You can write an empty array of a specific type using ARRAY<type>[]. You can also write an untyped empty array using [], in which case Cloud Spanner SQL attempts to infer the array type from the surrounding context. If Cloud Spanner SQL cannot infer a type, the default type ARRAY<INT64> is used.

Accessing Array Elements

Consider the following table, sequences:

CREATE TABLE sequences (
  id INT64 NOT NULL,
  some_numbers ARRAY<INT64> NOT NULL
) PRIMARY KEY(id);

Assume the table is populated with the following rows:

+----+---------------------+
| id | some_numbers        |
+----+---------------------+
| 1  | [0, 1, 1, 2, 3, 5]  |
| 2  | [2, 4, 8, 16, 32]   |
| 3  | [5, 10]             |
+----+---------------------+

This table contains the column some_numbers of the ARRAY data type. To access elements from the arrays in this column, you must specify which type of indexing you want to use: either OFFSET, for zero-based indexes, or ORDINAL, for one-based indexes.

This query shows how to use OFFSET() and ORDINAL():

SELECT some_numbers,
       some_numbers[OFFSET(1)] AS offset_1,
       some_numbers[ORDINAL(1)] AS ordinal_1
FROM sequences;

+---------------+----------+-----------+
| some_numbers  | offset_1 | ordinal_1 |
+---------------+----------+-----------+
| [0,1,1,2,3,5] |        1 |         0 |
+---------------+----------+-----------+
| [2,4,8,16,32] |        4 |         2 |
+---------------+----------+-----------+
| [5,10]        |       10 |         5 |
+---------------+----------+-----------+

Finding Lengths

The ARRAY_LENGTH() function returns the length of an array.

Here's an example query, assuming the same definition of the sequences table as above, with the same sample rows:

SELECT some_numbers,
       ARRAY_LENGTH(some_numbers) AS len
FROM sequences;

+---------------+------+
| some_numbers  | len  |
+---------------+------+
| [0,1,1,2,3,5] |    6 |
+---------------+------+
| [2,4,8,16,32] |    5 |
+---------------+------+
| [5,10]        |    2 |
+---------------+------+

Flattening arrays

To convert an ARRAY into a set of rows, also known as "flattening," use the UNNEST operator. UNNEST takes an ARRAY and returns a table with a single row for each element in the ARRAY.

Because UNNEST destroys the order of the ARRAY elements, you may wish to restore order to the table. To do so, use the optional WITH OFFSET clause to return an additional column with the offset for each array element, then use the ORDER BY clause to order the rows by their offset.

Example

SELECT *
FROM UNNEST(['foo', 'bar', 'baz', 'qux', 'corge', 'garply', 'waldo', 'fred'])
  AS element
WITH OFFSET AS offset
ORDER BY offset;

+----------+--------+
| element  | offset |
+----------+--------+
| foo      | 0      |
| bar      | 1      |
| baz      | 2      |
| qux      | 3      |
| corge    | 4      |
| garply   | 5      |
| waldo    | 6      |
| fred     | 7      |
+----------+--------+

To flatten an entire column of ARRAYs while preserving the values of the other columns in each row, use a CROSS JOIN to join the table containing the ARRAY column to the UNNEST output of that ARRAY column.

This is a correlated cross join: the UNNEST operator references the column of ARRAYs from each row in the source table, which appears previously in the FROM clause. For each row N in the source table, UNNEST flattens the ARRAY from row N into a set of rows containing the ARRAY elements, and then the CROSS JOIN joins this new set of rows with the single row N from the source table.

Example

The following example uses UNNEST to return a row for each element in the array column. Because of the CROSS JOIN, the id column contains the id values for the row in sequences that contains each number.

SELECT id, flattened_numbers FROM sequences
CROSS JOIN UNNEST(sequences.some_numbers) AS flattened_numbers;

+------+-------------------+
| id   | flattened_numbers |
+------+-------------------+
|    1 |                 0 |
|    1 |                 1 |
|    1 |                 1 |
|    1 |                 2 |
|    1 |                 3 |
|    1 |                 5 |
|    2 |                 2 |
|    2 |                 4 |
|    2 |                 8 |
|    2 |                16 |
|    2 |                32 |
|    3 |                 5 |
|    3 |                10 |
+------+-------------------+

Querying Nested Arrays

If a table contains an ARRAY of STRUCTs, you can flatten the ARRAY to query the fields of the STRUCT. You can also flatten ARRAY type fields of STRUCT values.

Querying STRUCT elements in an ARRAY

The following example uses UNNEST with CROSS JOIN to flatten an ARRAY of STRUCTs.

SELECT race,
       participant.name,
       participant.splits
FROM
  (SELECT "800M" AS race,
    [STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] AS splits),
     STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS splits),
     STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS splits),
     STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS splits),
     STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS splits),
     STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS splits),
     STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS splits),
     STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] as splits)]
     AS participants
  ) AS r
CROSS JOIN UNNEST(r.participants) AS participant;

+------+-------------+-----------------------+
| race | name        | splits                |
+------+-------------+-----------------------+
| 800M | Rudisha     | [23.4,26.3,26.4,26.1] |
+------+-------------+-----------------------+
| 800M | Makhloufi   | [24.5,25.4,26.6,26.1] |
+------+-------------+-----------------------+
| 800M | Murphy      | [23.9,26,27,26]       |
+------+-------------+-----------------------+
| 800M | Bosse       | [23.6,26.2,26.5,27.1] |
+------+-------------+-----------------------+
| 800M | Rotich      | [24.7,25.6,26.9,26.4] |
+------+-------------+-----------------------+
| 800M | Lewandowski | [25,25.7,26.3,27.2]   |
+------+-------------+-----------------------+
| 800M | Kipketer    | [23.2,26.1,27.3,29.4] |
+------+-------------+-----------------------+
| 800M | Berian      | [23.7,26.1,27,29.3]   |
+------+-------------+-----------------------+

You can find specific information from repeated fields. For example, the following query returns the fastest racer in an 800M race.

This example does not involve flattening an array, but does represent a common way to get information from a repeated field.

Example

SELECT race,
       (SELECT name
        FROM UNNEST(participants)
        ORDER BY (
          SELECT SUM(duration)
          FROM UNNEST(splits) AS duration) ASC
          LIMIT 1) AS fastest_racer
FROM
  (SELECT "800M" AS race,
    [STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] AS splits),
     STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS splits),
     STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS splits),
     STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS splits),
     STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS splits),
     STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS splits),
     STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS splits),
     STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] as splits)]
     AS participants
  ) AS r;

+------+---------------+
| race | fastest_racer |
+------+---------------+
| 800M | Rudisha       |
+------+---------------+

Querying ARRAY-type fields in a STRUCT

You can also get information from nested repeated fields. For example, the following statement returns the runner who had the fastest lap in an 800M race.

SELECT race,
       (SELECT name
        FROM UNNEST(participants),
          UNNEST(splits) AS duration
        ORDER BY duration ASC LIMIT 1) AS runner_with_fastest_lap
FROM
  (SELECT "800M" AS race,
    [STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] AS splits),
     STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS splits),
     STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS splits),
     STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS splits),
     STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS splits),
     STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS splits),
     STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS splits),
     STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] as splits)]
     AS participants
  ) AS r;

+------+-------------------------+
| race | runner_with_fastest_lap |
+------+-------------------------+
| 800M | Kipketer                |
+------+-------------------------+

Notice that the preceding query uses the comma operator (,) to perform an implicit CROSS JOIN. It is equivalent to the following example, which uses an explicit CROSS JOIN.

SELECT race,
       (SELECT name
        FROM UNNEST(participants)
        CROSS JOIN UNNEST(splits) AS duration
        ORDER BY duration ASC LIMIT 1) AS runner_with_fastest_lap
FROM
  (SELECT "800M" AS race,
    [STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] AS splits),
     STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS splits),
     STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS splits),
     STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS splits),
     STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS splits),
     STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS splits),
     STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS splits),
     STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] as splits)]
     AS participants
  ) AS r;

+------+-------------------------+
| race | runner_with_fastest_lap |
+------+-------------------------+
| 800M | Kipketer                |
+------+-------------------------+

Note that flattening arrays with a CROSS JOIN excludes rows that have empty or NULL arrays. If you want to include these rows, use a LEFT JOIN.

SELECT
  name, sum(duration) as duration
FROM
  (SELECT "800M" AS race,
    [STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS splits),
     STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS splits),
     STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS splits),
     STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS splits),
     STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS splits),
     STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS splits),
     STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS splits),
     STRUCT("Nathan" as name, ARRAY<FLOAT64>[] as splits),
     STRUCT("David" as name, NULL as splits)]
     AS participants) AS races,
  races.participants LEFT JOIN participants.splits duration
GROUP BY name;

+-------------+--------------------+
| name        | duration           |
+-------------+--------------------+
| Murphy      | 102.9              |
| Rudisha     | 102.19999999999999 |
| David       | NULL               |
| Rotich      | 103.6              |
| Makhloufi   | 102.6              |
| Bosse       | 103.4              |
| Kipketer    | 106                |
| Nathan      | NULL               |
| Lewandowski | 104.2              |
+-------------+--------------------+

Creating Arrays From Subqueries

A common task when working with arrays is turning a subquery result into an array. In Cloud Spanner SQL, you can accomplish this using the

ARRAY() function.

For example, consider the following operation on the sequences table:

SELECT some_numbers,
  ARRAY(SELECT x * 2
        FROM UNNEST(some_numbers) AS x) AS doubled
FROM sequences;

+---------------+----------------+
| some_numbers  | doubled        |
+---------------+----------------+
| [0,1,1,2,3,5] | [0,2,2,4,6,10] |
+---------------+----------------+
| [2,4,8,16,32] | [4,8,16,32,64] |
+---------------+----------------+
| [5,10]        | [10,20]        |
+---------------+----------------+

This example starts with a table named sequences. This table contains a column, some_numbers, of type ARRAY<INT64>.

The query itself contains a subquery. This subquery selects each row in the some_numbers column and uses UNNEST to return the array as a set of rows. Next, it multiplies each value by two, and then recombines the rows back into an array using the ARRAY() operator.

Filtering Arrays

The following example uses a WHERE clause in the ARRAY() operator's subquery to filter the returned rows.

Note: In the following examples, the resulting rows are not ordered.

SELECT
  ARRAY(SELECT x * 2
        FROM UNNEST(some_numbers) AS x
        WHERE x < 5) AS doubled_less_than_five
FROM sequences;

+------------------------+
| doubled_less_than_five |
+------------------------+
| [0,2,2,4,6]            |
+------------------------+
| [4,8]                  |
+------------------------+
| []                     |
+------------------------+

Notice that the third row contains an empty array, because the elements in the corresponding original row ([5, 10]) did not meet the filter requirement of x < 5.

You can also filter arrays by using SELECT DISTINCT to return only unique elements within an array.

SELECT ARRAY(SELECT DISTINCT x
             FROM UNNEST(some_numbers) AS x) AS unique_numbers
FROM sequences
WHERE id = 1;

+----------------+
| unique_numbers |
+----------------+
| [0,1,2,3,5]    |
+----------------+

You can also filter rows of arrays by using the IN keyword. This keyword filters rows containing arrays by determining if a specific value matches an element in the array.

SELECT
   ARRAY(SELECT x
         FROM UNNEST(some_numbers) AS x
         WHERE 2 IN UNNEST(some_numbers)) AS contains_two
FROM sequences;

+---------------+
| contains_two  |
+---------------+
| [0,1,1,2,3,5] |
+---------------+
| [2,4,8,16,32] |
+---------------+
| []            |
+---------------+

Notice again that the third row contains an empty array, because the array in the corresponding original row ([5, 10]) did not contain 2.

Scanning Arrays

To check if an array contains a specific value, use the IN operator with UNNEST. To check if an array contains a value matching a condition, use the EXISTS function with UNNEST.

Scanning for specific values

To scan an array for a specific value, use the IN operator with UNNEST.

Example

The following example returns true if the array contains the number 2.

SELECT 2 IN UNNEST([0, 1, 1, 2, 3, 5]) AS contains_value;

+----------------+
| contains_value |
+----------------+
| true           |
+----------------+

To return the rows of a table where the array column contains a specific value, filter the results of IN UNNEST using the WHERE clause.

Example

The following example returns the id value for the rows where the array column contains the value 2.

SELECT id AS matching_rows
FROM (SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
      UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
      UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
WHERE 2 IN UNNEST(some_numbers)
ORDER BY matching_rows;

+---------------+
| matching_rows |
+---------------+
| 1             |
| 2             |
+---------------+

Scanning for values that satisfy a condition

To scan an array for values that match a condition, use UNNEST to return a table of the elements in the array, use WHERE to filter the resulting table in a subquery, and use EXISTS to check if the filtered table contains any rows.

Example

The following example returns the id value for the rows where the array column contains values greater than 5.

SELECT id AS matching_rows
FROM (SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
      UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
      UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
WHERE EXISTS (SELECT *
              FROM UNNEST(some_numbers) AS x
              WHERE x > 5);

+---------------+
| matching_rows |
+---------------+
| 2             |
| 3             |
+---------------+

Scanning for STRUCT field values that satisfy a condition

To search an array of STRUCTs for a field whose value matches a condition, use UNNEST to return a table with a column for each STRUCT field, then filter non-matching rows from the table using WHERE EXISTS.

Example

The following example returns the rows where the array column contains a STRUCT whose field b has a value greater than 3.

SELECT id AS matching_rows
FROM (SELECT 1 AS id, [STRUCT(0 AS a, 1 AS b)] AS some_numbers
      UNION ALL SELECT 2 AS id, [STRUCT(2 AS a, 4 AS b)] AS some_numbers
      UNION ALL SELECT 3 AS id, [STRUCT(5 AS a, 3 AS b), STRUCT(7 AS a, 4 AS b)]
        AS some_numbers)
WHERE EXISTS (
  SELECT 1
  FROM UNNEST(some_numbers)
  WHERE b > 3
);

+---------------+
| matching_rows |
+---------------+
| 2             |
| 3             |
+---------------+

Arrays and Aggregation

With Cloud Spanner SQL, you can aggregate values into an array using ARRAY_AGG().

Consider the following table, fruits:

CREATE TABLE fruits (
  fruit STRING(MAX),
  id INT64 NOT NULL
) PRIMARY KEY(id);

Assume the table is populated with the following data:

+----+--------------+
| id | fruit        |
+----+--------------+
| 1  | "apple"      |
| 2  | "pear"       |
| 3  | "banana"     |
+----+--------------+

This query shows how to use ARRAY_AGG():

SELECT ARRAY_AGG(fruit) AS fruit_basket
FROM fruits;

+---------------------+
| fruit_basket        |
+---------------------+
| [apple,pear,banana] |
+---------------------+

The array returned by ARRAY_AGG() is in an arbitrary order, since the order in which the function concatenates values is not guaranteed.

You can also apply aggregate functions such as SUM() to the elements in an array. For example, the following query returns the sum of array elements for each row of the sequences table.

SELECT some_numbers,
  (SELECT SUM(x)
   FROM UNNEST(s.some_numbers) x) AS sums
FROM sequences s;

+---------------+------+
| some_numbers  | sums |
+---------------+------+
| [0,1,1,2,3,5] |   12 |
+---------------+------+
| [2,4,8,16,32] |   62 |
+---------------+------+
| [5,10]        |   15 |
+---------------+------+

Building arrays of arrays

Cloud Spanner SQL does not support building arrays of arrays directly. Instead, you must create an array of structs, with each struct containing a field of type ARRAY. To illustrate this, consider the following points table:

CREATE TABLE points (
  point ARRAY<INT64>,
  id INT64 NOT NULL
) PRIMARY KEY(id);

Assume the table is populated with the following rows:

+----+----------+
| id | point    |
+----+----------+
| 1  | [1, 5]   |
| 2  | [2, 8]   |
| 3  | [3, 7]   |
| 4  | [4, 1]   |
| 5  | [5, 7]   |
+----+----------+

Now, let's say you wanted to create an array consisting of each point in the points table. To accomplish this, wrap the array returned from each row in a STRUCT, as shown below.

SELECT ARRAY(
  SELECT STRUCT(point)
  FROM points)
  AS coordinates;

+--------------+
| coordinates  |
+--------------+
| point: [1,5] |
| point: [2,8] |
| point: [3,7] |
| point: [4,1] |
| point: [5,7] |
+--------------+
Was this page helpful? Let us know how we did:

Send feedback about...

Cloud Spanner Documentation