Work with arrays in GoogleSQL

In GoogleSQL for Spanner, an array is an ordered list consisting of zero or more values of the same data type. You can construct arrays of simple data types, such as INT64, and complex data types, such as STRUCTs. The current exception to this is the ARRAY data type because arrays of arrays are not supported. To learn more about the ARRAY data type, see Array type.

With GoogleSQL, you can construct array literals, build arrays from subqueries using the ARRAY function, and aggregate values into an array using the ARRAY_AGG function.

You can combine arrays using functions like ARRAY_CONCAT(), and convert arrays to strings using ARRAY_TO_STRING().

Accessing array elements

Consider the following emulated table called Sequences. This table contains the column some_numbers of the ARRAY data type.

WITH
  Sequences AS (
    SELECT [0, 1, 1, 2, 3, 5] AS some_numbers UNION ALL
    SELECT [2, 4, 8, 16, 32] UNION ALL
    SELECT [5, 10]
  )
SELECT * FROM Sequences

/*---------------------*
 | some_numbers        |
 +---------------------+
 | [0, 1, 1, 2, 3, 5]  |
 | [2, 4, 8, 16, 32]   |
 | [5, 10]             |
 *---------------------*/

To access array elements in the some_numbers column, specify which type of indexing you want to use: either OFFSET(index) for zero-based indexes, or ORDINAL(index) for one-based indexes.

For example:

SELECT
  some_numbers,
  some_numbers[OFFSET(1)] AS offset_1,
  some_numbers[ORDINAL(1)] AS ordinal_1
FROM Sequences

/*--------------------+----------+-----------*
 | some_numbers       | offset_1 | ordinal_1 |
 +--------------------+----------+-----------+
 | [0, 1, 1, 2, 3, 5] | 1        | 0         |
 | [2, 4, 8, 16, 32]  | 4        | 2         |
 | [5, 10]            | 10       | 5         |
 *--------------------+----------+-----------*/

Finding lengths

The ARRAY_LENGTH function returns the length of an array.

WITH Sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
       ARRAY_LENGTH(some_numbers) AS len
FROM Sequences;

/*--------------------+--------*
 | some_numbers       | len    |
 +--------------------+--------+
 | [0, 1, 1, 2, 3, 5] | 6      |
 | [2, 4, 8, 16, 32]  | 5      |
 | [5, 10]            | 2      |
 *--------------------+--------*/

Converting elements in an array to rows in a table

To convert an ARRAY into a set of rows, also known as "flattening," use the UNNEST operator. UNNEST takes an ARRAY and returns a table with a single row for each element in the ARRAY.

Because UNNEST destroys the order of the ARRAY elements, you may wish to restore order to the table. To do so, use the optional WITH OFFSET clause to return an additional column with the offset for each array element, then use the ORDER BY clause to order the rows by their offset.

Example

SELECT *
FROM UNNEST(['foo', 'bar', 'baz', 'qux', 'corge', 'garply', 'waldo', 'fred'])
  AS element
WITH OFFSET AS offset
ORDER BY offset;

/*----------+--------*
 | element  | offset |
 +----------+--------+
 | foo      | 0      |
 | bar      | 1      |
 | baz      | 2      |
 | qux      | 3      |
 | corge    | 4      |
 | garply   | 5      |
 | waldo    | 6      |
 | fred     | 7      |
 *----------+--------*/

To flatten an entire column of ARRAYs while preserving the values of the other columns in each row, use a correlated INNER JOIN to join the table containing the ARRAY column to the UNNEST output of that ARRAY column.

With a correlated join, the UNNEST operator references the ARRAY typed column from each row in the source table, which appears previously in the FROM clause. For each row N in the source table, UNNEST flattens the ARRAY from row N into a set of rows containing the ARRAY elements, and then a correlated INNER JOIN or CROSS JOIN combines this new set of rows with the single row N from the source table.

Examples

The following example uses UNNEST to return a row for each element in the array column. Because of the INNER JOIN, the id column contains the id values for the row in Sequences that contains each number.

WITH
  Sequences AS (
    SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
    UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
    UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers
  )
SELECT id, flattened_numbers
FROM Sequences
INNER JOIN UNNEST(Sequences.some_numbers) AS flattened_numbers;

/*------+-------------------*
 | id   | flattened_numbers |
 +------+-------------------+
 |    1 |                 0 |
 |    1 |                 1 |
 |    1 |                 1 |
 |    1 |                 2 |
 |    1 |                 3 |
 |    1 |                 5 |
 |    2 |                 2 |
 |    2 |                 4 |
 |    2 |                 8 |
 |    2 |                16 |
 |    2 |                32 |
 |    3 |                 5 |
 |    3 |                10 |
 *------+-------------------*/

Note that for correlated joins the UNNEST operator is optional and the INNER JOIN can be expressed as a CROSS JOIN or a comma cross join. Using the comma cross join shorthand notation, the previous example is consolidated as follows:

WITH
  Sequences AS (
    SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
    UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
    UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers
  )
SELECT id, flattened_numbers
FROM Sequences, Sequences.some_numbers AS flattened_numbers;

/*------+-------------------*
 | id   | flattened_numbers |
 +------+-------------------+
 |    1 |                 0 |
 |    1 |                 1 |
 |    1 |                 1 |
 |    1 |                 2 |
 |    1 |                 3 |
 |    1 |                 5 |
 |    2 |                 2 |
 |    2 |                 4 |
 |    2 |                 8 |
 |    2 |                16 |
 |    2 |                32 |
 |    3 |                 5 |
 |    3 |                10 |
 *------+-------------------*/

Querying nested and repeated fields

If a table contains an ARRAY of STRUCTs or PROTOs, you can flatten the ARRAY to query the fields of the STRUCT or PROTO. You can also flatten ARRAY type fields of STRUCT values and repeated fields of PROTO values. GoogleSQL treats repeated PROTO fields as ARRAYs.

Querying STRUCT elements in an array

The following example uses UNNEST with INNER JOIN to flatten an ARRAY of STRUCTs.

SELECT
  race,
  participant.name,
  participant.laps
FROM
  (
    SELECT
    "800M" AS race,
    [
      STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
      STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
      STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
      STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
      STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
      STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
      STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
      STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps)
    ] AS participants
  ) AS R
INNER JOIN UNNEST(R.participants) AS participant;

/*------+-------------+-----------------------*
 | race | name        | laps                  |
 +------+-------------+-----------------------+
 | 800M | Rudisha     | [23.4,26.3,26.4,26.1] |
 +------+-------------+-----------------------+
 | 800M | Makhloufi   | [24.5,25.4,26.6,26.1] |
 +------+-------------+-----------------------+
 | 800M | Murphy      | [23.9,26,27,26]       |
 +------+-------------+-----------------------+
 | 800M | Bosse       | [23.6,26.2,26.5,27.1] |
 +------+-------------+-----------------------+
 | 800M | Rotich      | [24.7,25.6,26.9,26.4] |
 +------+-------------+-----------------------+
 | 800M | Lewandowski | [25,25.7,26.3,27.2]   |
 +------+-------------+-----------------------+
 | 800M | Kipketer    | [23.2,26.1,27.3,29.4] |
 +------+-------------+-----------------------+
 | 800M | Berian      | [23.7,26.1,27,29.3]   |
 *------+-------------+-----------------------*/

You can find specific information from repeated fields. For example, the following query returns the fastest racer in an 800M race.

Example

WITH
  Races AS (
    SELECT
      "800M" AS race,
      [
        STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
        STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
        STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
        STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
        STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
        STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
        STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
        STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps)
      ] AS participants
  )
SELECT
  race,
  (
    SELECT name
    FROM UNNEST(participants)
    ORDER BY (SELECT SUM(duration) FROM UNNEST(laps) AS duration) ASC
    LIMIT 1
  ) AS fastest_racer
FROM Races;

/*------+---------------*
 | race | fastest_racer |
 +------+---------------+
 | 800M | Rudisha       |
 *------+---------------*/

Querying PROTO elements in an array

To query the fields of PROTO elements in an ARRAY, use UNNEST and INNER JOIN.

Example

The following query shows the contents of a table where one row contains an ARRAY of PROTOs. All of the PROTO field values in the ARRAY appear in a single row.

WITH
  Albums AS (
  SELECT
    'Let It Be' AS album_name,
    [
      NEW googlesql.examples.music.Chart(1 AS rank, 'US 100' AS chart_name),
      NEW googlesql.examples.music.Chart(1 AS rank, 'UK 40' AS chart_name),
      NEW googlesql.examples.music.Chart(2 AS rank, 'Oricon' AS chart_name)
    ] AS charts
  UNION ALL
  SELECT
    'Rubber Soul' AS album_name,
    [
      NEW googlesql.examples.music.Chart(1 AS rank, 'US 100' AS chart_name),
      NEW googlesql.examples.music.Chart(1 AS rank, 'UK 40' AS chart_name),
      NEW googlesql.examples.music.Chart(24 AS rank, 'Oricon' AS chart_name)
    ] AS charts
  )
SELECT *
FROM Albums;

/*-------------+---------------------------------*
 | album_name  | charts                          |
 +-------------+---------------------------------+
 | Let It Be   | [chart_name: "US 100", rank: 1, |
 |             | chart_name: "UK 40", rank: 1,   |
 |             | chart_name: "Oricon" rank: 2]   |
 +-------------+---------------------------------+
 | Rubber Soul | [chart_name: "US 100", rank: 1, |
 |             | chart_name: "UK 40", rank: 1,   |
 |             | chart_name: "Oricon" rank: 24]  |
 *-------------+---------------------------------*/

To return the value of the individual fields of the PROTOs inside an ARRAY, use UNNEST to flatten the ARRAY, then use a INNER JOIN to apply the UNNEST operator to each row of the ARRAY column. The INNER JOIN also joins the duplicated values of other columns to the result of UNNEST, so you can query these columns together with the fields of the PROTOs in the ARRAY.

Example

The following example uses UNNEST to flatten the ARRAY charts. The INNER JOIN applies the UNNEST operator to every row in the charts column and joins the duplicated value of table.album_name to the chart table. This allows the query to include the table.album_name column in the SELECT list together with the PROTO fields chart.chart_name and chart.rank.

WITH
  Albums AS (
    SELECT
      'Let It Be' AS album_name,
      [
        NEW googlesql.examples.music.Chart(1 AS rank, 'US 100' AS chart_name),
        NEW googlesql.examples.music.Chart(1 AS rank, 'UK 40' AS chart_name),
        NEW googlesql.examples.music.Chart(2 AS rank, 'Oricon' AS chart_name)
      ] AS charts
    UNION ALL
    SELECT
      'Rubber Soul' AS album_name,
      [
        NEW googlesql.examples.music.Chart(1 AS rank, 'US 100' AS chart_name),
        NEW googlesql.examples.music.Chart(1 AS rank, 'UK 40' AS chart_name),
        NEW googlesql.examples.music.Chart(24 AS rank, 'Oricon' AS chart_name)
      ] AS charts
  )
SELECT Albums.album_name, chart.chart_name, chart.rank
FROM Albums
INNER JOIN UNNEST(charts) AS chart;

/*-------------+------------+------*
 | album_name  | chart_name | rank |
 +-------------+------------+------+
 | Let It Be   | US 100     |    1 |
 | Let It Be   | UK 40      |    1 |
 | Let It Be   | Oricon     |    2 |
 | Rubber Soul | US 100     |    1 |
 | Rubber Soul | UK 40      |    1 |
 | Rubber Soul | Oricon     |   24 |
 *-------------+------------+------*/

Querying ARRAY-type fields in a struct

You can also get information from nested repeated fields. For example, the following statement returns the runner who had the fastest lap in an 800M race.

WITH
  Races AS (
    SELECT
      "800M" AS race,
      [
        STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
        STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
        STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
        STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
        STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
        STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
        STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
        STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps)
      ]AS participants
  )
SELECT
  race,
  (
    SELECT name
    FROM UNNEST(participants), UNNEST(laps) AS duration
    ORDER BY duration ASC
    LIMIT 1
  ) AS runner_with_fastest_lap
FROM Races;

/*------+-------------------------*
 | race | runner_with_fastest_lap |
 +------+-------------------------+
 | 800M | Kipketer                |
 *------+-------------------------*/

Notice that the preceding query uses the comma operator (,) to perform a cross join and flatten the array. This is equivalent to using an explicit CROSS JOIN, or the following example which uses an explicit INNER JOIN:

WITH
  Races AS (
    SELECT "800M" AS race,
      [
        STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
        STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
        STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
        STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
        STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
        STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
        STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
        STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps)
      ] AS participants
  )
SELECT
  race,
  (
    SELECT name
    FROM UNNEST(participants)
    INNER JOIN UNNEST(laps) AS duration
    ORDER BY duration ASC LIMIT 1
  ) AS runner_with_fastest_lap
FROM Races;

/*------+-------------------------*
 | race | runner_with_fastest_lap |
 +------+-------------------------+
 | 800M | Kipketer                |
 *------+-------------------------*/

Flattening arrays with a INNER JOIN excludes rows that have empty or NULL arrays. If you want to include these rows, use a LEFT JOIN.

WITH
  Races AS (
    SELECT
      "800M" AS race,
      [
        STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
        STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
        STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
        STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
        STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
        STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
        STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
        STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps),
        STRUCT("Nathan" AS name, ARRAY<FLOAT64>[] AS laps),
        STRUCT("David" AS name, NULL AS laps)
      ] AS participants
  )
SELECT
  Participant.name,
  SUM(duration) AS finish_time
FROM Races
INNER JOIN Races.participants AS Participant
LEFT JOIN Participant.laps AS duration
GROUP BY name;

/*-------------+--------------------*
 | name        | finish_time        |
 +-------------+--------------------+
 | Murphy      | 102.9              |
 | Rudisha     | 102.19999999999999 |
 | David       | NULL               |
 | Rotich      | 103.6              |
 | Makhloufi   | 102.6              |
 | Berian      | 106.1              |
 | Bosse       | 103.4              |
 | Kipketer    | 106                |
 | Nathan      | NULL               |
 | Lewandowski | 104.2              |
 *-------------+--------------------*/

Querying repeated fields

GoogleSQL represents repeated fields of PROTOs as ARRAYs. You can query these ARRAYs using UNNEST and INNER JOIN.

The following example queries a table containing a column of PROTOs with the alias album and the repeated field song. All values of song for each album appear on the same row.

Example

WITH
  Bands AS (
    SELECT
      'The Beatles' AS band_name,
      NEW googlesql.examples.music.Album(
        'Let It Be' AS album_name,
        ['Across the Universe', 'Get Back', 'Dig It'] AS song) AS album
    UNION ALL
    SELECT
      'The Beatles' AS band_name,
      NEW googlesql.examples.music.Album(
        'Rubber Soul' AS album_name,
        ['Drive My Car', 'The Word', 'Michelle'] AS song) AS album
  )
SELECT band_name, album.album_name, album.song
FROM Bands;

/*-------------+------------------+-----------------------------------------*
 | band_name   | album_name       | song                                    |
 +-------------+------------------+-----------------------------------------+
 | The Beatles | Let It Be        | [Across the Universe, Get Back, Dig It] |
 | The Beatles | Rubber Soul      | [Drive My Car, The Word, Michelle]      |
 *-------------+------------------+-----------------------------------------*/

To query the individual values of a repeated field, reference the field name using dot notation to return an ARRAY, and flatten the ARRAY using UNNEST. Use INNER JOIN to apply the UNNEST operator to each row and join the flattened ARRAY to the duplicated value of any non-repeated fields or columns in that row.

Example

The following example queries the table from the previous example and returns the values of the repeated field as an ARRAY. The UNNEST operator flattens the ARRAY that represents the repeated field song. INNER JOIN applies the UNNEST operator to each row and joins the output of UNNEST to the duplicated value of the column band_name and the non-repeated field album_name within that row.

WITH Bands AS (
  SELECT
    'The Beatles' AS band_name,
    NEW googlesql.examples.music.Album(
      'Let It Be' AS album_name,
      ['Across the Universe', 'Get Back', 'Dig It'] AS song
    ) AS album
    UNION ALL
  SELECT
    'The Beatles' AS band_name,
    NEW googlesql.examples.music.Album(
      'Rubber Soul' AS album_name,
      ['Drive My Car', 'The Word', 'Michelle'] AS song
    ) AS album
)
SELECT band_name, album.album_name, song_name
FROM Bands
INNER JOIN UNNEST(album.song) AS song_name;

/*-------------+-------------+---------------------*
 | band_name   | album_name  | song_name           |
 +-------------+-------------+---------------------+
 | The Beatles | Let It Be   | Across the Universe |
 | The Beatles | Let It Be   | Get Back            |
 | The Beatles | Let It Be   | Dig It              |
 | The Beatles | Rubber Soul | Drive My Car        |
 | The Beatles | Rubber Soul | The Word            |
 | The Beatles | Rubber Soul | Michelle            |
 *-------------+-------------+---------------------*/

Constructing arrays

You can construct an array using array literals or array functions. To learn more about constructing arrays, see Array type.

Creating arrays from subqueries

A common task when working with arrays is turning a subquery result into an array. In GoogleSQL, you can accomplish this using the ARRAY() function.

For example, consider the following operation on the Sequences table:

WITH Sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
  UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
  UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
  ARRAY(SELECT x * 2
        FROM UNNEST(some_numbers) AS x) AS doubled
FROM Sequences;

/*--------------------+---------------------*
 | some_numbers       | doubled             |
 +--------------------+---------------------+
 | [0, 1, 1, 2, 3, 5] | [0, 2, 2, 4, 6, 10] |
 | [2, 4, 8, 16, 32]  | [4, 8, 16, 32, 64]  |
 | [5, 10]            | [10, 20]            |
 *--------------------+---------------------*/

This example starts with a table named Sequences. This table contains a column, some_numbers, of type ARRAY<INT64>.

The query itself contains a subquery. This subquery selects each row in the some_numbers column and uses UNNEST to return the array as a set of rows. Next, it multiplies each value by two, and then re-combines the rows back into an array using the ARRAY() operator.

Filtering arrays

The following example uses a WHERE clause in the ARRAY() operator's subquery to filter the returned rows.

WITH Sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT
  ARRAY(SELECT x * 2
        FROM UNNEST(some_numbers) AS x
        WHERE x < 5) AS doubled_less_than_five
FROM Sequences;

/*------------------------*
 | doubled_less_than_five |
 +------------------------+
 | [0, 2, 2, 4, 6]        |
 | [4, 8]                 |
 | []                     |
 *------------------------*/

Notice that the third row contains an empty array, because the elements in the corresponding original row ([5, 10]) did not meet the filter requirement of x < 5.

You can also filter arrays by using SELECT DISTINCT to return only unique elements within an array.

WITH Sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers)
SELECT ARRAY(SELECT DISTINCT x
             FROM UNNEST(some_numbers) AS x) AS unique_numbers
FROM Sequences;

/*-----------------*
 | unique_numbers  |
 +-----------------+
 | [0, 1, 2, 3, 5] |
 *-----------------*/

You can also filter rows of arrays by using the IN keyword. This keyword filters rows containing arrays by determining if a specific value matches an element in the array.

WITH Sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT
   ARRAY(SELECT x
         FROM UNNEST(some_numbers) AS x
         WHERE 2 IN UNNEST(some_numbers)) AS contains_two
FROM Sequences;

/*--------------------*
 | contains_two       |
 +--------------------+
 | [0, 1, 1, 2, 3, 5] |
 | [2, 4, 8, 16, 32]  |
 | []                 |
 *--------------------*/

Notice again that the third row contains an empty array, because the array in the corresponding original row ([5, 10]) did not contain 2.

Scanning arrays

To check if an array contains a specific value, use the IN operator with UNNEST. To check if an array contains a value matching a condition, use the EXISTS operator with UNNEST.

Scanning for specific values

To scan an array for a specific value, use the IN operator with UNNEST.

Example

The following example returns true if the array contains the number 2.

SELECT 2 IN UNNEST([0, 1, 1, 2, 3, 5]) AS contains_value;

/*----------------*
 | contains_value |
 +----------------+
 | true           |
 *----------------*/

To return the rows of a table where the array column contains a specific value, filter the results of IN UNNEST using the WHERE clause.

Example

The following example returns the id value for the rows where the array column contains the value 2.

WITH Sequences AS
  (SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
SELECT id AS matching_rows
FROM Sequences
WHERE 2 IN UNNEST(Sequences.some_numbers)
ORDER BY matching_rows;

/*---------------*
 | matching_rows |
 +---------------+
 | 1             |
 | 2             |
 *---------------*/

Scanning for values that satisfy a condition

To scan an array for values that match a condition, use UNNEST to return a table of the elements in the array, use WHERE to filter the resulting table in a subquery, and use EXISTS to check if the filtered table contains any rows.

Example

The following example returns the id value for the rows where the array column contains values greater than 5.

WITH
  Sequences AS (
    SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
    UNION ALL
    SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
    UNION ALL
    SELECT 3 AS id, [5, 10] AS some_numbers
  )
SELECT id AS matching_rows
FROM Sequences
WHERE EXISTS(SELECT * FROM UNNEST(some_numbers) AS x WHERE x > 5);

/*---------------*
 | matching_rows |
 +---------------+
 | 2             |
 | 3             |
 *---------------*/

Scanning for STRUCT field values that satisfy a condition

To search an array of STRUCTs for a field whose value matches a condition, use UNNEST to return a table with a column for each STRUCT field, then filter non-matching rows from the table using WHERE EXISTS.

Example

The following example returns the rows where the array column contains a STRUCT whose field b has a value greater than 3.

WITH
  Sequences AS (
    SELECT 1 AS id, [STRUCT(0 AS a, 1 AS b)] AS some_numbers
    UNION ALL
    SELECT 2 AS id, [STRUCT(2 AS a, 4 AS b)] AS some_numbers
    UNION ALL
    SELECT 3 AS id, [STRUCT(5 AS a, 3 AS b), STRUCT(7 AS a, 4 AS b)] AS some_numbers
  )
SELECT id AS matching_rows
FROM Sequences
WHERE EXISTS(SELECT 1 FROM UNNEST(some_numbers) WHERE b > 3);

/*---------------*
 | matching_rows |
 +---------------+
 | 2             |
 | 3             |
 *---------------*/

Arrays and aggregation

With GoogleSQL, you can aggregate values into an array using ARRAY_AGG().

WITH Fruits AS
  (SELECT "apple" AS fruit
   UNION ALL SELECT "pear" AS fruit
   UNION ALL SELECT "banana" AS fruit)
SELECT ARRAY_AGG(fruit) AS fruit_basket
FROM Fruits;

/*-----------------------*
 | fruit_basket          |
 +-----------------------+
 | [apple, pear, banana] |
 *-----------------------*/

The array returned by ARRAY_AGG() is in an arbitrary order, since the order in which the function concatenates values is not guaranteed.

You can also apply aggregate functions such as SUM() to the elements in an array. For example, the following query returns the sum of array elements for each row of the Sequences table.

WITH Sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
  (SELECT SUM(x)
   FROM UNNEST(s.some_numbers) AS x) AS sums
FROM Sequences AS s;

/*--------------------+------*
 | some_numbers       | sums |
 +--------------------+------+
 | [0, 1, 1, 2, 3, 5] | 12   |
 | [2, 4, 8, 16, 32]  | 62   |
 | [5, 10]            | 15   |
 *--------------------+------*/

GoogleSQL also supports an aggregate function, ARRAY_CONCAT_AGG(), which concatenates the elements of an array column across rows.

WITH Aggregates AS
  (SELECT [1,2] AS numbers
   UNION ALL SELECT [3,4] AS numbers
   UNION ALL SELECT [5, 6] AS numbers)
SELECT ARRAY_CONCAT_AGG(numbers) AS count_to_six_agg
FROM Aggregates;

/*--------------------------------------------------*
 | count_to_six_agg                                 |
 +--------------------------------------------------+
 | [1, 2, 3, 4, 5, 6]                               |
 *--------------------------------------------------*/

Converting arrays to strings

The ARRAY_TO_STRING() function allows you to convert an ARRAY<STRING> to a single STRING value or an ARRAY<BYTES> to a single BYTES value where the resulting value is the ordered concatenation of the array elements.

The second argument is the separator that the function will insert between inputs to produce the output; this second argument must be of the same type as the elements of the first argument.

Example:

WITH Words AS
  (SELECT ["Hello", "World"] AS greeting)
SELECT ARRAY_TO_STRING(greeting, " ") AS greetings
FROM Words;

/*-------------*
 | greetings   |
 +-------------+
 | Hello World |
 *-------------*/

The optional third argument takes the place of NULL values in the input array.

  • If you omit this argument, then the function ignores NULL array elements.

  • If you provide an empty string, the function inserts a separator for NULL array elements.

Example:

SELECT
  ARRAY_TO_STRING(arr, ".", "N") AS non_empty_string,
  ARRAY_TO_STRING(arr, ".", "") AS empty_string,
  ARRAY_TO_STRING(arr, ".") AS omitted
FROM (SELECT ["a", NULL, "b", NULL, "c", NULL] AS arr);

/*------------------+--------------+---------*
 | non_empty_string | empty_string | omitted |
 +------------------+--------------+---------+
 | a.N.b.N.c.N      | a..b..c.     | a.b.c   |
 *------------------+--------------+---------*/

Combining arrays

In some cases, you might want to combine multiple arrays into a single array. You can accomplish this using the ARRAY_CONCAT() function.

SELECT ARRAY_CONCAT([1, 2], [3, 4], [5, 6]) AS count_to_six;

/*--------------------------------------------------*
 | count_to_six                                     |
 +--------------------------------------------------+
 | [1, 2, 3, 4, 5, 6]                               |
 *--------------------------------------------------*/

Updating arrays

Consider the following table called arrays_table. The first column in the table is an array of integers and the second column contains two nested arrays of integers.

WITH arrays_table AS (
  SELECT
    [1, 2] AS regular_array,
    STRUCT([10, 20] AS first_array, [100, 200] AS second_array) AS nested_arrays
  UNION ALL SELECT
    [3, 4] AS regular_array,
    STRUCT([30, 40] AS first_array, [300, 400] AS second_array) AS nested_arrays
)
SELECT * FROM arrays_table;

/*---------------*---------------------------*----------------------------*
 | regular_array | nested_arrays.first_array | nested_arrays.second_array |
 +---------------+---------------------------+----------------------------+
 | [1, 2]        | [10, 20]                  | [100, 200]                 |
 | [3, 4]        | [30, 40]                  | [130, 400]                 |
 *---------------*---------------------------*----------------------------*/

You can update arrays in a table by using the UPDATE statement. The following example inserts the number 5 into the regular_array column, and inserts the elements from the first_array field of the nested_arrays column into the second_array field:

UPDATE
  arrays_table
SET
  regular_array = ARRAY_CONCAT(regular_array, [5]),
  nested_arrays.second_array = ARRAY_CONCAT(nested_arrays.second_array,
                                            nested_arrays.first_array)
WHERE TRUE;
SELECT * FROM arrays_table;

/*---------------*---------------------------*----------------------------*
 | regular_array | nested_arrays.first_array | nested_arrays.second_array |
 +---------------+---------------------------+----------------------------+
 | [1, 2, 5]     | [10, 20]                  | [100, 200, 10, 20]         |
 | [3, 4, 5]     | [30, 40]                  | [130, 400, 30, 40]         |
 *---------------*---------------------------*----------------------------*/

Building arrays of arrays

GoogleSQL does not support building arrays of arrays directly. Instead, you must create an array of structs, with each struct containing a field of type ARRAY. To illustrate this, consider the following Points table:

CREATE TABLE Points (
  point ARRAY<INT64>,
  id INT64 NOT NULL
) PRIMARY KEY(id);

Assume the table is populated with the following rows:

/*----+----------*
 | id | point    |
 +----+----------+
 | 1  | [1, 5]   |
 | 2  | [2, 8]   |
 | 3  | [3, 7]   |
 | 4  | [4, 1]   |
 | 5  | [5, 7]   |
 *----+----------*/

Now, let's say you wanted to create an array consisting of each point in the Points table. To accomplish this, wrap the array returned from each row in a STRUCT, as shown below.

WITH Points AS
  (SELECT [1, 5] AS point
   UNION ALL SELECT [2, 8] AS point
   UNION ALL SELECT [3, 7] AS point
   UNION ALL SELECT [4, 1] AS point
   UNION ALL SELECT [5, 7] AS point)
SELECT ARRAY(
  SELECT STRUCT(point)
  FROM Points)
  AS coordinates;

/*-------------------*
 | coordinates       |
 +-------------------+
 | [{point: [1,5]},  |
 |  {point: [2,8]},  |
 |  {point: [5,7]},  |
 |  {point: [3,7]},  |
 |  {point: [4,1]}]  |
 *-------------------*/
SELECT ARRAY(
  SELECT STRUCT(point)
  FROM Points)
  AS coordinates;

/*--------------*
 | coordinates  |
 +--------------+
 | point: [1,5] |
 | point: [2,8] |
 | point: [3,7] |
 | point: [4,1] |
 | point: [5,7] |
 *--------------*/