使用数组

在 GoogleSQL for BigQuery 中,数组是由零个或更多相同数据类型的值组成的有序列表。您可以构建简单数据类型(如 INT64)的数组和复杂数据类型(如 STRUCT)的数组。目前,ARRAY 数据类型除外,因为不支持数组的数组。如需详细了解 ARRAY 数据类型(包括 NULL 处理),请参阅数组类型

借助 GoogleSQL,您可以构建数组字面量、使用 ARRAY 函数基于子查询构建数组,还可以使用 ARRAY_AGG 函数将值聚合到数组中。

您可以使用 ARRAY_CONCAT() 等函数合并数组,也可以使用 ARRAY_TO_STRING() 将数组转换为字符串。

访问数组元素

假设有以下名为 Sequences 的模拟表。该表包含 ARRAY 数据类型的列 some_numbers

WITH
  Sequences AS (
    SELECT [0, 1, 1, 2, 3, 5] AS some_numbers UNION ALL
    SELECT [2, 4, 8, 16, 32] UNION ALL
    SELECT [5, 10]
  )
SELECT * FROM Sequences

/*---------------------*
 | some_numbers        |
 +---------------------+
 | [0, 1, 1, 2, 3, 5]  |
 | [2, 4, 8, 16, 32]   |
 | [5, 10]             |
 *---------------------*/

如需访问 some_numbers 列中的数组元素,请指定要使用的索引类型:indexOFFSET(index)(用于从 0 开始的索引),或 ORDINAL(index)(用于从 1 开始的索引)。

例如:

SELECT
  some_numbers,
  some_numbers[0] AS index_0,
  some_numbers[OFFSET(1)] AS offset_1,
  some_numbers[ORDINAL(1)] AS ordinal_1
FROM Sequences

/*--------------------+---------+----------+-----------*
 | some_numbers       | index_0 | offset_1 | ordinal_1 |
 +--------------------+---------+----------+-----------+
 | [0, 1, 1, 2, 3, 5] | 0       | 1        | 0         |
 | [2, 4, 8, 16, 32]  | 2       | 4        | 2         |
 | [5, 10]            | 5       | 10       | 5         |
 *--------------------+---------+----------+-----------*/

查找长度

ARRAY_LENGTH 函数用于返回数组的长度。

WITH Sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
       ARRAY_LENGTH(some_numbers) AS len
FROM Sequences;

/*--------------------+--------*
 | some_numbers       | len    |
 +--------------------+--------+
 | [0, 1, 1, 2, 3, 5] | 6      |
 | [2, 4, 8, 16, 32]  | 5      |
 | [5, 10]            | 2      |
 *--------------------+--------*/

将数组中的元素转换为表中的行

如需将 ARRAY 转换为一组行(即执行展平操作),请使用 UNNEST 运算符。UNNEST 获取一个 ARRAY,然后返回一个表,ARRAY 中的每个元素均占该表的一行。

由于 UNNEST 破坏了 ARRAY 元素的顺序,您可能需要恢复表中的顺序。为此,请使用可选的 WITH OFFSET 子句返回另一个包含各数组元素偏移量的列,然后使用 ORDER BY 子句按偏移量对行进行排序。

示例

SELECT *
FROM UNNEST(['foo', 'bar', 'baz', 'qux', 'corge', 'garply', 'waldo', 'fred'])
  AS element
WITH OFFSET AS offset
ORDER BY offset;

/*----------+--------*
 | element  | offset |
 +----------+--------+
 | foo      | 0      |
 | bar      | 1      |
 | baz      | 2      |
 | qux      | 3      |
 | corge    | 4      |
 | garply   | 5      |
 | waldo    | 6      |
 | fred     | 7      |
 *----------+--------*/

如需展平一整列 ARRAY,同时保留每行中其他列的值,请使用相互关联的 INNER JOIN 将包含 ARRAY 列的表联接到该 ARRAY 列的 UNNEST 输出。

通过相互关联的联接,UNNEST 运算符会引用源表每一行中的 ARRAY 类型列,该列之前曾出现在 FROM 子句中。对于源表中的每一行 NUNNEST 将行 NARRAY 展平成一组包含 ARRAY 元素的行,然后相关 INNER JOINCROSS JOIN 会将这组新行与源表的单行 N 组合起来。

示例

下述示例使用 UNNEST 为数组列中的每个元素返回一行。使用 INNER JOIN 后,id 列显示 Sequences 中包含每个数字的行的 id 值。

WITH
  Sequences AS (
    SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
    UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
    UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers
  )
SELECT id, flattened_numbers
FROM Sequences
INNER JOIN UNNEST(Sequences.some_numbers) AS flattened_numbers;

/*------+-------------------*
 | id   | flattened_numbers |
 +------+-------------------+
 |    1 |                 0 |
 |    1 |                 1 |
 |    1 |                 1 |
 |    1 |                 2 |
 |    1 |                 3 |
 |    1 |                 5 |
 |    2 |                 2 |
 |    2 |                 4 |
 |    2 |                 8 |
 |    2 |                16 |
 |    2 |                32 |
 |    3 |                 5 |
 |    3 |                10 |
 *------+-------------------*/

请注意,对于相关联接,UNNEST 运算符为可选运算符,INNER JOIN 可表示为 CROSS JOIN 或逗号交叉联接。使用英文逗号交叉联接简写表示法,上述示例合并如下:

WITH
  Sequences AS (
    SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
    UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
    UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers
  )
SELECT id, flattened_numbers
FROM Sequences, Sequences.some_numbers AS flattened_numbers;

/*------+-------------------*
 | id   | flattened_numbers |
 +------+-------------------+
 |    1 |                 0 |
 |    1 |                 1 |
 |    1 |                 1 |
 |    1 |                 2 |
 |    1 |                 3 |
 |    1 |                 5 |
 |    2 |                 2 |
 |    2 |                 4 |
 |    2 |                 8 |
 |    2 |                16 |
 |    2 |                32 |
 |    3 |                 5 |
 |    3 |                10 |
 *------+-------------------*/

查询嵌套数组

如果一个表包含 STRUCTARRAY,则可展平 ARRAY 以查询 STRUCT 的字段。您也可以展平 STRUCT 值的 ARRAY 类型字段。

查询 ARRAY 中的 STRUCT 元素

以下示例将 UNNESTINNER JOIN 结合使用,以展平 STRUCTARRAY

WITH
  Races AS (
    SELECT
      "800M" AS race,
      [
        STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
        STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
        STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
        STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
        STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
        STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
        STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
        STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps)
      ] AS participants
    )
SELECT
  race,
  participant
FROM Races AS r
INNER JOIN UNNEST(r.participants) AS participant;

/*------+---------------------------------------*
 | race | participant                           |
 +------+---------------------------------------+
 | 800M | {Rudisha, [23.4, 26.3, 26.4, 26.1]}   |
 | 800M | {Makhloufi, [24.5, 25.4, 26.6, 26.1]} |
 | 800M | {Murphy, [23.9, 26, 27, 26]}          |
 | 800M | {Bosse, [23.6, 26.2, 26.5, 27.1]}     |
 | 800M | {Rotich, [24.7, 25.6, 26.9, 26.4]}    |
 | 800M | {Lewandowski, [25, 25.7, 26.3, 27.2]} |
 | 800M | {Kipketer, [23.2, 26.1, 27.3, 29.4]}  |
 | 800M | {Berian, [23.7, 26.1, 27, 29.3]}      |
 *------+---------------------------------------*/

您可以从重复字段中找到特定信息。例如,以下查询用于返回 800 米赛跑中跑得最快的参赛选手。

示例

WITH
  Races AS (
    SELECT
      "800M" AS race,
      [
        STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
        STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
        STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
        STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
        STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
        STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
        STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
        STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps)
      ] AS participants
  )
SELECT
  race,
  (
    SELECT name
    FROM UNNEST(participants)
    ORDER BY (SELECT SUM(duration) FROM UNNEST(laps) AS duration) ASC
    LIMIT 1
  ) AS fastest_racer
FROM Races;

/*------+---------------*
 | race | fastest_racer |
 +------+---------------+
 | 800M | Rudisha       |
 *------+---------------*/

查询 STRUCT 中的 ARRAY 类型字段

您也可以从嵌套的重复字段中获取信息。例如,以下语句用于返回在 800 米赛跑中圈速最快的参赛选手。

WITH
  Races AS (
    SELECT
      "800M" AS race,
      [
        STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
        STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
        STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
        STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
        STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
        STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
        STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
        STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps)
      ]AS participants
  )
SELECT
  race,
  (
    SELECT name
    FROM UNNEST(participants), UNNEST(laps) AS duration
    ORDER BY duration ASC
    LIMIT 1
  ) AS runner_with_fastest_lap
FROM Races;

/*------+-------------------------*
 | race | runner_with_fastest_lap |
 +------+-------------------------+
 | 800M | Kipketer                |
 *------+-------------------------*/

请注意,前面的查询使用英文逗号运算符 (,) 来执行交叉联接并展平数组。这相当于使用显式 CROSS JOIN,或使用以下使用显式 INNER JOIN 的示例:

WITH
  Races AS (
    SELECT "800M" AS race,
      [
        STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
        STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
        STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
        STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
        STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
        STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
        STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
        STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps)
      ] AS participants
  )
SELECT
  race,
  (
    SELECT name
    FROM UNNEST(participants)
    INNER JOIN UNNEST(laps) AS duration
    ORDER BY duration ASC LIMIT 1
  ) AS runner_with_fastest_lap
FROM Races;

/*------+-------------------------*
 | race | runner_with_fastest_lap |
 +------+-------------------------+
 | 800M | Kipketer                |
 *------+-------------------------*/

使用 INNER JOIN 展平数组会排除具有空数组或 NULL 数组的行。如果要包含这些行,请使用 LEFT JOIN

WITH
  Races AS (
    SELECT
      "800M" AS race,
      [
        STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
        STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
        STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
        STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
        STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
        STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
        STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
        STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps),
        STRUCT("Nathan" AS name, ARRAY<FLOAT64>[] AS laps),
        STRUCT("David" AS name, NULL AS laps)
      ] AS participants
  )
SELECT
  Participant.name,
  SUM(duration) AS finish_time
FROM Races
INNER JOIN Races.participants AS Participant
LEFT JOIN Participant.laps AS duration
GROUP BY name;

/*-------------+--------------------*
 | name        | finish_time        |
 +-------------+--------------------+
 | Murphy      | 102.9              |
 | Rudisha     | 102.19999999999999 |
 | David       | NULL               |
 | Rotich      | 103.6              |
 | Makhloufi   | 102.6              |
 | Berian      | 106.1              |
 | Bosse       | 103.4              |
 | Kipketer    | 106                |
 | Nathan      | NULL               |
 | Lewandowski | 104.2              |
 *-------------+--------------------*/

构建数组

您可以使用数组字面量或数组函数构建 ARRAY。如需详细了解如何构建 ARRAY,请参阅 ARRAY 类型

基于子查询创建数组

使用数组时,一个常见的任务是将子查询结果转换为数组。在 GoogleSQL 中,您可以使用 ARRAY() 函数来实现此目的。

例如,对 Sequences 表执行以下操作:

WITH Sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
  UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
  UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
  ARRAY(SELECT x * 2
        FROM UNNEST(some_numbers) AS x) AS doubled
FROM Sequences;

/*--------------------+---------------------*
 | some_numbers       | doubled             |
 +--------------------+---------------------+
 | [0, 1, 1, 2, 3, 5] | [0, 2, 2, 4, 6, 10] |
 | [2, 4, 8, 16, 32]  | [4, 8, 16, 32, 64]  |
 | [5, 10]            | [10, 20]            |
 *--------------------+---------------------*/

此示例以一个名为 Sequences 的表开始。该表包含一个类型为 ARRAY<INT64> 的列 some_numbers

该查询本身包含一个子查询。该子查询会选择 some_numbers 列中的每一行,并使用 UNNEST 将数组返回为一组行。接下来,它将每个值乘以 2,然后使用 ARRAY() 运算符将这些行重新组合到一个数组中。

过滤数组

以下示例在 ARRAY() 运算符的子查询中使用 WHERE 子句,以过滤返回的行。

WITH Sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT
  ARRAY(SELECT x * 2
        FROM UNNEST(some_numbers) AS x
        WHERE x < 5) AS doubled_less_than_five
FROM Sequences;

/*------------------------*
 | doubled_less_than_five |
 +------------------------+
 | [0, 2, 2, 4, 6]        |
 | [4, 8]                 |
 | []                     |
 *------------------------*/

请注意,第三行包含一个空数组,这是因为与之对应的原始行 ([5, 10]) 中的元素不符合 x < 5 的过滤条件。

您也可以使用 SELECT DISTINCT 来过滤数组,这样可以仅返回数组中的唯一元素。

WITH Sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers)
SELECT ARRAY(SELECT DISTINCT x
             FROM UNNEST(some_numbers) AS x) AS unique_numbers
FROM Sequences;

/*-----------------*
 | unique_numbers  |
 +-----------------+
 | [0, 1, 2, 3, 5] |
 *-----------------*/

您还可以使用 IN 关键字过滤数组行。此关键字可过滤包含数组的行,方法是确定特定值是否与数组中的元素匹配。

WITH Sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT
   ARRAY(SELECT x
         FROM UNNEST(some_numbers) AS x
         WHERE 2 IN UNNEST(some_numbers)) AS contains_two
FROM Sequences;

/*--------------------*
 | contains_two       |
 +--------------------+
 | [0, 1, 1, 2, 3, 5] |
 | [2, 4, 8, 16, 32]  |
 | []                 |
 *--------------------*/

另请注意,第三行包含一个空数组,这是因为与之对应的原始行 ([5, 10]) 中的数组不包含 2

扫描数组

如需检查数组是否包含特定值,请结合使用 IN 运算符和 UNNEST。如需检查数组是否包含与某条件匹配的值,请结合使用 EXISTS 运算符和 UNNEST

扫描特定值

如需扫描数组中的特定值,请将 IN 运算符与 UNNEST 结合使用。

示例

以下示例在数组中包含数字 2 时返回 true

SELECT 2 IN UNNEST([0, 1, 1, 2, 3, 5]) AS contains_value;

/*----------------*
 | contains_value |
 +----------------+
 | true           |
 *----------------*/

对于一个表,要返回数组列中包含特定值的行,请使用 WHERE 子句过滤 IN UNNEST 的结果。

示例

以下示例返回数组列中包含值 2 的行的 id 值。

WITH Sequences AS
  (SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
SELECT id AS matching_rows
FROM Sequences
WHERE 2 IN UNNEST(Sequences.some_numbers)
ORDER BY matching_rows;

/*---------------*
 | matching_rows |
 +---------------+
 | 1             |
 | 2             |
 *---------------*/

扫描满足条件的值

如需扫描数组以查找与条件相匹配的值,请使用 UNNEST 返回包含数组中元素的表,使用 WHERE 过滤子查询中的结果表,然后使用 EXISTS 检查过滤后的表是否包含任何行。

示例

以下示例返回包含大于 5 的值的数组列中行的 id 值。

WITH
  Sequences AS (
    SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
    UNION ALL
    SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
    UNION ALL
    SELECT 3 AS id, [5, 10] AS some_numbers
  )
SELECT id AS matching_rows
FROM Sequences
WHERE EXISTS(SELECT * FROM UNNEST(some_numbers) AS x WHERE x > 5);

/*---------------*
 | matching_rows |
 +---------------+
 | 2             |
 | 3             |
 *---------------*/

扫描满足条件的 STRUCT 字段值

如需搜索 STRUCT 数组中值与条件相匹配的字段,请使用 UNNEST 返回一个每列均与一个 STRUCT 字段相对应的表,然后使用 WHERE EXISTS 从表中过滤掉不匹配的行。

示例

以下示例返回数组列中包含 STRUCT 且其字段 b 的值大于 3 的行。

WITH
  Sequences AS (
    SELECT 1 AS id, [STRUCT(0 AS a, 1 AS b)] AS some_numbers
    UNION ALL
    SELECT 2 AS id, [STRUCT(2 AS a, 4 AS b)] AS some_numbers
    UNION ALL
    SELECT 3 AS id, [STRUCT(5 AS a, 3 AS b), STRUCT(7 AS a, 4 AS b)] AS some_numbers
  )
SELECT id AS matching_rows
FROM Sequences
WHERE EXISTS(SELECT 1 FROM UNNEST(some_numbers) WHERE b > 3);

/*---------------*
 | matching_rows |
 +---------------+
 | 2             |
 | 3             |
 *---------------*/

数组和聚合

借助 GoogleSQL,您可以使用 ARRAY_AGG() 将值聚合到数组中。

WITH Fruits AS
  (SELECT "apple" AS fruit
   UNION ALL SELECT "pear" AS fruit
   UNION ALL SELECT "banana" AS fruit)
SELECT ARRAY_AGG(fruit) AS fruit_basket
FROM Fruits;

/*-----------------------*
 | fruit_basket          |
 +-----------------------+
 | [apple, pear, banana] |
 *-----------------------*/

ARRAY_AGG() 返回的数组以任意顺序排列,这是因为该函数连接值的顺序无法保证。如需对数组元素进行排序,请使用 ORDER BY。例如:

WITH Fruits AS
  (SELECT "apple" AS fruit
   UNION ALL SELECT "pear" AS fruit
   UNION ALL SELECT "banana" AS fruit)
SELECT ARRAY_AGG(fruit ORDER BY fruit) AS fruit_basket
FROM Fruits;

/*-----------------------*
 | fruit_basket          |
 +-----------------------+
 | [apple, banana, pear] |
 *-----------------------*/

您还可以将聚合函数(如 SUM())应用于数组中的元素。例如,以下查询返回 Sequences 表中每一行的数组元素的总和。

WITH Sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
  (SELECT SUM(x)
   FROM UNNEST(s.some_numbers) AS x) AS sums
FROM Sequences AS s;

/*--------------------+------*
 | some_numbers       | sums |
 +--------------------+------+
 | [0, 1, 1, 2, 3, 5] | 12   |
 | [2, 4, 8, 16, 32]  | 62   |
 | [5, 10]            | 15   |
 *--------------------+------*/

GoogleSQL 还支持聚合函数 ARRAY_CONCAT_AGG(),该函数可以跨行连接数组列的元素。

WITH Aggregates AS
  (SELECT [1,2] AS numbers
   UNION ALL SELECT [3,4] AS numbers
   UNION ALL SELECT [5, 6] AS numbers)
SELECT ARRAY_CONCAT_AGG(numbers) AS count_to_six_agg
FROM Aggregates;

/*--------------------------------------------------*
 | count_to_six_agg                                 |
 +--------------------------------------------------+
 | [1, 2, 3, 4, 5, 6]                               |
 *--------------------------------------------------*/

将数组转换为字符串

利用 ARRAY_TO_STRING() 函数,您可以将 ARRAY<STRING> 转换为单个 STRING 值,或者将 ARRAY<BYTES> 转换为单个 BYTES 值,其中,生成的值是有序连接的数组元素。

第二个参数是分隔符,相应函数将在输入之间插入该分隔符以生成输出;第二个参数必须与第一个参数的元素具有相同类型。

示例:

WITH Words AS
  (SELECT ["Hello", "World"] AS greeting)
SELECT ARRAY_TO_STRING(greeting, " ") AS greetings
FROM Words;

/*-------------*
 | greetings   |
 +-------------+
 | Hello World |
 *-------------*/

第三个参数(可选)取代输入数组中的 NULL 值。

  • 如果您省略此参数,则该函数将忽略 NULL 数组元素。

  • 如果您提供空字符串,则该函数会为 NULL 数组元素插入分隔符。

示例:

SELECT
  ARRAY_TO_STRING(arr, ".", "N") AS non_empty_string,
  ARRAY_TO_STRING(arr, ".", "") AS empty_string,
  ARRAY_TO_STRING(arr, ".") AS omitted
FROM (SELECT ["a", NULL, "b", NULL, "c", NULL] AS arr);

/*------------------+--------------+---------*
 | non_empty_string | empty_string | omitted |
 +------------------+--------------+---------+
 | a.N.b.N.c.N      | a..b..c.     | a.b.c   |
 *------------------+--------------+---------*/

合并数组

在某些情况下,您可能需要将多个数组合并为单个数组。 您可以使用 ARRAY_CONCAT() 函数来完成此操作。

SELECT ARRAY_CONCAT([1, 2], [3, 4], [5, 6]) AS count_to_six;

/*--------------------------------------------------*
 | count_to_six                                     |
 +--------------------------------------------------+
 | [1, 2, 3, 4, 5, 6]                               |
 *--------------------------------------------------*/

更新数组

假设以下名为 arrays_table 的表。表格中的第一列是整数数组,而第二列包含两个嵌套的整数数组。

WITH arrays_table AS (
  SELECT
    [1, 2] AS regular_array,
    STRUCT([10, 20] AS first_array, [100, 200] AS second_array) AS nested_arrays
  UNION ALL SELECT
    [3, 4] AS regular_array,
    STRUCT([30, 40] AS first_array, [300, 400] AS second_array) AS nested_arrays
)
SELECT * FROM arrays_table;

/*---------------*---------------------------*----------------------------*
 | regular_array | nested_arrays.first_array | nested_arrays.second_array |
 +---------------+---------------------------+----------------------------+
 | [1, 2]        | [10, 20]                  | [100, 200]                 |
 | [3, 4]        | [30, 40]                  | [130, 400]                 |
 *---------------*---------------------------*----------------------------*/

您可以使用 UPDATE 语句更新表中的数组。以下示例将数字 5 插入 regular_array 列中,并将 nested_arrays 列的 first_array 字段中的元素插入 second_array 字段中:

UPDATE
  arrays_table
SET
  regular_array = ARRAY_CONCAT(regular_array, [5]),
  nested_arrays.second_array = ARRAY_CONCAT(nested_arrays.second_array,
                                            nested_arrays.first_array)
WHERE TRUE;
SELECT * FROM arrays_table;

/*---------------*---------------------------*----------------------------*
 | regular_array | nested_arrays.first_array | nested_arrays.second_array |
 +---------------+---------------------------+----------------------------+
 | [1, 2, 5]     | [10, 20]                  | [100, 200, 10, 20]         |
 | [3, 4, 5]     | [30, 40]                  | [130, 400, 30, 40]         |
 *---------------*---------------------------*----------------------------*/

压缩数组

假定两个大小相同的数组,您可以将其合并到一个数组中,该数组由输入数组中的元素对(取自其相应的位置)组成。此操作有时称为压缩

您可以使用 UNNESTWITH OFFSET 压缩数组。在此示例中,每个值对都以 STRUCT 形式存储在数组中。

WITH
  Combinations AS (
    SELECT
      ['a', 'b'] AS letters,
      [1, 2, 3] AS numbers
  )
SELECT
  ARRAY(
    SELECT AS STRUCT
      letters[SAFE_OFFSET(index)] AS letter,
      numbers[SAFE_OFFSET(index)] AS number
    FROM Combinations
    INNER JOIN
      UNNEST(
        GENERATE_ARRAY(
          0,
          LEAST(ARRAY_LENGTH(letters), ARRAY_LENGTH(numbers)) - 1)) AS index
    ORDER BY index
  ) AS pairs;

/*------------------------------*
 | pairs                        |
 +------------------------------+
 | [{ letter: "a", number: 1 }, |
 |  { letter: "b", number: 2 }] |
 *------------------------------*/

您可以使用长度不同的输入数组,只要第一个数组的长度等于或小于第二个数组的长度。压缩后的数组将是最短输入数组的长度。

要获取一个包含所有元素的压缩数组(即使输入数组的长度不同),请将 LEAST 更改为 GREATEST。如果一个数组中的元素没有另一个数组中的关联元素,则这些元素将与 NULL 配对。

WITH
  Combinations AS (
    SELECT
      ['a', 'b'] AS letters,
      [1, 2, 3] AS numbers
  )
SELECT
  ARRAY(
    SELECT AS STRUCT
      letters[SAFE_OFFSET(index)] AS letter,
      numbers[SAFE_OFFSET(index)] AS number
    FROM Combinations
    INNER JOIN
      UNNEST(
        GENERATE_ARRAY(
          0,
          GREATEST(ARRAY_LENGTH(letters), ARRAY_LENGTH(numbers)) - 1)) AS index
    ORDER BY index
  ) AS pairs;

/*-------------------------------*
 | pairs                         |
 +-------------------------------+
 | [{ letter: "a", number: 1 },  |
 |  { letter: "b", number: 2 },  |
 |  { letter: null, number: 3 }] |
 *-------------------------------*/

构建数组的数组

GoogleSQL 不支持直接构建数组的数组。正确做法是,您必须创建一个结构体数组,每个结构体包含一个 ARRAY 类型的字段。为便于说明,请参考以下 Points 表:

/*----------*
 | point    |
 +----------+
 | [1, 5]   |
 | [2, 8]   |
 | [3, 7]   |
 | [4, 1]   |
 | [5, 7]   |
 *----------*/

现在,假设您要创建一个由 Points 表中的每个 point 组成的数组。如需完成此操作,请将各行返回的数组封装在一个 STRUCT 中,如下所示。

WITH Points AS
  (SELECT [1, 5] AS point
   UNION ALL SELECT [2, 8] AS point
   UNION ALL SELECT [3, 7] AS point
   UNION ALL SELECT [4, 1] AS point
   UNION ALL SELECT [5, 7] AS point)
SELECT ARRAY(
  SELECT STRUCT(point)
  FROM Points)
  AS coordinates;

/*-------------------*
 | coordinates       |
 +-------------------+
 | [{point: [1,5]},  |
 |  {point: [2,8]},  |
 |  {point: [5,7]},  |
 |  {point: [3,7]},  |
 |  {point: [4,1]}]  |
 *-------------------*/