在标准 SQL 中使用数组

在 BigQuery 中,数组是由零个或多个数据类型相同的值组成的有序列表。您可以构建简单数据类型(如 INT64)的数组和复杂数据类型(如 STRUCT)的数组。目前,ARRAY 数据类型是个例外:不支持数组的数组。

借助 BigQuery,您可以构建数组字面量,使用 ARRAY 函数基于子查询构建数组,还可以使用 ARRAY_AGG 函数将值聚合到数组中。

您可以使用 ARRAY_CONCAT() 等函数合并数组,也可以使用 ARRAY_TO_STRING() 将数组转换为字符串。

构建数组

使用数组字面量

您可以使用英文方括号([])在 BigQuery 中构建数组字面量。数组中的每个元素都用英文逗号分隔。

SELECT [1, 2, 3] as numbers;

SELECT ["apple", "pear", "orange"] as fruit;

SELECT [true, false, true] as booleans;

您也可以基于任何具有兼容类型的表达式创建数组。例如:

SELECT [a, b, c]
FROM
  (SELECT 5 AS a,
          37 AS b,
          406 AS c);

SELECT [a, b, c]
FROM
  (SELECT CAST(5 AS INT64) AS a,
          CAST(37 AS FLOAT64) AS b,
          406 AS c);

注意,第二个示例包含三个表达式:第一个返回 INT64,第二个返回 FLOAT64,第三个声明一个字面量。此表达式之所以有效,是因为所有这三个表达式都具有 FLOAT64 这一超类型。

要为数组声明特定数据类型,请使用英文尖括号(<>)。例如:

SELECT ARRAY<FLOAT64>[1, 2, 3] as floats;

大多数数据类型(例如 INT64STRING)的数组不需要事先进行声明。

SELECT [1, 2, 3] as numbers;

您可以使用 ARRAY<type>[] 编写一个特定类型的空数组。此外,您还可以使用 [] 编写一个无类型的空数组,在这种情况下,BigQuery 会尝试通过上下文环境推断数组类型。如果 BigQuery 无法推断出类型,则会使用默认类型 ARRAY<INT64>

使用生成的值

您还可以使用生成的值构造 ARRAY

生成整数数组

GENERATE_ARRAY 基于起始值和结束值以及步进值生成值数组。例如,以下查询用于生成一个包含从 11 到 33(含边界值)的所有奇整数的数组:

SELECT GENERATE_ARRAY(11, 33, 2) AS odds;

+--------------------------------------------------+
| odds                                             |
+--------------------------------------------------+
| [11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33] |
+--------------------------------------------------+

您还可以通过提供负步进值按照降序生成值数组:

SELECT GENERATE_ARRAY(21, 14, -1) AS countdown;

+----------------------------------+
| countdown                        |
+----------------------------------+
| [21, 20, 19, 18, 17, 16, 15, 14] |
+----------------------------------+

生成日期数组

GENERATE_DATE_ARRAY 基于起始和结束 DATE 以及步进 INTERVAL 生成 DATE 数组。

您可以使用 GENERATE_DATE_ARRAY 生成一组 DATE 值。例如,以下查询最多在 1 WEEK 间隔内返回当前 DATE 和以下 DATE,并且包括以后的某一 DATE

SELECT
  GENERATE_DATE_ARRAY('2017-11-21', '2017-12-31', INTERVAL 1 WEEK)
    AS date_array;
+--------------------------------------------------------------------------+
| date_array                                                               |
+--------------------------------------------------------------------------+
| [2017-11-21, 2017-11-28, 2017-12-05, 2017-12-12, 2017-12-19, 2017-12-26] |
+--------------------------------------------------------------------------+

访问数组元素

请考虑下面的 sequences 表:

+---------------------+
| some_numbers        |
+---------------------+
| [0, 1, 1, 2, 3, 5]  |
| [2, 4, 8, 16, 32]   |
| [5, 10]             |
+---------------------+

该表包含 ARRAY 数据类型的列 some_numbers。要访问此列中数组的元素,您必须指定要使用的索引类型:OFFSET(用于从 0 开始的索引)或 ORDINAL(用于从 1 开始的索引)。

WITH sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
       some_numbers[OFFSET(1)] AS offset_1,
       some_numbers[ORDINAL(1)] AS ordinal_1
FROM sequences;

+--------------------+----------+-----------+
| some_numbers       | offset_1 | ordinal_1 |
+--------------------+----------+-----------+
| [0, 1, 1, 2, 3, 5] | 1        | 0         |
| [2, 4, 8, 16, 32]  | 4        | 2         |
| [5, 10]            | 10       | 5         |
+--------------------+----------+-----------+

查找长度

ARRAY_LENGTH() 函数用于返回数组的长度。

WITH sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
       ARRAY_LENGTH(some_numbers) AS len
FROM sequences;

+--------------------+--------+
| some_numbers       | len    |
+--------------------+--------+
| [0, 1, 1, 2, 3, 5] | 6      |
| [2, 4, 8, 16, 32]  | 5      |
| [5, 10]            | 2      |
+--------------------+--------+

平展数组

要将 ARRAY 转换为一组行(即执行展平操作),请使用 UNNEST 运算符。UNNEST 获取一个 ARRAY,然后返回一个表,ARRAY 中的每个元素均占该表的一行。

由于 UNNEST 破坏了 ARRAY 元素的顺序,您可能需要恢复表中的顺序。为此,请使用可选的 WITH OFFSET 子句返回另一个包含各数组元素偏移量的列,然后使用 ORDER BY 子句按偏移量对行进行排序。

示例

SELECT *
FROM UNNEST(['foo', 'bar', 'baz', 'qux', 'corge', 'garply', 'waldo', 'fred'])
  AS element
WITH OFFSET AS offset
ORDER BY offset;

+----------+--------+
| element  | offset |
+----------+--------+
| foo      | 0      |
| bar      | 1      |
| baz      | 2      |
| qux      | 3      |
| corge    | 4      |
| garply   | 5      |
| waldo    | 6      |
| fred     | 7      |
+----------+--------+

要展平一整列 ARRAY,同时保留每行中其他列的值,请使用 CROSS JOIN 将含有 ARRAY 列的表连接到该 ARRAY 列的 UNNEST 输出。

这是一种相互关联的交叉联接:UNNEST 运算符引用了源表中每一行的 ARRAY 列,该列之前曾出现在 FROM 子句中。对于源表中的每一行 NUNNEST 将行 NARRAY 展平成一组包含 ARRAY 元素的行,然后 CROSS JOIN 将这组新行与源表的单行 N 连接起来。

示例

以下示例使用 UNNEST 将数组列中的每个元素返回为一行。使用 CROSS JOIN 语句后,id 列显示 sequences 中包含每个数字的行的 id 值。

WITH sequences AS
  (SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
SELECT id, flattened_numbers
FROM sequences
CROSS JOIN UNNEST(sequences.some_numbers) AS flattened_numbers;

+------+-------------------+
| id   | flattened_numbers |
+------+-------------------+
|    1 |                 0 |
|    1 |                 1 |
|    1 |                 1 |
|    1 |                 2 |
|    1 |                 3 |
|    1 |                 5 |
|    2 |                 2 |
|    2 |                 4 |
|    2 |                 8 |
|    2 |                16 |
|    2 |                32 |
|    3 |                 5 |
|    3 |                10 |
+------+-------------------+

查询嵌套数组

如果一个表包含 STRUCTARRAY,您可以展平 ARRAY 以查询 STRUCT 的字段。您也可以展平 STRUCT 值的 ARRAY 类型字段。

查询 ARRAY 中的 STRUCT 元素

以下示例将 UNNESTCROSS JOIN 结合使用,用来展平 STRUCTARRAY

WITH races AS (
  SELECT "800M" AS race,
    [STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] as splits),
     STRUCT("Makhloufi" as name, [24.5, 25.4, 26.6, 26.1] as splits),
     STRUCT("Murphy" as name, [23.9, 26.0, 27.0, 26.0] as splits),
     STRUCT("Bosse" as name, [23.6, 26.2, 26.5, 27.1] as splits),
     STRUCT("Rotich" as name, [24.7, 25.6, 26.9, 26.4] as splits),
     STRUCT("Lewandowski" as name, [25.0, 25.7, 26.3, 27.2] as splits),
     STRUCT("Kipketer" as name, [23.2, 26.1, 27.3, 29.4] as splits),
     STRUCT("Berian" as name, [23.7, 26.1, 27.0, 29.3] as splits)]
       AS participants)
SELECT
  race,
  participant
FROM races r
CROSS JOIN UNNEST(r.participants) as participant;

+------+---------------------------------------+
| race | participant                           |
+------+---------------------------------------+
| 800M | {Rudisha, [23.4, 26.3, 26.4, 26.1]}   |
| 800M | {Makhloufi, [24.5, 25.4, 26.6, 26.1]} |
| 800M | {Murphy, [23.9, 26, 27, 26]}          |
| 800M | {Bosse, [23.6, 26.2, 26.5, 27.1]}     |
| 800M | {Rotich, [24.7, 25.6, 26.9, 26.4]}    |
| 800M | {Lewandowski, [25, 25.7, 26.3, 27.2]} |
| 800M | {Kipketer, [23.2, 26.1, 27.3, 29.4]}  |
| 800M | {Berian, [23.7, 26.1, 27, 29.3]}      |
+------+---------------------------------------+

您可以从重复字段中找到特定信息。例如,以下查询用于返回 800 米赛跑中跑得最快的参赛选手。

此示例并不涉及数组展平操作,但展示了从重复字段中获取信息的一种常用方法。

示例

WITH races AS (
  SELECT "800M" AS race,
    [STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] as splits),
     STRUCT("Makhloufi" as name, [24.5, 25.4, 26.6, 26.1] as splits),
     STRUCT("Murphy" as name, [23.9, 26.0, 27.0, 26.0] as splits),
     STRUCT("Bosse" as name, [23.6, 26.2, 26.5, 27.1] as splits),
     STRUCT("Rotich" as name, [24.7, 25.6, 26.9, 26.4] as splits),
     STRUCT("Lewandowski" as name, [25.0, 25.7, 26.3, 27.2] as splits),
     STRUCT("Kipketer" as name, [23.2, 26.1, 27.3, 29.4] as splits),
     STRUCT("Berian" as name, [23.7, 26.1, 27.0, 29.3] as splits)]
       AS participants)
SELECT
  race,
  (SELECT name
   FROM UNNEST(participants)
   ORDER BY (
     SELECT SUM(duration)
     FROM UNNEST(splits) AS duration) ASC
   LIMIT 1) AS fastest_racer
FROM races;

+------+---------------+
| race | fastest_racer |
+------+---------------+
| 800M | Rudisha       |
+------+---------------+

查询 STRUCT 中的 ARRAY 类型字段

您也可以从嵌套的重复字段中获取信息。例如,以下语句用于返回在 800 米赛跑中圈速最快的参赛选手。

WITH races AS (
 SELECT "800M" AS race,
   [STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] as splits),
    STRUCT("Makhloufi" as name, [24.5, 25.4, 26.6, 26.1] as splits),
    STRUCT("Murphy" as name, [23.9, 26.0, 27.0, 26.0] as splits),
    STRUCT("Bosse" as name, [23.6, 26.2, 26.5, 27.1] as splits),
    STRUCT("Rotich" as name, [24.7, 25.6, 26.9, 26.4] as splits),
    STRUCT("Lewandowski" as name, [25.0, 25.7, 26.3, 27.2] as splits),
    STRUCT("Kipketer" as name, [23.2, 26.1, 27.3, 29.4] as splits),
    STRUCT("Berian" as name, [23.7, 26.1, 27.0, 29.3] as splits)]
    AS participants)
SELECT
race,
(SELECT name
 FROM UNNEST(participants),
   UNNEST(splits) AS duration
 ORDER BY duration ASC LIMIT 1) AS runner_with_fastest_lap
FROM races;

+------+-------------------------+
| race | runner_with_fastest_lap |
+------+-------------------------+
| 800M | Kipketer                |
+------+-------------------------+

请注意,前面的查询使用英文逗号运算符 (,) 来执行隐式 CROSS JOIN。这一做法的效果与以下示例相同,后者使用了显式 CROSS JOIN

WITH races AS (
 SELECT "800M" AS race,
   [STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] as splits),
    STRUCT("Makhloufi" as name, [24.5, 25.4, 26.6, 26.1] as splits),
    STRUCT("Murphy" as name, [23.9, 26.0, 27.0, 26.0] as splits),
    STRUCT("Bosse" as name, [23.6, 26.2, 26.5, 27.1] as splits),
    STRUCT("Rotich" as name, [24.7, 25.6, 26.9, 26.4] as splits),
    STRUCT("Lewandowski" as name, [25.0, 25.7, 26.3, 27.2] as splits),
    STRUCT("Kipketer" as name, [23.2, 26.1, 27.3, 29.4] as splits),
    STRUCT("Berian" as name, [23.7, 26.1, 27.0, 29.3] as splits)]
    AS participants)
SELECT
race,
(SELECT name
 FROM UNNEST(participants)
 CROSS JOIN UNNEST(splits) AS duration
 ORDER BY duration ASC LIMIT 1) AS runner_with_fastest_lap
FROM races;

+------+-------------------------+
| race | runner_with_fastest_lap |
+------+-------------------------+
| 800M | Kipketer                |
+------+-------------------------+

请注意,使用 CROSS JOIN 展平数组会排除具有空或 NULL 数组的行。如果要包含这些行,请使用 LEFT JOIN

WITH races AS (
 SELECT "800M" AS race,
   [STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] as splits),
    STRUCT("Makhloufi" as name, [24.5, 25.4, 26.6, 26.1] as splits),
    STRUCT("Murphy" as name, [23.9, 26.0, 27.0, 26.0] as splits),
    STRUCT("Bosse" as name, [23.6, 26.2, 26.5, 27.1] as splits),
    STRUCT("Rotich" as name, [24.7, 25.6, 26.9, 26.4] as splits),
    STRUCT("Lewandowski" as name, [25.0, 25.7, 26.3, 27.2] as splits),
    STRUCT("Kipketer" as name, [23.2, 26.1, 27.3, 29.4] as splits),
    STRUCT("Berian" as name, [23.7, 26.1, 27.0, 29.3] as splits),
    STRUCT("Nathan" as name, ARRAY<FLOAT64>[] as splits),
    STRUCT("David" as name, NULL as splits)]
    AS participants)
SELECT
  name, sum(duration) AS finish_time
FROM races, races.participants LEFT JOIN participants.splits duration
GROUP BY name;

+-------------+--------------------+
| name        | finish_time        |
+-------------+--------------------+
| Murphy      | 102.9              |
| Rudisha     | 102.19999999999999 |
| David       | NULL               |
| Rotich      | 103.6              |
| Makhloufi   | 102.6              |
| Berian      | 106.1              |
| Bosse       | 103.4              |
| Kipketer    | 106                |
| Nathan      | NULL               |
| Lewandowski | 104.2              |
+-------------+--------------------+

基于子查询创建数组

使用数组时,一个常见的任务是将子查询结果转换为数组。在 BigQuery 中,您可以使用 ARRAY() 函数来实现此目的。

例如,对 sequences 表执行以下操作:

WITH sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
  UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
  UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
  ARRAY(SELECT x * 2
        FROM UNNEST(some_numbers) AS x) AS doubled
FROM sequences;

+--------------------+---------------------+
| some_numbers       | doubled             |
+--------------------+---------------------+
| [0, 1, 1, 2, 3, 5] | [0, 2, 2, 4, 6, 10] |
| [2, 4, 8, 16, 32]  | [4, 8, 16, 32, 64]  |
| [5, 10]            | [10, 20]            |
+--------------------+---------------------+

此示例以一个名为 sequences 的表开始。该表包含一个类型为 ARRAY<INT64> 的列 some_numbers

该查询本身包含一个子查询。该子查询会选择 some_numbers 列中的每一行,并使用 UNNEST 将数组返回为一组行。接下来,它将每个值乘以 2,然后使用 ARRAY() 运算符将这些行重新组合到一个数组中。

过滤数组

以下示例在 WHERE 运算符的子查询中使用 ARRAY() 子句,以过滤返回的行。

注意:在下列示例中,返回的结果行未进行排序。

WITH sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT
  ARRAY(SELECT x * 2
        FROM UNNEST(some_numbers) AS x
        WHERE x < 5) AS doubled_less_than_five
FROM sequences;

+------------------------+
| doubled_less_than_five |
+------------------------+
| [0, 2, 2, 4, 6]        |
| [4, 8]                 |
| []                     |
+------------------------+

请注意,第三行包含一个空数组,这是因为与之对应的原始行 ([5, 10]) 中的元素不符合 x < 5 的过滤条件。

您也可以使用 SELECT DISTINCT 来过滤数组,这样可以返回数组中具有唯一性的元素。

WITH sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers)
SELECT ARRAY(SELECT DISTINCT x
             FROM UNNEST(some_numbers) AS x) AS unique_numbers
FROM sequences;

+-----------------+
| unique_numbers  |
+-----------------+
| [0, 1, 2, 3, 5] |
+-----------------+

您还可以使用 IN 关键字过滤数组行。此关键字可过滤包含数组的行,方法是确定特定值是否与数组中的元素匹配。

WITH sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT
   ARRAY(SELECT x
         FROM UNNEST(some_numbers) AS x
         WHERE 2 IN UNNEST(some_numbers)) AS contains_two
FROM sequences;

+--------------------+
| contains_two       |
+--------------------+
| [0, 1, 1, 2, 3, 5] |
| [2, 4, 8, 16, 32]  |
| []                 |
+--------------------+

另请注意,第三行包含一个空数组,这是因为与之对应的原始行 ([5, 10]) 中的数组不包含 2

扫描数组

如需检查数组中是否包含特定值,请将 IN 运算符与 UNNEST 结合使用。如需检查数组中是否包含与条件相匹配的值,请将 EXISTS 函数与 UNNEST 结合使用。

扫描特定值

要扫描数组中的特定值,请将 IN 运算符与 UNNEST 结合使用。

示例

以下示例在数组中包含数字 2 时返回 true

SELECT 2 IN UNNEST([0, 1, 1, 2, 3, 5]) AS contains_value;

+----------------+
| contains_value |
+----------------+
| true           |
+----------------+

对于一个表,要返回数组列中包含特定值的行,请使用 IN UNNEST 子句过滤 WHERE 的结果。

示例

以下示例返回数组列中包含值 2 的行的 id 值。

WITH sequences AS
  (SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
SELECT id AS matching_rows
FROM sequences
WHERE 2 IN UNNEST(sequences.some_numbers)
ORDER BY matching_rows;

+---------------+
| matching_rows |
+---------------+
| 1             |
| 2             |
+---------------+

扫描满足条件的值

要扫描数组以查找与条件相匹配的值,请使用 UNNEST 返回包含数组中元素的表,使用 WHERE 过滤子查询中的结果表,然后使用 EXISTS 检查过滤后的表是否包含任何行。

示例

以下示例返回数组列中值大于 5 的行的 id 值。

WITH sequences AS
  (SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
SELECT id AS matching_rows FROM sequences
WHERE EXISTS (SELECT *
              FROM UNNEST(some_numbers) AS x
              WHERE x > 5);

+---------------+
| matching_rows |
+---------------+
| 2             |
| 3             |
+---------------+

扫描满足条件的 STRUCT 字段值

要搜索 STRUCT 数组中值与条件相匹配的字段,请使用 UNNEST 返回一个每列均与一个 STRUCT 字段相对应的表,然后使用 WHERE EXISTS 从表中过滤掉不匹配的行。

示例

以下示例返回数组列中包含 STRUCT 且其字段 b 的值大于 3 的行。

WITH sequences AS
  (SELECT 1 AS id, [STRUCT(0 AS a, 1 AS b)] AS some_numbers
   UNION ALL SELECT 2 AS id, [STRUCT(2 AS a, 4 AS b)] AS some_numbers
   UNION ALL SELECT 3 AS id, [STRUCT(5 AS a, 3 AS b), STRUCT (7 AS a, 4 AS b)]
     AS some_numbers)
SELECT id AS matching_rows
FROM sequences
WHERE EXISTS (SELECT 1
              FROM UNNEST(some_numbers)
              WHERE b > 3);

+---------------+
| matching_rows |
+---------------+
| 2             |
| 3             |
+---------------+

数组和聚合

借助 BigQuery,您可以使用 ARRAY_AGG() 将值聚合到数组中。

WITH fruits AS
  (SELECT "apple" AS fruit
   UNION ALL SELECT "pear" AS fruit
   UNION ALL SELECT "banana" AS fruit)
SELECT ARRAY_AGG(fruit) AS fruit_basket
FROM fruits;

+-----------------------+
| fruit_basket          |
+-----------------------+
| [apple, pear, banana] |
+-----------------------+

ARRAY_AGG() 返回的数组以任意顺序排列,这是因为该函数连接值的顺序无法保证。要对数组元素进行排序,请使用 ORDER BY。例如:

WITH fruits AS
  (SELECT "apple" AS fruit
   UNION ALL SELECT "pear" AS fruit
   UNION ALL SELECT "banana" AS fruit)
SELECT ARRAY_AGG(fruit ORDER BY fruit) AS fruit_basket
FROM fruits;

+-----------------------+
| fruit_basket          |
+-----------------------+
| [apple, banana, pear] |
+-----------------------+

您还可以将聚合函数(如 SUM())应用于数组中的元素。例如,以下查询返回 sequences 表中每一行的数组元素的总和。

WITH sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
  (SELECT SUM(x)
   FROM UNNEST(s.some_numbers) x) AS sums
FROM sequences s;

+--------------------+------+
| some_numbers       | sums |
+--------------------+------+
| [0, 1, 1, 2, 3, 5] | 12   |
| [2, 4, 8, 16, 32]  | 62   |
| [5, 10]            | 15   |
+--------------------+------+

BigQuery 还支持聚合函数 ARRAY_CONCAT_AGG(),该函数可以跨行连接数组列的元素。

WITH aggregate_example AS
  (SELECT [1,2] AS numbers
   UNION ALL SELECT [3,4] AS numbers
   UNION ALL SELECT [5, 6] AS numbers)
SELECT ARRAY_CONCAT_AGG(numbers) AS count_to_six_agg
FROM aggregate_example;

+--------------------------------------------------+
| count_to_six_agg                                 |
+--------------------------------------------------+
| [1, 2, 3, 4, 5, 6]                               |
+--------------------------------------------------+

注意:由 ARRAY_CONCAT_AGG() 返回的数组是不确定的,因为该函数连接值的顺序无法保证。

将数组转换为字符串

利用 ARRAY_TO_STRING() 函数,您可以将 ARRAY<STRING> 转换为单个 STRING 值,或者将 ARRAY<BYTES> 转换为单个 BYTES 值,其中,结果值是有序连接的数组元素。

第二个参数是分隔符,相应函数将在输入之间插入该分隔符以生成输出;第二个参数必须与第一个参数的元素具有相同类型。

示例:

WITH greetings AS
  (SELECT ["Hello", "World"] AS greeting)
SELECT ARRAY_TO_STRING(greeting, " ") AS greetings
FROM greetings;

+-------------+
| greetings   |
+-------------+
| Hello World |
+-------------+

第三个参数(可选)取代输入数组中的 NULL 值。

  • 如果您省略此参数,则该函数将忽略 NULL 数组元素。

  • 如果您提供空字符串,则该函数会为 NULL 数组元素插入分隔符。

示例:

SELECT
  ARRAY_TO_STRING(arr, ".", "N") AS non_empty_string,
  ARRAY_TO_STRING(arr, ".", "") AS empty_string,
  ARRAY_TO_STRING(arr, ".") AS omitted
FROM (SELECT ["a", NULL, "b", NULL, "c", NULL] AS arr);

+------------------+--------------+---------+
| non_empty_string | empty_string | omitted |
+------------------+--------------+---------+
| a.N.b.N.c.N      | a..b..c.     | a.b.c   |
+------------------+--------------+---------+

合并数组

在某些情况下,您可能需要将多个数组合并为单个数组。为此,您可以使用 ARRAY_CONCAT() 函数。

SELECT ARRAY_CONCAT([1, 2], [3, 4], [5, 6]) as count_to_six;

+--------------------------------------------------+
| count_to_six                                     |
+--------------------------------------------------+
| [1, 2, 3, 4, 5, 6]                               |
+--------------------------------------------------+

构建数组的数组

BigQuery 不支持直接构建数组的数组。正确做法是,您必须创建一个结构体数组,每个结构体包含一个 ARRAY 类型的字段。为了说明这一点,请考虑以下 points 表:

+----------+
| point    |
+----------+
| [1, 5]   |
| [2, 8]   |
| [3, 7]   |
| [4, 1]   |
| [5, 7]   |
+----------+

现在,假设您要创建一个由 points 表中的每个 point 组成的数组。要完成此操作,请封装从 STRUCT 中的每行返回的数组,如下所示。

WITH points AS
  (SELECT [1, 5] as point
   UNION ALL SELECT [2, 8] as point
   UNION ALL SELECT [3, 7] as point
   UNION ALL SELECT [4, 1] as point
   UNION ALL SELECT [5, 7] as point)
SELECT ARRAY(
  SELECT STRUCT(point)
  FROM points)
  AS coordinates;

+----------------------------------------------------+
| coordinates                                        |
+----------------------------------------------------+
| [{[1, 5]}, {[2, 8]}, {[3, 7]}, {[4, 1]}, {[5, 7]}] |
+----------------------------------------------------+
此页内容是否有用?请给出您的反馈和评价:

发送以下问题的反馈:

此网页
需要帮助?请访问我们的支持页面