在標準 SQL 中使用陣列

在 BigQuery 中,陣列是一種排序清單,由資料類型相同的 0 個以上值組成。您可以建立簡易型資料類型 (例如 INT64) 的陣列,以及複雜型資料類型 (例如 STRUCT) 的陣列。ARRAY 資料類型目前屬於例外,系統不支援陣列的陣列。

您可以透過 BigQuery 建立陣列常值,運用 ARRAY 函式從子查詢建立陣列,然後使用 ARRAY_AGG 函式將值匯總至陣列。

您可以採用 ARRAY_CONCAT() 等函式來合併陣列,然後以 ARRAY_TO_STRING() 將陣列轉換為字串。

建立陣列

使用陣列常值

您可以使用方括號 ([]) 在 BigQuery 中建立陣列常值。陣列中的每項元素都是以半形逗號分隔。

SELECT [1, 2, 3] as numbers;

SELECT ["apple", "pear", "orange"] as fruit;

SELECT [true, false, true] as booleans;

您也可以從任何具備相容類型的運算式建立陣列。例如:

SELECT [a, b, c]
FROM
  (SELECT 5 AS a,
          37 AS b,
          406 AS c);

SELECT [a, b, c]
FROM
  (SELECT CAST(5 AS INT64) AS a,
          CAST(37 AS FLOAT64) AS b,
          406 AS c);

請注意,第二個範例含有三個運算式:一個傳回 INT64,一個傳回 FLOAT64,而一個宣告文字。這個運算式之所以能作用,是因為這三個運算式均共用 FLOAT64 作為超級類型。

如要宣告陣列的特定資料類型,請使用角括號 (<>)。例如:

SELECT ARRAY<FLOAT64>[1, 2, 3] as floats;

大部分資料類型的陣列 (例如 INT64STRING) 不需要先宣告。

SELECT [1, 2, 3] as numbers;

您可以使用 ARRAY<type>[] 寫入特定類型的空陣列。也可以使用 [] 寫入未設定類型的空陣列,在此情況下,BigQuery 會嘗試從上下文推斷陣列類型。如果 BigQuery 無法推斷出類型,則會使用預設類型 ARRAY<INT64>

使用產生的值

您也可以使用系統產生的值來建立 ARRAY

產生整數陣列

GENERATE_ARRAY 會從開始值、結束值和步驟值產生值陣列。舉例來說,以下查詢產生的陣列含有 11 至 33 (含) 之間的所有奇數整數。

SELECT GENERATE_ARRAY(11, 33, 2) AS odds;

+--------------------------------------------------+
| odds                                             |
+--------------------------------------------------+
| [11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33] |
+--------------------------------------------------+

如果您提供的是負步驟值,也可以產生採用遞減順序的值陣列:

SELECT GENERATE_ARRAY(21, 14, -1) AS countdown;

+----------------------------------+
| countdown                        |
+----------------------------------+
| [21, 20, 19, 18, 17, 16, 15, 14] |
+----------------------------------+

產生日期陣列

GENERATE_DATE_ARRAY 會從開始和結束的 DATE 與步驟 INTERVAL 產生 DATE 陣列。

您可以使用 GENERATE_DATE_ARRAY 產生一組 DATE 值。舉例來說,這項查詢會傳回目前的 DATE 和後續的 DATE,間隔為 1 WEEK ,一直到其後的 DATE (含) 為止:

SELECT
  GENERATE_DATE_ARRAY('2017-11-21', '2017-12-31', INTERVAL 1 WEEK)
    AS date_array;
+--------------------------------------------------------------------------+
| date_array                                                               |
+--------------------------------------------------------------------------+
| [2017-11-21, 2017-11-28, 2017-12-05, 2017-12-12, 2017-12-19, 2017-12-26] |
+--------------------------------------------------------------------------+

存取陣列元素

請查看下表 sequences

+---------------------+
| some_numbers        |
+---------------------+
| [0, 1, 1, 2, 3, 5]  |
| [2, 4, 8, 16, 32]   |
| [5, 10]             |
+---------------------+

這個資料表含有 ARRAY 資料類型的資料欄 some_numbers。如要從陣列中存取這個資料欄的元素,您必須指定要使用的索引類型:OFFSET 代表索引從 0 開始,ORDINAL 則代表索引從 1 開始。

WITH sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
       some_numbers[OFFSET(1)] AS offset_1,
       some_numbers[ORDINAL(1)] AS ordinal_1
FROM sequences;

+--------------------+----------+-----------+
| some_numbers       | offset_1 | ordinal_1 |
+--------------------+----------+-----------+
| [0, 1, 1, 2, 3, 5] | 1        | 0         |
| [2, 4, 8, 16, 32]  | 4        | 2         |
| [5, 10]            | 10       | 5         |
+--------------------+----------+-----------+

瞭解長度

ARRAY_LENGTH() 函式會傳回陣列長度。

WITH sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
       ARRAY_LENGTH(some_numbers) AS len
FROM sequences;

+--------------------+--------+
| some_numbers       | len    |
+--------------------+--------+
| [0, 1, 1, 2, 3, 5] | 6      |
| [2, 4, 8, 16, 32]  | 5      |
| [5, 10]            | 2      |
+--------------------+--------+

整併陣列

如要將 ARRAY 轉換為一組資料列 (也稱為「整併」),請使用 UNNEST 運算子。UNNEST 會使用 ARRAY,然後傳回代表 ARRAY 中每個元素的個別單列資料表。.

因為 UNNEST 會毀損 ARRAY 元素的順序,您也許會想要還原資料表的順序。為此,請使用選用的 WITH OFFSET 子句以傳回其他資料欄和每個陣列元素的偏移,然後使用 ORDER BY 子句以按照偏移排序列。

範例

SELECT *
FROM UNNEST(['foo', 'bar', 'baz', 'qux', 'corge', 'garply', 'waldo', 'fred'])
  AS element
WITH OFFSET AS offset
ORDER BY offset;

+----------+--------+
| element  | offset |
+----------+--------+
| foo      | 0      |
| bar      | 1      |
| baz      | 2      |
| qux      | 3      |
| corge    | 4      |
| garply   | 5      |
| waldo    | 6      |
| fred     | 7      |
+----------+--------+

如要整併 ARRAY 的整個資料欄,同時保留每個資料列中其他資料欄的值,請使用 CROSS JOIN,將含有 ARRAY 資料欄的資料表連結至這個 ARRAY 資料欄的 UNNEST 輸出內容。

這是一個關聯交叉聯結:UNNEST 運算子會從先前出現在 FROM 子句中來源資料表的每一個資料列參照 ARRAY 的資料欄。對於來源資料表中的每一列 NUNNEST 會將 ARRAY 從列 N 整併成一組含有 ARRAY 元素的列,然後 CROSS JOIN 會將新的這組列與來源資料表中的單列 N 聯結在一起。

範例

下例範例使用 UNNEST,讓陣列資料欄中的每項元素都傳回一個資料列。因為使用 CROSS JOIN,所以 id 資料欄所含的 id 值代表 sequences 中每一個數字所在的列。

WITH sequences AS
  (SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
SELECT id, flattened_numbers
FROM sequences
CROSS JOIN UNNEST(sequences.some_numbers) AS flattened_numbers;

+------+-------------------+
| id   | flattened_numbers |
+------+-------------------+
|    1 |                 0 |
|    1 |                 1 |
|    1 |                 1 |
|    1 |                 2 |
|    1 |                 3 |
|    1 |                 5 |
|    2 |                 2 |
|    2 |                 4 |
|    2 |                 8 |
|    2 |                16 |
|    2 |                32 |
|    3 |                 5 |
|    3 |                10 |
+------+-------------------+

查詢巢狀陣列

如果資料表含有 ARRAYSTRUCT,您可以整併 STRUCT 來查詢 ARRAY 欄位,也可以整併 STRUCT 值的 ARRAY 欄位類型。

查詢 ARRAY 中的 STRUCT 元素

以下範例使用 UNNESTCROSS JOIN 來整併 STRUCTARRAY

WITH races AS (
  SELECT "800M" AS race,
    [STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] as splits),
     STRUCT("Makhloufi" as name, [24.5, 25.4, 26.6, 26.1] as splits),
     STRUCT("Murphy" as name, [23.9, 26.0, 27.0, 26.0] as splits),
     STRUCT("Bosse" as name, [23.6, 26.2, 26.5, 27.1] as splits),
     STRUCT("Rotich" as name, [24.7, 25.6, 26.9, 26.4] as splits),
     STRUCT("Lewandowski" as name, [25.0, 25.7, 26.3, 27.2] as splits),
     STRUCT("Kipketer" as name, [23.2, 26.1, 27.3, 29.4] as splits),
     STRUCT("Berian" as name, [23.7, 26.1, 27.0, 29.3] as splits)]
       AS participants)
SELECT
  race,
  participant
FROM races r
CROSS JOIN UNNEST(r.participants) as participant;

+------+---------------------------------------+
| race | participant                           |
+------+---------------------------------------+
| 800M | {Rudisha, [23.4, 26.3, 26.4, 26.1]}   |
| 800M | {Makhloufi, [24.5, 25.4, 26.6, 26.1]} |
| 800M | {Murphy, [23.9, 26, 27, 26]}          |
| 800M | {Bosse, [23.6, 26.2, 26.5, 27.1]}     |
| 800M | {Rotich, [24.7, 25.6, 26.9, 26.4]}    |
| 800M | {Lewandowski, [25, 25.7, 26.3, 27.2]} |
| 800M | {Kipketer, [23.2, 26.1, 27.3, 29.4]}  |
| 800M | {Berian, [23.7, 26.1, 27, 29.3]}      |
+------+---------------------------------------+

您可以在重複的欄位中發現特定資訊。舉例來說,以下查詢會傳回 800 公尺競賽中速度最快的跑者。

此範例不需要整併陣列,但卻代表從重複欄位中取得資訊的一種常見方法。

範例

WITH races AS (
  SELECT "800M" AS race,
    [STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] as splits),
     STRUCT("Makhloufi" as name, [24.5, 25.4, 26.6, 26.1] as splits),
     STRUCT("Murphy" as name, [23.9, 26.0, 27.0, 26.0] as splits),
     STRUCT("Bosse" as name, [23.6, 26.2, 26.5, 27.1] as splits),
     STRUCT("Rotich" as name, [24.7, 25.6, 26.9, 26.4] as splits),
     STRUCT("Lewandowski" as name, [25.0, 25.7, 26.3, 27.2] as splits),
     STRUCT("Kipketer" as name, [23.2, 26.1, 27.3, 29.4] as splits),
     STRUCT("Berian" as name, [23.7, 26.1, 27.0, 29.3] as splits)]
       AS participants)
SELECT
  race,
  (SELECT name
   FROM UNNEST(participants)
   ORDER BY (
     SELECT SUM(duration)
     FROM UNNEST(splits) AS duration) ASC
   LIMIT 1) AS fastest_racer
FROM races;

+------+---------------+
| race | fastest_racer |
+------+---------------+
| 800M | Rudisha       |
+------+---------------+

查詢 STRUCT 中的 ARRAY 類型欄位

您也可以從巢狀的重複欄位取得資訊。舉例來說,下列陳述式會傳回 800 公尺競賽中圈速最快的跑者。

WITH races AS (
 SELECT "800M" AS race,
   [STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] as splits),
    STRUCT("Makhloufi" as name, [24.5, 25.4, 26.6, 26.1] as splits),
    STRUCT("Murphy" as name, [23.9, 26.0, 27.0, 26.0] as splits),
    STRUCT("Bosse" as name, [23.6, 26.2, 26.5, 27.1] as splits),
    STRUCT("Rotich" as name, [24.7, 25.6, 26.9, 26.4] as splits),
    STRUCT("Lewandowski" as name, [25.0, 25.7, 26.3, 27.2] as splits),
    STRUCT("Kipketer" as name, [23.2, 26.1, 27.3, 29.4] as splits),
    STRUCT("Berian" as name, [23.7, 26.1, 27.0, 29.3] as splits)]
    AS participants)
SELECT
race,
(SELECT name
 FROM UNNEST(participants),
   UNNEST(splits) AS duration
 ORDER BY duration ASC LIMIT 1) AS runner_with_fastest_lap
FROM races;

+------+-------------------------+
| race | runner_with_fastest_lap |
+------+-------------------------+
| 800M | Kipketer                |
+------+-------------------------+

請注意,之前的查詢使用逗號運算子 (,) 來執行隱含的 CROSS JOIN。作用與使用明確 CROSS JOIN 的下列範例相等。

WITH races AS (
 SELECT "800M" AS race,
   [STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] as splits),
    STRUCT("Makhloufi" as name, [24.5, 25.4, 26.6, 26.1] as splits),
    STRUCT("Murphy" as name, [23.9, 26.0, 27.0, 26.0] as splits),
    STRUCT("Bosse" as name, [23.6, 26.2, 26.5, 27.1] as splits),
    STRUCT("Rotich" as name, [24.7, 25.6, 26.9, 26.4] as splits),
    STRUCT("Lewandowski" as name, [25.0, 25.7, 26.3, 27.2] as splits),
    STRUCT("Kipketer" as name, [23.2, 26.1, 27.3, 29.4] as splits),
    STRUCT("Berian" as name, [23.7, 26.1, 27.0, 29.3] as splits)]
    AS participants)
SELECT
race,
(SELECT name
 FROM UNNEST(participants)
 CROSS JOIN UNNEST(splits) AS duration
 ORDER BY duration ASC LIMIT 1) AS runner_with_fastest_lap
FROM races;

+------+-------------------------+
| race | runner_with_fastest_lap |
+------+-------------------------+
| 800M | Kipketer                |
+------+-------------------------+

請注意,整併陣列與 CROSS JOIN 會排除含有空陣列或 NULL 陣列的列。如果要併入這些列,請使用 LEFT JOIN

WITH races AS (
 SELECT "800M" AS race,
   [STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] as splits),
    STRUCT("Makhloufi" as name, [24.5, 25.4, 26.6, 26.1] as splits),
    STRUCT("Murphy" as name, [23.9, 26.0, 27.0, 26.0] as splits),
    STRUCT("Bosse" as name, [23.6, 26.2, 26.5, 27.1] as splits),
    STRUCT("Rotich" as name, [24.7, 25.6, 26.9, 26.4] as splits),
    STRUCT("Lewandowski" as name, [25.0, 25.7, 26.3, 27.2] as splits),
    STRUCT("Kipketer" as name, [23.2, 26.1, 27.3, 29.4] as splits),
    STRUCT("Berian" as name, [23.7, 26.1, 27.0, 29.3] as splits),
    STRUCT("Nathan" as name, ARRAY<FLOAT64>[] as splits),
    STRUCT("David" as name, NULL as splits)]
    AS participants)
SELECT
  name, sum(duration) AS finish_time
FROM races, races.participants LEFT JOIN participants.splits duration
GROUP BY name;

+-------------+--------------------+
| name        | finish_time        |
+-------------+--------------------+
| Murphy      | 102.9              |
| Rudisha     | 102.19999999999999 |
| David       | NULL               |
| Rotich      | 103.6              |
| Makhloufi   | 102.6              |
| Berian      | 106.1              |
| Bosse       | 103.4              |
| Kipketer    | 106                |
| Nathan      | NULL               |
| Lewandowski | 104.2              |
+-------------+--------------------+

從子查詢建立陣列

處理陣列的常見工作是將子查詢結果轉成陣列。您可以在 BigQuery 中使用 ARRAY() 函式完成這項工作。

舉例來說,請細想對 sequences 資料表執行的下列作業:

WITH sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
  UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
  UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
  ARRAY(SELECT x * 2
        FROM UNNEST(some_numbers) AS x) AS doubled
FROM sequences;

+--------------------+---------------------+
| some_numbers       | doubled             |
+--------------------+---------------------+
| [0, 1, 1, 2, 3, 5] | [0, 2, 2, 4, 6, 10] |
| [2, 4, 8, 16, 32]  | [4, 8, 16, 32, 64]  |
| [5, 10]            | [10, 20]            |
+--------------------+---------------------+

此範例從名稱為 sequences 的資料表開始。這個資料表含有類型 ARRAY<INT64> 的資料欄 some_numbers

查詢本身含有子查詢。這項子查詢會選取 some_numbers 資料欄中的每一個資料列,然後使用 UNNEST 以一組資料列的形式傳回陣列。接下來,將每個值乘以 2,然後使用 ARRAY() 運算子將這些列重新合併到陣列。

篩選陣列

以下範例在 ARRAY() 運算子的子查詢中使用 WHERE 子句來篩選傳回的列。

注意事項:在下例中,結果列不會排序。

WITH sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT
  ARRAY(SELECT x * 2
        FROM UNNEST(some_numbers) AS x
        WHERE x < 5) AS doubled_less_than_five
FROM sequences;

+------------------------+
| doubled_less_than_five |
+------------------------+
| [0, 2, 2, 4, 6]        |
| [4, 8]                 |
| []                     |
+------------------------+

請注意,第三列含有空陣列,因為對應原始列 ([5, 10]) 中的元素不符合 x < 5 的篩選需求。

您也可以使用 SELECT DISTINCT,只傳回陣列中非重複的元素,藉以篩選陣列。

WITH sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers)
SELECT ARRAY(SELECT DISTINCT x
             FROM UNNEST(some_numbers) AS x) AS unique_numbers
FROM sequences;

+-----------------+
| unique_numbers  |
+-----------------+
| [0, 1, 2, 3, 5] |
+-----------------+

您也可以使用 IN 關鍵字來篩選陣列的資料列。這個關鍵字藉由判定特定值是否符合陣列中的元素,篩選含有陣列的列。

WITH sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT
   ARRAY(SELECT x
         FROM UNNEST(some_numbers) AS x
         WHERE 2 IN UNNEST(some_numbers)) AS contains_two
FROM sequences;

+--------------------+
| contains_two       |
+--------------------+
| [0, 1, 1, 2, 3, 5] |
| [2, 4, 8, 16, 32]  |
| []                 |
+--------------------+

請再次注意,第三列含有空陣列,因為對應原始列 ([5, 10]) 中的陣列不含 2

掃描陣列

如要檢查陣列是否含有特定值,請使用 IN 運算子與 UNNEST。如要檢查陣列是否含有符合條件的值,請使用 EXISTS 函式與 UNNEST

掃描特定值

如要掃描陣列是否含有特定值,請使用 IN 運算子與 UNNEST

範例

以下範例顯示陣列含有數字 2 時會傳回 true

SELECT 2 IN UNNEST([0, 1, 1, 2, 3, 5]) AS contains_value;

+----------------+
| contains_value |
+----------------+
| true           |
+----------------+

如要傳回陣列資料欄含有特定值的資料表列,請使用 WHERE 子句來篩選 IN UNNEST 的結果。

範例

以下範例傳回的 id 值代表陣列資料欄含有值 2 的列。

WITH sequences AS
  (SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
SELECT id AS matching_rows
FROM sequences
WHERE 2 IN UNNEST(sequences.some_numbers)
ORDER BY matching_rows;

+---------------+
| matching_rows |
+---------------+
| 1             |
| 2             |
+---------------+

掃描符合條件的值

如要掃描陣列尋找符合條件的值,請使用 UNNEST 傳回陣列中的元素資料表,使用 WHERE 來篩選子查詢中的結果資料表,並使用 EXISTS 來檢查篩選的資料表是否含有任何列。

範例

以下範例傳回的 id 值代表陣列資料欄所含值大於 5 的列。

WITH sequences AS
  (SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
SELECT id AS matching_rows FROM sequences
WHERE EXISTS (SELECT *
              FROM UNNEST(some_numbers) AS x
              WHERE x > 5);

+---------------+
| matching_rows |
+---------------+
| 2             |
| 3             |
+---------------+

掃描符合條件的 STRUCT 欄位值

如要掃描 STRUCT 的陣列尋找值符合條件的欄位,請使用 UNNEST 傳回每一個 STRUCT 欄位所在資料欄的資料表,然後使用 WHERE EXISTS 篩選資料表中不符條件的列。

範例

以下範例傳回陣列資料欄所含的 STRUCT 中欄位 b 的值大於 3 的列。

WITH sequences AS
  (SELECT 1 AS id, [STRUCT(0 AS a, 1 AS b)] AS some_numbers
   UNION ALL SELECT 2 AS id, [STRUCT(2 AS a, 4 AS b)] AS some_numbers
   UNION ALL SELECT 3 AS id, [STRUCT(5 AS a, 3 AS b), STRUCT (7 AS a, 4 AS b)]
     AS some_numbers)
SELECT id AS matching_rows
FROM sequences
WHERE EXISTS (SELECT 1
              FROM UNNEST(some_numbers)
              WHERE b > 3);

+---------------+
| matching_rows |
+---------------+
| 2             |
| 3             |
+---------------+

陣列與匯總

使用 BigQuery 時,您可以使用 ARRAY_AGG() 將值匯總到陣列。

WITH fruits AS
  (SELECT "apple" AS fruit
   UNION ALL SELECT "pear" AS fruit
   UNION ALL SELECT "banana" AS fruit)
SELECT ARRAY_AGG(fruit) AS fruit_basket
FROM fruits;

+-----------------------+
| fruit_basket          |
+-----------------------+
| [apple, pear, banana] |
+-----------------------+

ARRAY_AGG() 傳回的陣列採任意順序,這是因為系統無法保證函式串連值所採用的順序。如要排序陣列元素,請使用 ORDER BY。例如:

WITH fruits AS
  (SELECT "apple" AS fruit
   UNION ALL SELECT "pear" AS fruit
   UNION ALL SELECT "banana" AS fruit)
SELECT ARRAY_AGG(fruit ORDER BY fruit) AS fruit_basket
FROM fruits;

+-----------------------+
| fruit_basket          |
+-----------------------+
| [apple, banana, pear] |
+-----------------------+

您也可以將匯總函式 (如 SUM()) 套用於陣列中的元素。舉例來說,下列查詢會傳回 sequences 資料表中每一列的陣列元素總和。

WITH sequences AS
  (SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
   UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
  (SELECT SUM(x)
   FROM UNNEST(s.some_numbers) x) AS sums
FROM sequences s;

+--------------------+------+
| some_numbers       | sums |
+--------------------+------+
| [0, 1, 1, 2, 3, 5] | 12   |
| [2, 4, 8, 16, 32]  | 62   |
| [5, 10]            | 15   |
+--------------------+------+

BigQuery 也支援匯總函式 ARRAY_CONCAT_AGG(),它會串連不同列中陣列資料欄的元素。

WITH aggregate_example AS
  (SELECT [1,2] AS numbers
   UNION ALL SELECT [3,4] AS numbers
   UNION ALL SELECT [5, 6] AS numbers)
SELECT ARRAY_CONCAT_AGG(numbers) AS count_to_six_agg
FROM aggregate_example;

+--------------------------------------------------+
| count_to_six_agg                                 |
+--------------------------------------------------+
| [1, 2, 3, 4, 5, 6]                               |
+--------------------------------------------------+

注意事項:ARRAY_CONCAT_AGG() 傳回的陣列具備不確定性,因為系統無法保證函式串連值採用的順序為何。

將陣列轉換成字串

ARRAY_TO_STRING() 函式可讓您將 ARRAY<STRING> 轉換成一個 STRING 值或將 ARRAY<BYTES> 轉換成一個 BYTES 值,其中的結果值是排序串連的陣列元素。

第二個引數是函式會在輸入之間插入以產生輸出的分隔符;第二個引數的類型必須與第一個引數的元素相同。

範例:

WITH greetings AS
  (SELECT ["Hello", "World"] AS greeting)
SELECT ARRAY_TO_STRING(greeting, " ") AS greetings
FROM greetings;

+-------------+
| greetings   |
+-------------+
| Hello World |
+-------------+

選用的第三個引數會取代輸入陣列中的 NULL 值。

  • 如果省略這個引數,則函式會忽略 NULL 陣列元素。

  • 如果您提供空字串,則函式會插入 NULL 陣列元素的分隔符。

範例:

SELECT
  ARRAY_TO_STRING(arr, ".", "N") AS non_empty_string,
  ARRAY_TO_STRING(arr, ".", "") AS empty_string,
  ARRAY_TO_STRING(arr, ".") AS omitted
FROM (SELECT ["a", NULL, "b", NULL, "c", NULL] AS arr);

+------------------+--------------+---------+
| non_empty_string | empty_string | omitted |
+------------------+--------------+---------+
| a.N.b.N.c.N      | a..b..c.     | a.b.c   |
+------------------+--------------+---------+

合併陣列

在某些情況下,您可能想要將多個陣列合併成一個陣列。您可以使用 ARRAY_CONCAT() 函式來完成此工作。

SELECT ARRAY_CONCAT([1, 2], [3, 4], [5, 6]) as count_to_six;

+--------------------------------------------------+
| count_to_six                                     |
+--------------------------------------------------+
| [1, 2, 3, 4, 5, 6]                               |
+--------------------------------------------------+

建立陣列的陣列

BigQuery 不支援直接建立陣列的陣列。您必須先建立 struct 的陣列,每一個 struct 含有類型 ARRAY 的欄位。為了說明此工作,請細想下列 points 資料表。

+----------+
| point    |
+----------+
| [1, 5]   |
| [2, 8]   |
| [3, 7]   |
| [4, 1]   |
| [5, 7]   |
+----------+

現在,我們假設您要建立一個陣列,該陣列由 points 資料表中的每一個 point 組成。如要完成這項工作,請將每一列傳回的陣列加入 STRUCT,如下所示。

WITH points AS
  (SELECT [1, 5] as point
   UNION ALL SELECT [2, 8] as point
   UNION ALL SELECT [3, 7] as point
   UNION ALL SELECT [4, 1] as point
   UNION ALL SELECT [5, 7] as point)
SELECT ARRAY(
  SELECT STRUCT(point)
  FROM points)
  AS coordinates;

+----------------------------------------------------+
| coordinates                                        |
+----------------------------------------------------+
| [{[1, 5]}, {[2, 8]}, {[3, 7]}, {[4, 1]}, {[5, 7]}] |
+----------------------------------------------------+
本頁內容對您是否有任何幫助?請提供意見:

傳送您對下列選項的寶貴意見...

這個網頁