Halaman ini menjelaskan pemetaan jenis data dari berbagai database sumber ke jenis data BigQuery yang sesuai. Pahami cara berbagai jenis data dikonversi saat memigrasikan data ke BigQuery, cara BigQuery merepresentasikan dokumen JSON biner MongoDB, dan cara membuat kueri data array PostgreSQL sebagai jenis data ARRAY
BigQuery.
Memetakan jenis data
Tabel berikut mencantumkan konversi jenis data dari database sumber yang didukung ke tujuan BigQuery.
Database sumber | Jenis data sumber | Jenis data BigQuery |
---|---|---|
MySQL | BIGINT(size) |
INT64 |
MySQL | BIGINT (unsigned) |
DECIMAL |
MySQL | BINARY(size) |
STRING (hex encoded) |
MySQL | BIT(size) |
INT64 |
MySQL | BLOB(size) |
STRING (hex encoded) |
MySQL | BOOL |
INT64 |
MySQL | CHAR(size) |
STRING |
MySQL | DATE |
DATE |
MySQL | DATETIME(fsp) |
DATETIME |
MySQL | DECIMAL(precision, scale) |
Jika nilai presisi <=38, dan nilai skala <=9, maka NUMERIC . Atau BIGNUMERIC |
MySQL | DOUBLE(size, d) |
FLOAT64 |
MySQL | ENUM(val1, val2, val3, ...) |
STRING |
MySQL | FLOAT(precision) |
FLOAT64 |
MySQL | FLOAT(size, d) |
FLOAT64 |
MySQL | INTEGER(size) |
INT64 |
MySQL | INTEGER (unsigned) |
INT64 |
MySQL |
|
JSON
Jenis data |
MySQL | LONGBLOB |
STRING (hex encoded) |
MySQL | LONGTEXT |
STRING |
MySQL | MEDIUMBLOB |
STRING (hex encoded) |
MySQL | MEDIUMINT(size) |
INT64 |
MySQL | MEDIUMTEXT |
STRING |
MySQL | SET(val1, val2, val3, ...) |
STRING |
MySQL | SMALLINT(size) |
INT64 |
MySQL | TEXT(size) |
STRING |
MySQL | TIME(fsp) |
INTERVAL |
MySQL | TIMESTAMP(fsp) |
TIMESTAMP |
MySQL | TINYBLOB |
STRING (hex encoded) |
MySQL | TINYINT(size) |
INT64 |
MySQL | TINYTEXT |
STRING |
MySQL | VARBINARY(size) |
STRING (hex encoded) |
MySQL | VARCHAR |
STRING |
MySQL | YEAR |
INT64 |
Oracle | ANYDATA |
UNSUPPORTED |
Oracle | BFILE |
STRING |
Oracle | BINARY DOUBLE |
FLOAT64 |
Oracle | BINARY FLOAT |
FLOAT64 |
Oracle | BLOB |
BYTES |
Oracle | CHAR |
STRING |
Oracle | CLOB |
STRING |
Oracle | DATE |
DATETIME
|
Oracle | DOUBLE PRECISION |
FLOAT64 |
Oracle | FLOAT(p) |
FLOAT64 |
Oracle | INTERVAL DAY TO SECOND |
UNSUPPORTED |
Oracle | INTERVAL YEAR TO MONTH |
UNSUPPORTED |
Oracle | LONG /LONG RAW |
STRING |
Oracle | NCHAR |
STRING |
Oracle | NCLOB |
STRING |
Oracle | NUMBER(precision, scale>0) |
Jika 0<p=<78, petakan ke jenis desimal berparameter. Jika p>=79, petakan ke STRING |
Oracle | NVARCHAR2 |
STRING |
Oracle | RAW |
STRING |
Oracle | ROWID |
STRING |
Oracle | SDO_GEOMETRY |
UNSUPPORTED |
Oracle | SMALLINT |
INT64 |
Oracle | TIMESTAMP |
TIMESTAMP
|
Oracle | TIMESTAMP WITH TIME ZONE |
TIMESTAMP
|
Oracle | UDT (user-defined type) |
UNSUPPORTED |
Oracle | UROWID |
STRING |
Oracle | VARCHAR |
STRING |
Oracle | VARCHAR2 |
STRING |
Oracle | XMLTYPE |
UNSUPPORTED |
PostgreSQL | ARRAY |
JSON
|
PostgreSQL | BIGINT |
INT64 |
PostgreSQL | BIT |
BYTES |
PostgreSQL | BIT_VARYING |
BYTES |
PostgreSQL | BOOLEAN |
BOOLEAN |
PostgreSQL | BOX |
UNSUPPORTED |
PostgreSQL | BYTEA |
BYTES |
PostgreSQL | CHARACTER |
STRING |
PostgreSQL | CHARACTER_VARYING |
STRING |
PostgreSQL | CIDR |
STRING |
PostgreSQL | CIRCLE |
UNSUPPORTED |
PostgreSQL | DATE |
DATE |
PostgreSQL | DOUBLE_PRECISION |
FLOAT64 |
PostgreSQL | ENUM |
STRING |
PostgreSQL | INET |
STRING |
PostgreSQL | INTEGER |
INT64 |
PostgreSQL | INTERVAL |
INTERVAL |
PostgreSQL | JSON |
JSON |
PostgreSQL | JSONB |
JSON |
PostgreSQL | LINE |
UNSUPPORTED |
PostgreSQL | LSEG |
UNSUPPORTED |
PostgreSQL | MACADDR |
STRING |
PostgreSQL | MONEY |
FLOAT64 |
PostgreSQL | NUMERIC |
Jika presisi = -1 , maka STRING (jenis NUMERIC BigQuery memerlukan presisi tetap). Jika tidak, BIGNUMERIC /NUMERIC . Untuk mengetahui informasi selengkapnya, lihat bagian Angka presisi arbitrer dalam dokumentasi PostgreSQL. |
PostgreSQL | OID |
INT64 |
PostgreSQL | PATH |
UNSUPPORTED |
PostgreSQL | POINT |
UNSUPPORTED |
PostgreSQL | POLYGON |
UNSUPPORTED |
PostgreSQL | REAL |
FLOAT64 |
PostgreSQL | SMALLINT |
INT64 |
PostgreSQL | SMALLSERIAL |
INT64 |
PostgreSQL | SERIAL |
INT64 |
PostgreSQL | TEXT |
STRING |
PostgreSQL | TIME |
TIME |
PostgreSQL | TIMESTAMP |
TIMESTAMP |
PostgreSQL | TIMESTAMP_WITH_TIMEZONE |
TIMESTAMP |
PostgreSQL | TIME_WITH_TIMEZONE |
TIME |
PostgreSQL | TSQUERY |
STRING |
PostgreSQL | TSVECTOR |
STRING |
PostgreSQL | TXID_SNAPSHOT |
STRING |
PostgreSQL | UUID |
STRING |
PostgreSQL | XML |
STRING |
SQL Server | BIGINT |
INT64 |
SQL Server | BINARY |
BYTES |
SQL Server | BIT |
BOOL |
SQL Server | CHAR |
STRING |
SQL Server | DATE |
DATE |
SQL Server | DATETIME2 |
DATETIME |
SQL Server | DATETIME |
DATETIME |
SQL Server | DATETIMEOFFSET |
TIMESTAMP |
SQL Server | DECIMAL |
BIGNUMERIC |
SQL Server | FLOAT |
FLOAT64 |
SQL Server | IMAGE |
BYTES |
SQL Server | INT |
INT64 |
SQL Server | MONEY |
BIGNUMERIC |
SQL Server | NCHAR |
STRING |
SQL Server | NTEXT |
STRING |
SQL Server | NUMERIC |
BIGNUMERIC |
SQL Server | NVARCHAR |
STRING |
SQL Server | NVARCHAR(MAX) |
STRING |
SQL Server | REAL |
FLOAT64 |
SQL Server | SMALLDATETIME |
DATETIME |
SQL Server | SMALLINT |
INT64 |
SQL Server | SMALLMONEY |
NUMERIC |
SQL Server | TEXT |
STRING |
SQL Server | TIME |
TIME |
SQL Server | TIMESTAMP /ROWVERSION |
BYTES |
SQL Server | TINYINT |
INT64 |
SQL Server | UNIQUEIDENTIFIER |
STRING |
SQL Server | VARBINARY |
BYTES |
SQL Server | VARBINARY(MAX) |
BYTES |
SQL Server | VARCHAR |
STRING |
SQL Server | VARCHAR(MAX) |
STRING |
SQL Server | XML |
STRING |
Salesforce | BOOLEAN |
BOOLEAN |
Salesforce | BYTE |
BYTES |
Salesforce | DATE |
DATE |
Salesforce | DATETIME |
DATETIME |
Salesforce | DOUBLE |
BIGNUMERIC |
Salesforce | INT |
INT64 |
Salesforce | STRING |
STRING |
Salesforce | TIME |
TIME |
Salesforce | ANYTYPE (dapat berupa STRING , DATE , NUMBER , atau BOOLEAN ) |
STRING |
Salesforce | COMBOBOX |
STRING |
Salesforce | CURRENCY |
FLOAT64
Panjang maksimum yang diizinkan adalah 18 digit. |
Salesforce | DATACATEGORYGROUPREFERENCE |
STRING |
Salesforce | EMAIL |
STRING |
Salesforce | ENCRYPTEDSTRING |
STRING |
Salesforce | ID |
STRING |
Salesforce | JUNCTIONIDLIST |
STRING |
Salesforce | MASTERRECORD |
STRING |
Salesforce | MULTIPICKLIST |
STRING |
Salesforce | PERCENT |
FLOAT64
Panjang maksimum yang diizinkan adalah 18 digit. |
Salesforce | PHONE |
STRING |
Salesforce | PICKLIST |
STRING |
Salesforce | REFERENCE |
STRING |
Salesforce | TEXTAREA |
STRING
Panjang maksimum yang diizinkan adalah 255 karakter. |
Salesforce | URL |
STRING |
Jenis data MongoDB
Dokumen JSON biner (BSON) MongoDB ditulis ke BigQuery dalam format mode ketat MongoDB Extended JSON (v1). Tabel ini menunjukkan cara jenis data direpresentasikan di BigQuery, beserta contoh nilainya.
Jenis data sumber | Nilai contoh | Nilai jenis JSON BigQuery |
---|---|---|
DOUBLE |
3.1415926535
|
3.1415926535 |
STRING | "Hello, MongoDB!" | "Hello, MongoDB!" |
ARRAY |
| ["item1",123,true,{"subItem":"object in array"}] |
BINARY DATA |
new BinData(0, "SGVsbG8gQmluYXJ5IERhdGE=") |
{"$binary":"SGVsbG8gQmluYXJ5IERhdGE=","$type":"00"} |
BOOLEAN | true | true |
DATE |
2024-12-25T10:30:00.000+00:00
|
{"$date": 1735122600000}
|
NULL | null | null |
REGEX | /^mongo(db)?$/i | {"$options":"i","$regex":"^mongo(db)?$"} |
JAVASCRIPT | function() {return this.stringField.length;} | {"$code":"function() {\n return this.stringField.length;\n }"} |
DECIMAL128 | NumberDecimal("1234567890.1234567890") | {"$numberDecimal":"1234567890.1234567890"} |
OBJECTID | ObjectId('673c5d8dbfe2e51808cc2c3d') | {"$oid": "673c5d8dbfe2e51808cc2c3d"} |
LONG | 3567587327 | {"$numberLong": "3567587327"} |
INT32 | 42 | 42 |
INT64 | 1864712049423024127 | {"$numberLong": "1864712049423024127"} |
TIMESTAMP | new Timestamp(1747888877, 1) | {"$timestamp":{"i":1,"t":1747888877}} |
Mengkueri array PostgreSQL sebagai jenis data array BigQuery
Jika Anda lebih suka membuat kueri array PostgreSQL sebagai jenis data ARRAY
BigQuery,
Anda dapat mengonversi nilai JSON
menjadi array BigQuery menggunakan fungsi JSON_VALUE_ARRAY
BigQuery:
SELECT ARRAY(SELECT CAST(element AS TYPE) FROM UNNEST(JSON_VALUE_ARRAY(BQ_COLUMN_NAME,'$')) AS element)AS array_col
Ganti kode berikut:
TYPE: jenis BigQuery yang cocok dengan jenis elemen dalam array sumber PostgreSQL. Misalnya, jika jenis sumber adalah array nilai
BIGINT
, ganti TYPE denganINT64
.Untuk mengetahui informasi selengkapnya tentang cara memetakan jenis data, lihat Memetakan jenis data.
BQ_COLUMN_NAME: nama kolom yang relevan dalam tabel BigQuery.
Ada 2 pengecualian untuk cara Anda mengonversi nilai:
Untuk array nilai
BIT
,BIT_VARYING
, atauBYTEA
di kolom sumber, jalankan kueri berikut:SELECT ARRAY(SELECT FROM_BASE64(element) FROM UNNEST(JSON_VALUE_ARRAY(BQ_COLUMN_NAME,'$')) AS element)
AS array_of_bytes Untuk array nilai
JSON
atauJSONB
di kolom sumber, gunakan fungsiJSON_QUERY_ARRAY
:SELECT ARRAY(SELECT element FROM UNNEST(JSON_QUERY_ARRAY(BQ_COLUMN_NAME,'$')) AS element)
AS array_of_jsons