Unified types are the data types that appear in the Avro or JSON events. They are a Datastream-specific, unified representation of a data type across multiple data sources and destinations, such as a source Oracle database, MySQL database, or PostgreSQL database, and a BigQuery or Cloud Storage destination.
The unified types are the superset of all type representations across all supported source types, which represent the original source type in a generic but lossless way.
The following tables list:
- The unified types associated with Datastream
- The mappings between the data types for an Oracle database, a MySQL database, or a PostgreSQL database, and the Datastream unified types
Refer to the BigQuery destination documentation for information about the mappings between the data types for the different sources and BigQuery.
Datastream unified types
Type name | Info | Avro definition | JSON definition |
---|---|---|---|
BOOLEAN |
boolean | boolean | boolean |
BYTES |
A sequence of unsigned bytes | bytes | string |
DATE |
Days since the epoch | A date logical type | string [ISO-8601] |
DATETIME |
The date (in days since the epoch) and time (in microseconds since midnight) | A custom type { "type": "record", "name": "datetime", "fields": [ {"name": "date", "type": "int", "logicalType": "date"}, {"name": "time", "type": "long", "logicalType": "time-micros"} ] } |
string [ISO-8601] |
DECIMAL (p,s) |
An arbitrary-precision signed decimal number | A decimal logical type | number |
DOUBLE |
64-bit floating point numbers | double | number |
FLOAT |
32-bit floating point numbers | float | number |
INTEGER |
A 32-bit integer | int | number |
INTERVAL |
Duration between two events (in months, hours, and microseconds) | A custom type { "type": "record", "name": "interval", "fields": [ {"name": "months", "type": "int"} {"name": "hours", "type": "int"}, {"name": "micros", "type": "long"} ] } |
string [ISO-8601] |
JSON |
A JSON object | A custom logical type { "type": "string", "logicalType": "json" } |
nested JSON |
LONG |
A 64-bit integer | long | number |
NUMBER |
A numeric data type | A custom logical type { "type": "string", "logicalType": "number" } |
string |
STRING |
An unlimited string length | string | string |
TIME |
How many microseconds elapsed since midnight, regardless of timezone. | A time-micros logical type | string [ISO-8601] |
TIME_INTERVAL |
How many microseconds elapsed between two events | A custom logical type { "type": "long", "logicalType": "time-interval-micros" } |
long |
TIMESTAMP |
How many microseconds elapsed since the epoch, regardless of timezone | A timestamp logical type | string [ISO-8601] |
TIMESTAMP WITH TIME ZONE |
How many microseconds elapsed since the epoch with a specific timezone offset in milliseconds | A custom type { "type": "record", "name": "timestampTz", "fields": [ {"name": "timestamp", "type": "long" "logicalType": "timestamp-micros"}, {"name": "offset", "type": "int" "logicalType": "time-millis"} ] } |
string [ISO-8601] |
TIME WITH TIME ZONE |
How many microseconds elapsed since midnight with a specific timezone offset | A custom type { "type": "record", "name": "timeTz", "fields": [ {"name": "time", "type": "long" "logicalType": "time-micros"}, {"name": "offset", "type": "int", "logicalType": "time-millis"} ] } |
string [ISO-8601] |
UNION |
A varying data type | Union | array |
UNSUPPORTED |
An unsupported data type | A custom logical type { "type": "null", "logicalType": "unsupported" } |
null |
VARCHAR |
A string with a maximum length of n characters | A custom logical type { "type": "string", "logicalType": "varchar" "length": N } |
string |
Map Oracle data types to Datastream unified types
Oracle data type | Datastream unified type |
---|---|
ANYDATA |
UNSUPPORTED |
BFILE |
STRING |
BINARY DOUBLE |
DOUBLE |
BINARY FLOAT |
FLOAT |
BLOB |
BYTES |
CHAR |
VARCHAR |
CLOB |
STRING |
DATE |
DATETIME
|
DOUBLE PRECISION |
DOUBLE |
FLOAT(p) |
DOUBLE |
INTERVAL DAY TO SECOND |
UNSUPPORTED |
INTERVAL YEAR TO MONTH |
UNSUPPORTED |
LONG/LONG RAW |
UNSUPPORTED |
NCHAR |
STRING |
NCLOB |
STRING |
NUMBER |
NUMBER |
NUMBER(p,s<=0) |
If p<=18, then |
NUMBER(p,s>0) |
If p= |
NVARCHAR2 |
STRING |
RAW |
STRING |
ROWID |
STRING |
SDO_GEOMETRY |
UNSUPPORTED |
SMALLINT |
INTEGER |
TIMESTAMP |
TIMESTAMP
|
TIMESTAMP WITH TIME ZONE |
TIMESTAMP WITH TIME ZONE
|
UDT (user-defined type) |
UNSUPPORTED |
UROWID |
UNSUPPORTED |
VARCHAR |
VARCHAR |
VARCHAR2 |
VARCHAR |
XMLTYPE |
UNSUPPORTED |
Map MySQL data types to Datastream unified types
MySQL data type | Datastream unified type |
---|---|
BIGINT(size) SIGNED |
LONG |
BIGINT(size) UNSIGNED |
If the destination is BigQuery, then DECIMAL , if Cloud Storage, then NUMBER |
BINARY(size) |
STRING (hex encoded) |
BIT(size) |
LONG |
BLOB(size) |
STRING (hex encoded) |
BOOL |
INTEGER |
CHAR(size) |
STRING |
DATE |
If the destination is BigQuery, then DATE , if Cloud Storage, then TIMESTAMP |
DATETIME(fsp) |
If the destination is BigQuery, then DATETIME , if Cloud Storage, then TIMESTAMP |
DECIMAL(size, d) |
DECIMAL(size, d) |
DOUBLE(size, d) |
DOUBLE |
ENUM(val1, val2, val3, ...) |
STRING |
FLOAT(p) |
FLOAT |
FLOAT(size, d) |
FLOAT |
GEOMETRY |
UNSUPPORTED |
INTEGER(size) SIGNED |
INTEGER |
INTEGER(size) UNSIGNED |
LONG |
JSON |
If the destination is BigQuery, then JSON , if Cloud Storage, then STRING
|
LONGBLOB |
STRING (hex encoded) |
LONGTEXT |
STRING |
MEDIUMBLOB |
STRING (hex encoded) |
MEDIUMINT(size) |
INTEGER |
MEDIUMTEXT |
STRING |
SET(val1, val2, val3, ...) |
STRING |
SMALLINT(size) |
INTEGER |
TEXT(size) |
STRING |
TIME(fsp) |
If the destination is BigQuery, then INTERVAL , if Cloud Storage, then TIME_INTERVAL |
TIMESTAMP(fsp) |
TIMESTAMP |
TINYBLOB |
STRING (hex encoded) |
TINYINT(size) |
INTEGER |
TINYTEXT |
STRING |
VARBINARY(size) |
STRING (hex encoded) |
VARCHAR |
STRING |
YEAR |
INTEGER |
Map PostgreSQL data types to Datastream unified types
PostgreSQL data type | Datastream unified type |
---|---|
ARRAY |
JSON
|
BIGINT |
LONG |
BIT |
BYTES |
BIT_VARYING |
BYTES |
BOOLEAN |
BOOLEAN |
BOX |
UNSUPPORTED |
BYTEA |
BYTES |
CHARACTER |
|
CHARACTER_VARYING |
|
CIDR |
STRING |
CIRCLE |
UNSUPPORTED |
CITEXT |
STRING |
DATE |
DATE |
DOUBLE_PRECISION |
DOUBLE |
ENUM |
STRING |
INET |
STRING |
INTEGER |
INTEGER |
INTERVAL |
INTERVAL |
JSON |
JSON |
JSONB |
JSON |
LINE |
UNSUPPORTED |
LSEG |
UNSUPPORTED |
MACADDR |
STRING |
MONEY |
DOUBLE |
NUMERIC |
|
OID |
LONG |
PATH |
UNSUPPORTED |
POINT |
UNSUPPORTED |
POLYGON |
UNSUPPORTED |
REAL |
FLOAT |
SMALLINT |
INTEGER |
SMALLSERIAL |
INTEGER |
SERIAL |
INTEGER |
TEXT |
STRING |
TIME |
TIME |
TIMESTAMP |
TIMESTAMP |
TIMESTAMP_WITH_TIMEZONE |
TIMESTAMP_WITH_TIMEZONE |
TIME_WITH_TIMEZONE |
TIME_WITH_TIMEZONE |
TSQUERY |
STRING |
TSVECTOR |
STRING |
TXID_SNAPSHOT |
STRING |
UUID |
STRING |
XML |
STRING |
Map SQL Server data types to Datastream unified types
SQL Server data type | Datastream unified type |
---|---|
BIGINT |
LONG |
BINARY |
BYTES |
BIT |
BOOLEAN |
CHAR |
STRING |
DATE |
DATE |
DATETIME2 |
DATETIME |
DATETIME |
DATETIME |
DATETIMEOFFSET |
TIMESTAMP WITH TIMEZONE |
DECIMAL |
DECIMAL (p, s) |
FLOAT |
DOUBLE |
INT |
INTEGER |
IMAGE |
BYTES |
MONEY |
DECIMAL |
NCHAR |
STRING |
NVARCHAR |
STRING |
NVARCHAR(MAX) |
STRING |
NTEXT |
STRING |
NUMERIC |
DECIMAL (p, s) |
REAL |
FLOAT |
SMALLDATETIME |
DATETIME |
SMALLINT |
INTEGER |
SMALLMONEY |
DECIMAL |
TEXT |
STRING |
TINYINT |
INTEGER |
TIME |
TIME |
TIMESTAMP/ROWVERSION |
BYTES |
UNIQUEIDENTIFIER |
STRING |
VARCHAR |
STRING |
VARCHAR(MAX) |
STRING |
VARBINARY |
BYTES |
VARBINARY(MAX) |
BYTES |
XML |
STRING |
What's next
- For more information about configuring an Oracle source, see Configure a source Oracle database.
- For more information about configuring a MySQL source, see Configure a source MySQL database.
- For more information about configuring a PostgreSQL source, see Configure a source PostgreSQL database.
- For more information about configuring a SQL Server source, see Configure a source SQL Server database.