Problem
While loading TIMESTAMP / DATETIME parameters in BigQuery using a JSON file in Cloud Storage, you get the follow error:
Error in query string: Error processing job '<JOB_ID>': Cannot return an invalid timestamp value of -5748696663385964544 microseconds relative to the Unix epoch. The range of valid timestamp values is [0001-01-1 00:00:00, 9999-12-31 23:59:59.999999]; error in writing field time
Environment
- Google Cloud Storage to BigQuery
- JSON NEWLINE_DELIMITED JSON file
- DATETIME / TIMESTAMP field in any format other than SECONDS
Solution
Workaround
- Convert the microseconds or milliseconds format to seconds for EPOCH input to one of the following formats.
- For Example 1602720000000000 microseconds to 1602720000 seconds or 1602720000.000000 seconds where decimals are for microseconds.
- Convert it as YYYY-[M]M-[D]D[( |T)[H]H:[M]M:[S]S[.DDDDDD] and update the corresponding JSON schema. This would handle precision as well since [.DDDDDD] denotes microseconds.
Dataflow Google Cloud Storage to BigQuery template can be modified to perform the above transformation using Javascript User Defined Functions (UDFs) by reading the relevant fields and converting it to one of the above formats for DATETIME / TIMESTAMP.
Cause
According to public docs BigQuery supports Unix Epoch Time which is in seconds, hence the time inputs are always in seconds. This is also documented as a limitation for JSON or CSV files.
Note: The above issue can also arise from the DATE type which needs to be of the formatted as YYYY-MM-DD.