Problem
When attempting to load a Hive Parquet file to the BigQuery table using the interface the following error message is thrown where <column_name> is the name of one of the BigQuery columns.
Invalid timestamp microseconds value -62135769600000000 converted from INT96 for column '<column_name>'.
The error can be found in Cloud logging using the following filter:
resource.type="bigquery_resource" severity=ERROR "Invalid timestamp microseconds value"
Environment
- BigQuery
- Google Cloud console
Solution
- Open the parquet file in a parquet file reader (any software that will show human-readable content).
- Verify the date-and-time value of the column mentioned in the error message, matches any of the following required formats:
- Int64 - TIMESTAMP, precision=MILLIS (TIMESTAMP_MILLIS) or TIMESTAMP, precision=MICROS (TIMESTAMP_MICROS)
- Int96
- The valid format for exporting from the hive is for example yyyy-MM-dd HH:mm:ss.SSS z. For more details see: BigQuery - Parquet conversions.
- If you prefer not to modify the parquet file, you can consider using a connector instead of the interface.
Cause
The affected column in the Parquet file has an incorrect date-time format and this string is misinterpreted (thus negative timestamp miliseconds value).
Useful links