Problem loading parquet file to BigQuery

Problem

When attempting to load a Hive Parquet file to the BigQuery table using the interface the following error message is thrown where <column_name> is the name of one of the BigQuery columns.

Invalid timestamp microseconds value -62135769600000000 converted from INT96 for column '<column_name>'.

The error can be found in Cloud logging using the following filter:

resource.type="bigquery_resource"
severity=ERROR
"Invalid timestamp microseconds value"

Environment

  • BigQuery
  • Google Cloud console

Solution

  1. Open the parquet file in a parquet file reader (any software that will show human-readable content).
  2. Verify the date-and-time value of the column mentioned in the error message, matches any of the following required formats:
    • Int64 -  TIMESTAMP, precision=MILLIS (TIMESTAMP_MILLIS) or TIMESTAMP, precision=MICROS (TIMESTAMP_MICROS) 
    • Int96
  3. The valid format for exporting from the hive is for example yyyy-MM-dd HH:mm:ss.SSS z. For more details see: BigQuery - Parquet conversions.
  4. If you prefer not to modify the parquet file, you can consider using a connector instead of the interface.

Cause

The affected column in the Parquet file has an incorrect date-time format and this string is misinterpreted (thus negative timestamp miliseconds value).

Useful links