BigQueryIO fails to load large messages into a nested structure with Beam SDK
Stay organized with collections
Save and categorize content based on your preferences.
Problem
BigQueryIO fails to load large messages into a nested structure.
Environment
- Dataflow
- BigQuery
- Beam SDK 2.35 and 2.36.
Solution
- Set your Pipeline option to a Large amount between 1MB - 5MB or so in BigQueryOptions.
- gRPC should support up to 10MB request sizes. But not advisable to peak the size at 10MB.
Workaround
- Upgrade to Beam 2.37.0.
Cause
The data insert was considered failed, and output to the failed inserts PCollection, which is part of the return value given by BigQueryIO and a bug in 2.36.0 and earlier. This issue was fixed in Beam 2.37.0.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-12-12 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-12-12 UTC."],[],[]]