Class JsonStreamWriter (3.8.0)

public class JsonStreamWriter implements AutoCloseable

A StreamWriter that can write JSON data (JSONObjects) to BigQuery tables. The JsonStreamWriter is built on top of a StreamWriter, and it simply converts all JSON data to protobuf messages then calls StreamWriter's append() method to write to BigQuery tables. It maintains all StreamWriter functions, but also provides an additional feature: schema update support, where if the BigQuery table schema is updated, users will be able to ingest data on the new schema after some time (in order of minutes).

Inheritance

Object > JsonStreamWriter

Implements

AutoCloseable

Static Methods

newBuilder(String streamOrTableName, BigQueryWriteClient client)

public static JsonStreamWriter.Builder newBuilder(String streamOrTableName, BigQueryWriteClient client)

newBuilder that constructs a JsonStreamWriter builder with TableSchema being initialized by StreamWriter by default.

Parameters
Name Description
streamOrTableName String

name of the stream that must follow "projects/[^/]+/datasets/[^/]+/tables/[^/]+/streams/[^/]+"

client BigQueryWriteClient

BigQueryWriteClient

Returns
Type Description
JsonStreamWriter.Builder

Builder

newBuilder(String streamOrTableName, TableSchema tableSchema)

public static JsonStreamWriter.Builder newBuilder(String streamOrTableName, TableSchema tableSchema)

newBuilder that constructs a JsonStreamWriter builder with BigQuery client being initialized by StreamWriter by default.

The table schema passed in will be updated automatically when there is a schema update event. When used for Writer creation, it should be the latest schema. So when you are trying to reuse a stream, you should use Builder newBuilder( String streamOrTableName, BigQueryWriteClient client) instead, so the created Writer will be based on a fresh schema.

Parameters
Name Description
streamOrTableName String

name of the stream that must follow "projects/[^/]+/datasets/[^/]+/tables/[^/]+/streams/[^/]+" or table name "projects/[^/]+/datasets/[^/]+/tables/[^/]+"

tableSchema TableSchema

The schema of the table when the stream was created, which is passed back through WriteStream

Returns
Type Description
JsonStreamWriter.Builder

Builder

newBuilder(String streamOrTableName, TableSchema tableSchema, BigQueryWriteClient client)

public static JsonStreamWriter.Builder newBuilder(String streamOrTableName, TableSchema tableSchema, BigQueryWriteClient client)

newBuilder that constructs a JsonStreamWriter builder.

The table schema passed in will be updated automatically when there is a schema update event. When used for Writer creation, it should be the latest schema. So when you are trying to reuse a stream, you should use Builder newBuilder( String streamOrTableName, BigQueryWriteClient client) instead, so the created Writer will be based on a fresh schema.

Parameters
Name Description
streamOrTableName String

name of the stream that must follow "projects/[^/]+/datasets/[^/]+/tables/[^/]+/streams/[^/]+"

tableSchema TableSchema

The schema of the table when the stream was created, which is passed back through WriteStream

client BigQueryWriteClient
Returns
Type Description
JsonStreamWriter.Builder

Builder

setMaxRequestCallbackWaitTime(Duration waitTime)

public static void setMaxRequestCallbackWaitTime(Duration waitTime)

Sets the maximum time a request is allowed to be waiting in request waiting queue. Under very low chance, it's possible for append request to be waiting indefintely for request callback when Google networking SDK does not detect the networking breakage. The default timeout is 15 minutes. We are investigating the root cause for callback not triggered by networking SDK.

Parameter
Name Description
waitTime Duration

Methods

append(JSONArray jsonArr)

public ApiFuture<AppendRowsResponse> append(JSONArray jsonArr)

Writes a JSONArray that contains JSONObjects to the BigQuery table by first converting the JSON data to protobuf messages, then using StreamWriter's append() to write the data at current end of stream. If there is a schema update, the current StreamWriter is closed. A new StreamWriter is created with the updated TableSchema.

Parameter
Name Description
jsonArr org.json.JSONArray

The JSON array that contains JSONObjects to be written

Returns
Type Description
ApiFuture<AppendRowsResponse>

ApiFuture<AppendRowsResponse> returns an AppendRowsResponse message wrapped in an ApiFuture

Exceptions
Type Description
IOException
DescriptorValidationException

append(JSONArray jsonArr, long offset)

public ApiFuture<AppendRowsResponse> append(JSONArray jsonArr, long offset)

Writes a JSONArray that contains JSONObjects to the BigQuery table by first converting the JSON data to protobuf messages, then using StreamWriter's append() to write the data at the specified offset. If there is a schema update, the current StreamWriter is closed. A new StreamWriter is created with the updated TableSchema.

Parameters
Name Description
jsonArr org.json.JSONArray

The JSON array that contains JSONObjects to be written

offset long

Offset for deduplication

Returns
Type Description
ApiFuture<AppendRowsResponse>

ApiFuture<AppendRowsResponse> returns an AppendRowsResponse message wrapped in an ApiFuture

Exceptions
Type Description
IOException
DescriptorValidationException

close()

public void close()

getDescriptor()

public Descriptors.Descriptor getDescriptor()

Gets current descriptor

Returns
Type Description
Descriptor

Descriptor

getInflightWaitSeconds()

public long getInflightWaitSeconds()

Returns the wait of a request in Client side before sending to the Server. Request could wait in Client because it reached the client side inflight request limit (adjustable when constructing the Writer). The value is the wait time for the last sent request. A constant high wait value indicates a need for more throughput, you can create a new Stream for to increase the throughput in exclusive stream case, or create a new Writer in the default stream case.

Returns
Type Description
long

getLocation()

public String getLocation()

Gets the location of the destination

Returns
Type Description
String

Descriptor

getMissingValueInterpretationMap()

public Map<String,AppendRowsRequest.MissingValueInterpretation> getMissingValueInterpretationMap()
Returns
Type Description
Map<String,MissingValueInterpretation>

the missing value interpretation map used for the writer.

getStreamName()

public String getStreamName()
Returns
Type Description
String

getWriterId()

public String getWriterId()
Returns
Type Description
String

A unique Id for this writer.

isClosed()

public boolean isClosed()
Returns
Type Description
boolean

if a Json writer can no longer be used for writing. It is due to either the JsonStreamWriter is explicitly closed or the underlying connection is broken when connection pool is not used. Client should recreate JsonStreamWriter in this case.

isUserClosed()

public boolean isUserClosed()
Returns
Type Description
boolean

if user explicitly closed the writer.

setMissingValueInterpretationMap(Map<String,AppendRowsRequest.MissingValueInterpretation> missingValueInterpretationMap)

public void setMissingValueInterpretationMap(Map<String,AppendRowsRequest.MissingValueInterpretation> missingValueInterpretationMap)

Sets the missing value interpretation map for the JsonStreamWriter. The input missingValueInterpretationMap is used for all append requests unless otherwise changed.

Parameter
Name Description
missingValueInterpretationMap Map<String,MissingValueInterpretation>

the missing value interpretation map used by the JsonStreamWriter.