Interface GcsSourceOrBuilder (0.41.0)

public interface GcsSourceOrBuilder extends MessageOrBuilder

Implements

MessageOrBuilder

Methods

getDataSchema()

public abstract String getDataSchema()

The schema to use when parsing the data from the source.

Supported values for document imports:

  • document (default): One JSON Document per line. Each document must have a valid Document.id.
  • content: Unstructured data (e.g. PDF, HTML). Each file matched by input_uris becomes a document, with the ID set to the first 128 bits of SHA256(URI) encoded as a hex string.
  • custom: One custom data JSON per row in arbitrary format that conforms to the defined Schema of the data store. This can only be used by the GENERIC Data Store vertical.
  • csv: A CSV file with header conforming to the defined Schema of the data store. Each entry after the header is imported as a Document. This can only be used by the GENERIC Data Store vertical.

    Supported values for user event imports:

  • user_event (default): One JSON UserEvent per line.

string data_schema = 2;

Returns
Type Description
String

The dataSchema.

getDataSchemaBytes()

public abstract ByteString getDataSchemaBytes()

The schema to use when parsing the data from the source.

Supported values for document imports:

  • document (default): One JSON Document per line. Each document must have a valid Document.id.
  • content: Unstructured data (e.g. PDF, HTML). Each file matched by input_uris becomes a document, with the ID set to the first 128 bits of SHA256(URI) encoded as a hex string.
  • custom: One custom data JSON per row in arbitrary format that conforms to the defined Schema of the data store. This can only be used by the GENERIC Data Store vertical.
  • csv: A CSV file with header conforming to the defined Schema of the data store. Each entry after the header is imported as a Document. This can only be used by the GENERIC Data Store vertical.

    Supported values for user event imports:

  • user_event (default): One JSON UserEvent per line.

string data_schema = 2;

Returns
Type Description
ByteString

The bytes for dataSchema.

getInputUris(int index)

public abstract String getInputUris(int index)

Required. Cloud Storage URIs to input files. Each URI can be up to 2000 characters long. URIs can match the full object path (for example, gs://bucket/directory/object.json) or a pattern matching one or more files, such as gs://bucket/directory/*.json.

A request can contain at most 100 files (or 100,000 files if data_schema is content). Each file can be up to 2 GB (or 100 MB if data_schema is content).

repeated string input_uris = 1 [(.google.api.field_behavior) = REQUIRED];

Parameter
Name Description
index int

The index of the element to return.

Returns
Type Description
String

The inputUris at the given index.

getInputUrisBytes(int index)

public abstract ByteString getInputUrisBytes(int index)

Required. Cloud Storage URIs to input files. Each URI can be up to 2000 characters long. URIs can match the full object path (for example, gs://bucket/directory/object.json) or a pattern matching one or more files, such as gs://bucket/directory/*.json.

A request can contain at most 100 files (or 100,000 files if data_schema is content). Each file can be up to 2 GB (or 100 MB if data_schema is content).

repeated string input_uris = 1 [(.google.api.field_behavior) = REQUIRED];

Parameter
Name Description
index int

The index of the value to return.

Returns
Type Description
ByteString

The bytes of the inputUris at the given index.

getInputUrisCount()

public abstract int getInputUrisCount()

Required. Cloud Storage URIs to input files. Each URI can be up to 2000 characters long. URIs can match the full object path (for example, gs://bucket/directory/object.json) or a pattern matching one or more files, such as gs://bucket/directory/*.json.

A request can contain at most 100 files (or 100,000 files if data_schema is content). Each file can be up to 2 GB (or 100 MB if data_schema is content).

repeated string input_uris = 1 [(.google.api.field_behavior) = REQUIRED];

Returns
Type Description
int

The count of inputUris.

getInputUrisList()

public abstract List<String> getInputUrisList()

Required. Cloud Storage URIs to input files. Each URI can be up to 2000 characters long. URIs can match the full object path (for example, gs://bucket/directory/object.json) or a pattern matching one or more files, such as gs://bucket/directory/*.json.

A request can contain at most 100 files (or 100,000 files if data_schema is content). Each file can be up to 2 GB (or 100 MB if data_schema is content).

repeated string input_uris = 1 [(.google.api.field_behavior) = REQUIRED];

Returns
Type Description
List<String>

A list containing the inputUris.