google-cloud-bigquery - Class Google::Cloud::Bigquery::External::ParquetSource (v1.38.1)

Reference documentation and code samples for the google-cloud-bigquery class Google::Cloud::Bigquery::External::ParquetSource.

ParquetSource

ParquetSource is a subclass of DataSource and represents a Parquet external data source that can be queried from directly, even though the data is not stored in BigQuery. Instead of loading or streaming the data, this object references the external data source.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

parquet_url = "gs://bucket/path/to/data.parquet"
parquet_table = bigquery.external parquet_url do |parquet|
  parquet.enable_list_inference = 1
end

data = bigquery.query "SELECT * FROM my_ext_table",
                      external: { my_ext_table: parquet_table }

# Iterate over the first page of results
data.each do |row|
  puts row[:name]
end
# Retrieve the next page of results
data = data.next if data.next?

Methods

#enable_list_inference

def enable_list_inference() -> Boolean

Indicates whether to use schema inference specifically for Parquet LIST logical type.

Returns
  • (Boolean)
Example
require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

parquet_url = "gs://bucket/path/to/data.parquet"
parquet_table = bigquery.external parquet_url do |parquet|
  parquet.enable_list_inference = true
end

parquet_table.enable_list_inference #=> true

#enable_list_inference=

def enable_list_inference=(new_enable_list_inference)

Sets whether to use schema inference specifically for Parquet LIST logical type.

Parameter
  • new_enable_list_inference (Boolean) — The new enable_list_inference value.
Example
require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

parquet_url = "gs://bucket/path/to/data.parquet"
parquet_table = bigquery.external parquet_url do |parquet|
  parquet.enable_list_inference = true
end

parquet_table.enable_list_inference #=> true

#enum_as_string

def enum_as_string() -> Boolean

Indicates whether to infer Parquet ENUM logical type as STRING instead of BYTES by default.

Returns
  • (Boolean)
Example
require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

parquet_url = "gs://bucket/path/to/data.parquet"
parquet_table = bigquery.external parquet_url do |parquet|
  parquet.enum_as_string = true
end

parquet_table.enum_as_string #=> true

#enum_as_string=

def enum_as_string=(new_enum_as_string)

Sets whether to infer Parquet ENUM logical type as STRING instead of BYTES by default.

Parameter
  • new_enum_as_string (Boolean) — The new enum_as_string value.
Example
require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

parquet_url = "gs://bucket/path/to/data.parquet"
parquet_table = bigquery.external parquet_url do |parquet|
  parquet.enum_as_string = true
end

parquet_table.enum_as_string #=> true