Reference documentation and code samples for the google-cloud-bigquery module Google::Cloud::Bigquery::External.
External
Creates a new DataSource (or subclass) object that represents the external data source that can be queried from directly, even though the data is not stored in BigQuery. Instead of loading or streaming the data, this object references the external data source.
See DataSource, CsvSource, JsonSource, SheetsSource, BigtableSource
Examples
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.autodetect = true csv.skip_leading_rows = 1 end data = bigquery.query "SELECT * FROM my_ext_table", external: { my_ext_table: csv_table } # Iterate over the first page of results data.each do |row| puts row[:name] end # Retrieve the next page of results data = data.next if data.next?
Hive partitioning options:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new gcs_uri = "gs://cloud-samples-data/bigquery/hive-partitioning-samples/autolayout/*" source_uri_prefix = "gs://cloud-samples-data/bigquery/hive-partitioning-samples/autolayout/" external_data = bigquery.external gcs_uri, format: :parquet do |ext| ext.hive_partitioning_mode = :auto ext.hive_partitioning_require_partition_filter = true ext.hive_partitioning_source_uri_prefix = source_uri_prefix end external_data.hive_partitioning? #=> true external_data.hive_partitioning_mode #=> "AUTO" external_data.hive_partitioning_require_partition_filter? #=> true external_data.hive_partitioning_source_uri_prefix #=> source_uri_prefix