Reference documentation and code samples for the google-cloud-bigquery class Google::Cloud::Bigquery::External::CsvSource.
CsvSource
CsvSource is a subclass of DataSource and represents a CSV external data source that can be queried from directly, such as Google Cloud Storage or Google Drive, even though the data is not stored in BigQuery. Instead of loading or streaming the data, this object references the external data source.
Example
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.autodetect = true csv.skip_leading_rows = 1 end data = bigquery.query "SELECT * FROM my_ext_table", external: { my_ext_table: csv_table } # Iterate over the first page of results data.each do |row| puts row[:name] end # Retrieve the next page of results data = data.next if data.next?
Methods
#delimiter
def delimiter() -> String
The separator for fields in a CSV file.
- (String)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.delimiter = "|" end csv_table.delimiter #=> "|"
#delimiter=
def delimiter=(new_delimiter)
Set the separator for fields in a CSV file.
- new_delimiter (String) — New delimiter value
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.delimiter = "|" end csv_table.delimiter #=> "|"
#encoding
def encoding() -> String
The character encoding of the data.
- (String)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.encoding = "UTF-8" end csv_table.encoding #=> "UTF-8"
#encoding=
def encoding=(new_encoding)
Set the character encoding of the data.
- new_encoding (String) — New encoding value
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.encoding = "UTF-8" end csv_table.encoding #=> "UTF-8"
#fields
def fields() -> Array<Schema::Field>
The fields of the schema.
- (Array<Schema::Field>) — An array of field objects.
#headers
def headers() -> Array<Symbol>
The names of the columns in the schema.
- (Array<Symbol>) — An array of column names.
#iso8859_1?
def iso8859_1?() -> Boolean
Checks if the character encoding of the data is "ISO-8859-1".
- (Boolean)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.encoding = "ISO-8859-1" end csv_table.encoding #=> "ISO-8859-1" csv_table.iso8859_1? #=> true
#jagged_rows
def jagged_rows() -> Boolean
Indicates if BigQuery should accept rows that are missing trailing optional columns.
- (Boolean)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.jagged_rows = true end csv_table.jagged_rows #=> true
#jagged_rows=
def jagged_rows=(new_jagged_rows)
Set whether BigQuery should accept rows that are missing trailing optional columns.
- new_jagged_rows (Boolean) — New jagged_rows value
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.jagged_rows = true end csv_table.jagged_rows #=> true
#param_types
def param_types() -> Hash
The types of the fields in the data in the schema, using the same format as the optional query parameter types.
- (Hash) — A hash with field names as keys, and types as values.
#quote
def quote() -> String
The value that is used to quote data sections in a CSV file.
- (String)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.quote = "'" end csv_table.quote #=> "'"
#quote=
def quote=(new_quote)
Set the value that is used to quote data sections in a CSV file.
- new_quote (String) — New quote value
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.quote = "'" end csv_table.quote #=> "'"
#quoted_newlines
def quoted_newlines() -> Boolean
Indicates if BigQuery should allow quoted data sections that contain newline characters in a CSV file.
- (Boolean)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.quoted_newlines = true end csv_table.quoted_newlines #=> true
#quoted_newlines=
def quoted_newlines=(new_quoted_newlines)
Set whether BigQuery should allow quoted data sections that contain newline characters in a CSV file.
- new_quoted_newlines (Boolean) — New quoted_newlines value
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.quoted_newlines = true end csv_table.quoted_newlines #=> true
#schema
def schema(replace: false) { |schema| ... } -> Google::Cloud::Bigquery::Schema
The schema for the data.
-
replace (Boolean) (defaults to: false) — Whether to replace the existing schema with
the new schema. If
true
, the fields will replace the existing schema. Iffalse
, the fields will be added to the existing schema. The default value isfalse
.
- (schema) — a block for setting the schema
- schema (Schema) — the object accepting the schema
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.schema do |schema| schema.string "name", mode: :required schema.string "email", mode: :required schema.integer "age", mode: :required schema.boolean "active", mode: :required end end
#schema=
def schema=(new_schema)
Set the schema for the data.
- new_schema (Schema) — The schema object.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_shema = bigquery.schema do |schema| schema.string "name", mode: :required schema.string "email", mode: :required schema.integer "age", mode: :required schema.boolean "active", mode: :required end csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url csv_table.schema = csv_shema
#skip_leading_rows
def skip_leading_rows() -> Integer
The number of rows at the top of a CSV file that BigQuery will skip when reading the data.
- (Integer)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.skip_leading_rows = 1 end csv_table.skip_leading_rows #=> 1
#skip_leading_rows=
def skip_leading_rows=(row_count)
Set the number of rows at the top of a CSV file that BigQuery will skip when reading the data.
- row_count (Integer) — New skip_leading_rows value
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.skip_leading_rows = 1 end csv_table.skip_leading_rows #=> 1
#utf8?
def utf8?() -> Boolean
Checks if the character encoding of the data is "UTF-8". This is the default.
- (Boolean)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.encoding = "UTF-8" end csv_table.encoding #=> "UTF-8" csv_table.utf8? #=> true