BigQuery API - Class Google::Cloud::Bigquery::Table (v1.41.0)

Reference documentation and code samples for the BigQuery API class Google::Cloud::Bigquery::Table.

Table

A named resource representing a BigQuery table that holds zero or more records. Every table is defined by a schema that may contain nested and repeated fields.

The Table class can also represent a logical view, which is a virtual table defined by a SQL query (see #view? and Dataset#create_view); or a materialized view, which is a precomputed view that periodically caches results of a query for increased performance and efficiency (see #materialized_view? and Dataset#create_materialized_view).

Inherits

Object

Examples

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"

table = dataset.create_table "my_table" do |schema|
  schema.string "first_name", mode: :required
  schema.record "cities_lived", mode: :repeated do |nested_schema|
    nested_schema.string "place", mode: :required
    nested_schema.integer "number_of_years", mode: :required
  end
end

row = {
  "first_name" => "Alice",
  "cities_lived" => [
    {
      "place" => "Seattle",
      "number_of_years" => 5
    },
    {
      "place" => "Stockholm",
      "number_of_years" => 6
    }
  ]
}
table.insert row

Creating a logical view:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
view = dataset.create_view "my_view",
         "SELECT name, age FROM `my_project.my_dataset.my_table`"
view.view? # true

Creating a materialized view:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
view = dataset.create_materialized_view "my_materialized_view",
                                        "SELECT name, age FROM `my_project.my_dataset.my_table`"
view.materialized_view? # true

Methods

#api_url

def api_url() -> String, nil

A URL that can be used to access the table using the REST API.

Returns

(String, nil) — A REST URL for the resource, or nil if the object is a reference (see #reference?).

#buffer_bytes

def buffer_bytes() -> Integer, nil

A lower-bound estimate of the number of bytes currently in this table's streaming buffer, if one is present. This field will be absent if the table is not being streamed to or if there is no data in the streaming buffer.

Returns

(Integer, nil) — The estimated number of bytes in the buffer, or nil if not present or the object is a reference (see #reference?).

#buffer_oldest_at

def buffer_oldest_at() -> Time, nil

The time of the oldest entry currently in this table's streaming buffer, if one is present. This field will be absent if the table is not being streamed to or if there is no data in the streaming buffer.

Returns

(Time, nil) — The oldest entry time, or nil if not present or the object is a reference (see #reference?).

#buffer_rows

def buffer_rows() -> Integer, nil

A lower-bound estimate of the number of rows currently in this table's streaming buffer, if one is present. This field will be absent if the table is not being streamed to or if there is no data in the streaming buffer.

Returns

(Integer, nil) — The estimated number of rows in the buffer, or nil if not present or the object is a reference (see #reference?).

#bytes_count

def bytes_count() -> Integer, nil

The number of bytes in the table.

Returns

(Integer, nil) — The count of bytes in the table, or nil if the object is a reference (see #reference?).

#clone

def clone(destination_table, &block) { |job| ... } -> Boolean

Clones the data from the table to another table using a synchronous method that blocks for a response. The source and destination table have the same table type, but only bill for unique data. Timeouts and transient errors are generally handled as needed to complete the job. See also #copy_job.

The geographic location for the job ("US", "EU", etc.) can be set via CopyJob::Updater#location= in a block passed to this method. If the table is a full resource representation (see #resource_full?), the location of the job will be automatically set to the location of the table.

Parameter

destination_table (Table, String) — The destination for the copied data. This can also be a string identifier as specified by the Standard SQL Query Reference (project-name.dataset_id.table_id) or the Legacy SQL Query Reference (project-name:dataset_id.table_id). This is useful for referencing tables in other projects and datasets.

Yields

(job) — a job configuration object

Yield Parameter

job (Google::Cloud::Bigquery::CopyJob::Updater) — a job configuration object for setting additional options.

Returns

(Boolean) — Returns true if the copy operation succeeded.

Examples

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
destination_table = dataset.table "my_destination_table"

table.clone destination_table

Passing a string identifier for the destination table:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.clone "other-project:other_dataset.other_table"

#clone?

def clone?() -> Boolean, nil

Checks if the table's type is CLONE, indicating that the table represents a BigQuery table clone.

Returns

(Boolean, nil) — true when the type is CLONE, false otherwise, if the object is a resource (see #resource?); nil if the object is a reference (see #reference?).

#clone_definition

def clone_definition() -> Google::Apis::BigqueryV2::CloneDefinition, nil

The Information about base table and clone time of the table.

Returns

(Google::Apis::BigqueryV2::CloneDefinition, nil) — Clone definition of table clone, or nil if not clone or the object is a reference (see #reference?).

#clustering?

def clustering?() -> Boolean, nil

Checks if the table is clustered.

See Updater#clustering_fields=, #clustering_fields and #clustering_fields=.

Returns

(Boolean, nil) — true when the table is clustered, or false otherwise, if the object is a resource (see #resource?); nil if the object is a reference (see #reference?).

#clustering_fields

def clustering_fields() -> Array<String>, nil

One or more fields on which data should be clustered. Must be specified with time partitioning, data in the table will be first partitioned and subsequently clustered. The order of the returned fields determines the sort order of the data.

BigQuery supports clustering for both partitioned and non-partitioned tables.

See Updater#clustering_fields=, #clustering_fields= and #clustering?.

Returns

(Array<String>, nil) — The clustering fields, or nil if the table is not clustered or if the table is a reference (see #reference?).

#clustering_fields=

def clustering_fields=(fields)

Updates the list of fields on which data should be clustered.

Only top-level, non-repeated, simple-type fields are supported. When you cluster a table using multiple columns, the order of columns you specify is important. The order of the specified columns determines the sort order of the data.

BigQuery supports clustering for both partitioned and non-partitioned tables.

See Updater#clustering_fields=, #clustering_fields and #clustering?.

Parameter

fields (Array<String>, nil) — The clustering fields, or nil to remove the clustering configuration. Only top-level, non-repeated, simple-type fields are supported.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.clustering_fields = ["last_name", "first_name"]

#copy

def copy(destination_table, create: nil, write: nil, &block) { |job| ... } -> Boolean

Copies the data from the table to another table using a synchronous method that blocks for a response. Timeouts and transient errors are generally handled as needed to complete the job. See also #copy_job.

Parameters

destination_table (Table, String) — The destination for the copied data. This can also be a string identifier as specified by the Standard SQL Query Reference (project-name.dataset_id.table_id) or the Legacy SQL Query Reference (project-name:dataset_id.table_id). This is useful for referencing tables in other projects and datasets.
create (String) (defaults to: nil) —
Specifies whether the job is allowed to create new tables. The default value is needed.

The following values are supported:
- needed - Create the table if it does not exist.
- never - The table must already exist. A 'notFound' error is raised if the table does not exist.
write (String) (defaults to: nil) —
Specifies how to handle data already present in the destination table. The default value is empty.

The following values are supported:
- truncate - BigQuery overwrites the table data.
- append - BigQuery appends the data to the table.
- empty - An error will be returned if the destination table already contains data.

Yields

(job) — a job configuration object

Yield Parameter

job (Google::Cloud::Bigquery::CopyJob::Updater) — a job configuration object for setting additional options.

Returns

(Boolean) — Returns true if the copy operation succeeded.

Examples

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
destination_table = dataset.table "my_destination_table"

table.copy destination_table

Passing a string identifier for the destination table:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.copy "other-project:other_dataset.other_table"

#copy_job

def copy_job(destination_table, create: nil, write: nil, job_id: nil, prefix: nil, labels: nil, dryrun: nil, operation_type: nil) { |job| ... } -> Google::Cloud::Bigquery::CopyJob

Copies the data from the table to another table using an asynchronous method. In this method, a CopyJob is immediately returned. The caller may poll the service by repeatedly calling Job#reload! and Job#done? to detect when the job is done, or simply block until the job is done by calling #Job#wait_until_done!. See also #copy.

Parameters

destination_table (Table, String) — The destination for the copied data. This can also be a string identifier as specified by the Standard SQL Query Reference (project-name.dataset_id.table_id) or the Legacy SQL Query Reference (project-name:dataset_id.table_id). This is useful for referencing tables in other projects and datasets.
create (String) (defaults to: nil) —
Specifies whether the job is allowed to create new tables. The default value is needed.

The following values are supported:
- needed - Create the table if it does not exist.
- never - The table must already exist. A 'notFound' error is raised if the table does not exist.
write (String) (defaults to: nil) —
Specifies how to handle data already present in the destination table. The default value is empty.

The following values are supported:
- truncate - BigQuery overwrites the table data.
- append - BigQuery appends the data to the table.
- empty - An error will be returned if the destination table already contains data.
job_id (String) (defaults to: nil) — A user-defined ID for the copy job. The ID must contain only letters ([A-Za-z]), numbers ([0-9]), underscores (_), or dashes (-). The maximum length is 1,024 characters. If job_id is provided, then prefix will not be used.

See Generating a job ID.
prefix (String) (defaults to: nil) — A string, usually human-readable, that will be prepended to a generated value to produce a unique job ID. For example, the prefix daily_import_job_ can be given to generate a job ID such as daily_import_job_12vEDtMQ0mbp1Mo5Z7mzAFQJZazh. The prefix must contain only letters ([A-Za-z]), numbers ([0-9]), underscores (_), or dashes (-). The maximum length of the entire ID is 1,024 characters. If job_id is provided, then prefix will not be used.
labels (Hash) (defaults to: nil) —
A hash of user-provided labels associated with the job. You can use these to organize and group your jobs.

The labels applied to a resource must meet the following requirements:
- Each resource can have multiple labels, up to a maximum of 64.
- Each label must be a key-value pair.
- Keys have a minimum length of 1 character and a maximum length of 63 characters, and cannot be empty. Values can be empty, and have a maximum length of 63 characters.
- Keys and values can contain only lowercase letters, numeric characters, underscores, and dashes. All characters must use UTF-8 encoding, and international characters are allowed.
- The key portion of a label must be unique. However, you can use the same key with multiple resources.
- Keys must start with a lowercase letter or international character.
dryrun (Boolean) (defaults to: nil) — If set, don't actually run this job. Behavior is undefined however for non-query jobs and may result in an error. Deprecated.

Yields

(job) — a job configuration object

Yield Parameter

job (Google::Cloud::Bigquery::CopyJob::Updater) — a job configuration object for setting additional options.

Returns

(Google::Cloud::Bigquery::CopyJob)

Examples

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
destination_table = dataset.table "my_destination_table"

copy_job = table.copy_job destination_table

Passing a string identifier for the destination table:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

copy_job = table.copy_job "other-project:other_dataset.other_table"

copy_job.wait_until_done!
copy_job.done? #=> true

#created_at

def created_at() -> Time, nil

The time when this table was created.

Returns

(Time, nil) — The creation time, or nil if the object is a reference (see #reference?).

#data

def data(token: nil, max: nil, start: nil) -> Google::Cloud::Bigquery::Data

Retrieves data from the table.

If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved before the data retrieval.

Parameters

token (String) (defaults to: nil) — Page token, returned by a previous call, identifying the result set.
max (Integer) (defaults to: nil) — Maximum number of results to return.
start (Integer) (defaults to: nil) — Zero-based index of the starting row to read.

Returns

(Google::Cloud::Bigquery::Data)

Examples

Paginate rows of data: (See Data#next)

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

data = table.data

# Iterate over the first page of results
data.each do |row|
  puts row[:name]
end
# Retrieve the next page of results
data = data.next if data.next?

Retrieve all rows of data: (See Data#all)

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

data = table.data

data.all do |row|
  puts row[:name]
end

#dataset_id

def dataset_id() -> String

The ID of the Dataset containing this table.

Returns

(String) — The ID must contain only letters ([A-Za-z]), numbers ([0-9]), or underscores (_). The maximum length is 1,024 characters.

#delete

def delete() -> Boolean

Permanently deletes the table.

Returns

(Boolean) — Returns true if the table was deleted.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.delete

#description

def description() -> String, nil

A user-friendly description of the table.

Returns

(String, nil) — The description, or nil if the object is a reference (see #reference?).

#description=

def description=(new_description)

Updates the user-friendly description of the table.

If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.

Parameter

new_description (String) — The new user-friendly description.

#enable_refresh=

def enable_refresh=(new_enable_refresh)

Sets whether automatic refresh of the materialized view is enabled. When true, the materialized view is updated when the base table is updated. See #materialized_view?.

Parameter

new_enable_refresh (Boolean) — true when automatic refresh is enabled, false otherwise.

#enable_refresh?

def enable_refresh?() -> Boolean, nil

Whether automatic refresh of the materialized view is enabled. When true, the materialized view is updated when the base table is updated. The default value is true. See #materialized_view?.

Returns

(Boolean, nil) — true when automatic refresh is enabled, false otherwise; or nil if not a materialized view.

#encryption

def encryption() -> EncryptionConfiguration, nil

The EncryptionConfiguration object that represents the custom encryption method used to protect the table. If not set, Dataset#default_encryption is used.

Present only if the table is using custom encryption.

Returns

(EncryptionConfiguration, nil) — The encryption configuration.

@!group Attributes

#encryption=

def encryption=(value)

Set the EncryptionConfiguration object that represents the custom encryption method used to protect the table. If not set, Dataset#default_encryption is used.

Present only if the table is using custom encryption.

If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.

Parameter

value (EncryptionConfiguration) — The new encryption config.

#etag

def etag() -> String, nil

The ETag hash of the table.

Returns

(String, nil) — The ETag hash, or nil if the object is a reference (see #reference?).

#exists?

def exists?(force: false) -> Boolean

Determines whether the table exists in the BigQuery service. The result is cached locally. To refresh state, set force to true.

Parameter

force (Boolean) (defaults to: false) — Force the latest resource representation to be retrieved from the BigQuery service when true. Otherwise the return value of this method will be memoized to reduce the number of API calls made to the BigQuery service. The default is false.

Returns

(Boolean) — true when the table exists in the BigQuery service, false otherwise.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table", skip_lookup: true
table.exists? # true

#expires_at

def expires_at() -> Time, nil

The time when this table expires. If not present, the table will persist indefinitely. Expired tables will be deleted and their storage reclaimed.

Returns

(Time, nil) — The expiration time, or nil if not present or the object is a reference (see #reference?).

#external

def external() -> External::DataSource, nil

The External::DataSource (or subclass) object that represents the external data source that the table represents. Data can be queried the table, even though the data is not stored in BigQuery. Instead of loading or streaming the data, this object references the external data source.

Present only if the table represents an External Data Source. See #external? and External::DataSource.

Returns

(External::DataSource, nil) — The external data source.

@!group Attributes

#external=

def external=(external)

Set the External::DataSource (or subclass) object that represents the external data source that the table represents. Data can be queried the table, even though the data is not stored in BigQuery. Instead of loading or streaming the data, this object references the external data source.

Use only if the table represents an External Data Source. See #external? and External::DataSource.

If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.

Parameter

external (External::DataSource) — An external data source.

#external?

def external?() -> Boolean, nil

Checks if the table's type is EXTERNAL, indicating that the table represents an External Data Source. See #external? and External::DataSource.

Returns

(Boolean, nil) — true when the type is EXTERNAL, false otherwise, if the object is a resource (see #resource?); nil if the object is a reference (see #reference?).

#extract

def extract(extract_url, format: nil, compression: nil, delimiter: nil, header: nil, &block) { |job| ... } -> Boolean

Extracts the data from the table to a Google Cloud Storage file using a synchronous method that blocks for a response. Timeouts and transient errors are generally handled as needed to complete the job. See also #extract_job.

The geographic location for the job ("US", "EU", etc.) can be set via ExtractJob::Updater#location= in a block passed to this method. If the table is a full resource representation (see #resource_full?), the location of the job will be automatically set to the location of the table.

Parameters

extract_url (Google::Cloud::Storage::File, String, Array<String>) — The Google Storage file or file URI pattern(s) to which BigQuery should extract the table data.
format (String) (defaults to: nil) —
The exported file format. The default value is csv.

The following values are supported:
- csv - CSV
- json - Newline-delimited JSON
- avro - Avro
compression (String) (defaults to: nil) — The compression type to use for exported files. Possible values include GZIP and NONE. The default value is NONE.
delimiter (String) (defaults to: nil) — Delimiter to use between fields in the exported data. Default is ,.
header (Boolean) (defaults to: nil) — Whether to print out a header row in the results. Default is true.

Yields

(job) — a job configuration object

Yield Parameter

job (Google::Cloud::Bigquery::ExtractJob::Updater) — a job configuration object for setting additional options.

Returns

(Boolean) — Returns true if the extract operation succeeded.

Examples

Extract to a JSON file:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.extract "gs://my-bucket/file-name.json", format: "json"

Extract to a CSV file, attaching labels to the job:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.extract "gs://my-bucket/file-name.csv" do |extract|
  extract.labels = { "custom-label" => "custom-value" }
end

#extract_job

def extract_job(extract_url, format: nil, compression: nil, delimiter: nil, header: nil, job_id: nil, prefix: nil, labels: nil, dryrun: nil) { |job| ... } -> Google::Cloud::Bigquery::ExtractJob

Extracts the data from the table to a Google Cloud Storage file using an asynchronous method. In this method, an ExtractJob is immediately returned. The caller may poll the service by repeatedly calling Job#reload! and Job#done? to detect when the job is done, or simply block until the job is done by calling #Job#wait_until_done!. See also #extract.

Parameters

extract_url (Google::Cloud::Storage::File, String, Array<String>) — The Google Storage file or file URI pattern(s) to which BigQuery should extract the table data.
format (String) (defaults to: nil) —
The exported file format. The default value is csv.

The following values are supported:
- csv - CSV
- json - Newline-delimited JSON
- avro - Avro
compression (String) (defaults to: nil) — The compression type to use for exported files. Possible values include GZIP and NONE. The default value is NONE.
delimiter (String) (defaults to: nil) — Delimiter to use between fields in the exported data. Default is ,.
header (Boolean) (defaults to: nil) — Whether to print out a header row in the results. Default is true.
job_id (String) (defaults to: nil) — A user-defined ID for the extract job. The ID must contain only letters ([A-Za-z]), numbers ([0-9]), underscores (_), or dashes (-). The maximum length is 1,024 characters. If job_id is provided, then prefix will not be used.

See Generating a job ID.
prefix (String) (defaults to: nil) — A string, usually human-readable, that will be prepended to a generated value to produce a unique job ID. For example, the prefix daily_import_job_ can be given to generate a job ID such as daily_import_job_12vEDtMQ0mbp1Mo5Z7mzAFQJZazh. The prefix must contain only letters ([A-Za-z]), numbers ([0-9]), underscores (_), or dashes (-). The maximum length of the entire ID is 1,024 characters. If job_id is provided, then prefix will not be used.
labels (Hash) (defaults to: nil) —
A hash of user-provided labels associated with the job. You can use these to organize and group your jobs.

The labels applied to a resource must meet the following requirements:
- Each resource can have multiple labels, up to a maximum of 64.
- Each label must be a key-value pair.
- Keys have a minimum length of 1 character and a maximum length of 63 characters, and cannot be empty. Values can be empty, and have a maximum length of 63 characters.
- Keys and values can contain only lowercase letters, numeric characters, underscores, and dashes. All characters must use UTF-8 encoding, and international characters are allowed.
- The key portion of a label must be unique. However, you can use the same key with multiple resources.
- Keys must start with a lowercase letter or international character.
dryrun (Boolean) (defaults to: nil) — If set, don't actually run this job. Behavior is undefined however for non-query jobs and may result in an error. Deprecated.

Yields

(job) — a job configuration object

Yield Parameter

job (Google::Cloud::Bigquery::ExtractJob::Updater) — a job configuration object for setting additional options.

Returns

(Google::Cloud::Bigquery::ExtractJob)

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

extract_job = table.extract_job "gs://my-bucket/file-name.json",
                                format: "json"
extract_job.wait_until_done!
extract_job.done? #=> true

#fields

def fields() -> Array<Schema::Field>, nil

The fields of the table, obtained from its schema.

Returns

(Array<Schema::Field>, nil) — An array of field objects.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.fields.each do |field|
  puts field.name
end

#headers

def headers() -> Array<Symbol>, nil

The names of the columns in the table, obtained from its schema.

Returns

(Array<Symbol>, nil) — An array of column names.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.headers.each do |header|
  puts header
end

#id

def id() -> String, nil

The combined Project ID, Dataset ID, and Table ID for this table, in the format specified by the Legacy SQL Query Reference (project-name:dataset_id.table_id). This is useful for referencing tables in other projects and datasets. To use this value in queries see #query_id.

Returns

(String, nil) — The combined ID, or nil if the object is a reference (see #reference?).

#insert

def insert(rows, insert_ids: nil, skip_invalid: nil, ignore_unknown: nil) -> Google::Cloud::Bigquery::InsertResponse

Inserts data into the table for near-immediate querying, without the need to complete a load operation before the data can appear in query results.

Simple Ruby types are generally accepted per JSON rules, along with the following support for BigQuery's more complex types:

| BigQuery | Ruby | Notes | |--------------|--------------------------------------|----------------------------------------------------| | NUMERIC | BigDecimal | BigDecimal values will be rounded to scale 9. | | BIGNUMERIC | String | Pass as String to avoid rounding to scale 9. | | DATETIME | DateTime | DATETIME does not support time zone. | | DATE | Date | | | GEOGRAPHY | String | Well-known text (WKT) or GeoJSON. | | TIMESTAMP | Time | | | TIME | Google::Cloud::BigQuery::Time | | | BYTES | File, IO, StringIO, or similar | | | ARRAY | Array | Nested arrays, nil values are not supported. | | STRUCT | Hash | Hash keys may be strings or symbols. |

For GEOGRAPHY data, see Working with BigQuery GIS data.

Because BigQuery's streaming API is designed for high insertion rates, modifications to the underlying table metadata are eventually consistent when interacting with the streaming system. In most cases metadata changes are propagated within minutes, but during this period API responses may reflect the inconsistent state of the table.

The value :skip can be provided to skip the generation of IDs for all rows, or to skip the generation of an ID for a specific row in the array.

Parameters

rows (Hash, Array<Hash>) — A hash object or array of hash objects containing the data. Required. BigDecimal values will be rounded to scale 9 to conform with the BigQuery NUMERIC data type. To avoid rounding BIGNUMERIC type values with scale greater than 9, use String instead of BigDecimal.
insert_ids (Array<String|Symbol>, Symbol) (defaults to: nil) — A unique ID for each row. BigQuery uses this property to detect duplicate insertion requests on a best-effort basis. For more information, see data consistency. Optional. If not provided, the client library will assign a UUID to each row before the request is sent.
skip_invalid (Boolean) (defaults to: nil) — Insert all valid rows of a request, even if invalid rows exist. The default value is false, which causes the entire request to fail if any invalid rows exist.
ignore_unknown (Boolean) (defaults to: nil) — Accept rows that contain values that do not match the schema. The unknown values are ignored. Default is false, which treats unknown values as errors.

Returns

(Google::Cloud::Bigquery::InsertResponse)

Raises

(ArgumentError)

Examples

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

rows = [
  { "first_name" => "Alice", "age" => 21 },
  { "first_name" => "Bob", "age" => 22 }
]
table.insert rows

Avoid retrieving the dataset and table with skip_lookup:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset", skip_lookup: true
table = dataset.table "my_table", skip_lookup: true

rows = [
  { "first_name" => "Alice", "age" => 21 },
  { "first_name" => "Bob", "age" => 22 }
]
table.insert rows

Pass BIGNUMERIC value as a string to avoid rounding to scale 9 in the conversion from BigDecimal:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

row = {
  "my_numeric" => BigDecimal("123456798.987654321"),
  "my_bignumeric" => "123456798.98765432100001" # BigDecimal would be rounded, use String instead!
}
table.insert row

#insert_async

def insert_async(skip_invalid: nil, ignore_unknown: nil, max_bytes: 10_000_000, max_rows: 500, interval: 10, threads: 4, &block) { |result| ... } -> Table::AsyncInserter

Create an asynchronous inserter object used to insert rows in batches.

Parameters

skip_invalid (Boolean) (defaults to: nil) — Insert all valid rows of a request, even if invalid rows exist. The default value is false, which causes the entire request to fail if any invalid rows exist.
ignore_unknown (Boolean) (defaults to: nil) — Accept rows that contain values that do not match the schema. The unknown values are ignored. Default is false, which treats unknown values as errors.
max_rows (Integer) (defaults to: 500) — The maximum number of rows to be collected before the batch is published. Default is 500.

Yields

(response) — the callback for when a batch of rows is inserted

Yield Parameter

result (Table::AsyncInserter::Result) — the result of the asynchronous insert

Returns

(Table::AsyncInserter) — Returns inserter object.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
inserter = table.insert_async do |result|
  if result.error?
    log_error result.error
  else
    log_insert "inserted #{result.insert_count} rows " \
      "with #{result.error_count} errors"
  end
end

rows = [
  { "first_name" => "Alice", "age" => 21 },
  { "first_name" => "Bob", "age" => 22 }
]
inserter.insert rows

inserter.stop.wait!

#labels

def labels() -> Hash<String, String>, nil

A hash of user-provided labels associated with this table. Labels are used to organize and group tables. See Using Labels.

The returned hash is frozen and changes are not allowed. Use #labels= to replace the entire hash.

Returns

(Hash<String, String>, nil) — A hash containing key/value pairs.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

labels = table.labels
labels["department"] #=> "shipping"

#labels=

def labels=(labels)

Updates the hash of user-provided labels associated with this table. Labels are used to organize and group tables. See Using Labels.

If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.

Parameter

labels (Hash<String, String>) —
A hash containing key/value pairs.

The labels applied to a resource must meet the following requirements:
- Each resource can have multiple labels, up to a maximum of 64.
- Each label must be a key-value pair.
- Keys have a minimum length of 1 character and a maximum length of 63 characters, and cannot be empty. Values can be empty, and have a maximum length of 63 characters.
- Keys and values can contain only lowercase letters, numeric characters, underscores, and dashes. All characters must use UTF-8 encoding, and international characters are allowed.
- The key portion of a label must be unique. However, you can use the same key with multiple resources.
- Keys must start with a lowercase letter or international character.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.labels = { "department" => "shipping" }

#last_refresh_time

def last_refresh_time() -> Time, nil

The time when the materialized view was last modified. See #materialized_view?.

Returns

(Time, nil) — The time, or nil if not present or not a materialized view.

#load

def load(files, format: nil, create: nil, write: nil, projection_fields: nil, jagged_rows: nil, quoted_newlines: nil, encoding: nil, delimiter: nil, ignore_unknown: nil, max_bad_records: nil, quote: nil, skip_leading: nil, autodetect: nil, null_marker: nil, &block) { |updater| ... } -> Boolean

Loads data into the table. You can pass a google-cloud storage file path or a google-cloud storage file instance. Or, you can upload a file directly. See Loading Data with a POST Request.

The geographic location for the job ("US", "EU", etc.) can be set via LoadJob::Updater#location= in a block passed to this method. If the table is a full resource representation (see #resource_full?), the location of the job will be automatically set to the location of the table.

Parameters

files (File, Google::Cloud::Storage::File, String, URI, Array<Google::Cloud::Storage::File, String, URI>) — A file or the URI of a Google Cloud Storage file, or an Array of those, containing data to load into the table.
format (String) (defaults to: nil) —
The exported file format. The default value is csv.

The following values are supported:
- csv - CSV
- json - Newline-delimited JSON
- avro - Avro
- orc - ORC
- parquet - Parquet
- datastore_backup - Cloud Datastore backup
create (String) (defaults to: nil) —
Specifies whether the job is allowed to create new tables. The default value is needed.

The following values are supported:
- needed - Create the table if it does not exist.
- never - The table must already exist. A 'notFound' error is raised if the table does not exist.
write (String) (defaults to: nil) —
Specifies how to handle data already present in the table. The default value is append.

The following values are supported:
- truncate - BigQuery overwrites the table data.
- append - BigQuery appends the data to the table.
- empty - An error will be returned if the table already contains data.
projection_fields (Array<String>) (defaults to: nil) — If the format option is set to datastore_backup, indicates which entity properties to load from a Cloud Datastore backup. Property names are case sensitive and must be top-level properties. If not set, BigQuery loads all properties. If any named property isn't found in the Cloud Datastore backup, an invalid error is returned.
jagged_rows (Boolean) (defaults to: nil) — Accept rows that are missing trailing optional columns. The missing values are treated as nulls. If false, records with missing trailing columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value is false. Only applicable to CSV, ignored for other formats.
quoted_newlines (Boolean) (defaults to: nil) — Indicates if BigQuery should allow quoted data sections that contain newline characters in a CSV file. The default value is false.
autodetect (Boolean) (defaults to: nil) — Indicates if BigQuery should automatically infer the options and schema for CSV and JSON sources. The default value is false.
encoding (String) (defaults to: nil) — The character encoding of the data. The supported values are UTF-8 or ISO-8859-1. The default value is UTF-8.
delimiter (String) (defaults to: nil) — Specifices the separator for fields in a CSV file. BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. Default is ,.
ignore_unknown (Boolean) (defaults to: nil) —
Indicates if BigQuery should allow extra values that are not represented in the table schema. If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value is false.

The format property determines what BigQuery treats as an extra value:
- CSV: Trailing columns
- JSON: Named values that don't match any column names
max_bad_records (Integer) (defaults to: nil) — The maximum number of bad records that BigQuery can ignore when running the job. If the number of bad records exceeds this value, an invalid error is returned in the job result. The default value is 0, which requires that all records are valid.
null_marker (String) (defaults to: nil) — Specifies a string that represents a null value in a CSV file. For example, if you specify \N, BigQuery interprets \N as a null value when loading a CSV file. The default value is the empty string. If you set this property to a custom value, BigQuery throws an error if an empty string is present for all data types except for STRING and BYTE. For STRING and BYTE columns, BigQuery interprets the empty string as an empty value.
quote (String) (defaults to: nil) — The value that is used to quote data sections in a CSV file. BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. The default value is a double-quote ". If your data does not contain quoted sections, set the property value to an empty string. If your data contains quoted newline characters, you must also set the allowQuotedNewlines property to true.
skip_leading (Integer) (defaults to: nil) — The number of rows at the top of a CSV file that BigQuery will skip when loading the data. The default value is 0. This property is useful if you have header rows in the file that should be skipped.

Yields

(updater) — A block for setting the schema of the destination table and other options for the load job. The schema can be omitted if the destination table already exists, or if you're loading data from a Google Cloud Datastore backup.

Yield Parameter

updater (Google::Cloud::Bigquery::LoadJob::Updater) — An updater to modify the load job and its schema.

Returns

(Boolean) — Returns true if the load job was successful.

Examples

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

success = table.load "gs://my-bucket/file-name.csv"

Pass a google-cloud-storage File instance:

require "google/cloud/bigquery"
require "google/cloud/storage"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

storage = Google::Cloud::Storage.new
bucket = storage.bucket "my-bucket"
file = bucket.file "file-name.csv"
success = table.load file

Pass a list of google-cloud-storage files:

require "google/cloud/bigquery"
require "google/cloud/storage"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

storage = Google::Cloud::Storage.new
bucket = storage.bucket "my-bucket"
file = bucket.file "file-name.csv"
table.load [file, "gs://my-bucket/file-name2.csv"]

Upload a file directly:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

file = File.open "my_data.json"
success = table.load file do |j|
  j.format = "newline_delimited_json"
end

#load_job

def load_job(files, format: nil, create: nil, write: nil, projection_fields: nil, jagged_rows: nil, quoted_newlines: nil, encoding: nil, delimiter: nil, ignore_unknown: nil, max_bad_records: nil, quote: nil, skip_leading: nil, job_id: nil, prefix: nil, labels: nil, autodetect: nil, null_marker: nil, dryrun: nil) { |load_job| ... } -> Google::Cloud::Bigquery::LoadJob

Loads data into the table. You can pass a google-cloud storage file path or a google-cloud storage file instance. Or, you can upload a file directly. See Loading Data with a POST Request.

Parameters

files (File, Google::Cloud::Storage::File, String, URI, Array<Google::Cloud::Storage::File, String, URI>) — A file or the URI of a Google Cloud Storage file, or an Array of those, containing data to load into the table.
format (String) (defaults to: nil) —
The exported file format. The default value is csv.

The following values are supported:
- csv - CSV
- json - Newline-delimited JSON
- avro - Avro
- orc - ORC
- parquet - Parquet
- datastore_backup - Cloud Datastore backup
create (String) (defaults to: nil) —
Specifies whether the job is allowed to create new tables. The default value is needed.

The following values are supported:
- needed - Create the table if it does not exist.
- never - The table must already exist. A 'notFound' error is raised if the table does not exist.
write (String) (defaults to: nil) —
Specifies how to handle data already present in the table. The default value is append.

The following values are supported:
- truncate - BigQuery overwrites the table data.
- append - BigQuery appends the data to the table.
- empty - An error will be returned if the table already contains data.
projection_fields (Array<String>) (defaults to: nil) — If the format option is set to datastore_backup, indicates which entity properties to load from a Cloud Datastore backup. Property names are case sensitive and must be top-level properties. If not set, BigQuery loads all properties. If any named property isn't found in the Cloud Datastore backup, an invalid error is returned.
jagged_rows (Boolean) (defaults to: nil) — Accept rows that are missing trailing optional columns. The missing values are treated as nulls. If false, records with missing trailing columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value is false. Only applicable to CSV, ignored for other formats.
quoted_newlines (Boolean) (defaults to: nil) — Indicates if BigQuery should allow quoted data sections that contain newline characters in a CSV file. The default value is false.
autodetect (Boolean) (defaults to: nil) — Indicates if BigQuery should automatically infer the options and schema for CSV and JSON sources. The default value is false.
encoding (String) (defaults to: nil) — The character encoding of the data. The supported values are UTF-8 or ISO-8859-1. The default value is UTF-8.
delimiter (String) (defaults to: nil) — Specifices the separator for fields in a CSV file. BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. Default is ,.
ignore_unknown (Boolean) (defaults to: nil) —
Indicates if BigQuery should allow extra values that are not represented in the table schema. If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value is false.

The format property determines what BigQuery treats as an extra value:
- CSV: Trailing columns
- JSON: Named values that don't match any column names
max_bad_records (Integer) (defaults to: nil) — The maximum number of bad records that BigQuery can ignore when running the job. If the number of bad records exceeds this value, an invalid error is returned in the job result. The default value is 0, which requires that all records are valid.
null_marker (String) (defaults to: nil) — Specifies a string that represents a null value in a CSV file. For example, if you specify \N, BigQuery interprets \N as a null value when loading a CSV file. The default value is the empty string. If you set this property to a custom value, BigQuery throws an error if an empty string is present for all data types except for STRING and BYTE. For STRING and BYTE columns, BigQuery interprets the empty string as an empty value.
quote (String) (defaults to: nil) — The value that is used to quote data sections in a CSV file. BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. The default value is a double-quote ". If your data does not contain quoted sections, set the property value to an empty string. If your data contains quoted newline characters, you must also set the allowQuotedNewlines property to true.
skip_leading (Integer) (defaults to: nil) — The number of rows at the top of a CSV file that BigQuery will skip when loading the data. The default value is 0. This property is useful if you have header rows in the file that should be skipped.
job_id (String) (defaults to: nil) — A user-defined ID for the load job. The ID must contain only letters ([A-Za-z]), numbers ([0-9]), underscores (_), or dashes (-). The maximum length is 1,024 characters. If job_id is provided, then prefix will not be used.

See Generating a job ID.
prefix (String) (defaults to: nil) — A string, usually human-readable, that will be prepended to a generated value to produce a unique job ID. For example, the prefix daily_import_job_ can be given to generate a job ID such as daily_import_job_12vEDtMQ0mbp1Mo5Z7mzAFQJZazh. The prefix must contain only letters ([A-Za-z]), numbers ([0-9]), underscores (_), or dashes (-). The maximum length of the entire ID is 1,024 characters. If job_id is provided, then prefix will not be used.
labels (Hash) (defaults to: nil) —
A hash of user-provided labels associated with the job. You can use these to organize and group your jobs.

The labels applied to a resource must meet the following requirements:
- Each resource can have multiple labels, up to a maximum of 64.
- Each label must be a key-value pair.
- Keys have a minimum length of 1 character and a maximum length of 63 characters, and cannot be empty. Values can be empty, and have a maximum length of 63 characters.
- Keys and values can contain only lowercase letters, numeric characters, underscores, and dashes. All characters must use UTF-8 encoding, and international characters are allowed.
- The key portion of a label must be unique. However, you can use the same key with multiple resources.
- Keys must start with a lowercase letter or international character.
dryrun (Boolean) (defaults to: nil) — If set, don't actually run this job. Behavior is undefined however for non-query jobs and may result in an error. Deprecated.

Yields

(load_job) — a block for setting the load job

Yield Parameter

load_job (LoadJob) — the load job object to be updated

Returns

(Google::Cloud::Bigquery::LoadJob)

Examples

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

load_job = table.load_job "gs://my-bucket/file-name.csv"

Pass a google-cloud-storage File instance:

require "google/cloud/bigquery"
require "google/cloud/storage"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

storage = Google::Cloud::Storage.new
bucket = storage.bucket "my-bucket"
file = bucket.file "file-name.csv"
load_job = table.load_job file

Pass a list of google-cloud-storage files:

require "google/cloud/bigquery"
require "google/cloud/storage"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

storage = Google::Cloud::Storage.new
bucket = storage.bucket "my-bucket"
file = bucket.file "file-name.csv"
load_job = table.load_job [file, "gs://my-bucket/file-name2.csv"]

Upload a file directly:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

file = File.open "my_data.csv"
load_job = table.load_job file

#location

def location() -> String, nil

The geographic location where the table should reside. Possible values include EU and US. The default value is US.

Returns

(String, nil) — The location code.

#materialized_view?

def materialized_view?() -> Boolean, nil

Checks if the table's type is MATERIALIZED_VIEW, indicating that the table represents a BigQuery materialized view. See Dataset#create_materialized_view.

Returns

(Boolean, nil) — true when the type is MATERIALIZED_VIEW, false otherwise, if the object is a resource (see #resource?); nil if the object is a reference (see #reference?).

#modified_at

def modified_at() -> Time, nil

The date when this table was last modified.

Returns

(Time, nil) — The last modified time, or nil if not present or the object is a reference (see #reference?).

#name

def name() -> String, nil

The name of the table.

Returns

(String, nil) — The friendly name, or nil if the object is a reference (see #reference?).

#name=

def name=(new_name)

Updates the name of the table.

If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.

Parameter

new_name (String) — The new friendly name.

#param_types

def param_types() -> Hash

The types of the fields in the table, obtained from its schema. Types use the same format as the optional query parameter types.

Returns

(Hash) — A hash with field names as keys, and types as values.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.param_types

#policy

def policy() -> Policy

Gets the Cloud IAM access control policy for the table. The latest policy will be read from the service. See also #update_policy.

Returns

(Policy) — The frozen policy for the table.

Raises

(ArgumentError)

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

policy = table.policy

policy.frozen? #=> true
binding_owner = policy.bindings.find { |b| b.role == "roles/owner" }
binding_owner.role #=> "roles/owner"
binding_owner.members #=> ["user:owner@example.com"]
binding_owner.frozen? #=> true
binding_owner.members.frozen? #=> true

#project_id

def project_id() -> String

The ID of the Project containing this table.

Returns

(String) — The project ID.

#query

def query() -> String, nil

The query that defines the view or materialized view. See #view? and #materialized_view?.

Returns

(String, nil) — The query that defines the view or materialized_view; or nil if not a view or materialized view.

#query=

def query=(new_query)

Updates the query that defines the view. (See #view?.) Not supported for materialized views.

This method sets the query using standard SQL. To specify legacy SQL or to use user-defined function resources for a view, use (#set_query) instead.

Parameter

new_query (String) — The query that defines the view.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
view = dataset.table "my_view"

view.query = "SELECT first_name FROM " \
             "`my_project.my_dataset.my_table`"

#query_id

def query_id(standard_sql: nil, legacy_sql: nil) -> String

The value returned by #id, wrapped in backticks (Standard SQL) or s quare brackets (Legacy SQL) to accommodate project IDs containing dashes. Useful in queries.

Parameters

standard_sql (Boolean) (defaults to: nil) — Specifies whether to use BigQuery's standard SQL dialect. Optional. The default value is true.
legacy_sql (Boolean) (defaults to: nil) — Specifies whether to use BigQuery's legacy SQL dialect. Optional. The default value is false.

Returns

(String) — The appropriate table ID for use in queries, depending on SQL type.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

data = bigquery.query "SELECT first_name FROM #{table.query_id}"

#query_legacy_sql?

def query_legacy_sql?() -> Boolean

Checks if the view's query is using legacy sql. See #view?.

Returns

(Boolean) — true when legacy sql is used, false otherwise; or nil if not a logical view.

#query_standard_sql?

def query_standard_sql?() -> Boolean

Checks if the view's query is using standard sql. See #view?.

Returns

(Boolean) — true when standard sql is used, false otherwise.

#query_udfs

def query_udfs() -> Array<String>, nil

The user-defined function resources used in the view's query. May be either a code resource to load from a Google Cloud Storage URI (gs://bucket/path), or an inline resource that contains code for a user-defined function (UDF). Providing an inline code resource is equivalent to providing a URI for a file containing the same code. See User-Defined Functions. See #view?.

Returns

(Array<String>, nil) — An array containing Google Cloud Storage URIs and/or inline source code, or nil if not a logical view.

#range_partitioning?

def range_partitioning?() -> Boolean, nil

Checks if the table is range partitioned. See Creating and using integer range partitioned tables.

Returns

(Boolean, nil) — true when the table is range partitioned, or false otherwise, if the object is a resource (see #resource?); nil if the object is a reference (see #reference?).

#range_partitioning_end

def range_partitioning_end() -> Integer, nil

The end of range partitioning, exclusive. See Creating and using integer range partitioned tables.

Returns

(Integer, nil) — The end of range partitioning, exclusive, or nil if not range partitioned or the object is a reference (see #reference?).

#range_partitioning_field

def range_partitioning_field() -> Integer, nil

The field on which the table is range partitioned, if any. The field must be a top-level NULLABLE/REQUIRED field. The only supported type is INTEGER/INT64. See Creating and using integer range partitioned tables.

Returns

(Integer, nil) — The range partition field, or nil if not range partitioned or the object is a reference (see #reference?).

#range_partitioning_interval

def range_partitioning_interval() -> Integer, nil

The width of each interval. See Creating and using integer range partitioned tables.

Returns

(Integer, nil) — The width of each interval, for data in range partitions, or nil if not range partitioned or the object is a reference (see #reference?).

#range_partitioning_start

def range_partitioning_start() -> Integer, nil

The start of range partitioning, inclusive. See Creating and using integer range partitioned tables.

Returns

(Integer, nil) — The start of range partitioning, inclusive, or nil if not range partitioned or the object is a reference (see #reference?).

#reference?

def reference?() -> Boolean

Whether the table was created without retrieving the resource representation from the BigQuery service.

Returns

(Boolean) — true when the table is just a local reference object, false otherwise.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table", skip_lookup: true

table.reference? # true
table.reload!
table.reference? # false

#refresh!

def refresh!() -> Google::Cloud::Bigquery::Table

Alias Of: #reload!

Reloads the table with current data from the BigQuery service.

Returns

(Google::Cloud::Bigquery::Table) — Returns the reloaded table.

Example

Skip retrieving the table from the service, then load it:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table", skip_lookup: true

table.reload!

#refresh_interval_ms

def refresh_interval_ms() -> Integer, nil

The maximum frequency in milliseconds at which the materialized view will be refreshed. See #materialized_view?.

Returns

(Integer, nil) — The maximum frequency in milliseconds; or nil if not a materialized view.

#refresh_interval_ms=

def refresh_interval_ms=(new_refresh_interval_ms)

Sets the maximum frequency at which the materialized view will be refreshed. See #materialized_view?.

Parameter

new_refresh_interval_ms (Integer) — The maximum frequency in milliseconds.

#reload!

def reload!() -> Google::Cloud::Bigquery::Table

Aliases

#refresh!

Reloads the table with current data from the BigQuery service.

Returns

(Google::Cloud::Bigquery::Table) — Returns the reloaded table.

Example

Skip retrieving the table from the service, then load it:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table", skip_lookup: true

table.reload!

#require_partition_filter

def require_partition_filter() -> Boolean, nil

Whether queries over this table require a partition filter that can be used for partition elimination to be specified. See Partitioned Tables.

Returns

(Boolean, nil) — true when a partition filter will be required, false otherwise, or nil if the object is a reference (see #reference?).

#require_partition_filter=

def require_partition_filter=(new_require)

Sets whether queries over this table require a partition filter. See Partitioned Tables.

If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.

Parameter

new_require (Boolean) — Whether queries over this table require a partition filter.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table" do |t|
  t.require_partition_filter = true
end

#resource?

def resource?() -> Boolean

Whether the table was created with a resource representation from the BigQuery service.

Returns

(Boolean) — true when the table was created with a resource representation, false otherwise.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table", skip_lookup: true

table.resource? # false
table.reload!
table.resource? # true

#resource_full?

def resource_full?() -> Boolean

Whether the table was created with a full resource representation from the BigQuery service.

Returns

(Boolean) — true when the table was created with a full resource representation, false otherwise.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.resource_full? # true

#resource_partial?

def resource_partial?() -> Boolean

Whether the table was created with a partial resource representation from the BigQuery service by retrieval through Dataset#tables. See Tables: list response for the contents of the partial representation. Accessing any attribute outside of the partial representation will result in loading the full representation.

Returns

(Boolean) — true when the table was created with a partial resource representation, false otherwise.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

dataset = bigquery.dataset "my_dataset"
table = dataset.tables.first

table.resource_partial? # true
table.description # Loads the full resource.
table.resource_partial? # false

#restore

def restore(destination_table, create: nil, write: nil, &block) { |job| ... } -> Boolean

Restore the data from the table to another table using a synchronous method that blocks for a response. The source table type is SNAPSHOT and the destination table type is TABLE. Timeouts and transient errors are generally handled as needed to complete the job. See also #copy_job.

Parameters

destination_table (Table, String) — The destination for the copied data. This can also be a string identifier as specified by the Standard SQL Query Reference (project-name.dataset_id.table_id) or the Legacy SQL Query Reference (project-name:dataset_id.table_id). This is useful for referencing tables in other projects and datasets.
create (String) (defaults to: nil) —
Specifies whether the job is allowed to create new tables. The default value is needed.

The following values are supported:
- needed - Create the table if it does not exist.
- never - The table must already exist. A 'notFound' error is raised if the table does not exist.
write (String) (defaults to: nil) —
Specifies how to handle data already present in the destination table. The default value is empty.

The following values are supported:
- truncate - BigQuery overwrites the table data.
- append - BigQuery appends the data to the table.
- empty - An error will be returned if the destination table already contains data.

Yields

(job) — a job configuration object

Yield Parameter

job (Google::Cloud::Bigquery::CopyJob::Updater) — a job configuration object for setting additional options.

Returns

(Boolean) — Returns true if the copy operation succeeded.

Examples

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
destination_table = dataset.table "my_destination_table"

table.restore destination_table

Passing a string identifier for the destination table:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.restore "other-project:other_dataset.other_table"

#rows_count

def rows_count() -> Integer, nil

The number of rows in the table.

Returns

(Integer, nil) — The count of rows in the table, or nil if the object is a reference (see #reference?).

#schema

def schema(replace: false) { |schema| ... } -> Google::Cloud::Bigquery::Schema, nil

Returns the table's schema. If the table is not a view (See #view?), this method can also be used to set, replace, or add to the schema by passing a block. See Schema for available methods.

If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved.

Parameter

replace (Boolean) (defaults to: false) — Whether to replace the existing schema with the new schema. If true, the fields will replace the existing schema. If false, the fields will be added to the existing schema. When a table already contains data, schema changes must be additive. Thus, the default value is false. When loading from a file this will always replace the schema, no matter what replace is set to. You can update the schema (for example, for a table that already contains data) by providing a schema file that includes the existing schema plus any new fields.

Yields

(schema) — a block for setting the schema

Yield Parameter

schema (Schema) — the object accepting the schema

Returns

(Google::Cloud::Bigquery::Schema, nil) — A frozen schema object.

Examples

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table"

table.schema do |schema|
  schema.string "first_name", mode: :required
  schema.record "cities_lived", mode: :repeated do |nested_schema|
    nested_schema.string "place", mode: :required
    nested_schema.integer "number_of_years", mode: :required
  end
end

Load the schema from a file

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table"
table.schema do |schema|
  schema.load File.open("schema.json")
end

#set_query

def set_query(query, standard_sql: nil, legacy_sql: nil, udfs: nil)

Updates the query that defines the view. (See #view?.) Not supported for materialized views.

Allows setting of standard vs. legacy SQL and user-defined function resources.

Parameters

query (String) — The query that defines the view.
standard_sql (Boolean) (defaults to: nil) — Specifies whether to use BigQuery's standard SQL dialect. Optional. The default value is true.
legacy_sql (Boolean) (defaults to: nil) — Specifies whether to use BigQuery's legacy SQL dialect. Optional. The default value is false.
udfs (Array<String>, String) (defaults to: nil) — User-defined function resources used in a legacy SQL query. Optional.

May be either a code resource to load from a Google Cloud Storage URI (gs://bucket/path), or an inline resource that contains code for a user-defined function (UDF). Providing an inline code resource is equivalent to providing a URI for a file containing the same code.

This parameter is used for defining User Defined Function (UDF) resources only when using legacy SQL. Users of standard SQL should leverage either DDL (e.g. CREATE [TEMPORARY] FUNCTION ...) or the Routines API to define UDF resources.

For additional information on migrating, see: Migrating to standard SQL - Differences in user-defined JavaScript functions

Example

Update a view:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
view = dataset.table "my_view"

view.set_query "SELECT first_name FROM " \
               "`my_project.my_dataset.my_table`",
               standard_sql: true

#snapshot

def snapshot(destination_table, &block) { |job| ... } -> Boolean

Takes snapshot of the data from the table to another table using a synchronous method that blocks for a response. The source table type is TABLE and the destination table type is SNAPSHOT. Timeouts and transient errors are generally handled as needed to complete the job. See also #copy_job.

Parameter

destination_table (Table, String) — The destination for the copied data. This can also be a string identifier as specified by the Standard SQL Query Reference (project-name.dataset_id.table_id) or the Legacy SQL Query Reference (project-name:dataset_id.table_id). This is useful for referencing tables in other projects and datasets.

Yields

(job) — a job configuration object

Yield Parameter

job (Google::Cloud::Bigquery::CopyJob::Updater) — a job configuration object for setting additional options.

Returns

(Boolean) — Returns true if the copy operation succeeded.

Examples

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
destination_table = dataset.table "my_destination_table"

table.snapshot destination_table

Passing a string identifier for the destination table:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.snapshot "other-project:other_dataset.other_table"

#snapshot?

def snapshot?() -> Boolean, nil

Checks if the table's type is SNAPSHOT, indicating that the table represents a BigQuery table snapshot.

Returns

(Boolean, nil) — true when the type is SNAPSHOT, false otherwise, if the object is a resource (see #resource?); nil if the object is a reference (see #reference?).

#snapshot_definition

def snapshot_definition() -> Google::Apis::BigqueryV2::SnapshotDefinition, nil

The Information about base table and snapshot time of the table.

Returns

(Google::Apis::BigqueryV2::SnapshotDefinition, nil) — Snapshot definition of table snapshot, or nil if not snapshot or the object is a reference (see #reference?).

#table?

def table?() -> Boolean, nil

Checks if the table's type is TABLE.

Returns

(Boolean, nil) — true when the type is TABLE, false otherwise, if the object is a resource (see #resource?); nil if the object is a reference (see #reference?).

#table_id

def table_id() -> String

A unique ID for this table.

Returns

(String) — The ID must contain only letters ([A-Za-z]), numbers ([0-9]), or underscores (_). The maximum length is 1,024 characters.

#test_iam_permissions

def test_iam_permissions(*permissions) -> Array<String>

Tests the specified permissions against the Cloud IAM access control policy.

Parameter

permissions (String, Array<String>) — The set of permissions against which to check access. Permissions must be of the format bigquery.resource.capability. See https://cloud.google.com/bigquery/docs/access-control#bigquery.

Returns

(Array<String>) — The frozen array of permissions held by the caller.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

permissions = table.test_iam_permissions "bigquery.tables.get",
                                         "bigquery.tables.delete"
permissions.include? "bigquery.tables.get"    #=> true
permissions.include? "bigquery.tables.delete" #=> false

#time_partitioning?

def time_partitioning?() -> Boolean, nil

Checks if the table is time partitioned. See Partitioned Tables.

Returns

(Boolean, nil) — true when the table is time partitioned, or false otherwise, if the object is a resource (see #resource?); nil if the object is a reference (see #reference?).

#time_partitioning_expiration

def time_partitioning_expiration() -> Integer, nil

The expiration for the time partitions, if any, in seconds. See Partitioned Tables.

Returns

(Integer, nil) — The expiration time, in seconds, for data in time partitions, or nil if not present or the object is a reference (see #reference?).

#time_partitioning_expiration=

def time_partitioning_expiration=(expiration)

Sets the time partition expiration for the table. See Partitioned Tables. The table must also be time partitioned.

See #time_partitioning_type=.

If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.

Parameter

expiration (Integer, nil) — An expiration time, in seconds, for data in time partitions, , or nil to indicate no expiration time.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table" do |t|
  t.schema do |schema|
    schema.timestamp "dob", mode: :required
  end
  t.time_partitioning_type = "DAY"
  t.time_partitioning_field = "dob"
  t.time_partitioning_expiration = 86_400
end

#time_partitioning_field

def time_partitioning_field() -> String, nil

The field on which the table is time partitioned, if any. If not set, the destination table is time partitioned by pseudo column _PARTITIONTIME; if set, the table is time partitioned by this field. See Partitioned Tables.

Returns

(String, nil) — The time partition field, if a field was configured. nil if not time partitioned, not set (time partitioned by pseudo column '_PARTITIONTIME') or the object is a reference (see #reference?).

#time_partitioning_field=

def time_partitioning_field=(field)

Sets the field on which to time partition the table. If not set, the destination table is time partitioned by pseudo column _PARTITIONTIME; if set, the table is time partitioned by this field. See Partitioned Tables. The table must also be time partitioned.

See #time_partitioning_type=.

You can only set the time partitioning field while creating a table as in the example below. BigQuery does not allow you to change time partitioning on an existing table.

Parameter

field (String) — The time partition field. The field must be a top-level TIMESTAMP or DATE field. Its mode must be NULLABLE or REQUIRED.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table" do |t|
  t.schema do |schema|
    schema.timestamp "dob", mode: :required
  end
  t.time_partitioning_type  = "DAY"
  t.time_partitioning_field = "dob"
end

#time_partitioning_type

def time_partitioning_type() -> String, nil

The period for which the table is time partitioned, if any. See Partitioned Tables.

Returns

(String, nil) — The time partition type. The supported types are DAY, HOUR, MONTH, and YEAR, which will generate one partition per day, hour, month, and year, respectively; or nil if not set or the object is a reference (see #reference?).

#time_partitioning_type=

def time_partitioning_type=(type)

Sets the time partitioning type for the table. See Partitioned Tables. The supported types are DAY, HOUR, MONTH, and YEAR, which will generate one partition per day, hour, month, and year, respectively.

You can only set time partitioning when creating a table as in the example below. BigQuery does not allow you to change time partitioning on an existing table.

Parameter

type (String) — The time partition type. The supported types are DAY, HOUR, MONTH, and YEAR, which will generate one partition per day, hour, month, and year, respectively.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table" do |t|
  t.schema do |schema|
    schema.timestamp "dob", mode: :required
  end
  t.time_partitioning_type  = "DAY"
  t.time_partitioning_field = "dob"
end

#type

def type() -> String, nil

The type of the table like if its a TABLE, VIEW or SNAPSHOT etc.,

Returns

(String, nil) — Type of the table, or nil if the object is a reference (see #reference?).

#update_policy

def update_policy() { |policy| ... } -> Policy

Updates the Cloud IAM access control policy for the table. The latest policy will be read from the service. See also #policy.

Yields

(policy) — A block for updating the policy. The latest policy will be read from the service and passed to the block. After the block completes, the modified policy will be written to the service.

Yield Parameter

policy (Policy) — The mutable Policy for the table.

Returns

(Policy) — The updated and frozen policy for the table.

Raises

(ArgumentError)

Example

Update the policy by passing a block.

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.update_policy do |p|
  p.grant role: "roles/viewer", members: "user:viewer@example.com"
  p.revoke role: "roles/editor", members: "user:editor@example.com"
  p.revoke role: "roles/owner"
end # 2 API calls

#view?

def view?() -> Boolean, nil

Checks if the table's type is VIEW, indicating that the table represents a BigQuery logical view. See Dataset#create_view.

Returns

(Boolean, nil) — true when the type is VIEW, false otherwise, if the object is a resource (see #resource?); nil if the object is a reference (see #reference?).