Reference documentation and code samples for the BigQuery API class Google::Cloud::Bigquery::Table.
Table
A named resource representing a BigQuery table that holds zero or more records. Every table is defined by a schema that may contain nested and repeated fields.
The Table class can also represent a logical view, which is a virtual table defined by a SQL query (see #view? and Dataset#create_view); or a materialized view, which is a precomputed view that periodically caches results of a query for increased performance and efficiency (see #materialized_view? and Dataset#create_materialized_view).
Inherits
- Object
Examples
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.create_table "my_table" do |schema| schema.string "first_name", mode: :required schema.record "cities_lived", mode: :repeated do |nested_schema| nested_schema.string "place", mode: :required nested_schema.integer "number_of_years", mode: :required end end row = { "first_name" => "Alice", "cities_lived" => [ { "place" => "Seattle", "number_of_years" => 5 }, { "place" => "Stockholm", "number_of_years" => 6 } ] } table.insert row
Creating a logical view:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" view = dataset.create_view "my_view", "SELECT name, age FROM `my_project.my_dataset.my_table`" view.view? # true
Creating a materialized view:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" view = dataset.create_materialized_view "my_materialized_view", "SELECT name, age FROM `my_project.my_dataset.my_table`" view.materialized_view? # true
Methods
#api_url
def api_url() -> String, nil
A URL that can be used to access the table using the REST API.
-
(String, nil) — A REST URL for the resource, or
nil
if the object is a reference (see #reference?).
#buffer_bytes
def buffer_bytes() -> Integer, nil
A lower-bound estimate of the number of bytes currently in this table's streaming buffer, if one is present. This field will be absent if the table is not being streamed to or if there is no data in the streaming buffer.
-
(Integer, nil) — The estimated number of bytes in the buffer, or
nil
if not present or the object is a reference (see #reference?).
#buffer_oldest_at
def buffer_oldest_at() -> Time, nil
The time of the oldest entry currently in this table's streaming buffer, if one is present. This field will be absent if the table is not being streamed to or if there is no data in the streaming buffer.
-
(Time, nil) — The oldest entry time, or
nil
if not present or the object is a reference (see #reference?).
#buffer_rows
def buffer_rows() -> Integer, nil
A lower-bound estimate of the number of rows currently in this table's streaming buffer, if one is present. This field will be absent if the table is not being streamed to or if there is no data in the streaming buffer.
-
(Integer, nil) — The estimated number of rows in the buffer, or
nil
if not present or the object is a reference (see #reference?).
#bytes_count
def bytes_count() -> Integer, nil
The number of bytes in the table.
-
(Integer, nil) — The count of bytes in the table, or
nil
if the object is a reference (see #reference?).
#clone
def clone(destination_table, &block) { |job| ... } -> Boolean
Clones the data from the table to another table using a synchronous method that blocks for a response. The source and destination table have the same table type, but only bill for unique data. Timeouts and transient errors are generally handled as needed to complete the job. See also #copy_job.
The geographic location for the job ("US", "EU", etc.) can be set via CopyJob::Updater#location= in a block passed to this method. If the table is a full resource representation (see #resource_full?), the location of the job will be automatically set to the location of the table.
-
destination_table (Table, String) — The destination for the
copied data. This can also be a string identifier as specified by
the Standard SQL Query
Reference
(
project-name.dataset_id.table_id
) or the Legacy SQL Query Reference (project-name:dataset_id.table_id
). This is useful for referencing tables in other projects and datasets.
- (job) — a job configuration object
- job (Google::Cloud::Bigquery::CopyJob::Updater) — a job configuration object for setting additional options.
-
(Boolean) — Returns
true
if the copy operation succeeded.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" destination_table = dataset.table "my_destination_table" table.clone destination_table
Passing a string identifier for the destination table:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" table.clone "other-project:other_dataset.other_table"
#clone?
def clone?() -> Boolean, nil
Checks if the table's type is CLONE
, indicating that the table
represents a BigQuery table clone.
-
(Boolean, nil) —
true
when the type isCLONE
,false
otherwise, if the object is a resource (see #resource?);nil
if the object is a reference (see #reference?).
#clone_definition
def clone_definition() -> Google::Apis::BigqueryV2::CloneDefinition, nil
The Information about base table and clone time of the table.
-
(Google::Apis::BigqueryV2::CloneDefinition, nil) — Clone definition of table clone, or
nil
if not clone or the object is a reference (see #reference?).
#clustering?
def clustering?() -> Boolean, nil
Checks if the table is clustered.
See Updater#clustering_fields=, #clustering_fields and #clustering_fields=.
-
(Boolean, nil) —
true
when the table is clustered, orfalse
otherwise, if the object is a resource (see #resource?);nil
if the object is a reference (see #reference?).
#clustering_fields
def clustering_fields() -> Array<String>, nil
One or more fields on which data should be clustered. Must be specified with time partitioning, data in the table will be first partitioned and subsequently clustered. The order of the returned fields determines the sort order of the data.
BigQuery supports clustering for both partitioned and non-partitioned tables.
See Updater#clustering_fields=, #clustering_fields= and #clustering?.
-
(Array<String>, nil) — The clustering fields, or
nil
if the table is not clustered or if the table is a reference (see #reference?).
#clustering_fields=
def clustering_fields=(fields)
Updates the list of fields on which data should be clustered.
Only top-level, non-repeated, simple-type fields are supported. When you cluster a table using multiple columns, the order of columns you specify is important. The order of the specified columns determines the sort order of the data.
BigQuery supports clustering for both partitioned and non-partitioned tables.
See Updater#clustering_fields=, #clustering_fields and #clustering?.
-
fields (Array<String>, nil) — The clustering fields, or
nil
to remove the clustering configuration. Only top-level, non-repeated, simple-type fields are supported.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" table.clustering_fields = ["last_name", "first_name"]
#copy
def copy(destination_table, create: nil, write: nil, &block) { |job| ... } -> Boolean
Copies the data from the table to another table using a synchronous method that blocks for a response. Timeouts and transient errors are generally handled as needed to complete the job. See also #copy_job.
The geographic location for the job ("US", "EU", etc.) can be set via CopyJob::Updater#location= in a block passed to this method. If the table is a full resource representation (see #resource_full?), the location of the job will be automatically set to the location of the table.
-
destination_table (Table, String) — The destination for the
copied data. This can also be a string identifier as specified by
the Standard SQL Query
Reference
(
project-name.dataset_id.table_id
) or the Legacy SQL Query Reference (project-name:dataset_id.table_id
). This is useful for referencing tables in other projects and datasets. -
create (String) (defaults to: nil) —
Specifies whether the job is allowed to create new tables. The default value is
needed
.The following values are supported:
needed
- Create the table if it does not exist.never
- The table must already exist. A 'notFound' error is raised if the table does not exist.
-
write (String) (defaults to: nil) —
Specifies how to handle data already present in the destination table. The default value is
empty
.The following values are supported:
truncate
- BigQuery overwrites the table data.append
- BigQuery appends the data to the table.empty
- An error will be returned if the destination table already contains data.
- (job) — a job configuration object
- job (Google::Cloud::Bigquery::CopyJob::Updater) — a job configuration object for setting additional options.
-
(Boolean) — Returns
true
if the copy operation succeeded.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" destination_table = dataset.table "my_destination_table" table.copy destination_table
Passing a string identifier for the destination table:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" table.copy "other-project:other_dataset.other_table"
#copy_job
def copy_job(destination_table, create: nil, write: nil, job_id: nil, prefix: nil, labels: nil, dryrun: nil, operation_type: nil) { |job| ... } -> Google::Cloud::Bigquery::CopyJob
Copies the data from the table to another table using an asynchronous method. In this method, a CopyJob is immediately returned. The caller may poll the service by repeatedly calling Job#reload! and Job#done? to detect when the job is done, or simply block until the job is done by calling #Job#wait_until_done!. See also #copy.
The geographic location for the job ("US", "EU", etc.) can be set via CopyJob::Updater#location= in a block passed to this method. If the table is a full resource representation (see #resource_full?), the location of the job will be automatically set to the location of the table.
-
destination_table (Table, String) — The destination for the
copied data. This can also be a string identifier as specified by
the Standard SQL Query
Reference
(
project-name.dataset_id.table_id
) or the Legacy SQL Query Reference (project-name:dataset_id.table_id
). This is useful for referencing tables in other projects and datasets. -
create (String) (defaults to: nil) —
Specifies whether the job is allowed to create new tables. The default value is
needed
.The following values are supported:
needed
- Create the table if it does not exist.never
- The table must already exist. A 'notFound' error is raised if the table does not exist.
-
write (String) (defaults to: nil) —
Specifies how to handle data already present in the destination table. The default value is
empty
.The following values are supported:
truncate
- BigQuery overwrites the table data.append
- BigQuery appends the data to the table.empty
- An error will be returned if the destination table already contains data.
-
job_id (String) (defaults to: nil) — A user-defined ID for the copy job. The ID
must contain only letters (
[A-Za-z]
), numbers ([0-9]
), underscores (_
), or dashes (-
). The maximum length is 1,024 characters. Ifjob_id
is provided, thenprefix
will not be used.See Generating a job ID.
-
prefix (String) (defaults to: nil) — A string, usually human-readable, that will be
prepended to a generated value to produce a unique job ID. For
example, the prefix
daily_import_job_
can be given to generate a job ID such asdaily_import_job_12vEDtMQ0mbp1Mo5Z7mzAFQJZazh
. The prefix must contain only letters ([A-Za-z]
), numbers ([0-9]
), underscores (_
), or dashes (-
). The maximum length of the entire ID is 1,024 characters. Ifjob_id
is provided, thenprefix
will not be used. -
labels (Hash) (defaults to: nil) —
A hash of user-provided labels associated with the job. You can use these to organize and group your jobs.
The labels applied to a resource must meet the following requirements:
- Each resource can have multiple labels, up to a maximum of 64.
- Each label must be a key-value pair.
- Keys have a minimum length of 1 character and a maximum length of 63 characters, and cannot be empty. Values can be empty, and have a maximum length of 63 characters.
- Keys and values can contain only lowercase letters, numeric characters, underscores, and dashes. All characters must use UTF-8 encoding, and international characters are allowed.
- The key portion of a label must be unique. However, you can use the same key with multiple resources.
- Keys must start with a lowercase letter or international character.
- dryrun (Boolean) (defaults to: nil) — If set, don't actually run this job. Behavior is undefined however for non-query jobs and may result in an error. Deprecated.
- (job) — a job configuration object
- job (Google::Cloud::Bigquery::CopyJob::Updater) — a job configuration object for setting additional options.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" destination_table = dataset.table "my_destination_table" copy_job = table.copy_job destination_table
Passing a string identifier for the destination table:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" copy_job = table.copy_job "other-project:other_dataset.other_table" copy_job.wait_until_done! copy_job.done? #=> true
#created_at
def created_at() -> Time, nil
The time when this table was created.
-
(Time, nil) — The creation time, or
nil
if the object is a reference (see #reference?).
#data
def data(token: nil, max: nil, start: nil) -> Google::Cloud::Bigquery::Data
Retrieves data from the table.
If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved before the data retrieval.
- token (String) (defaults to: nil) — Page token, returned by a previous call, identifying the result set.
- max (Integer) (defaults to: nil) — Maximum number of results to return.
- start (Integer) (defaults to: nil) — Zero-based index of the starting row to read.
Paginate rows of data: (See Data#next)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" data = table.data # Iterate over the first page of results data.each do |row| puts row[:name] end # Retrieve the next page of results data = data.next if data.next?
Retrieve all rows of data: (See Data#all)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" data = table.data data.all do |row| puts row[:name] end
#dataset_id
def dataset_id() -> String
The ID of the Dataset
containing this table.
-
(String) — The ID must contain only letters (
[A-Za-z]
), numbers ([0-9]
), or underscores (_
). The maximum length is 1,024 characters.
#delete
def delete() -> Boolean
Permanently deletes the table.
-
(Boolean) — Returns
true
if the table was deleted.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" table.delete
#description
def description() -> String, nil
A user-friendly description of the table.
-
(String, nil) — The description, or
nil
if the object is a reference (see #reference?).
#description=
def description=(new_description)
Updates the user-friendly description of the table.
If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
- new_description (String) — The new user-friendly description.
#enable_refresh=
def enable_refresh=(new_enable_refresh)
Sets whether automatic refresh of the materialized view is enabled. When true, the materialized view is updated when the base table is updated. See #materialized_view?.
-
new_enable_refresh (Boolean) —
true
when automatic refresh is enabled,false
otherwise.
#enable_refresh?
def enable_refresh?() -> Boolean, nil
Whether automatic refresh of the materialized view is enabled. When true, the materialized view is updated when the base table is updated. The default value is true. See #materialized_view?.
-
(Boolean, nil) —
true
when automatic refresh is enabled,false
otherwise; ornil
if not a materialized view.
#encryption
def encryption() -> EncryptionConfiguration, nil
The EncryptionConfiguration object that represents the custom encryption method used to protect the table. If not set, Dataset#default_encryption is used.
Present only if the table is using custom encryption.
-
(EncryptionConfiguration, nil) — The encryption configuration.
@!group Attributes
#encryption=
def encryption=(value)
Set the EncryptionConfiguration object that represents the custom encryption method used to protect the table. If not set, Dataset#default_encryption is used.
Present only if the table is using custom encryption.
If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
- value (EncryptionConfiguration) — The new encryption config.
#etag
def etag() -> String, nil
The ETag hash of the table.
-
(String, nil) — The ETag hash, or
nil
if the object is a reference (see #reference?).
#exists?
def exists?(force: false) -> Boolean
Determines whether the table exists in the BigQuery service. The
result is cached locally. To refresh state, set force
to true
.
-
force (Boolean) (defaults to: false) — Force the latest resource representation to be
retrieved from the BigQuery service when
true
. Otherwise the return value of this method will be memoized to reduce the number of API calls made to the BigQuery service. The default isfalse
.
-
(Boolean) —
true
when the table exists in the BigQuery service,false
otherwise.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table", skip_lookup: true table.exists? # true
#expires_at
def expires_at() -> Time, nil
The time when this table expires. If not present, the table will persist indefinitely. Expired tables will be deleted and their storage reclaimed.
-
(Time, nil) — The expiration time, or
nil
if not present or the object is a reference (see #reference?).
#external
def external() -> External::DataSource, nil
The External::DataSource (or subclass) object that represents the external data source that the table represents. Data can be queried the table, even though the data is not stored in BigQuery. Instead of loading or streaming the data, this object references the external data source.
Present only if the table represents an External Data Source. See #external? and External::DataSource.
-
(External::DataSource, nil) — The external data source.
@!group Attributes
#external=
def external=(external)
Set the External::DataSource (or subclass) object that represents the external data source that the table represents. Data can be queried the table, even though the data is not stored in BigQuery. Instead of loading or streaming the data, this object references the external data source.
Use only if the table represents an External Data Source. See #external? and External::DataSource.
If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
- external (External::DataSource) — An external data source.
#external?
def external?() -> Boolean, nil
Checks if the table's type is EXTERNAL
, indicating that the table
represents an External Data Source. See #external? and
External::DataSource.
-
(Boolean, nil) —
true
when the type isEXTERNAL
,false
otherwise, if the object is a resource (see #resource?);nil
if the object is a reference (see #reference?).
#extract
def extract(extract_url, format: nil, compression: nil, delimiter: nil, header: nil, &block) { |job| ... } -> Boolean
Extracts the data from the table to a Google Cloud Storage file using a synchronous method that blocks for a response. Timeouts and transient errors are generally handled as needed to complete the job. See also #extract_job.
The geographic location for the job ("US", "EU", etc.) can be set via ExtractJob::Updater#location= in a block passed to this method. If the table is a full resource representation (see #resource_full?), the location of the job will be automatically set to the location of the table.
- extract_url (Google::Cloud::Storage::File, String, Array<String>) — The Google Storage file or file URI pattern(s) to which BigQuery should extract the table data.
-
format (String) (defaults to: nil) —
The exported file format. The default value is
csv
.The following values are supported:
csv
- CSVjson
- Newline-delimited JSONavro
- Avro
-
compression (String) (defaults to: nil) — The compression type to use for exported
files. Possible values include
GZIP
andNONE
. The default value isNONE
. -
delimiter (String) (defaults to: nil) — Delimiter to use between fields in the
exported data. Default is
,
. -
header (Boolean) (defaults to: nil) — Whether to print out a header row in the
results. Default is
true
.
- (job) — a job configuration object
- job (Google::Cloud::Bigquery::ExtractJob::Updater) — a job configuration object for setting additional options.
-
(Boolean) — Returns
true
if the extract operation succeeded.
Extract to a JSON file:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" table.extract "gs://my-bucket/file-name.json", format: "json"
Extract to a CSV file, attaching labels to the job:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" table.extract "gs://my-bucket/file-name.csv" do |extract| extract.labels = { "custom-label" => "custom-value" } end
#extract_job
def extract_job(extract_url, format: nil, compression: nil, delimiter: nil, header: nil, job_id: nil, prefix: nil, labels: nil, dryrun: nil) { |job| ... } -> Google::Cloud::Bigquery::ExtractJob
Extracts the data from the table to a Google Cloud Storage file using an asynchronous method. In this method, an ExtractJob is immediately returned. The caller may poll the service by repeatedly calling Job#reload! and Job#done? to detect when the job is done, or simply block until the job is done by calling #Job#wait_until_done!. See also #extract.
The geographic location for the job ("US", "EU", etc.) can be set via ExtractJob::Updater#location= in a block passed to this method. If the table is a full resource representation (see #resource_full?), the location of the job will automatically be set to the location of the table.
- extract_url (Google::Cloud::Storage::File, String, Array<String>) — The Google Storage file or file URI pattern(s) to which BigQuery should extract the table data.
-
format (String) (defaults to: nil) —
The exported file format. The default value is
csv
.The following values are supported:
csv
- CSVjson
- Newline-delimited JSONavro
- Avro
-
compression (String) (defaults to: nil) — The compression type to use for exported
files. Possible values include
GZIP
andNONE
. The default value isNONE
. -
delimiter (String) (defaults to: nil) — Delimiter to use between fields in the
exported data. Default is
,
. -
header (Boolean) (defaults to: nil) — Whether to print out a header row in the
results. Default is
true
. -
job_id (String) (defaults to: nil) — A user-defined ID for the extract job. The ID
must contain only letters (
[A-Za-z]
), numbers ([0-9]
), underscores (_
), or dashes (-
). The maximum length is 1,024 characters. Ifjob_id
is provided, thenprefix
will not be used.See Generating a job ID.
-
prefix (String) (defaults to: nil) — A string, usually human-readable, that will be
prepended to a generated value to produce a unique job ID. For
example, the prefix
daily_import_job_
can be given to generate a job ID such asdaily_import_job_12vEDtMQ0mbp1Mo5Z7mzAFQJZazh
. The prefix must contain only letters ([A-Za-z]
), numbers ([0-9]
), underscores (_
), or dashes (-
). The maximum length of the entire ID is 1,024 characters. Ifjob_id
is provided, thenprefix
will not be used. -
labels (Hash) (defaults to: nil) —
A hash of user-provided labels associated with the job. You can use these to organize and group your jobs.
The labels applied to a resource must meet the following requirements:
- Each resource can have multiple labels, up to a maximum of 64.
- Each label must be a key-value pair.
- Keys have a minimum length of 1 character and a maximum length of 63 characters, and cannot be empty. Values can be empty, and have a maximum length of 63 characters.
- Keys and values can contain only lowercase letters, numeric characters, underscores, and dashes. All characters must use UTF-8 encoding, and international characters are allowed.
- The key portion of a label must be unique. However, you can use the same key with multiple resources.
- Keys must start with a lowercase letter or international character.
- dryrun (Boolean) (defaults to: nil) — If set, don't actually run this job. Behavior is undefined however for non-query jobs and may result in an error. Deprecated.
- (job) — a job configuration object
- job (Google::Cloud::Bigquery::ExtractJob::Updater) — a job configuration object for setting additional options.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" extract_job = table.extract_job "gs://my-bucket/file-name.json", format: "json" extract_job.wait_until_done! extract_job.done? #=> true
#fields
def fields() -> Array<Schema::Field>, nil
The fields of the table, obtained from its schema.
- (Array<Schema::Field>, nil) — An array of field objects.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" table.fields.each do |field| puts field.name end
#headers
def headers() -> Array<Symbol>, nil
The names of the columns in the table, obtained from its schema.
- (Array<Symbol>, nil) — An array of column names.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" table.headers.each do |header| puts header end
#id
def id() -> String, nil
The combined Project ID, Dataset ID, and Table ID for this table, in
the format specified by the Legacy SQL Query
Reference
(project-name:dataset_id.table_id
). This is useful for referencing
tables in other projects and datasets. To use this value in queries
see #query_id.
-
(String, nil) — The combined ID, or
nil
if the object is a reference (see #reference?).
#insert
def insert(rows, insert_ids: nil, skip_invalid: nil, ignore_unknown: nil) -> Google::Cloud::Bigquery::InsertResponse
Inserts data into the table for near-immediate querying, without the need to complete a load operation before the data can appear in query results.
Simple Ruby types are generally accepted per JSON rules, along with the following support for BigQuery's more complex types:
| BigQuery | Ruby | Notes |
|--------------|--------------------------------------|----------------------------------------------------|
| NUMERIC
| BigDecimal
| BigDecimal
values will be rounded to scale 9. |
| BIGNUMERIC
| String
| Pass as String
to avoid rounding to scale 9. |
| DATETIME
| DateTime
| DATETIME
does not support time zone. |
| DATE
| Date
| |
| GEOGRAPHY
| String
| Well-known text (WKT) or GeoJSON. |
| TIMESTAMP
| Time
| |
| TIME
| Google::Cloud::BigQuery::Time
| |
| BYTES
| File
, IO
, StringIO
, or similar | |
| ARRAY
| Array
| Nested arrays, nil
values are not supported. |
| STRUCT
| Hash
| Hash keys may be strings or symbols. |
For GEOGRAPHY
data, see Working with BigQuery GIS data.
Because BigQuery's streaming API is designed for high insertion rates, modifications to the underlying table metadata are eventually consistent when interacting with the streaming system. In most cases metadata changes are propagated within minutes, but during this period API responses may reflect the inconsistent state of the table.
The value :skip
can be provided to skip the generation of IDs for all rows, or to skip the generation of an
ID for a specific row in the array.
-
rows (Hash, Array<Hash>) — A hash object or array of hash objects
containing the data. Required.
BigDecimal
values will be rounded to scale 9 to conform with the BigQueryNUMERIC
data type. To avoid roundingBIGNUMERIC
type values with scale greater than 9, useString
instead ofBigDecimal
. - insert_ids (Array<String|Symbol>, Symbol) (defaults to: nil) — A unique ID for each row. BigQuery uses this property to detect duplicate insertion requests on a best-effort basis. For more information, see data consistency. Optional. If not provided, the client library will assign a UUID to each row before the request is sent.
-
skip_invalid (Boolean) (defaults to: nil) — Insert all valid rows of a request, even
if invalid rows exist. The default value is
false
, which causes the entire request to fail if any invalid rows exist. - ignore_unknown (Boolean) (defaults to: nil) — Accept rows that contain values that do not match the schema. The unknown values are ignored. Default is false, which treats unknown values as errors.
- (ArgumentError)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" rows = [ { "first_name" => "Alice", "age" => 21 }, { "first_name" => "Bob", "age" => 22 } ] table.insert rows
Avoid retrieving the dataset and table with skip_lookup
:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset", skip_lookup: true table = dataset.table "my_table", skip_lookup: true rows = [ { "first_name" => "Alice", "age" => 21 }, { "first_name" => "Bob", "age" => 22 } ] table.insert rows
Pass BIGNUMERIC
value as a string to avoid rounding to scale 9 in the conversion from BigDecimal
:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" row = { "my_numeric" => BigDecimal("123456798.987654321"), "my_bignumeric" => "123456798.98765432100001" # BigDecimal would be rounded, use String instead! } table.insert row
#insert_async
def insert_async(skip_invalid: nil, ignore_unknown: nil, max_bytes: 10_000_000, max_rows: 500, interval: 10, threads: 4, &block) { |result| ... } -> Table::AsyncInserter
Create an asynchronous inserter object used to insert rows in batches.
-
skip_invalid (Boolean) (defaults to: nil) — Insert all valid rows of a request, even
if invalid rows exist. The default value is
false
, which causes the entire request to fail if any invalid rows exist. - ignore_unknown (Boolean) (defaults to: nil) — Accept rows that contain values that do not match the schema. The unknown values are ignored. Default is false, which treats unknown values as errors.
- max_rows (Integer) (defaults to: 500) — The maximum number of rows to be collected before the batch is published. Default is 500.
- (response) — the callback for when a batch of rows is inserted
- result (Table::AsyncInserter::Result) — the result of the asynchronous insert
- (Table::AsyncInserter) — Returns inserter object.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" inserter = table.insert_async do |result| if result.error? log_error result.error else log_insert "inserted #{result.insert_count} rows " \ "with #{result.error_count} errors" end end rows = [ { "first_name" => "Alice", "age" => 21 }, { "first_name" => "Bob", "age" => 22 } ] inserter.insert rows inserter.stop.wait!
#labels
def labels() -> Hash<String, String>, nil
A hash of user-provided labels associated with this table. Labels are used to organize and group tables. See Using Labels.
The returned hash is frozen and changes are not allowed. Use #labels= to replace the entire hash.
- (Hash<String, String>, nil) — A hash containing key/value pairs.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" labels = table.labels labels["department"] #=> "shipping"
#labels=
def labels=(labels)
Updates the hash of user-provided labels associated with this table. Labels are used to organize and group tables. See Using Labels.
If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
-
labels (Hash<String, String>) —
A hash containing key/value pairs.
The labels applied to a resource must meet the following requirements:
- Each resource can have multiple labels, up to a maximum of 64.
- Each label must be a key-value pair.
- Keys have a minimum length of 1 character and a maximum length of 63 characters, and cannot be empty. Values can be empty, and have a maximum length of 63 characters.
- Keys and values can contain only lowercase letters, numeric characters, underscores, and dashes. All characters must use UTF-8 encoding, and international characters are allowed.
- The key portion of a label must be unique. However, you can use the same key with multiple resources.
- Keys must start with a lowercase letter or international character.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" table.labels = { "department" => "shipping" }
#last_refresh_time
def last_refresh_time() -> Time, nil
The time when the materialized view was last modified. See #materialized_view?.
-
(Time, nil) — The time, or
nil
if not present or not a materialized view.
#load
def load(files, format: nil, create: nil, write: nil, projection_fields: nil, jagged_rows: nil, quoted_newlines: nil, encoding: nil, delimiter: nil, ignore_unknown: nil, max_bad_records: nil, quote: nil, skip_leading: nil, autodetect: nil, null_marker: nil, &block) { |updater| ... } -> Boolean
Loads data into the table. You can pass a google-cloud storage file path or a google-cloud storage file instance. Or, you can upload a file directly. See Loading Data with a POST Request.
The geographic location for the job ("US", "EU", etc.) can be set via LoadJob::Updater#location= in a block passed to this method. If the table is a full resource representation (see #resource_full?), the location of the job will be automatically set to the location of the table.
- files (File, Google::Cloud::Storage::File, String, URI, Array<Google::Cloud::Storage::File, String, URI>) — A file or the URI of a Google Cloud Storage file, or an Array of those, containing data to load into the table.
-
format (String) (defaults to: nil) —
The exported file format. The default value is
csv
.The following values are supported:
csv
- CSVjson
- Newline-delimited JSONavro
- Avroorc
- ORCparquet
- Parquetdatastore_backup
- Cloud Datastore backup
-
create (String) (defaults to: nil) —
Specifies whether the job is allowed to create new tables. The default value is
needed
.The following values are supported:
needed
- Create the table if it does not exist.never
- The table must already exist. A 'notFound' error is raised if the table does not exist.
-
write (String) (defaults to: nil) —
Specifies how to handle data already present in the table. The default value is
append
.The following values are supported:
truncate
- BigQuery overwrites the table data.append
- BigQuery appends the data to the table.empty
- An error will be returned if the table already contains data.
-
projection_fields (Array<String>) (defaults to: nil) — If the
format
option is set todatastore_backup
, indicates which entity properties to load from a Cloud Datastore backup. Property names are case sensitive and must be top-level properties. If not set, BigQuery loads all properties. If any named property isn't found in the Cloud Datastore backup, an invalid error is returned. -
jagged_rows (Boolean) (defaults to: nil) — Accept rows that are missing trailing
optional columns. The missing values are treated as nulls. If
false
, records with missing trailing columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value isfalse
. Only applicable to CSV, ignored for other formats. -
quoted_newlines (Boolean) (defaults to: nil) — Indicates if BigQuery should allow
quoted data sections that contain newline characters in a CSV file.
The default value is
false
. -
autodetect (Boolean) (defaults to: nil) — Indicates if BigQuery should
automatically infer the options and schema for CSV and JSON sources.
The default value is
false
. -
encoding (String) (defaults to: nil) — The character encoding of the data. The
supported values are
UTF-8
orISO-8859-1
. The default value isUTF-8
. -
delimiter (String) (defaults to: nil) — Specifices the separator for fields in a CSV
file. BigQuery converts the string to
ISO-8859-1
encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. Default is,
. -
ignore_unknown (Boolean) (defaults to: nil) —
Indicates if BigQuery should allow extra values that are not represented in the table schema. If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value is
false
.The
format
property determines what BigQuery treats as an extra value:CSV
: Trailing columnsJSON
: Named values that don't match any column names
-
max_bad_records (Integer) (defaults to: nil) — The maximum number of bad records
that BigQuery can ignore when running the job. If the number of bad
records exceeds this value, an invalid error is returned in the job
result. The default value is
0
, which requires that all records are valid. -
null_marker (String) (defaults to: nil) — Specifies a string that represents a null
value in a CSV file. For example, if you specify
\N
, BigQuery interprets\N
as a null value when loading a CSV file. The default value is the empty string. If you set this property to a custom value, BigQuery throws an error if an empty string is present for all data types except for STRING and BYTE. For STRING and BYTE columns, BigQuery interprets the empty string as an empty value. -
quote (String) (defaults to: nil) — The value that is used to quote data sections in
a CSV file. BigQuery converts the string to ISO-8859-1 encoding, and
then uses the first byte of the encoded string to split the data in
its raw, binary state. The default value is a double-quote
"
. If your data does not contain quoted sections, set the property value to an empty string. If your data contains quoted newline characters, you must also set the allowQuotedNewlines property to true. -
skip_leading (Integer) (defaults to: nil) — The number of rows at the top of a CSV
file that BigQuery will skip when loading the data. The default
value is
0
. This property is useful if you have header rows in the file that should be skipped.
- (updater) — A block for setting the schema of the destination table and other options for the load job. The schema can be omitted if the destination table already exists, or if you're loading data from a Google Cloud Datastore backup.
- updater (Google::Cloud::Bigquery::LoadJob::Updater) — An updater to modify the load job and its schema.
-
(Boolean) — Returns
true
if the load job was successful.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" success = table.load "gs://my-bucket/file-name.csv"
Pass a google-cloud-storage File
instance:
require "google/cloud/bigquery" require "google/cloud/storage" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" storage = Google::Cloud::Storage.new bucket = storage.bucket "my-bucket" file = bucket.file "file-name.csv" success = table.load file
Pass a list of google-cloud-storage files:
require "google/cloud/bigquery" require "google/cloud/storage" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" storage = Google::Cloud::Storage.new bucket = storage.bucket "my-bucket" file = bucket.file "file-name.csv" table.load [file, "gs://my-bucket/file-name2.csv"]
Upload a file directly:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" file = File.open "my_data.json" success = table.load file do |j| j.format = "newline_delimited_json" end
#load_job
def load_job(files, format: nil, create: nil, write: nil, projection_fields: nil, jagged_rows: nil, quoted_newlines: nil, encoding: nil, delimiter: nil, ignore_unknown: nil, max_bad_records: nil, quote: nil, skip_leading: nil, job_id: nil, prefix: nil, labels: nil, autodetect: nil, null_marker: nil, dryrun: nil) { |load_job| ... } -> Google::Cloud::Bigquery::LoadJob
Loads data into the table. You can pass a google-cloud storage file path or a google-cloud storage file instance. Or, you can upload a file directly. See Loading Data with a POST Request.
The geographic location for the job ("US", "EU", etc.) can be set via LoadJob::Updater#location= in a block passed to this method. If the table is a full resource representation (see #resource_full?), the location of the job will be automatically set to the location of the table.
- files (File, Google::Cloud::Storage::File, String, URI, Array<Google::Cloud::Storage::File, String, URI>) — A file or the URI of a Google Cloud Storage file, or an Array of those, containing data to load into the table.
-
format (String) (defaults to: nil) —
The exported file format. The default value is
csv
.The following values are supported:
csv
- CSVjson
- Newline-delimited JSONavro
- Avroorc
- ORCparquet
- Parquetdatastore_backup
- Cloud Datastore backup
-
create (String) (defaults to: nil) —
Specifies whether the job is allowed to create new tables. The default value is
needed
.The following values are supported:
needed
- Create the table if it does not exist.never
- The table must already exist. A 'notFound' error is raised if the table does not exist.
-
write (String) (defaults to: nil) —
Specifies how to handle data already present in the table. The default value is
append
.The following values are supported:
truncate
- BigQuery overwrites the table data.append
- BigQuery appends the data to the table.empty
- An error will be returned if the table already contains data.
-
projection_fields (Array<String>) (defaults to: nil) — If the
format
option is set todatastore_backup
, indicates which entity properties to load from a Cloud Datastore backup. Property names are case sensitive and must be top-level properties. If not set, BigQuery loads all properties. If any named property isn't found in the Cloud Datastore backup, an invalid error is returned. -
jagged_rows (Boolean) (defaults to: nil) — Accept rows that are missing trailing
optional columns. The missing values are treated as nulls. If
false
, records with missing trailing columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value isfalse
. Only applicable to CSV, ignored for other formats. -
quoted_newlines (Boolean) (defaults to: nil) — Indicates if BigQuery should allow
quoted data sections that contain newline characters in a CSV file.
The default value is
false
. -
autodetect (Boolean) (defaults to: nil) — Indicates if BigQuery should
automatically infer the options and schema for CSV and JSON sources.
The default value is
false
. -
encoding (String) (defaults to: nil) — The character encoding of the data. The
supported values are
UTF-8
orISO-8859-1
. The default value isUTF-8
. -
delimiter (String) (defaults to: nil) — Specifices the separator for fields in a CSV
file. BigQuery converts the string to
ISO-8859-1
encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. Default is,
. -
ignore_unknown (Boolean) (defaults to: nil) —
Indicates if BigQuery should allow extra values that are not represented in the table schema. If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value is
false
.The
format
property determines what BigQuery treats as an extra value:CSV
: Trailing columnsJSON
: Named values that don't match any column names
-
max_bad_records (Integer) (defaults to: nil) — The maximum number of bad records
that BigQuery can ignore when running the job. If the number of bad
records exceeds this value, an invalid error is returned in the job
result. The default value is
0
, which requires that all records are valid. -
null_marker (String) (defaults to: nil) — Specifies a string that represents a null
value in a CSV file. For example, if you specify
\N
, BigQuery interprets\N
as a null value when loading a CSV file. The default value is the empty string. If you set this property to a custom value, BigQuery throws an error if an empty string is present for all data types except for STRING and BYTE. For STRING and BYTE columns, BigQuery interprets the empty string as an empty value. -
quote (String) (defaults to: nil) — The value that is used to quote data sections in
a CSV file. BigQuery converts the string to ISO-8859-1 encoding, and
then uses the first byte of the encoded string to split the data in
its raw, binary state. The default value is a double-quote
"
. If your data does not contain quoted sections, set the property value to an empty string. If your data contains quoted newline characters, you must also set the allowQuotedNewlines property to true. -
skip_leading (Integer) (defaults to: nil) — The number of rows at the top of a CSV
file that BigQuery will skip when loading the data. The default
value is
0
. This property is useful if you have header rows in the file that should be skipped. -
job_id (String) (defaults to: nil) — A user-defined ID for the load job. The ID
must contain only letters (
[A-Za-z]
), numbers ([0-9]
), underscores (_
), or dashes (-
). The maximum length is 1,024 characters. Ifjob_id
is provided, thenprefix
will not be used.See Generating a job ID.
-
prefix (String) (defaults to: nil) — A string, usually human-readable, that will be
prepended to a generated value to produce a unique job ID. For
example, the prefix
daily_import_job_
can be given to generate a job ID such asdaily_import_job_12vEDtMQ0mbp1Mo5Z7mzAFQJZazh
. The prefix must contain only letters ([A-Za-z]
), numbers ([0-9]
), underscores (_
), or dashes (-
). The maximum length of the entire ID is 1,024 characters. Ifjob_id
is provided, thenprefix
will not be used. -
labels (Hash) (defaults to: nil) —
A hash of user-provided labels associated with the job. You can use these to organize and group your jobs.
The labels applied to a resource must meet the following requirements:
- Each resource can have multiple labels, up to a maximum of 64.
- Each label must be a key-value pair.
- Keys have a minimum length of 1 character and a maximum length of 63 characters, and cannot be empty. Values can be empty, and have a maximum length of 63 characters.
- Keys and values can contain only lowercase letters, numeric characters, underscores, and dashes. All characters must use UTF-8 encoding, and international characters are allowed.
- The key portion of a label must be unique. However, you can use the same key with multiple resources.
- Keys must start with a lowercase letter or international character.
- dryrun (Boolean) (defaults to: nil) — If set, don't actually run this job. Behavior is undefined however for non-query jobs and may result in an error. Deprecated.
- (load_job) — a block for setting the load job
- load_job (LoadJob) — the load job object to be updated
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" load_job = table.load_job "gs://my-bucket/file-name.csv"
Pass a google-cloud-storage File
instance:
require "google/cloud/bigquery" require "google/cloud/storage" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" storage = Google::Cloud::Storage.new bucket = storage.bucket "my-bucket" file = bucket.file "file-name.csv" load_job = table.load_job file
Pass a list of google-cloud-storage files:
require "google/cloud/bigquery" require "google/cloud/storage" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" storage = Google::Cloud::Storage.new bucket = storage.bucket "my-bucket" file = bucket.file "file-name.csv" load_job = table.load_job [file, "gs://my-bucket/file-name2.csv"]
Upload a file directly:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" file = File.open "my_data.csv" load_job = table.load_job file
#location
def location() -> String, nil
The geographic location where the table should reside. Possible
values include EU
and US
. The default value is US
.
- (String, nil) — The location code.
#materialized_view?
def materialized_view?() -> Boolean, nil
Checks if the table's type is MATERIALIZED_VIEW
, indicating that
the table represents a BigQuery materialized view.
See Dataset#create_materialized_view.
-
(Boolean, nil) —
true
when the type isMATERIALIZED_VIEW
,false
otherwise, if the object is a resource (see #resource?);nil
if the object is a reference (see #reference?).
#modified_at
def modified_at() -> Time, nil
The date when this table was last modified.
-
(Time, nil) — The last modified time, or
nil
if not present or the object is a reference (see #reference?).
#name
def name() -> String, nil
The name of the table.
-
(String, nil) — The friendly name, or
nil
if the object is a reference (see #reference?).
#name=
def name=(new_name)
Updates the name of the table.
If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
- new_name (String) — The new friendly name.
#param_types
def param_types() -> Hash
The types of the fields in the table, obtained from its schema. Types use the same format as the optional query parameter types.
- (Hash) — A hash with field names as keys, and types as values.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" table.param_types
#policy
def policy() -> Policy
Gets the Cloud IAM access control policy for the table. The latest policy will be read from the service. See also #update_policy.
- (Policy) — The frozen policy for the table.
- (ArgumentError)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" policy = table.policy policy.frozen? #=> true binding_owner = policy.bindings.find { |b| b.role == "roles/owner" } binding_owner.role #=> "roles/owner" binding_owner.members #=> ["user:owner@example.com"] binding_owner.frozen? #=> true binding_owner.members.frozen? #=> true
#project_id
def project_id() -> String
The ID of the Project
containing this table.
- (String) — The project ID.
#query
def query() -> String, nil
The query that defines the view or materialized view. See #view? and #materialized_view?.
-
(String, nil) — The query that defines the view or materialized_view;
or
nil
if not a view or materialized view.
#query=
def query=(new_query)
Updates the query that defines the view. (See #view?.) Not supported for materialized views.
This method sets the query using standard SQL. To specify legacy SQL or to use user-defined function resources for a view, use (#set_query) instead.
- new_query (String) — The query that defines the view.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" view = dataset.table "my_view" view.query = "SELECT first_name FROM " \ "`my_project.my_dataset.my_table`"
#query_id
def query_id(standard_sql: nil, legacy_sql: nil) -> String
The value returned by #id, wrapped in backticks (Standard SQL) or s quare brackets (Legacy SQL) to accommodate project IDs containing dashes. Useful in queries.
- standard_sql (Boolean) (defaults to: nil) — Specifies whether to use BigQuery's standard SQL dialect. Optional. The default value is true.
- legacy_sql (Boolean) (defaults to: nil) — Specifies whether to use BigQuery's legacy SQL dialect. Optional. The default value is false.
- (String) — The appropriate table ID for use in queries, depending on SQL type.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" data = bigquery.query "SELECT first_name FROM #{table.query_id}"
#query_legacy_sql?
def query_legacy_sql?() -> Boolean
Checks if the view's query is using legacy sql. See #view?.
-
(Boolean) —
true
when legacy sql is used,false
otherwise; ornil
if not a logical view.
#query_standard_sql?
def query_standard_sql?() -> Boolean
Checks if the view's query is using standard sql. See #view?.
-
(Boolean) —
true
when standard sql is used,false
otherwise.
#query_udfs
def query_udfs() -> Array<String>, nil
The user-defined function resources used in the view's query. May be
either a code resource to load from a Google Cloud Storage URI
(gs://bucket/path
), or an inline resource that contains code for a
user-defined function (UDF). Providing an inline code resource is
equivalent to providing a URI for a file containing the same code. See
User-Defined
Functions.
See #view?.
-
(Array<String>, nil) — An array containing Google Cloud Storage URIs
and/or inline source code, or
nil
if not a logical view.
#range_partitioning?
def range_partitioning?() -> Boolean, nil
Checks if the table is range partitioned. See Creating and using integer range partitioned tables.
-
(Boolean, nil) —
true
when the table is range partitioned, orfalse
otherwise, if the object is a resource (see #resource?);nil
if the object is a reference (see #reference?).
#range_partitioning_end
def range_partitioning_end() -> Integer, nil
The end of range partitioning, exclusive. See Creating and using integer range partitioned tables.
-
(Integer, nil) — The end of range partitioning, exclusive, or
nil
if not range partitioned or the object is a reference (see #reference?).
#range_partitioning_field
def range_partitioning_field() -> Integer, nil
The field on which the table is range partitioned, if any. The field must be a top-level NULLABLE/REQUIRED
field. The only supported type is INTEGER/INT64
. See Creating and using integer range partitioned
tables.
-
(Integer, nil) — The range partition field, or
nil
if not range partitioned or the object is a reference (see #reference?).
#range_partitioning_interval
def range_partitioning_interval() -> Integer, nil
The width of each interval. See Creating and using integer range partitioned tables.
-
(Integer, nil) — The width of each interval, for data in range partitions, or
nil
if not range partitioned or the object is a reference (see #reference?).
#range_partitioning_start
def range_partitioning_start() -> Integer, nil
The start of range partitioning, inclusive. See Creating and using integer range partitioned tables.
-
(Integer, nil) — The start of range partitioning, inclusive, or
nil
if not range partitioned or the object is a reference (see #reference?).
#reference?
def reference?() -> Boolean
Whether the table was created without retrieving the resource representation from the BigQuery service.
-
(Boolean) —
true
when the table is just a local reference object,false
otherwise.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table", skip_lookup: true table.reference? # true table.reload! table.reference? # false
#refresh!
def refresh!() -> Google::Cloud::Bigquery::Table
Reloads the table with current data from the BigQuery service.
- (Google::Cloud::Bigquery::Table) — Returns the reloaded table.
Skip retrieving the table from the service, then load it:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table", skip_lookup: true table.reload!
#refresh_interval_ms
def refresh_interval_ms() -> Integer, nil
The maximum frequency in milliseconds at which the materialized view will be refreshed. See #materialized_view?.
-
(Integer, nil) — The maximum frequency in milliseconds;
or
nil
if not a materialized view.
#refresh_interval_ms=
def refresh_interval_ms=(new_refresh_interval_ms)
Sets the maximum frequency at which the materialized view will be refreshed. See #materialized_view?.
- new_refresh_interval_ms (Integer) — The maximum frequency in milliseconds.
#reload!
def reload!() -> Google::Cloud::Bigquery::Table
Reloads the table with current data from the BigQuery service.
- (Google::Cloud::Bigquery::Table) — Returns the reloaded table.
Skip retrieving the table from the service, then load it:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table", skip_lookup: true table.reload!
#require_partition_filter
def require_partition_filter() -> Boolean, nil
Whether queries over this table require a partition filter that can be used for partition elimination to be specified. See Partitioned Tables.
-
(Boolean, nil) —
true
when a partition filter will be required,false
otherwise, ornil
if the object is a reference (see #reference?).
#require_partition_filter=
def require_partition_filter=(new_require)
Sets whether queries over this table require a partition filter. See Partitioned Tables.
If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
- new_require (Boolean) — Whether queries over this table require a partition filter.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.create_table "my_table" do |t| t.require_partition_filter = true end
#resource?
def resource?() -> Boolean
Whether the table was created with a resource representation from the BigQuery service.
-
(Boolean) —
true
when the table was created with a resource representation,false
otherwise.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table", skip_lookup: true table.resource? # false table.reload! table.resource? # true
#resource_full?
def resource_full?() -> Boolean
Whether the table was created with a full resource representation from the BigQuery service.
-
(Boolean) —
true
when the table was created with a full resource representation,false
otherwise.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" table.resource_full? # true
#resource_partial?
def resource_partial?() -> Boolean
Whether the table was created with a partial resource representation from the BigQuery service by retrieval through Dataset#tables. See Tables: list response for the contents of the partial representation. Accessing any attribute outside of the partial representation will result in loading the full representation.
-
(Boolean) —
true
when the table was created with a partial resource representation,false
otherwise.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.tables.first table.resource_partial? # true table.description # Loads the full resource. table.resource_partial? # false
#restore
def restore(destination_table, create: nil, write: nil, &block) { |job| ... } -> Boolean
Restore the data from the table to another table using a synchronous method that blocks for a response. The source table type is SNAPSHOT and the destination table type is TABLE. Timeouts and transient errors are generally handled as needed to complete the job. See also #copy_job.
The geographic location for the job ("US", "EU", etc.) can be set via CopyJob::Updater#location= in a block passed to this method. If the table is a full resource representation (see #resource_full?), the location of the job will be automatically set to the location of the table.
-
destination_table (Table, String) — The destination for the
copied data. This can also be a string identifier as specified by
the Standard SQL Query
Reference
(
project-name.dataset_id.table_id
) or the Legacy SQL Query Reference (project-name:dataset_id.table_id
). This is useful for referencing tables in other projects and datasets. -
create (String) (defaults to: nil) —
Specifies whether the job is allowed to create new tables. The default value is
needed
.The following values are supported:
needed
- Create the table if it does not exist.never
- The table must already exist. A 'notFound' error is raised if the table does not exist.
-
write (String) (defaults to: nil) —
Specifies how to handle data already present in the destination table. The default value is
empty
.The following values are supported:
truncate
- BigQuery overwrites the table data.append
- BigQuery appends the data to the table.empty
- An error will be returned if the destination table already contains data.
- (job) — a job configuration object
- job (Google::Cloud::Bigquery::CopyJob::Updater) — a job configuration object for setting additional options.
-
(Boolean) — Returns
true
if the copy operation succeeded.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" destination_table = dataset.table "my_destination_table" table.restore destination_table
Passing a string identifier for the destination table:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" table.restore "other-project:other_dataset.other_table"
#rows_count
def rows_count() -> Integer, nil
The number of rows in the table.
-
(Integer, nil) — The count of rows in the table, or
nil
if the object is a reference (see #reference?).
#schema
def schema(replace: false) { |schema| ... } -> Google::Cloud::Bigquery::Schema, nil
Returns the table's schema. If the table is not a view (See #view?), this method can also be used to set, replace, or add to the schema by passing a block. See Schema for available methods.
If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved.
-
replace (Boolean) (defaults to: false) — Whether to replace the existing schema with
the new schema. If
true
, the fields will replace the existing schema. Iffalse
, the fields will be added to the existing schema. When a table already contains data, schema changes must be additive. Thus, the default value isfalse
. When loading from a file this will always replace the schema, no matter whatreplace
is set to. You can update the schema (for example, for a table that already contains data) by providing a schema file that includes the existing schema plus any new fields.
- (schema) — a block for setting the schema
- schema (Schema) — the object accepting the schema
- (Google::Cloud::Bigquery::Schema, nil) — A frozen schema object.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.create_table "my_table" table.schema do |schema| schema.string "first_name", mode: :required schema.record "cities_lived", mode: :repeated do |nested_schema| nested_schema.string "place", mode: :required nested_schema.integer "number_of_years", mode: :required end end
Load the schema from a file
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.create_table "my_table" table.schema do |schema| schema.load File.open("schema.json") end
#set_query
def set_query(query, standard_sql: nil, legacy_sql: nil, udfs: nil)
Updates the query that defines the view. (See #view?.) Not supported for materialized views.
Allows setting of standard vs. legacy SQL and user-defined function resources.
- query (String) — The query that defines the view.
- standard_sql (Boolean) (defaults to: nil) — Specifies whether to use BigQuery's standard SQL dialect. Optional. The default value is true.
- legacy_sql (Boolean) (defaults to: nil) — Specifies whether to use BigQuery's legacy SQL dialect. Optional. The default value is false.
-
udfs (Array<String>, String) (defaults to: nil) — User-defined function resources
used in a legacy SQL query. Optional.
May be either a code resource to load from a Google Cloud Storage URI (
gs://bucket/path
), or an inline resource that contains code for a user-defined function (UDF). Providing an inline code resource is equivalent to providing a URI for a file containing the same code.This parameter is used for defining User Defined Function (UDF) resources only when using legacy SQL. Users of standard SQL should leverage either DDL (e.g.
CREATE [TEMPORARY] FUNCTION ...
) or the Routines API to define UDF resources.For additional information on migrating, see: Migrating to standard SQL - Differences in user-defined JavaScript functions
Update a view:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" view = dataset.table "my_view" view.set_query "SELECT first_name FROM " \ "`my_project.my_dataset.my_table`", standard_sql: true
#snapshot
def snapshot(destination_table, &block) { |job| ... } -> Boolean
Takes snapshot of the data from the table to another table using a synchronous method that blocks for a response. The source table type is TABLE and the destination table type is SNAPSHOT. Timeouts and transient errors are generally handled as needed to complete the job. See also #copy_job.
The geographic location for the job ("US", "EU", etc.) can be set via CopyJob::Updater#location= in a block passed to this method. If the table is a full resource representation (see #resource_full?), the location of the job will be automatically set to the location of the table.
-
destination_table (Table, String) — The destination for the
copied data. This can also be a string identifier as specified by
the Standard SQL Query
Reference
(
project-name.dataset_id.table_id
) or the Legacy SQL Query Reference (project-name:dataset_id.table_id
). This is useful for referencing tables in other projects and datasets.
- (job) — a job configuration object
- job (Google::Cloud::Bigquery::CopyJob::Updater) — a job configuration object for setting additional options.
-
(Boolean) — Returns
true
if the copy operation succeeded.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" destination_table = dataset.table "my_destination_table" table.snapshot destination_table
Passing a string identifier for the destination table:
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" table.snapshot "other-project:other_dataset.other_table"
#snapshot?
def snapshot?() -> Boolean, nil
Checks if the table's type is SNAPSHOT
, indicating that the table
represents a BigQuery table snapshot.
-
(Boolean, nil) —
true
when the type isSNAPSHOT
,false
otherwise, if the object is a resource (see #resource?);nil
if the object is a reference (see #reference?).
#snapshot_definition
def snapshot_definition() -> Google::Apis::BigqueryV2::SnapshotDefinition, nil
The Information about base table and snapshot time of the table.
-
(Google::Apis::BigqueryV2::SnapshotDefinition, nil) — Snapshot definition of table snapshot, or
nil
if not snapshot or the object is a reference (see #reference?).
#table?
def table?() -> Boolean, nil
Checks if the table's type is TABLE
.
-
(Boolean, nil) —
true
when the type isTABLE
,false
otherwise, if the object is a resource (see #resource?);nil
if the object is a reference (see #reference?).
#table_id
def table_id() -> String
A unique ID for this table.
-
(String) — The ID must contain only letters (
[A-Za-z]
), numbers ([0-9]
), or underscores (_
). The maximum length is 1,024 characters.
#test_iam_permissions
def test_iam_permissions(*permissions) -> Array<String>
Tests the specified permissions against the Cloud IAM access control policy.
-
permissions (String, Array<String>) — The set of permissions
against which to check access. Permissions must be of the format
bigquery.resource.capability
. See https://cloud.google.com/bigquery/docs/access-control#bigquery.
- (Array<String>) — The frozen array of permissions held by the caller.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" permissions = table.test_iam_permissions "bigquery.tables.get", "bigquery.tables.delete" permissions.include? "bigquery.tables.get" #=> true permissions.include? "bigquery.tables.delete" #=> false
#time_partitioning?
def time_partitioning?() -> Boolean, nil
Checks if the table is time partitioned. See Partitioned Tables.
-
(Boolean, nil) —
true
when the table is time partitioned, orfalse
otherwise, if the object is a resource (see #resource?);nil
if the object is a reference (see #reference?).
#time_partitioning_expiration
def time_partitioning_expiration() -> Integer, nil
The expiration for the time partitions, if any, in seconds. See Partitioned Tables.
-
(Integer, nil) — The expiration time, in seconds, for data in
time partitions, or
nil
if not present or the object is a reference (see #reference?).
#time_partitioning_expiration=
def time_partitioning_expiration=(expiration)
Sets the time partition expiration for the table. See Partitioned Tables. The table must also be time partitioned.
If the table is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
-
expiration (Integer, nil) — An expiration time, in seconds,
for data in time partitions, , or
nil
to indicate no expiration time.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.create_table "my_table" do |t| t.schema do |schema| schema.timestamp "dob", mode: :required end t.time_partitioning_type = "DAY" t.time_partitioning_field = "dob" t.time_partitioning_expiration = 86_400 end
#time_partitioning_field
def time_partitioning_field() -> String, nil
The field on which the table is time partitioned, if any. If not
set, the destination table is time partitioned by pseudo column
_PARTITIONTIME
; if set, the table is time partitioned by this field. See
Partitioned Tables.
-
(String, nil) — The time partition field, if a field was configured.
nil
if not time partitioned, not set (time partitioned by pseudo column '_PARTITIONTIME') or the object is a reference (see #reference?).
#time_partitioning_field=
def time_partitioning_field=(field)
Sets the field on which to time partition the table. If not
set, the destination table is time partitioned by pseudo column
_PARTITIONTIME
; if set, the table is time partitioned by this field. See
Partitioned Tables.
The table must also be time partitioned.
You can only set the time partitioning field while creating a table as in the example below. BigQuery does not allow you to change time partitioning on an existing table.
- field (String) — The time partition field. The field must be a top-level TIMESTAMP or DATE field. Its mode must be NULLABLE or REQUIRED.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.create_table "my_table" do |t| t.schema do |schema| schema.timestamp "dob", mode: :required end t.time_partitioning_type = "DAY" t.time_partitioning_field = "dob" end
#time_partitioning_type
def time_partitioning_type() -> String, nil
The period for which the table is time partitioned, if any. See Partitioned Tables.
-
(String, nil) — The time partition type. The supported types are
DAY
,HOUR
,MONTH
, andYEAR
, which will generate one partition per day, hour, month, and year, respectively; ornil
if not set or the object is a reference (see #reference?).
#time_partitioning_type=
def time_partitioning_type=(type)
Sets the time partitioning type for the table. See Partitioned
Tables.
The supported types are DAY
, HOUR
, MONTH
, and YEAR
, which will
generate one partition per day, hour, month, and year, respectively.
You can only set time partitioning when creating a table as in the example below. BigQuery does not allow you to change time partitioning on an existing table.
-
type (String) — The time partition type. The supported types are
DAY
,HOUR
,MONTH
, andYEAR
, which will generate one partition per day, hour, month, and year, respectively.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.create_table "my_table" do |t| t.schema do |schema| schema.timestamp "dob", mode: :required end t.time_partitioning_type = "DAY" t.time_partitioning_field = "dob" end
#type
def type() -> String, nil
The type of the table like if its a TABLE, VIEW or SNAPSHOT etc.,
-
(String, nil) — Type of the table, or
nil
if the object is a reference (see #reference?).
#update_policy
def update_policy() { |policy| ... } -> Policy
Updates the Cloud IAM access control policy for the table. The latest policy will be read from the service. See also #policy.
- (policy) — A block for updating the policy. The latest policy will be read from the service and passed to the block. After the block completes, the modified policy will be written to the service.
- policy (Policy) — The mutable Policy for the table.
- (Policy) — The updated and frozen policy for the table.
- (ArgumentError)
Update the policy by passing a block.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" table.update_policy do |p| p.grant role: "roles/viewer", members: "user:viewer@example.com" p.revoke role: "roles/editor", members: "user:editor@example.com" p.revoke role: "roles/owner" end # 2 API calls
#view?
def view?() -> Boolean, nil
Checks if the table's type is VIEW
, indicating that the table
represents a BigQuery logical view. See Dataset#create_view.
-
(Boolean, nil) —
true
when the type isVIEW
,false
otherwise, if the object is a resource (see #resource?);nil
if the object is a reference (see #reference?).