BigQuery API - Class Google::Cloud::Bigquery::QueryJob::Updater (v1.59.0)

Reference documentation and code samples for the BigQuery API class Google::Cloud::Bigquery::QueryJob::Updater.

Yielded to a block to accumulate changes for a patch request.

Inherits

Google::Cloud::Bigquery::QueryJob

Methods

#cache=

def cache=(value)

Specifies to look in the query cache for results.

Parameter

value (Boolean) — Whether to look for the result in the query cache. The query cache is a best-effort cache that will be flushed whenever tables in the query are modified. The default value is true. For more information, see query caching.

#cancel

def cancel()

#clustering_fields=

def clustering_fields=(fields)

Sets the list of fields on which data should be clustered.

Only top-level, non-repeated, simple-type fields are supported. When you cluster a table using multiple columns, the order of columns you specify is important. The order of the specified columns determines the sort order of the data.

BigQuery supports clustering for both partitioned and non-partitioned tables.

See #clustering_fields, Table#clustering_fields and Table#clustering_fields=.

Parameter

fields (Array<String>) — The clustering fields. Only top-level, non-repeated, simple-type fields are supported.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
destination_table = dataset.table "my_destination_table",
                                  skip_lookup: true

job = dataset.query_job "SELECT * FROM my_table" do |job|
  job.table = destination_table
  job.time_partitioning_type = "DAY"
  job.time_partitioning_field = "dob"
  job.clustering_fields = ["last_name", "first_name"]
end

job.wait_until_done!
job.done? #=> true

#create=

def create=(value)

Sets the create disposition for creating the query results table.

create new tables. The default value is needed.

The following values are supported:

needed - Create the table if it does not exist.
never - The table must already exist. A 'notFound' error is raised if the table does not exist.

Parameter

value (String) — Specifies whether the job is allowed to

#create_session=

def create_session=(value)

Sets the create_session property. If true, creates a new session, where session id will be a server generated random id. If false, runs query with an existing #session_id=, otherwise runs query in non-session mode. The default value is false.

value is false.

Parameter

value (Boolean) — The create_session property. The default

#dataset=

def dataset=(value)

Sets the default dataset of tables referenced in the query.

Parameter

value (Dataset) — The default dataset to use for unqualified table names in the query.

#dry_run=

def dry_run=(value)

Alias Of: #dryrun=

Sets the dry run flag for the query job.

Parameter

value (Boolean) — If set, don't actually run this job. A valid query will return a mostly empty response with some processing statistics, while an invalid query will return the same error it would if it wasn't a dry run..

#dryrun=

def dryrun=(value)

Aliases

#dry_run=

Sets the dry run flag for the query job.

Parameter

value (Boolean) — If set, don't actually run this job. A valid query will return a mostly empty response with some processing statistics, while an invalid query will return the same error it would if it wasn't a dry run..

#encryption=

def encryption=(val)

Sets the encryption configuration of the destination table.

Parameter

val (Google::Cloud::BigQuery::EncryptionConfiguration) — Custom encryption configuration (e.g., Cloud KMS keys).

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"

key_name = "projects/a/locations/b/keyRings/c/cryptoKeys/d"
encrypt_config = bigquery.encryption kms_key: key_name
job = bigquery.query_job "SELECT 1;" do |job|
  job.table = dataset.table "my_table", skip_lookup: true
  job.encryption = encrypt_config
end

#external=

def external=(value)

Sets definitions for external tables used in the query.

Parameter

value (Hash<String|Symbol, External::DataSource>) — A Hash that represents the mapping of the external tables to the table names used in the SQL query. The hash keys are the table names, and the hash values are the external table objects.

#flatten=

def flatten=(value)

Flatten nested and repeated fields in legacy SQL queries.

Parameter

value (Boolean) — This option is specific to Legacy SQL. Flattens all nested and repeated fields in the query results. The default value is true. large_results parameter must be true if this is set to false.

#labels=

def labels=(value)

Sets the labels to use for the job.

Parameter

value (Hash) —
A hash of user-provided labels associated with the job. You can use these to organize and group your jobs.

The labels applied to a resource must meet the following requirements:
- Each resource can have multiple labels, up to a maximum of 64.
- Each label must be a key-value pair.
- Keys have a minimum length of 1 character and a maximum length of 63 characters, and cannot be empty. Values can be empty, and have a maximum length of 63 characters.
- Keys and values can contain only lowercase letters, numeric characters, underscores, and dashes. All characters must use UTF-8 encoding, and international characters are allowed.
- The key portion of a label must be unique. However, you can use the same key with multiple resources.
- Keys must start with a lowercase letter or international character.

#large_results=

def large_results=(value)

Allow large results for a legacy SQL query.

Parameter

value (Boolean) — This option is specific to Legacy SQL. If true, allows the query to produce arbitrarily large result tables at a slight cost in performance. Requires table parameter to be set.

#legacy_sql=

def legacy_sql=(value)

Sets the query syntax to legacy SQL.

Parameter

value (Boolean) — Specifies whether to use BigQuery's legacy SQL dialect for this query. If set to false, the query will use BigQuery's standard SQL dialect. Optional. The default value is false.

#location=

def location=(value)

Sets the geographic location where the job should run. Required except for US and EU.

Parameter

value (String) — A geographic location, such as "US", "EU" or "asia-northeast1". Required except for US and EU.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"

job = bigquery.query_job "SELECT 1;" do |query|
  query.table = dataset.table "my_table", skip_lookup: true
  query.location = "EU"
end

#maximum_bytes_billed=

def maximum_bytes_billed=(value)

Sets the maximum bytes billed for the query.

Parameter

value (Integer) — Limits the bytes billed for this job. Queries that will have bytes billed beyond this limit will fail (without incurring a charge). Optional. If unspecified, this will be set to your project default.

#params=

def params=(params)

Sets the query parameters. Standard SQL only.

Use #set_params_and_types to set both params and types.

Parameter

params (Array, Hash) — Standard SQL only. Used to pass query arguments when the query string contains either positional (?) or named (@myparam) query parameters. If value passed is an array ["foo"], the query must use positional query parameters. If value passed is a hash { myparam: "foo" }, the query must use named query parameters. When set, legacy_sql will automatically be set to false and standard_sql to true.

BigQuery types are converted from Ruby types as follows:

| BigQuery | Ruby | Notes | |--------------|--------------------------------------|--------------------------------------------------| | BOOL | true/false | | | INT64 | Integer | | | FLOAT64 | Float | | | NUMERIC | BigDecimal | BigDecimal values will be rounded to scale 9. | | BIGNUMERIC | BigDecimal | NOT AUTOMATIC: Must be mapped using types. | | STRING | String | | | DATETIME | DateTime | DATETIME does not support time zone. | | DATE | Date | | | GEOGRAPHY | String (WKT or GeoJSON) | NOT AUTOMATIC: Must be mapped using types. | | JSON | String (Stringified JSON) | String, as JSON does not have a schema to verify.| | TIMESTAMP | Time | | | TIME | Google::Cloud::BigQuery::Time | | | BYTES | File, IO, StringIO, or similar | | | ARRAY | Array | Nested arrays, nil values are not supported. | | STRUCT | Hash | Hash keys may be strings or symbols. |

See Data Types for an overview of each BigQuery data type, including allowed values. For the GEOGRAPHY type, see Working with BigQuery GIS data.

#priority=

def priority=(value)

Sets the priority of the query.

Parameter

value (String) — Specifies a priority for the query. Possible values include INTERACTIVE and BATCH.

#range_partitioning_end=

def range_partitioning_end=(range_end)

Sets the end of range partitioning, exclusive, for the destination table. See Creating and using integer range partitioned tables.

You can only set range partitioning when creating a table. BigQuery does not allow you to change partitioning on an existing table.

See #range_partitioning_start=, #range_partitioning_interval= and #range_partitioning_field=.

Parameter

range_end (Integer) — The end of range partitioning, exclusive.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
destination_table = dataset.table "my_destination_table",
                                  skip_lookup: true

job = bigquery.query_job "SELECT num FROM UNNEST(GENERATE_ARRAY(0, 99)) AS num" do |job|
  job.table = destination_table
  job.range_partitioning_field = "num"
  job.range_partitioning_start = 0
  job.range_partitioning_interval = 10
  job.range_partitioning_end = 100
end

job.wait_until_done!
job.done? #=> true

#range_partitioning_field=

def range_partitioning_field=(field)

Sets the field on which to range partition the table. See Creating and using integer range partitioned tables.

See #range_partitioning_start=, #range_partitioning_interval= and #range_partitioning_end=.

You can only set range partitioning when creating a table. BigQuery does not allow you to change partitioning on an existing table.

Parameter

field (String) — The range partition field. the destination table is partitioned by this field. The field must be a top-level NULLABLE/REQUIRED field. The only supported type is INTEGER/INT64.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
destination_table = dataset.table "my_destination_table",
                                  skip_lookup: true

job = bigquery.query_job "SELECT num FROM UNNEST(GENERATE_ARRAY(0, 99)) AS num" do |job|
  job.table = destination_table
  job.range_partitioning_field = "num"
  job.range_partitioning_start = 0
  job.range_partitioning_interval = 10
  job.range_partitioning_end = 100
end

job.wait_until_done!
job.done? #=> true

#range_partitioning_interval=

def range_partitioning_interval=(range_interval)

Sets width of each interval for data in range partitions. See Creating and using integer range partitioned tables.

You can only set range partitioning when creating a table. BigQuery does not allow you to change partitioning on an existing table.

See #range_partitioning_field=, #range_partitioning_start= and #range_partitioning_end=.

Parameter

range_interval (Integer) — The width of each interval, for data in partitions.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
destination_table = dataset.table "my_destination_table",
                                  skip_lookup: true

job = bigquery.query_job "SELECT num FROM UNNEST(GENERATE_ARRAY(0, 99)) AS num" do |job|
  job.table = destination_table
  job.range_partitioning_field = "num"
  job.range_partitioning_start = 0
  job.range_partitioning_interval = 10
  job.range_partitioning_end = 100
end

job.wait_until_done!
job.done? #=> true

#range_partitioning_start=

def range_partitioning_start=(range_start)

Sets the start of range partitioning, inclusive, for the destination table. See Creating and using integer range partitioned tables.

You can only set range partitioning when creating a table. BigQuery does not allow you to change partitioning on an existing table.

See #range_partitioning_field=, #range_partitioning_interval= and #range_partitioning_end=.

Parameter

range_start (Integer) — The start of range partitioning, inclusive.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
destination_table = dataset.table "my_destination_table",
                                  skip_lookup: true

job = bigquery.query_job "SELECT num FROM UNNEST(GENERATE_ARRAY(0, 99)) AS num" do |job|
  job.table = destination_table
  job.range_partitioning_field = "num"
  job.range_partitioning_start = 0
  job.range_partitioning_interval = 10
  job.range_partitioning_end = 100
end

job.wait_until_done!
job.done? #=> true

#refresh!

def refresh!()

Alias Of: #reload!

#reload!

def reload!()

Aliases

#refresh!

#rerun!

def rerun!()

#session_id=

def session_id=(value)

Sets the session ID for a query run in session mode. See #create_session=.

Parameter

value (String) — The session ID. The default value is nil.

#set_params_and_types

def set_params_and_types(params, types = nil)

Sets the query parameters. Standard SQL only.

Parameters

params (Array, Hash) — Standard SQL only. Used to pass query arguments when the query string contains either positional (?) or named (@myparam) query parameters. If value passed is an array ["foo"], the query must use positional query parameters. If value passed is a hash { myparam: "foo" }, the query must use named query parameters. When set, legacy_sql will automatically be set to false and standard_sql to true.

BigQuery types are converted from Ruby types as follows:

| BigQuery | Ruby | Notes | |--------------|--------------------------------------|--------------------------------------------------| | BOOL | true/false | | | INT64 | Integer | | | FLOAT64 | Float | | | NUMERIC | BigDecimal | BigDecimal values will be rounded to scale 9. | | BIGNUMERIC | BigDecimal | NOT AUTOMATIC: Must be mapped using types. | | STRING | String | | | DATETIME | DateTime | DATETIME does not support time zone. | | DATE | Date | | | GEOGRAPHY | String (WKT or GeoJSON) | NOT AUTOMATIC: Must be mapped using types. | | JSON | String (Stringified JSON) | String, as JSON does not have a schema to verify.| | TIMESTAMP | Time | | | TIME | Google::Cloud::BigQuery::Time | | | BYTES | File, IO, StringIO, or similar | | | ARRAY | Array | Nested arrays, nil values are not supported. | | STRUCT | Hash | Hash keys may be strings or symbols. |

See Data Types for an overview of each BigQuery data type, including allowed values. For the GEOGRAPHY type, see Working with BigQuery GIS data.
types (Array, Hash) — Standard SQL only. Types of the SQL parameters in params. It is not always possible to infer the right SQL type from a value in params. In these cases, types must be used to specify the SQL type for these values.

Arguments must match the value type passed to params. This must be an Array when the query uses positional query parameters. This must be an Hash when the query uses named query parameters. The values should be BigQuery type codes from the following list:
- :BOOL
- :INT64
- :FLOAT64
- :NUMERIC
- :BIGNUMERIC
- :STRING
- :DATETIME
- :DATE
- :GEOGRAPHY
- :JSON
- :TIMESTAMP
- :TIME
- :BYTES
- Array - Lists are specified by providing the type code in an array. For example, an array of integers are specified as [:INT64].
- Hash - Types for STRUCT values (Hash objects) are specified using a Hash object, where the keys match the params hash, and the values are the types value that matches the data.
Types are optional.

Raises

(ArgumentError)

#standard_sql=

def standard_sql=(value)

Sets the query syntax to standard SQL.

Parameter

value (Boolean) — Specifies whether to use BigQuery's standard SQL dialect for this query. If set to true, the query will use standard SQL rather than the legacy SQL dialect. Optional. The default value is true.

#table=

def table=(value)

Sets the destination for the query results table.

Parameter

value (Table) — The destination table where the query results should be stored. If not present, a new table will be created according to the create disposition to store the results.

#time_partitioning_expiration=

def time_partitioning_expiration=(expiration)

Sets the partition expiration for the destination table. See Partitioned Tables.

The destination table must also be partitioned. See #time_partitioning_type=.

Parameter

expiration (Integer) — An expiration time, in seconds, for data in partitions.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
destination_table = dataset.table "my_destination_table",
                                  skip_lookup: true

job = dataset.query_job "SELECT * FROM UNNEST(" \
                        "GENERATE_TIMESTAMP_ARRAY(" \
                        "'2018-10-01 00:00:00', " \
                        "'2018-10-10 00:00:00', " \
                        "INTERVAL 1 DAY)) AS dob" do |job|
  job.table = destination_table
  job.time_partitioning_type = "DAY"
  job.time_partitioning_expiration = 86_400
end

job.wait_until_done!
job.done? #=> true

#time_partitioning_field=

def time_partitioning_field=(field)

Sets the field on which to partition the destination table. If not set, the destination table is partitioned by pseudo column _PARTITIONTIME; if set, the table is partitioned by this field. See Partitioned Tables.

The destination table must also be partitioned. See #time_partitioning_type=.

You can only set the partitioning field while creating a table. BigQuery does not allow you to change partitioning on an existing table.

Parameter

field (String) — The partition field. The field must be a top-level TIMESTAMP or DATE field. Its mode must be NULLABLE or REQUIRED.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
destination_table = dataset.table "my_destination_table",
                                  skip_lookup: true

job = dataset.query_job "SELECT * FROM UNNEST(" \
                        "GENERATE_TIMESTAMP_ARRAY(" \
                        "'2018-10-01 00:00:00', " \
                        "'2018-10-10 00:00:00', " \
                        "INTERVAL 1 DAY)) AS dob" do |job|
  job.table = destination_table
  job.time_partitioning_type  = "DAY"
  job.time_partitioning_field = "dob"
end

job.wait_until_done!
job.done? #=> true

#time_partitioning_require_filter=

def time_partitioning_require_filter=(val)

If set to true, queries over the destination table will require a partition filter that can be used for partition elimination to be specified. See Partitioned Tables.

Parameter

val (Boolean) — Indicates if queries over the destination table will require a partition filter. The default value is false.

#time_partitioning_type=

def time_partitioning_type=(type)

Sets the partitioning for the destination table. See Partitioned Tables. The supported types are DAY, HOUR, MONTH, and YEAR, which will generate one partition per day, hour, month, and year, respectively.

You can only set the partitioning field while creating a table. BigQuery does not allow you to change partitioning on an existing table.

Parameter

type (String) — The partition type. The supported types are DAY, HOUR, MONTH, and YEAR, which will generate one partition per day, hour, month, and year, respectively.

Example

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
destination_table = dataset.table "my_destination_table",
                                  skip_lookup: true

job = dataset.query_job "SELECT * FROM UNNEST(" \
                        "GENERATE_TIMESTAMP_ARRAY(" \
                        "'2018-10-01 00:00:00', " \
                        "'2018-10-10 00:00:00', " \
                        "INTERVAL 1 DAY)) AS dob" do |job|
  job.table = destination_table
  job.time_partitioning_type = "DAY"
end

job.wait_until_done!
job.done? #=> true

#udfs=

def udfs=(value)

Sets user defined functions for the query.

Parameter

value (Array<String>, String) — User-defined function resources used in the query. May be either a code resource to load from a Google Cloud Storage URI (gs://bucket/path), or an inline resource that contains code for a user-defined function (UDF). Providing an inline code resource is equivalent to providing a URI for a file containing the same code. See User-Defined Functions.

#wait_until_done!

def wait_until_done!()

#write=

def write=(value)

Sets the write disposition for when the query results table exists.

Parameter

value (String) —
Specifies the action that occurs if the destination table already exists. The default value is empty.

The following values are supported:
- truncate - BigQuery overwrites the table data.
- append - BigQuery appends the data to the table.
- empty - A 'duplicate' error is returned in the job result if the table exists and contains data.