Reference documentation and code samples for the BigQuery API class Google::Cloud::Bigquery::QueryJob::Updater.
Yielded to a block to accumulate changes for a patch request.
Inherits
Methods
#cache=
def cache=(value)
Specifies to look in the query cache for results.
- value (Boolean) — Whether to look for the result in the query cache. The query cache is a best-effort cache that will be flushed whenever tables in the query are modified. The default value is true. For more information, see query caching.
#cancel
def cancel()
#clustering_fields=
def clustering_fields=(fields)
Sets the list of fields on which data should be clustered.
Only top-level, non-repeated, simple-type fields are supported. When you cluster a table using multiple columns, the order of columns you specify is important. The order of the specified columns determines the sort order of the data.
BigQuery supports clustering for both partitioned and non-partitioned tables.
See #clustering_fields, Table#clustering_fields and Table#clustering_fields=.
- fields (Array<String>) — The clustering fields. Only top-level, non-repeated, simple-type fields are supported.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" destination_table = dataset.table "my_destination_table", skip_lookup: true job = dataset.query_job "SELECT * FROM my_table" do |job| job.table = destination_table job.time_partitioning_type = "DAY" job.time_partitioning_field = "dob" job.clustering_fields = ["last_name", "first_name"] end job.wait_until_done! job.done? #=> true
#create=
def create=(value)
Sets the create disposition for creating the query results table.
create new tables. The default value is needed
.
The following values are supported:
needed
- Create the table if it does not exist.never
- The table must already exist. A 'notFound' error is raised if the table does not exist.
- value (String) — Specifies whether the job is allowed to
#create_session=
def create_session=(value)
Sets the create_session property. If true, creates a new session,
where session id will be a server generated random id. If false,
runs query with an existing #session_id=, otherwise runs query in
non-session mode. The default value is false
.
value is false
.
- value (Boolean) — The create_session property. The default
#dataset=
def dataset=(value)
Sets the default dataset of tables referenced in the query.
- value (Dataset) — The default dataset to use for unqualified table names in the query.
#dry_run=
def dry_run=(value)
Sets the dry run flag for the query job.
- value (Boolean) — If set, don't actually run this job. A valid query will return a mostly empty response with some processing statistics, while an invalid query will return the same error it would if it wasn't a dry run..
#dryrun=
def dryrun=(value)
Sets the dry run flag for the query job.
- value (Boolean) — If set, don't actually run this job. A valid query will return a mostly empty response with some processing statistics, while an invalid query will return the same error it would if it wasn't a dry run..
#encryption=
def encryption=(val)
Sets the encryption configuration of the destination table.
- val (Google::Cloud::BigQuery::EncryptionConfiguration) — Custom encryption configuration (e.g., Cloud KMS keys).
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" key_name = "projects/a/locations/b/keyRings/c/cryptoKeys/d" encrypt_config = bigquery.encryption kms_key: key_name job = bigquery.query_job "SELECT 1;" do |job| job.table = dataset.table "my_table", skip_lookup: true job.encryption = encrypt_config end
#external=
def external=(value)
Sets definitions for external tables used in the query.
- value (Hash<String|Symbol, External::DataSource>) — A Hash that represents the mapping of the external tables to the table names used in the SQL query. The hash keys are the table names, and the hash values are the external table objects.
#flatten=
def flatten=(value)
Flatten nested and repeated fields in legacy SQL queries.
-
value (Boolean) — This option is specific to Legacy SQL.
Flattens all nested and repeated fields in the query results. The
default value is
true
.large_results
parameter must betrue
if this is set tofalse
.
#labels=
def labels=(value)
Sets the labels to use for the job.
-
value (Hash) —
A hash of user-provided labels associated with the job. You can use these to organize and group your jobs.
The labels applied to a resource must meet the following requirements:
- Each resource can have multiple labels, up to a maximum of 64.
- Each label must be a key-value pair.
- Keys have a minimum length of 1 character and a maximum length of 63 characters, and cannot be empty. Values can be empty, and have a maximum length of 63 characters.
- Keys and values can contain only lowercase letters, numeric characters, underscores, and dashes. All characters must use UTF-8 encoding, and international characters are allowed.
- The key portion of a label must be unique. However, you can use the same key with multiple resources.
- Keys must start with a lowercase letter or international character.
#large_results=
def large_results=(value)
Allow large results for a legacy SQL query.
-
value (Boolean) — This option is specific to Legacy SQL.
If
true
, allows the query to produce arbitrarily large result tables at a slight cost in performance. Requirestable
parameter to be set.
#legacy_sql=
def legacy_sql=(value)
Sets the query syntax to legacy SQL.
- value (Boolean) — Specifies whether to use BigQuery's legacy SQL dialect for this query. If set to false, the query will use BigQuery's standard SQL dialect. Optional. The default value is false.
#location=
def location=(value)
Sets the geographic location where the job should run. Required except for US and EU.
- value (String) — A geographic location, such as "US", "EU" or "asia-northeast1". Required except for US and EU.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" job = bigquery.query_job "SELECT 1;" do |query| query.table = dataset.table "my_table", skip_lookup: true query.location = "EU" end
#maximum_bytes_billed=
def maximum_bytes_billed=(value)
Sets the maximum bytes billed for the query.
- value (Integer) — Limits the bytes billed for this job. Queries that will have bytes billed beyond this limit will fail (without incurring a charge). Optional. If unspecified, this will be set to your project default.
#params=
def params=(params)
Sets the query parameters. Standard SQL only.
Use #set_params_and_types to set both params and types.
-
params (Array, Hash) — Standard SQL only. Used to pass query arguments when the
query
string contains either positional (?
) or named (@myparam
) query parameters. If value passed is an array["foo"]
, the query must use positional query parameters. If value passed is a hash{ myparam: "foo" }
, the query must use named query parameters. When set,legacy_sql
will automatically be set to false andstandard_sql
to true.BigQuery types are converted from Ruby types as follows:
| BigQuery | Ruby | Notes | |--------------|--------------------------------------|--------------------------------------------------| |
BOOL
|true
/false
| | |INT64
|Integer
| | |FLOAT64
|Float
| | |NUMERIC
|BigDecimal
|BigDecimal
values will be rounded to scale 9. | |BIGNUMERIC
|BigDecimal
| NOT AUTOMATIC: Must be mapped usingtypes
. | |STRING
|String
| | |DATETIME
|DateTime
|DATETIME
does not support time zone. | |DATE
|Date
| | |GEOGRAPHY
|String
(WKT or GeoJSON) | NOT AUTOMATIC: Must be mapped usingtypes
. | |TIMESTAMP
|Time
| | |TIME
|Google::Cloud::BigQuery::Time
| | |BYTES
|File
,IO
,StringIO
, or similar | | |ARRAY
|Array
| Nested arrays,nil
values are not supported. | |STRUCT
|Hash
| Hash keys may be strings or symbols. |See Data Types for an overview of each BigQuery data type, including allowed values. For the
GEOGRAPHY
type, see Working with BigQuery GIS data.
#priority=
def priority=(value)
Sets the priority of the query.
-
value (String) — Specifies a priority for the query. Possible
values include
INTERACTIVE
andBATCH
.
#range_partitioning_end=
def range_partitioning_end=(range_end)
Sets the end of range partitioning, exclusive, for the destination table. See Creating and using integer range partitioned tables.
You can only set range partitioning when creating a table. BigQuery does not allow you to change partitioning on an existing table.
See #range_partitioning_start=, #range_partitioning_interval= and #range_partitioning_field=.
- range_end (Integer) — The end of range partitioning, exclusive.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" destination_table = dataset.table "my_destination_table", skip_lookup: true job = bigquery.query_job "SELECT num FROM UNNEST(GENERATE_ARRAY(0, 99)) AS num" do |job| job.table = destination_table job.range_partitioning_field = "num" job.range_partitioning_start = 0 job.range_partitioning_interval = 10 job.range_partitioning_end = 100 end job.wait_until_done! job.done? #=> true
#range_partitioning_field=
def range_partitioning_field=(field)
Sets the field on which to range partition the table. See Creating and using integer range partitioned tables.
See #range_partitioning_start=, #range_partitioning_interval= and #range_partitioning_end=.
You can only set range partitioning when creating a table. BigQuery does not allow you to change partitioning on an existing table.
-
field (String) — The range partition field. the destination table is partitioned by this
field. The field must be a top-level
NULLABLE/REQUIRED
field. The only supported type isINTEGER/INT64
.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" destination_table = dataset.table "my_destination_table", skip_lookup: true job = bigquery.query_job "SELECT num FROM UNNEST(GENERATE_ARRAY(0, 99)) AS num" do |job| job.table = destination_table job.range_partitioning_field = "num" job.range_partitioning_start = 0 job.range_partitioning_interval = 10 job.range_partitioning_end = 100 end job.wait_until_done! job.done? #=> true
#range_partitioning_interval=
def range_partitioning_interval=(range_interval)
Sets width of each interval for data in range partitions. See Creating and using integer range partitioned tables.
You can only set range partitioning when creating a table. BigQuery does not allow you to change partitioning on an existing table.
See #range_partitioning_field=, #range_partitioning_start= and #range_partitioning_end=.
- range_interval (Integer) — The width of each interval, for data in partitions.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" destination_table = dataset.table "my_destination_table", skip_lookup: true job = bigquery.query_job "SELECT num FROM UNNEST(GENERATE_ARRAY(0, 99)) AS num" do |job| job.table = destination_table job.range_partitioning_field = "num" job.range_partitioning_start = 0 job.range_partitioning_interval = 10 job.range_partitioning_end = 100 end job.wait_until_done! job.done? #=> true
#range_partitioning_start=
def range_partitioning_start=(range_start)
Sets the start of range partitioning, inclusive, for the destination table. See Creating and using integer range partitioned tables.
You can only set range partitioning when creating a table. BigQuery does not allow you to change partitioning on an existing table.
See #range_partitioning_field=, #range_partitioning_interval= and #range_partitioning_end=.
- range_start (Integer) — The start of range partitioning, inclusive.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" destination_table = dataset.table "my_destination_table", skip_lookup: true job = bigquery.query_job "SELECT num FROM UNNEST(GENERATE_ARRAY(0, 99)) AS num" do |job| job.table = destination_table job.range_partitioning_field = "num" job.range_partitioning_start = 0 job.range_partitioning_interval = 10 job.range_partitioning_end = 100 end job.wait_until_done! job.done? #=> true
#refresh!
def refresh!()
#reload!
def reload!()
#rerun!
def rerun!()
#session_id=
def session_id=(value)
Sets the session ID for a query run in session mode. See #create_session=.
-
value (String) — The session ID. The default value is
nil
.
#set_params_and_types
def set_params_and_types(params, types = nil)
Sets the query parameters. Standard SQL only.
-
params (Array, Hash) — Standard SQL only. Used to pass query arguments when the
query
string contains either positional (?
) or named (@myparam
) query parameters. If value passed is an array["foo"]
, the query must use positional query parameters. If value passed is a hash{ myparam: "foo" }
, the query must use named query parameters. When set,legacy_sql
will automatically be set to false andstandard_sql
to true.BigQuery types are converted from Ruby types as follows:
| BigQuery | Ruby | Notes | |--------------|--------------------------------------|--------------------------------------------------| |
BOOL
|true
/false
| | |INT64
|Integer
| | |FLOAT64
|Float
| | |NUMERIC
|BigDecimal
|BigDecimal
values will be rounded to scale 9. | |BIGNUMERIC
|BigDecimal
| NOT AUTOMATIC: Must be mapped usingtypes
. | |STRING
|String
| | |DATETIME
|DateTime
|DATETIME
does not support time zone. | |DATE
|Date
| | |GEOGRAPHY
|String
(WKT or GeoJSON) | NOT AUTOMATIC: Must be mapped usingtypes
. | |TIMESTAMP
|Time
| | |TIME
|Google::Cloud::BigQuery::Time
| | |BYTES
|File
,IO
,StringIO
, or similar | | |ARRAY
|Array
| Nested arrays,nil
values are not supported. | |STRUCT
|Hash
| Hash keys may be strings or symbols. |See Data Types for an overview of each BigQuery data type, including allowed values. For the
GEOGRAPHY
type, see Working with BigQuery GIS data. -
types (Array, Hash) — Standard SQL only. Types of the SQL parameters in
params
. It is not always possible to infer the right SQL type from a value inparams
. In these cases,types
must be used to specify the SQL type for these values.Arguments must match the value type passed to
params
. This must be anArray
when the query uses positional query parameters. This must be anHash
when the query uses named query parameters. The values should be BigQuery type codes from the following list::BOOL
:INT64
:FLOAT64
:NUMERIC
:BIGNUMERIC
:STRING
:DATETIME
:DATE
:GEOGRAPHY
:TIMESTAMP
:TIME
:BYTES
Array
- Lists are specified by providing the type code in an array. For example, an array of integers are specified as[:INT64]
.Hash
- Types for STRUCT values (Hash
objects) are specified using aHash
object, where the keys match theparams
hash, and the values are the types value that matches the data.
Types are optional.
- (ArgumentError)
#standard_sql=
def standard_sql=(value)
Sets the query syntax to standard SQL.
- value (Boolean) — Specifies whether to use BigQuery's standard SQL dialect for this query. If set to true, the query will use standard SQL rather than the legacy SQL dialect. Optional. The default value is true.
#table=
def table=(value)
Sets the destination for the query results table.
- value (Table) — The destination table where the query results should be stored. If not present, a new table will be created according to the create disposition to store the results.
#time_partitioning_expiration=
def time_partitioning_expiration=(expiration)
Sets the partition expiration for the destination table. See Partitioned Tables.
The destination table must also be partitioned. See #time_partitioning_type=.
- expiration (Integer) — An expiration time, in seconds, for data in partitions.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" destination_table = dataset.table "my_destination_table", skip_lookup: true job = dataset.query_job "SELECT * FROM UNNEST(" \ "GENERATE_TIMESTAMP_ARRAY(" \ "'2018-10-01 00:00:00', " \ "'2018-10-10 00:00:00', " \ "INTERVAL 1 DAY)) AS dob" do |job| job.table = destination_table job.time_partitioning_type = "DAY" job.time_partitioning_expiration = 86_400 end job.wait_until_done! job.done? #=> true
#time_partitioning_field=
def time_partitioning_field=(field)
Sets the field on which to partition the destination table. If not
set, the destination table is partitioned by pseudo column
_PARTITIONTIME
; if set, the table is partitioned by this field.
See Partitioned
Tables.
The destination table must also be partitioned. See #time_partitioning_type=.
You can only set the partitioning field while creating a table. BigQuery does not allow you to change partitioning on an existing table.
- field (String) — The partition field. The field must be a top-level TIMESTAMP or DATE field. Its mode must be NULLABLE or REQUIRED.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" destination_table = dataset.table "my_destination_table", skip_lookup: true job = dataset.query_job "SELECT * FROM UNNEST(" \ "GENERATE_TIMESTAMP_ARRAY(" \ "'2018-10-01 00:00:00', " \ "'2018-10-10 00:00:00', " \ "INTERVAL 1 DAY)) AS dob" do |job| job.table = destination_table job.time_partitioning_type = "DAY" job.time_partitioning_field = "dob" end job.wait_until_done! job.done? #=> true
#time_partitioning_require_filter=
def time_partitioning_require_filter=(val)
If set to true, queries over the destination table will require a partition filter that can be used for partition elimination to be specified. See Partitioned Tables.
-
val (Boolean) — Indicates if queries over the destination table
will require a partition filter. The default value is
false
.
#time_partitioning_type=
def time_partitioning_type=(type)
Sets the partitioning for the destination table. See Partitioned
Tables.
The supported types are DAY
, HOUR
, MONTH
, and YEAR
, which will
generate one partition per day, hour, month, and year, respectively.
You can only set the partitioning field while creating a table. BigQuery does not allow you to change partitioning on an existing table.
-
type (String) — The partition type. The supported types are
DAY
,HOUR
,MONTH
, andYEAR
, which will generate one partition per day, hour, month, and year, respectively.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" destination_table = dataset.table "my_destination_table", skip_lookup: true job = dataset.query_job "SELECT * FROM UNNEST(" \ "GENERATE_TIMESTAMP_ARRAY(" \ "'2018-10-01 00:00:00', " \ "'2018-10-10 00:00:00', " \ "INTERVAL 1 DAY)) AS dob" do |job| job.table = destination_table job.time_partitioning_type = "DAY" end job.wait_until_done! job.done? #=> true
#udfs=
def udfs=(value)
Sets user defined functions for the query.
-
value (Array<String>, String) — User-defined function resources
used in the query. May be either a code resource to load from a
Google Cloud Storage URI (
gs://bucket/path
), or an inline resource that contains code for a user-defined function (UDF). Providing an inline code resource is equivalent to providing a URI for a file containing the same code. See User-Defined Functions.
#wait_until_done!
def wait_until_done!()
#write=
def write=(value)
Sets the write disposition for when the query results table exists.
-
value (String) —
Specifies the action that occurs if the destination table already exists. The default value is
empty
.The following values are supported:
truncate
- BigQuery overwrites the table data.append
- BigQuery appends the data to the table.empty
- A 'duplicate' error is returned in the job result if the table exists and contains data.