BigQuery API - Class Google::Cloud::Bigquery::ExtractJob (v1.49.0)

Reference documentation and code samples for the BigQuery API class Google::Cloud::Bigquery::ExtractJob.

ExtractJob

A Job subclass representing an export operation that may be performed on a Table or Model. A ExtractJob instance is returned when you call Project#extract_job, Table#extract_job or Model#extract_job.

Examples

Export table data

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

extract_job = table.extract_job "gs://my-bucket/file-name.json",
                                format: "json"
extract_job.wait_until_done!
extract_job.done? #=> true

Export a model

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
model = dataset.model "my_model"

extract_job = model.extract_job "gs://my-bucket/#{model.model_id}"

extract_job.wait_until_done!
extract_job.done? #=> true

Methods

#avro?

def avro?() -> Boolean

Checks if the destination format for the table data is Avro. The default is false. Not applicable when extracting models.

Returns
  • (Boolean) — true when AVRO, false if not AVRO or not a table extraction.

#compression?

def compression?() -> Boolean

Checks if the export operation compresses the data using gzip. The default is false. Not applicable when extracting models.

Returns
  • (Boolean) — true when GZIP, false if not GZIP or not a table extraction.

#csv?

def csv?() -> Boolean

Checks if the destination format for the table data is CSV. Tables with nested or repeated fields cannot be exported as CSV. The default is true for tables. Not applicable when extracting models.

Returns
  • (Boolean) — true when CSV, or false if not CSV or not a table extraction.

#delimiter

def delimiter() -> String, nil

The character or symbol the operation uses to delimit fields in the exported data. The default is a comma (,) for tables. Not applicable when extracting models.

Returns
  • (String, nil) — A string containing the character, such as ",", nil if not a table extraction.

#destinations

def destinations()

The URI or URIs representing the Google Cloud Storage files to which the data is exported.

#destinations_counts

def destinations_counts() -> Hash<String, Integer>

A hash containing the URI or URI pattern specified in #destinations mapped to the counts of files per destination.

Returns
  • (Hash<String, Integer>) — A Hash with the URI patterns as keys and the counts as values.

#destinations_file_counts

def destinations_file_counts() -> Array<Integer>

The number of files per destination URI or URI pattern specified in #destinations.

Returns
  • (Array<Integer>) — An array of values in the same order as the URI patterns.

#json?

def json?() -> Boolean

Checks if the destination format for the table data is newline-delimited JSON. The default is false. Not applicable when extracting models.

Returns
  • (Boolean) — true when NEWLINE_DELIMITED_JSON, false if not NEWLINE_DELIMITED_JSON or not a table extraction.

#ml_tf_saved_model?

def ml_tf_saved_model?() -> Boolean

Checks if the destination format for the model is TensorFlow SavedModel. The default is true for models. Not applicable when extracting tables.

Returns
  • (Boolean) — true when ML_TF_SAVED_MODEL, false if not ML_TF_SAVED_MODEL or not a model extraction.

#ml_xgboost_booster?

def ml_xgboost_booster?() -> Boolean

Checks if the destination format for the model is XGBoost. The default is false. Not applicable when extracting tables.

Returns
  • (Boolean) — true when ML_XGBOOST_BOOSTER, false if not ML_XGBOOST_BOOSTER or not a model extraction.

#model?

def model?() -> Boolean

Whether the source of the export job is a model. See #source.

Returns
  • (Boolean) — true when the source is a model, false otherwise.

#print_header?

def print_header?() -> Boolean

Checks if the exported data contains a header row. The default is true for tables. Not applicable when extracting models.

Returns
  • (Boolean) — true when the print header configuration is present or nil, false if disabled or not a table extraction.

#source

def source(view: nil) -> Table, Model, nil

The table or model which is exported.

Parameter
  • view (String) (defaults to: nil) — Specifies the view that determines which table information is returned. By default, basic table information and storage statistics (STORAGE_STATS) are returned. Accepted values include :unspecified, :basic, :storage, and :full. For more information, see BigQuery Classes. The default value is the :unspecified view type.
Returns
  • (Table, Model, nil) — A table or model instance, or nil.

#table?

def table?() -> Boolean

Whether the source of the export job is a table. See #source.

Returns
  • (Boolean) — true when the source is a table, false otherwise.

#use_avro_logical_types?

def use_avro_logical_types?() -> Boolean

If #avro? (#format is set to "AVRO"), this flag indicates whether to enable extracting applicable column types (such as TIMESTAMP) to their corresponding AVRO logical types (timestamp-micros), instead of only using their raw types (avro-long). Not applicable when extracting models.

Returns
  • (Boolean) — true when applicable column types will use their corresponding AVRO logical types, false if not enabled or not a table extraction.