Mainframe Connector command-line reference

This document describes the syntax, commands, flags, and arguments for the Mainframe Connector command-line tool.

Commands

bq export

Export a table from BigQuery.

Synopsis

bq export [options]

Flags and arguments

To run this command you must provide an SQL query (see flags --sql and query_dsn) and a copybook (see flag cobDsn). You can run this command in local, remote, and standalone mode. For remote mode, see the flags --bucket, --remoteHost, --remotePort, and --remoteUrl. The bq export command uses the following flags and arguments:

--project_id=ID
Specify the project to use to execute this command.
--allow_large_results
(Optional) Use large destination table sizes for legacy SQL queries.
--batch
(Optional) Run the query in batch mode.
--bucket=BUCKET
(Optional) Write the output of the command to a location within a Cloud Storage bucket. The output files are written to the destination path gs://BUCKET/EXPORT/. This argument is required for remote mode.
--cobDsn=DSN
(Optional) Specify the copybook DSN that you want to use. If you don't provide a value, Mainframe Connector reads from DD COPYBOOK.
--dataset_id=ID
(Optional) Specify the default dataset to use with the command. You can set the value to [PROJECT_ID]:[DATASET] or [DATASET]. If [PROJECT_ID] is missing, the default project is used.
--destination_table=TABLE
(Optional) Specify the destination table that you want to write the query results to.
--dry_run
(Optional) Validate the query without running it.
--encoding=ENCODING
(Optional) Specify the character set to use for encoding and decoding character fields. When provided, this value overrides the default set by the ENCODING environment variable.
--exporter_thread_count=COUNT
(Optional) Set the number of exporter threads. The default value is 4.
--help or -h
Display this helper text.
--keepAliveTimeInSeconds=SECONDS
(Optional) Specify the keep alive timeout in seconds for an HTTP channel. The default value 480 seconds.
--location=LOCATION
(Optional) Specify a region or multi-region location to execute the command. The default value is US.
--max_read_queue=NUMBER
(Optional) Set the maximum size of the Avro records queue. The default value is twice the number of threads.
--max_read_streams=NUMBER
(Optional) Set the maximum number of read streams threads. the default value is 4.
--maximum_bytes_billed=BYTES
(Optional) Limit the bytes billed for the query.
--order_response
(Optional) Keep the response ordered as returned from BigQuery.
--outDD=OUTPUT
(Optional) Write the output records to the specified dataset in z/OS. The default value is DD OUTFILE.
--parser_type=TYPE
(Optional) Set the configuration parser to legacy, copybook, or auto. The default value is auto.
--query_dsn=DSN
(Optional) Read a query from the specified dataset in z/OS. Use the format HLQ.MEMBER or HLQ.PDS(MEMBER). If you don't provide a value, Mainframe Connector reads from DD QUERY.
--remoteHost=HOST
(Optional) Specify the IP address of the remote host. To run Mainframe Connector in remote mode, set the --bucket flag.
--remotePort=PORT
(Optional) Specify the remote port. The default value is 51770. To run Mainframe Connector in remote mode, set the --bucket flag.
--remoteUrl=URL
(Optional) Specify the remote URL. To run Mainframe Connector in remote mode, set the --bucket flag.
--run_mode=MODE
(Optional) Select the export implementation. You can use one of the following options:
  • directstorage: the binary file is saved locally (default)
  • gcsoutput: the binary file is saved in Cloud Storage
--sql=SQL
(Optional) Specify the BigQuery SQL query to execute.
--stats_table=TABLE
(Optional) Specify the table to insert statistics into.
--timeOutMinutes=MINUTES
(Optional) Set the timeout in minutes for the remote grpc call. The default value is 90 minutes.
--transcoding_buffer=BUFFER
(Optional) Sets the size of the transcoding buffer per thread, in MB. The default value is 20.
--use_cache={true|false}
(Optional) To cache the query results, set to true.
--use_legacy_sql
(Optional) Use legacy SQL instead of standard SQL.

bq load

Load data into a BigQuery table.

Synopsis

bq load [options] tablespec path

Flags and arguments

The bq load command uses the following flags and arguments:

path
Specify a comma-separated list of source file URIs. Supported format is gs://bucket/path. Example: gs://my-bucket/data.orc,gs://my-bucket/more-data.orc.
tablespec
Specify the destination table for the data. The supported format is [PROJECT]:[DATASET].[TABLE]
--project_id=ID
Specify the project to use to execute this command.
--allow_jagged_rows
(Optional) Allow missing trailing optional columns in CSV data.
--allow_quoted_newlines
(Optional) Allow quoted newlines within CSV data.
--append_table
(Optional) Append the loaded data to the existing data in the destination table.
--autodetect
(Optional) Enable automatic schema detection for CSV and JSON data.
--clustering_fields=FIELDS
(Optional) If specified, a comma-separated list of columns is used to cluster the destination table in a query. This flag must be used with the time partitioning flags to create either an ingestion-time partitioned table or a table partitioned on a DATE or TIMESTAMP column. When specified, the table is first partitioned, and then it is clustered using the supplied columns.
--dataset_id=ID
(Optional) Specify the default dataset to use with the command. You can set the value to [PROJECT_ID]:[DATASET] or [DATASET]. If [PROJECT_ID] is missing, the default project is used.
--debug_mode={true|false}
(Optional) Set logging level to debug.
--destination_kms_key=KEY
(Optional) The Cloud KMS key for encryption of the destination table data.
--encoding or -E=ENCODING
(Optional) Specify the character set to use for encoding and decoding character fields. When specified, this value overrides the default set by the ENCODING environment variable.
--field_delimiter or -F=FIELD
(Optional) Specify the column delimiter in the CSV data. Use \t or tab for tab delimiters.
--help or -h
Display this helper text.
--ignore_unknown_values=VALUES
(Optional) Ignore extra unrecognized values in CSV or JSON data.
--location=LOCATION
(Optional) Specify a region or multi-region location to execute the command. The default value is US.
--max_bad_records=RECORDS
(Optional) Set the maximum number of invalid records allowed before the job fails. A maximum of five errors of any type are returned regardless of the --max_bad_records value. This flag applies for loading CSV, JSON, and Google Sheets data only. The default value is 0.
--max_polling_interval_ms=MILLISECONDS
(Optional) The maximum wait time for a BigQuery job.
--null_marker=MARKER
(Optional) Specify a custom string that represents a NULL value in CSV data.
--projection_fields=FIELDS
(Optional) If you set --source_format to DATASTORE_BACKUP then this flag indicates the entity properties to load from a datastore export. Specify the property names in a comma-separated list. Property names are case sensitive and must refer to top-level properties. You can also use this flag with Firestore exports.
--quote=QUOTE
(Optional) Specify a quote character to surround fields in the CSV data. You can specify any one-byte character as the argument. The default value is a double quote ("). To specify that there is no quote characters, use an empty string.
--replace
(Optional) Replace existing data in the destination table with the loaded data.
--require_partition_filter={true|false}
(Optional) To have a partition filter for queries over the supplied table, set to true. This argument applies only to partitioned tables, and if the --time_partitioning_field flag is set to true. The default value is false.
--schema=SCHEMA
(Optional) Define the schema of the destination table. Specify the value as a comma-separated list of column definitions in the form [FIELD]:[DATA_TYPE]. Example: name:STRING,age:INTEGER,city:STRING
--schema_update_option=OPTION
(Optional) When appending data to a table (in a load job or a query job), or when overwriting a table partition, specify how to update the schema of the destination table. Use one of the following values:
  • ALLOW_FIELD_ADDITION: Allow new fields to be added
  • ALLOW_FIELD_RELAXATION: Allow relaxing REQUIRED fields to NULLABLE
Repeat this flag to specify multiple schema update options.
--skip_leading_rows=NUMBER
(Optional) Specify the number of rows to skip at the beginning of the source file. The default value is 0.
--source_format=FORMAT
(Optional) Specify the format of the source data. You can use one of the following values: CSV, NEWLINE_DELIMITED_JSON, AVRO, DATASTORE_BACKUP (use this value for Filestore), PARQUET, ORC. The default value is ORC.
--stats_table=TABLE
(Optional) Specify the table to insert statistics into.
--time_partitioning_expiration=SECONDS
(Optional) Specify when a time-based partition should be deleted, in seconds. The expiration time evaluates to the partition's UTC date plus the specified value. If you provide a negative number, the time-based partition never expires.
--time_partitioning_field=FIELD
(Optional) Specify the field used to determine how to create a time-based partition. If time-based partitioning is enabled without this value, then the table is partitioned based on the load time.
--time_partitioning_type=TYPE
(Optional) Enable time-based partitioning on a table and set the partition type using the following value: DAY.
--use_avro_logical_types={true|false}
(Optional) If --source_format is set to AVRO, then set this flag to true to convert logical types into their corresponding types (such as TIMESTAMP) instead of only using their raw types (such as INTEGER). The default value is false.

bq mk

Create a BigQuery resources such as built-in tables or external tables that need partitioning and and clustering to be set up. You can also use the bq mk command to generate a BigQuery table directly from parsing COBOL copybooks using the --schema_from_copybook flag.

Synopsis

bq mk [options]

Flags and arguments

The bq mk command uses the following flags and arguments:

--project_id=ID
Specify the project to use to execute this command.
--tablespec=TABLE
Specify the destination table for the data. The supported format is [PROJECT]:[DATASET].[TABLE].
--clustering_fields=FIELDS
(Optional) Specify a comma-separated list of up to four column names that specify the fields to use for table clustering.
--dataset_id=ID
(Optional) Specify the default dataset to use with the command. You can set the value to [PROJECT_ID]:[DATASET] or [DATASET]. If [PROJECT_ID] is missing, the default project is used.
--description=DESCRIPTION
(Optional) Provide a description for the dataset or table.
--dry_run
(Optional) Print the table's Data Definition Language (DDL) statement.
--encoding=ENCODING
(Optional) Specify the character set to use for encoding and decoding character fields. When specified, this value overrides the default set by the ENCODING environment variable.
--expiration=EXPIRATION
(Optional) Specify the lifetime for the table. If you don't specify a value, BigQuery creates the table with the dataset's default table lifetime, or the table doesn't expire.
--external_table_definition or -e=TABLE
(Optional) Specify a name and schema definition to create an external table. Example: ORC=gs://bucket/table_part1.orc/,gs://bucket/table_part2.orc/.
--help or -h
Display this helper text.
--location=LOCATION
(Optional) Specify a region or multi-region location to execute the command. The default value is US.
--parser_type=TYPE
(Optional) Set the configuration parser to legacy, copybook, or auto. The default value is auto.
--require_partition_filter={true|false}
(Optional) To have a partition filter for queries over the supplied table, set to true. This argument applies only to partitioned tables, and if the --time_partitioning_field flag is set to true. The default value is true.
--schema=SCHEMA
(Optional) Specify either the path to a local JSON schema file or a comma-separated list of column definitions in the format FIELD:DATA_TYPE, FIELD:DATA_TYPE and so on.
--schema_from_copybook=SCHEMA
(Optional) Generate the schema from a copybook.
--table or -t=TABLE
(Optional) Create a table.
--time_partitioning_expiration=SECONDS
(Optional) Specify when a time-based partition should be deleted, in seconds. The expiration time evaluates to the partition's UTC date plus the specified value. If you provide a negative number, the time-based partition never expires.
--time_partitioning_field=FIELD
(Optional) Specify the field used to determine how to create a time-based partition. If time-based partitioning is enabled without this value, then the table is partitioned based on the load time.
--view
(Optional) Create a view.

bq query

Execute a BigQuery query.

Synopsis

bq query [options]

Flags and arguments

You can run this command in the local and remote modes. For the remote mode, see the flags --remoteHost, --remotePort, and --remoteUrl, and the environment variable BQ_QUERY_REMOTE_EXECUTION. The bq query command uses the following flags and arguments:

--project_id=ID
Specify the project to use to execute this command.
--allow_large_results
(Optional) Use large destination table sizes for legacy SQL queries.
--append_table
(Optional) Append the loaded data to the existing data in the destination table.
--batch
(Optional) Run the query in batch mode.
--clustering_fields=FIELDS
(Optional) Specify a comma-separated list of up to four column names that specify the fields to use for table clustering. If you specify this value with partitioning, then the table is first partitioned, and then each partition is clustered using the supplied columns.
--create_if_needed
(Optional) Create destination table if it doesn't exist.
--dataset_id=ID
(Optional) Specify the default dataset to use with the command. You can set the value to [PROJECT_ID]:[DATASET] or [DATASET]. If [PROJECT_ID] is missing, the default project is used.
--destination_table=TABLE
(Optional) Specify the destination table that you want to write the query results to.
--dry_run
(Optional) Validate the query without running it.
--follow={true|false}
(Optional) To track individual query steps or the script as a whole, set to true. The default value is false.
--help or -h
Display this helper text.
--location=LOCATION
(Optional) Specify a region or multi-region location to execute the command. The default value is US.
--maximum_bytes_billed=BYTES
(Optional) Specify the limit of the bytes billed for the query.
--parameters=PARAMETERS
(Optional) Specify comma-separated query parameters in the format [NAME]:[TYPE]:[VALUE]. An empty name creates a positional parameter. You can omit [TYPE] to assume a STRING value in the format name::value or ::value. NULL produces a null value.
--query_dsn=DSN
(Optional) Specify the DSN to read the query from, in the format HLQ.MEMBER or HLQ.PDS(MEMBER). If query_dsn is not provided, QUERY DD is used.
--remoteHost=HOST
(Optional) Specify the IP address of the remote host. To run the query in remote mode, set the BQ_QUERY_REMOTE_EXECUTION environment variable.
--remotePort=PORT
(Optional) Specify the remote port. The default value is 51770. To run the query in remote mode, set the BQ_QUERY_REMOTE_EXECUTION environment variable.
--remoteUrl=URL
(Optional) Specify the remote URL. To run the query in remote mode, set the BQ_QUERY_REMOTE_EXECUTION environment variable.
--replace
(Optional) Overwrite the destination table with the query results.
--report_row_limit=LIMIT
(Optional) Specify the maximum rows to print in the audit report. The default value is 30.
--require_partition_filter={true|false}
(Optional) To have a partition filter for queries over the supplied table, set to true. The default value is true.
--schema_update_option=OPTION
(Optional) Update the schema of the destination table when appending data. Use the following values:
  • ALLOW_FIELD_ADDITION: Allows new fields to be added.
  • ALLOW_FIELD_RELAXATION: Allows relaxing REQUIRED fields to NULLABLE.
--split_sql={true|false}
(Optional) To split input SQL script into individual queries, set to true. The default value is true.
--stats_table=TABLE
(Optional) Specify the table to insert statistics into.
--sync={true|false}
(Optional) Run the command in synchronous mode.
--synchronous_mode={true|false}
(Optional) An alternative to --sync.
--timeOutMinutes=MINUTES
(Optional) Specify the timeout in minutes for a BigQuery job response. The default value is 240 minutes.
--time_partitioning_expiration=SECONDS
(Optional) Specify when a time-based partition should be deleted, in seconds. The expiration time evaluates to the partition's UTC date plus the specified value. If you provide a negative number, the time-based partition never expires.
--time_partitioning_field=FIELD
(Optional) Specify the field used to determine how to create a time-based partition. If time-based partitioning is enabled without this value, then the table is partitioned based on the load time.
--time_partitioning_type=TYPE
(Optional) Enable time-based partitioning on a table and set the partition type using one of the following values: DAY, HOUR, MONTH, YEAR.
--use_cache={true|false}
(Optional) To cache the query results, set to true. The default value is true.
--use_legacy_sql
(Optional) Use legacy SQL instead of standard SQL.

gsutil cp

Transcode data from your Mainframe to a Cloud Storage bucket.

Synopsis

gsutil cp [options] gcsUri [dest]

Flags and arguments

You can use this command for the following different purposes:

  • Copy and transcode a file from a Mainframe or a linux environment to Cloud Storage.
    • Source: --inDSN. If not provided, is specified by DD INFILE.
    • Destination: gcsUri
  • Copy and transcode a file within Cloud Storage
    • Source: gcsUri
    • Destination: --destPath
  • Copy a file from Cloud Storage to a Mainframe.
    • Source: gcsUri
    • Destination: --destDSN
    • Relevant flags: --lrecl, --blksize, --recfm, --noseek.
  • Copy a file from Cloud Storage to a linux environment.
    • Source: gcsUri
    • Destination: --destPath
This command can run in local, remote, and standalone modes. For remote mode, see the flags --remote, --remoteHost, --remotePort, and --remoteUrl. The gsutil cp command uses the following flags and arguments:

dest
(Optional) The local path or data source name (DSN). Example formats: /path/to/file, DATASET.MEMBER
gcsUri
The Cloud Storage URI in the format gs://bucket/path. Can represent both the source and destination location, depending on usage.
--project_id=ID
Specify the project to use to execute this command.
--batchSize=SIZE
(Optional) Specify the blocks to be used per batch. The default value is 1000.
--blksize=SIZE
(Optional) Specify the block size of file to copy to the Mainframe. If blksize=0 and the recfm is not U, the mainframe system determines the optimal block size for the file.
--cobDsn=DSN
(Optional) Specify the copybook DSN that you want to use. If you don't provide a value, Mainframe Connector reads from DD COPYBOOK.
--connections=NUMBER
(Optional) Specify the number of connections that can made to the remote receiver. The default value is 10.
--dataset_id=ID
(Optional) Specify the default dataset to use with the command. You can set the value to [PROJECT_ID]:[DATASET] or [DATASET]. If [PROJECT_ID] is missing, the default project is used.
--destDSN=OUTPUT
(Optional) Specify the destination DSN.
--destPath=OUTPUT
(Optional) Specify the destination path.
--dry_run
(Optional) Test copybook parsing and decoding of the QSAM file.
--encoding=ENCODING
(Optional) Specify the character set to use for encoding and decoding character fields. When specified, this value overrides the default set by the ENCODING environment variable.
--help or -h
Display this helper text.
--inDsn=DSN
(Optional) Specify the infile DSN that you want to use. If you don't provide a value, Mainframe Connector reads from DD INFILE.
--keepAliveTimeInSeconds=SECONDS
(Optional) Specify the keep alive timeout in seconds for an HTTP channel. The default value 480 seconds.
--location=LOCATION
(Optional) Specify a region or multi-region location to execute the command. The default value is US.
--lowerCaseColumnNames
(Optional) Create lowercase column names for copybook fields.
--lrecl=LRECL
(Optional) Specify the logical record length (lrecl) of the file that you want to copy to the Mainframe.
--maxChunkSize=SIZE
(Optional) Specify the maximum chunk size per batch. You should use K, KiB, KB, M, MiB, MB, G, GiB, GB, T, TiB, or TB to describe the size. The default value is 128MiB.
--max_error_pct=PCT
(Optional) Specify the job failure threshold for row decoding errors. Valid values are within the range [0.0, 1.0]. The default value is 0.0.
--noseek
(Optional) Improve download performance from Cloud Storage to the Mainframe.
--parallel or -m
(Optional) Set the number of concurrent writers to 4.
--parallelism or -p=NUMBER
(Optional) Specify the number of concurrent writers. The default value is 4.
--parser_type=TYPE
(Optional) Set the configuration parser to legacy, copybook, or auto. The default value is auto.
--preload_chunk_count=NUMBER
(Optional) Specify the number of chunks to preload from disks while all workers are occupied. The default value is 2.
--recfm=REFCM
(Optional) Specify the recfm of the file that you want to copy to the Mainframe. You can use one of the following values: F, FB, V, VB, U. The default value is FB.
--remote
(Optional) Use a remote decoder.
--remoteHost=HOST
(Optional) Specify the IP address of the remote host. To run Mainframe Connector in remote mode, set the --remote flag.
--remotePort=PORT
(Optional) Specify the remote port to be used. The default value is 51770. To run Mainframe Connector in remote mode, set the --remote flag.
--remoteUrl=URL
(Optional) Specify the remote URL. To run Mainframe Connector in remote mode, set the --remote flag.
--replace
(Optional) Delete the destination recursively before uploading.
--stats_table=TABLE
(Optional) Specify the table to insert statistics into.
--tfDSN=DSN
(Optional) Specify the transformations from a DSN, DATASET.MEMBER, or PDS(MBR).
--tfGCS=GCS
(Optional) Specify the transformations file from Cloud Storage.
--timeOutMinutes=MINUTES
(Optional) Specify the timeout in minutes for a remote grpc call. The default value is 90 minutes for Cloud Storage and 50 minutes for a Mainframe.

gsutil rm

Remove Cloud Storage objects.

Synopsis

gsutil rm [-hR] url...

Flags and arguments

The gsutil rm command uses the following flags and arguments:

url
Specify the Cloud Storage location in the format gs://bucket/prefix.
--help or -h
Display this help message.
-R or -r
Recursively delete the contents of directories or objects that match the path expression.

scp

Copy files to Cloud Storage.

Synopsis

scp [options] [input] [output]

Flags and arguments

To use this command, you must ensure the following:

  • Set one unique input value through input, --inDD, or --inDsn.
  • Set one unique output value through output or --gcsOutUri.

The scp command uses the following flags and arguments:

input
(Optional) Specify the DD or DSN to be copied. You can use --inDD or --inDsn instead.
output
(Optional) Specify the URI of the output using the format gs://[BUCKET]/[PREFIX]. You can use --gcsOutUri instead.
--compress
(Optional) Compress output with gzip.
--count or -n=RECORDS
(Optional) Specify the number of records to copy. The default is unlimited.
--encoding=ENCODING
(Optional) Specify the input character encoding. The default value is CP037.
--gcsOutUri=URI
(Optional) Specify the destination Cloud Storage URI of the file copy.
--help or -h
Display this helper text.
--inDD=INPUT
(Optional) Specify the DD file to be copied. The default value is DD INFILE.
--inDsn=INPUT
(Optional) Specify the DSN to be copied.
--noConvert
(Optional) Disable conversion of character input to ASCII. Character conversion is enabled by default.

systemreport

Provide system report.

Synopsis

systemreport [-h] [--available_security_providers] [--supported_ciphers]

Flags and arguments

The systemreport command uses the following flags and arguments:

--available_security_providers
Print supported security providers.
--help or -h
Display this help message.
--supported_ciphers
Print supported ciphers.