gcloud dataproc jobs submit spark-sql

NAME
gcloud dataproc jobs submit spark-sql - submit a Spark SQL job to a cluster
SYNOPSIS
gcloud dataproc jobs submit spark-sql (--cluster=CLUSTER     | --cluster-labels=[KEY=VALUE,…]) (--execute=QUERY, -e QUERY     | --file=FILE, -f FILE) [--async] [--bucket=BUCKET] [--driver-log-levels=[PACKAGE=LEVEL,…]] [--driver-required-memory-mb=DRIVER_REQUIRED_MEMORY_MB] [--driver-required-vcores=DRIVER_REQUIRED_VCORES] [--jars=[JAR,…]] [--labels=[KEY=VALUE,…]] [--max-failures-per-hour=MAX_FAILURES_PER_HOUR] [--max-failures-total=MAX_FAILURES_TOTAL] [--params=[PARAM=VALUE,…]] [--properties=[PROPERTY=VALUE,…]] [--properties-file=PROPERTIES_FILE] [--region=REGION] [GCLOUD_WIDE_FLAG]
DESCRIPTION
Submit a Spark SQL job to a cluster.
EXAMPLES
To submit a Spark SQL job with a local script, run:
gcloud dataproc jobs submit spark-sql --cluster=my-cluster --file=my_queries.ql

To submit a Spark SQL job with inline queries, run:

gcloud dataproc jobs submit spark-sql --cluster=my-cluster -e="CREATE EXTERNAL TABLE foo(bar int) LOCATION 'gs://my_bucket/'" -e="SELECT * FROM foo WHERE bar > 2"
REQUIRED FLAGS
Exactly one of these must be specified:
--cluster=CLUSTER
The Dataproc cluster to submit the job to.
--cluster-labels=[KEY=VALUE,…]
List of label KEY=VALUE pairs to add.

Keys must start with a lowercase character and contain only hyphens (-), underscores (_), lowercase characters, and numbers. Values must contain only hyphens (-), underscores (_), lowercase characters, and numbers.

Labels of Dataproc cluster on which to place the job.

Exactly one of these must be specified:
--execute=QUERY, -e QUERY
A Spark SQL query to execute as part of the job.
--file=FILE, -f FILE
HCFS URI of file containing Spark SQL script to execute as the job.
OPTIONAL FLAGS
--async
Return immediately, without waiting for the operation in progress to complete.
--bucket=BUCKET
The Cloud Storage bucket to stage files in. Defaults to the cluster's configured bucket.
--driver-log-levels=[PACKAGE=LEVEL,…]
A list of package to log4j log level pairs to configure driver logging. For example: root=FATAL,com.example=INFO
--driver-required-memory-mb=DRIVER_REQUIRED_MEMORY_MB
The memory allocation requested by the job driver in megabytes (MB) for execution on the driver node group (it is used only by clusters with a driver node group).
--driver-required-vcores=DRIVER_REQUIRED_VCORES
The vCPU allocation requested by the job driver for execution on the driver node group (it is used only by clusters with a driver node group).
--jars=[JAR,…]
Comma separated list of jar files to be provided to the executor and driver classpaths. May contain UDFs.
--labels=[KEY=VALUE,…]
List of label KEY=VALUE pairs to add.

Keys must start with a lowercase character and contain only hyphens (-), underscores (_), lowercase characters, and numbers. Values must contain only hyphens (-), underscores (_), lowercase characters, and numbers.

--max-failures-per-hour=MAX_FAILURES_PER_HOUR
Specifies the maximum number of times a job can be restarted per hour in event of failure. Default is 0 (no retries after job failure).
--max-failures-total=MAX_FAILURES_TOTAL
Specifies the maximum total number of times a job can be restarted after the job fails. Default is 0 (no retries after job failure).
--params=[PARAM=VALUE,…]
A list of key value pairs to set variables in the Hive queries.
--properties=[PROPERTY=VALUE,…]
A list of key value pairs to configure Hive.
--properties-file=PROPERTIES_FILE
Path to a local file or a file in a Cloud Storage bucket containing configuration properties for the job. The client machine running this command must have read permission to the file.

Specify properties in the form of property=value in the text file. For example:

  # Properties to set for the job:
  key1=value1
  key2=value2
  # Comment out properties not used.
  # key3=value3

If a property is set in both --properties and --properties-file, the value defined in --properties takes precedence.

--region=REGION
Dataproc region to use. Each Dataproc region constitutes an independent resource namespace constrained to deploying instances into Compute Engine zones inside the region. Overrides the default dataproc/region property value for this command invocation.
GCLOUD WIDE FLAGS
These flags are available to all commands: --access-token-file, --account, --billing-project, --configuration, --flags-file, --flatten, --format, --help, --impersonate-service-account, --log-http, --project, --quiet, --trace-token, --user-output-enabled, --verbosity.

Run $ gcloud help for details.

NOTES
These variants are also available:
gcloud alpha dataproc jobs submit spark-sql
gcloud beta dataproc jobs submit spark-sql