A Dataproc job for running Apache Spark applications on YARN.
JSON representation
{"args": [string],"jarFileUris": [string],"fileUris": [string],"archiveUris": [string],"properties": {string: string,...},"loggingConfig": {object (LoggingConfig)},// Union field driver can be only one of the following:"mainJarFileUri": string,"mainClass": string// End of list of possible types for union field driver.}
Fields
args[]
string
Optional. The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
jarFileUris[]
string
Optional. HCFS URIs of jar files to add to the CLASSPATHs of the Spark driver and tasks.
fileUris[]
string
Optional. HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.
archiveUris[]
string
Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
properties
map (key: string, value: string)
Optional. A mapping of property names to values, used to configure Spark. Properties that conflict with values set by the Dataproc API might be overwritten. Can include properties set in /etc/spark/conf/spark-defaults.conf and classes in user code.
An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" }.
Optional. The runtime log config for job execution.
Union field driver. Required. The specification of the main method to call to drive the job. Specify either the jar file that contains the main class or the main class name. To pass both a main jar and a main class in that jar, add the jar to jarFileUris, and then specify the main class name in mainClass. driver can be only one of the following:
mainJarFileUri
string
The HCFS URI of the jar file that contains the main class.
mainClass
string
The name of the driver's main class. The jar file that contains the class must be in the default CLASSPATH or specified in SparkJob.jar_file_uris.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-06-20 UTC."],[[["\u003cp\u003eThis content outlines the JSON representation and fields for configuring an Apache Spark job on Google Cloud Dataproc, including options for passing arguments, specifying file URIs, and setting job properties.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003edriver\u003c/code\u003e field is required and can be specified using either \u003ccode\u003emainJarFileUri\u003c/code\u003e which is the HCFS URI of a JAR file, or \u003ccode\u003emainClass\u003c/code\u003e that defines the main driver class within a JAR.\u003c/p\u003e\n"],["\u003cp\u003eThe job configuration supports specifying \u003ccode\u003ejarFileUris\u003c/code\u003e, \u003ccode\u003efileUris\u003c/code\u003e, and \u003ccode\u003earchiveUris\u003c/code\u003e to manage dependencies and files needed by the Spark driver and executors.\u003c/p\u003e\n"],["\u003cp\u003eCustom configurations for the Spark job can be set via the \u003ccode\u003eproperties\u003c/code\u003e field, which allows mapping key-value pairs for specific Spark settings.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003eloggingConfig\u003c/code\u003e field is optional and can be used to configure the runtime log for the execution of the job.\u003c/p\u003e\n"]]],[],null,[]]