Google Cloud Dataproc V1 Client - Class PySparkBatch (2.2.0)

Reference documentation and code samples for the Google Cloud Dataproc V1 Client class PySparkBatch.

A configuration for running an Apache PySpark batch workload.

Generated from protobuf message google.cloud.dataproc.v1.PySparkBatch

Namespace

Google \ Cloud \ Dataproc \ V1

Methods

__construct

Constructor.

Parameters
Name Description
data array

Optional. Data for populating the Message object.

↳ main_python_file_uri string

Required. The HCFS URI of the main Python file to use as the Spark driver. Must be a .py file.

↳ args array

Optional. The arguments to pass to the driver. Do not include arguments that can be set as batch properties, such as --conf, since a collision can occur that causes an incorrect batch submission.

↳ python_file_uris array

Optional. HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.

↳ jar_file_uris array

Optional. HCFS URIs of jar files to add to the classpath of the Spark driver and tasks.

↳ file_uris array

Optional. HCFS URIs of files to be placed in the working directory of each executor.

↳ archive_uris array

Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

getMainPythonFileUri

Required. The HCFS URI of the main Python file to use as the Spark driver.

Must be a .py file.

Returns
Type Description
string

setMainPythonFileUri

Required. The HCFS URI of the main Python file to use as the Spark driver.

Must be a .py file.

Parameter
Name Description
var string
Returns
Type Description
$this

getArgs

Optional. The arguments to pass to the driver. Do not include arguments that can be set as batch properties, such as --conf, since a collision can occur that causes an incorrect batch submission.

Returns
Type Description
Google\Protobuf\Internal\RepeatedField

setArgs

Optional. The arguments to pass to the driver. Do not include arguments that can be set as batch properties, such as --conf, since a collision can occur that causes an incorrect batch submission.

Parameter
Name Description
var string[]
Returns
Type Description
$this

getPythonFileUris

Optional. HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.

Returns
Type Description
Google\Protobuf\Internal\RepeatedField

setPythonFileUris

Optional. HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.

Parameter
Name Description
var string[]
Returns
Type Description
$this

getJarFileUris

Optional. HCFS URIs of jar files to add to the classpath of the Spark driver and tasks.

Returns
Type Description
Google\Protobuf\Internal\RepeatedField

setJarFileUris

Optional. HCFS URIs of jar files to add to the classpath of the Spark driver and tasks.

Parameter
Name Description
var string[]
Returns
Type Description
$this

getFileUris

Optional. HCFS URIs of files to be placed in the working directory of each executor.

Returns
Type Description
Google\Protobuf\Internal\RepeatedField

setFileUris

Optional. HCFS URIs of files to be placed in the working directory of each executor.

Parameter
Name Description
var string[]
Returns
Type Description
$this

getArchiveUris

Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

Returns
Type Description
Google\Protobuf\Internal\RepeatedField

setArchiveUris

Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

Parameter
Name Description
var string[]
Returns
Type Description
$this