Reference documentation and code samples for the Google Cloud Dataproc V1 Client class PySparkJob.
A Dataproc job for running Apache PySpark applications on YARN.
Generated from protobuf message google.cloud.dataproc.v1.PySparkJob
Namespace
Google \ Cloud \ Dataproc \ V1Methods
__construct
Constructor.
Parameters | |
---|---|
Name | Description |
data |
array
Optional. Data for populating the Message object. |
↳ main_python_file_uri |
string
Required. The HCFS URI of the main Python file to use as the driver. Must be a .py file. |
↳ args |
array
Optional. The arguments to pass to the driver. Do not include arguments, such as |
↳ python_file_uris |
array
Optional. HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip. |
↳ jar_file_uris |
array
Optional. HCFS URIs of jar files to add to the CLASSPATHs of the Python driver and tasks. |
↳ file_uris |
array
Optional. HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks. |
↳ archive_uris |
array
Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip. |
↳ properties |
array|Google\Protobuf\Internal\MapField
Optional. A mapping of property names to values, used to configure PySpark. Properties that conflict with values set by the Dataproc API might be overwritten. Can include properties set in /etc/spark/conf/spark-defaults.conf and classes in user code. |
↳ logging_config |
Google\Cloud\Dataproc\V1\LoggingConfig
Optional. The runtime log config for job execution. |
getMainPythonFileUri
Required. The HCFS URI of the main Python file to use as the driver. Must be a .py file.
Returns | |
---|---|
Type | Description |
string |
setMainPythonFileUri
Required. The HCFS URI of the main Python file to use as the driver. Must be a .py file.
Parameter | |
---|---|
Name | Description |
var |
string
|
Returns | |
---|---|
Type | Description |
$this |
getArgs
Optional. The arguments to pass to the driver. Do not include arguments,
such as --conf
, that can be set as job properties, since a collision may
occur that causes an incorrect job submission.
Returns | |
---|---|
Type | Description |
Google\Protobuf\Internal\RepeatedField |
setArgs
Optional. The arguments to pass to the driver. Do not include arguments,
such as --conf
, that can be set as job properties, since a collision may
occur that causes an incorrect job submission.
Parameter | |
---|---|
Name | Description |
var |
string[]
|
Returns | |
---|---|
Type | Description |
$this |
getPythonFileUris
Optional. HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.
Returns | |
---|---|
Type | Description |
Google\Protobuf\Internal\RepeatedField |
setPythonFileUris
Optional. HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.
Parameter | |
---|---|
Name | Description |
var |
string[]
|
Returns | |
---|---|
Type | Description |
$this |
getJarFileUris
Optional. HCFS URIs of jar files to add to the CLASSPATHs of the Python driver and tasks.
Returns | |
---|---|
Type | Description |
Google\Protobuf\Internal\RepeatedField |
setJarFileUris
Optional. HCFS URIs of jar files to add to the CLASSPATHs of the Python driver and tasks.
Parameter | |
---|---|
Name | Description |
var |
string[]
|
Returns | |
---|---|
Type | Description |
$this |
getFileUris
Optional. HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.
Returns | |
---|---|
Type | Description |
Google\Protobuf\Internal\RepeatedField |
setFileUris
Optional. HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.
Parameter | |
---|---|
Name | Description |
var |
string[]
|
Returns | |
---|---|
Type | Description |
$this |
getArchiveUris
Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
Returns | |
---|---|
Type | Description |
Google\Protobuf\Internal\RepeatedField |
setArchiveUris
Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
Parameter | |
---|---|
Name | Description |
var |
string[]
|
Returns | |
---|---|
Type | Description |
$this |
getProperties
Optional. A mapping of property names to values, used to configure PySpark.
Properties that conflict with values set by the Dataproc API might be overwritten. Can include properties set in /etc/spark/conf/spark-defaults.conf and classes in user code.
Returns | |
---|---|
Type | Description |
Google\Protobuf\Internal\MapField |
setProperties
Optional. A mapping of property names to values, used to configure PySpark.
Properties that conflict with values set by the Dataproc API might be overwritten. Can include properties set in /etc/spark/conf/spark-defaults.conf and classes in user code.
Parameter | |
---|---|
Name | Description |
var |
array|Google\Protobuf\Internal\MapField
|
Returns | |
---|---|
Type | Description |
$this |
getLoggingConfig
Optional. The runtime log config for job execution.
Returns | |
---|---|
Type | Description |
Google\Cloud\Dataproc\V1\LoggingConfig|null |
hasLoggingConfig
clearLoggingConfig
setLoggingConfig
Optional. The runtime log config for job execution.
Parameter | |
---|---|
Name | Description |
var |
Google\Cloud\Dataproc\V1\LoggingConfig
|
Returns | |
---|---|
Type | Description |
$this |