REST Resource: projects.regions.jobs

Resource: Job

A Dataproc job resource.

JSON representation
{
  "reference": {
    object (JobReference)
  },
  "placement": {
    object (JobPlacement)
  },
  "status": {
    object (JobStatus)
  },
  "statusHistory": [
    {
      object (JobStatus)
    }
  ],
  "yarnApplications": [
    {
      object (YarnApplication)
    }
  ],
  "driverOutputResourceUri": string,
  "driverControlFilesUri": string,
  "labels": {
    string: string,
    ...
  },
  "scheduling": {
    object (JobScheduling)
  },
  "jobUuid": string,
  "done": boolean,
  "driverSchedulingConfig": {
    object (DriverSchedulingConfig)
  },

  // Union field type_job can be only one of the following:
  "hadoopJob": {
    object (HadoopJob)
  },
  "sparkJob": {
    object (SparkJob)
  },
  "pysparkJob": {
    object (PySparkJob)
  },
  "hiveJob": {
    object (HiveJob)
  },
  "pigJob": {
    object (PigJob)
  },
  "sparkRJob": {
    object (SparkRJob)
  },
  "sparkSqlJob": {
    object (SparkSqlJob)
  },
  "prestoJob": {
    object (PrestoJob)
  },
  "flinkJob": {
    object (FlinkJob)
  }
  // End of list of possible types for union field type_job.
}
Fields
reference

object (JobReference)

Optional. The fully qualified reference to the job, which can be used to obtain the equivalent REST path of the job resource. If this property is not specified when a job is created, the server generates a

jobId

.

placement

object (JobPlacement)

Required. Job information, including how, when, and where to run the job.

status

object (JobStatus)

Output only. The job status. Additional application-specific status information might be contained in the

type_job

and

yarnApplications

fields.

statusHistory[]

object (JobStatus)

Output only. The previous job status.

yarnApplications[]

object (YarnApplication)

Output only. The collection of YARN applications spun up by this job.

Beta Feature: This report is available for testing purposes only. It might be changed before final release.

driverOutputResourceUri

string

Output only. A URI pointing to the location of the stdout of the job's driver program.

driverControlFilesUri

string

Output only. If present, the location of miscellaneous control files which can be used as part of job setup and handling. If not present, control files might be placed in the same location as driver_output_uri.

labels

map (key: string, value: string)

Optional. The labels to associate with this job. Label keys must contain 1 to 63 characters, and must conform to RFC 1035. Label values can be empty, but, if present, must contain 1 to 63 characters, and must conform to RFC 1035. No more than 32 labels can be associated with a job.

An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" }.

scheduling

object (JobScheduling)

Optional. Job scheduling configuration.

jobUuid

string

Output only. A UUID that uniquely identifies a job within the project over time. This is in contrast to a user-settable reference.job_id that might be reused over time.

done

boolean

Output only. Indicates whether the job is completed. If the value is false, the job is still in progress. If true, the job is completed, and status.state field will indicate if it was successful, failed, or cancelled.

driverSchedulingConfig

object (DriverSchedulingConfig)

Optional. Driver scheduling configuration.

Union field type_job. Required. The application/framework-specific portion of the job. type_job can be only one of the following:
hadoopJob

object (HadoopJob)

Optional. Job is a Hadoop job.

sparkJob

object (SparkJob)

Optional. Job is a Spark job.

pysparkJob

object (PySparkJob)

Optional. Job is a PySpark job.

hiveJob

object (HiveJob)

Optional. Job is a Hive job.

pigJob

object (PigJob)

Optional. Job is a Pig job.

sparkRJob

object (SparkRJob)

Optional. Job is a SparkR job.

sparkSqlJob

object (SparkSqlJob)

Optional. Job is a SparkSql job.

prestoJob

object (PrestoJob)

Optional. Job is a Presto job.

JobReference

Encapsulates the full scoping used to reference a job.

JSON representation
{
  "projectId": string,
  "jobId": string
}
Fields
projectId

string

Optional. The ID of the Google Cloud Platform project that the job belongs to. If specified, must match the request project ID.

jobId

string

Optional. The job ID, which must be unique within the project.

The ID must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), or hyphens (-). The maximum length is 100 characters.

If not specified by the caller, the job ID will be provided by the server.

JobPlacement

Dataproc job config.

JSON representation
{
  "clusterName": string,
  "clusterUuid": string,
  "clusterLabels": {
    string: string,
    ...
  }
}
Fields
clusterName

string

Required. The name of the cluster where the job will be submitted.

clusterUuid

string

Output only. A cluster UUID generated by the Dataproc service when the job is submitted.

clusterLabels

map (key: string, value: string)

Optional. Cluster labels to identify a cluster where the job will be submitted.

An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" }.

JobStatus

Dataproc job status.

JSON representation
{
  "state": enum (State),
  "details": string,
  "stateStartTime": string,
  "substate": enum (Substate)
}
Fields
state

enum (State)

Output only. A state message specifying the overall job state.

details

string

Optional. Output only. Job state details, such as an error description if the state is ERROR.

stateStartTime

string (Timestamp format)

Output only. The time when this state was entered.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

substate

enum (Substate)

Output only. Additional state information, which includes status reported by the agent.

State

The job state.

Enums
STATE_UNSPECIFIED The job state is unknown.
PENDING The job is pending; it has been submitted, but is not yet running.
SETUP_DONE Job has been received by the service and completed initial setup; it will soon be submitted to the cluster.
RUNNING The job is running on the cluster.
CANCEL_PENDING A jobs.cancel request has been received, but is pending.
CANCEL_STARTED Transient in-flight resources have been canceled, and the request to cancel the running job has been issued to the cluster.
CANCELLED The job cancellation was successful.
DONE The job has completed successfully.
ERROR The job has completed, but encountered an error.
ATTEMPT_FAILURE

Job attempt has failed. The detail field contains failure details for this attempt.

Applies to restartable jobs only.

Substate

The job substate.

Enums
UNSPECIFIED The job substate is unknown.
SUBMITTED

The Job is submitted to the agent.

Applies to RUNNING state.

QUEUED

The Job has been received and is awaiting execution (it might be waiting for a condition to be met). See the "details" field for the reason for the delay.

Applies to RUNNING state.

STALE_STATUS

The agent-reported status is out of date, which can be caused by a loss of communication between the agent and Dataproc. If the agent does not send a timely update, the job will fail.

Applies to RUNNING state.

YarnApplication

A YARN application created by a job. Application information is a subset of

org.apache.hadoop.yarn.proto.YarnProtos.ApplicationReportProto

.

Beta Feature: This report is available for testing purposes only. It may be changed before final release.

JSON representation
{
  "name": string,
  "state": enum (State),
  "progress": number,
  "trackingUrl": string
}
Fields
name

string

Required. The application name.

state

enum (State)

Required. The application state.

progress

number

Required. The numerical progress of the application, from 1 to 100.

trackingUrl

string

Optional. The HTTP URL of the ApplicationMaster, HistoryServer, or TimelineServer that provides application-specific information. The URL uses the internal hostname, and requires a proxy server for resolution and, possibly, access.

State

The application state, corresponding to

YarnProtos.YarnApplicationStateProto

.

Enums
STATE_UNSPECIFIED Status is unspecified.
NEW Status is NEW.
NEW_SAVING Status is NEW_SAVING.
SUBMITTED Status is SUBMITTED.
ACCEPTED Status is ACCEPTED.
RUNNING Status is RUNNING.
FINISHED Status is FINISHED.
FAILED Status is FAILED.
KILLED Status is KILLED.

DriverSchedulingConfig

Driver scheduling configuration.

JSON representation
{
  "memoryMb": integer,
  "vcores": integer
}
Fields
memoryMb

integer

Required. The amount of memory in MB the driver is requesting.

vcores

integer

Required. The number of vCPUs the driver is requesting.

Methods

cancel

Starts a job cancellation request.

delete

Deletes the job from the project.

get

Gets the resource representation for a job in a project.

getIamPolicy

Gets the access control policy for a resource.

list

Lists regions/{region}/jobs in a project.

patch

Updates a job in a project.

setIamPolicy

Sets the access control policy on the specified resource.

submit

Submits a job to a cluster.

submitAsOperation

Submits job to a cluster.

testIamPermissions

Returns permissions that a caller has on the specified resource.