REST Resource: projects.regions.jobs

Resource: Job

A Cloud Dataproc job resource.

JSON representation
{
  "reference": {
    object(JobReference)
  },
  "placement": {
    object(JobPlacement)
  },
  "status": {
    object(JobStatus)
  },
  "statusHistory": [
    {
      object(JobStatus)
    }
  ],
  "yarnApplications": [
    {
      object(YarnApplication)
    }
  ],
  "driverOutputResourceUri": string,
  "driverControlFilesUri": string,
  "labels": {
    string: string,
    ...
  },
  "scheduling": {
    object(JobScheduling)
  },

  // Union field type_job can be only one of the following:
  "hadoopJob": {
    object(HadoopJob)
  },
  "sparkJob": {
    object(SparkJob)
  },
  "pysparkJob": {
    object(PySparkJob)
  },
  "hiveJob": {
    object(HiveJob)
  },
  "pigJob": {
    object(PigJob)
  },
  "sparkSqlJob": {
    object(SparkSqlJob)
  }
  // End of list of possible types for union field type_job.
}
Fields
reference

object(JobReference)

Optional. The fully qualified reference to the job, which can be used to obtain the equivalent REST path of the job resource. If this property is not specified when a job is created, the server generates a

jobId

.

placement

object(JobPlacement)

Required. Job information, including how, when, and where to run the job.

status

object(JobStatus)

Output only. The job status. Additional application-specific status information may be contained in the

type_job

and

yarnApplications

fields.

statusHistory[]

object(JobStatus)

Output only. The previous job status.

yarnApplications[]

object(YarnApplication)

Output only. The collection of YARN applications spun up by this job.

Beta Feature: This report is available for testing purposes only. It may be changed before final release.

driverOutputResourceUri

string

Output only. A URI pointing to the location of the stdout of the job's driver program.

driverControlFilesUri

string

Output only. If present, the location of miscellaneous control files which may be used as part of job setup and handling. If not present, control files may be placed in the same location as driver_output_uri.

labels

map (key: string, value: string)

Optional. The labels to associate with this job. Label keys must contain 1 to 63 characters, and must conform to RFC 1035. Label values may be empty, but, if present, must contain 1 to 63 characters, and must conform to RFC 1035. No more than 32 labels can be associated with a job.

An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" }.

scheduling

object(JobScheduling)

Optional. Job scheduling configuration.

Union field type_job. Required. The application/framework-specific portion of the job. type_job can be only one of the following:
hadoopJob

object(HadoopJob)

Job is a Hadoop job.

sparkJob

object(SparkJob)

Job is a Spark job.

pysparkJob

object(PySparkJob)

Job is a Pyspark job.

hiveJob

object(HiveJob)

Job is a Hive job.

pigJob

object(PigJob)

Job is a Pig job.

sparkSqlJob

object(SparkSqlJob)

Job is a SparkSql job.

JobReference

Encapsulates the full scoping used to reference a job.

JSON representation
{
  "projectId": string,
  "jobId": string
}
Fields
projectId

string

Required. The ID of the Google Cloud Platform project that the job belongs to.

jobId

string

Optional. The job ID, which must be unique within the project. The job ID is generated by the server upon job submission or provided by the user as a means to perform retries without creating duplicate jobs. The ID must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), or hyphens (-). The maximum length is 100 characters.

JobPlacement

Cloud Dataproc job config.

JSON representation
{
  "clusterName": string,
  "clusterUuid": string
}
Fields
clusterName

string

Required. The name of the cluster where the job will be submitted.

clusterUuid

string

Output only. A cluster UUID generated by the Cloud Dataproc service when the job is submitted.

JobStatus

Cloud Dataproc job status.

JSON representation
{
  "state": enum(State),
  "details": string,
  "stateStartTime": string,
  "substate": enum(Substate)
}
Fields
state

enum(State)

Output only. A state message specifying the overall job state.

details

string

Output only. Optional job state details, such as an error description if the state is

ERROR

.

stateStartTime

string (Timestamp format)

Output only. The time when this state was entered.

A timestamp in RFC3339 UTC "Zulu" format, accurate to nanoseconds. Example: "2014-10-02T15:01:23.045123456Z".

substate

enum(Substate)

Output only. Additional state information, which includes status reported by the agent.

State

The job state.

Enums
STATE_UNSPECIFIED The job state is unknown.
PENDING The job is pending; it has been submitted, but is not yet running.
SETUP_DONE Job has been received by the service and completed initial setup; it will soon be submitted to the cluster.
RUNNING The job is running on the cluster.
CANCEL_PENDING A jobs.cancel request has been received, but is pending.
CANCEL_STARTED Transient in-flight resources have been canceled, and the request to cancel the running job has been issued to the cluster.
CANCELLED The job cancellation was successful.
DONE The job has completed successfully.
ERROR The job has completed, but encountered an error.
ATTEMPT_FAILURE

Job attempt has failed. The detail field contains failure details for this attempt.

Applies to restartable jobs only.

Substate

The job substate.

Enums
UNSPECIFIED The job substate is unknown.
SUBMITTED

The Job is submitted to the agent.

Applies to RUNNING state.

QUEUED

The Job has been received and is awaiting execution (it may be waiting for a condition to be met). See the "details" field for the reason for the delay.

Applies to RUNNING state.

STALE_STATUS

The agent-reported status is out of date, which may be caused by a loss of communication between the agent and Cloud Dataproc. If the agent does not send a timely update, the job will fail.

Applies to RUNNING state.

YarnApplication

A YARN application created by a job. Application information is a subset of

org.apache.hadoop.yarn.proto.YarnProtos.ApplicationReportProto

.

Beta Feature: This report is available for testing purposes only. It may be changed before final release.

JSON representation
{
  "name": string,
  "state": enum(State),
  "progress": number,
  "trackingUrl": string
}
Fields
name

string

Required. The application name.

state

enum(State)

Required. The application state.

progress

number

Required. The numerical progress of the application, from 1 to 100.

trackingUrl

string

Optional. The HTTP URL of the ApplicationMaster, HistoryServer, or TimelineServer that provides application-specific information. The URL uses the internal hostname, and requires a proxy server for resolution and, possibly, access.

State

The application state, corresponding to

YarnProtos.YarnApplicationStateProto

.

Enums
STATE_UNSPECIFIED Status is unspecified.
NEW Status is NEW.
NEW_SAVING Status is NEW_SAVING.
SUBMITTED Status is SUBMITTED.
ACCEPTED Status is ACCEPTED.
RUNNING Status is RUNNING.
FINISHED Status is FINISHED.
FAILED Status is FAILED.
KILLED Status is KILLED.

Methods

cancel

Starts a job cancellation request.

delete

Deletes the job from the project.

get

Gets the resource representation for a job in a project.

getIamPolicy

Gets the access control policy for a resource.

list

Lists regions/{region}/jobs in a project.

patch

Updates a job in a project.

setIamPolicy

Sets the access control policy on the specified resource.

submit

Submits a job to a cluster.

testIamPermissions

Returns permissions that a caller has on the specified resource.
Was this page helpful? Let us know how we did:

Send feedback about...

Cloud Dataproc