Life of a Cloud Dataproc Job

This page delineates the sequence of steps involved with the submission, execution, and completion of a Cloud Dataproc job. It also discusses job throttling and debugging.

Cloud Dataproc jobs flow

  1. User submits job to Cloud Dataproc.
  2. Job waits to be acquired by the dataproc agent.
    • If the job is acquired, JobStatus.State is marked as RUNNING.
    • If the job is not acquired due to agent failure, Compute Engine network failure, or other cause, the job is marked ERROR.
  3. Once a job is acquired by the agent, the agent verifies that there are sufficient resources available on the Cloud Dataproc cluster's master node to start the driver.
  4. If sufficient resources are available, the dataproc agent starts the job driver process.
    • At this stage, typically there are one or more applications running in Apache Hadoop YARN. However, Yarn applications may not start until the driver finishes scanning Cloud Storage directories or performing other start-up job tasks.
  5. The dataproc agent periodically sends updates to Cloud Dataproc on job progress, cluster metrics, and Yarn applications associated with the job (see Job Monitoring and Debugging).
  6. Yarn application(s) complete.
    • Job continues to be reported as RUNNING while driver performs any job completion tasks, such as materializing collections.
    • An unhandled or uncaught failure in the Main thread can leave the driver in a zombie state (marked as RUNNING without information as to the cause of the failure).
  7. Driver exits. dataproc agent reports completion to Cloud Dataproc.
    • Cloud Dataproc reports job as DONE.

Job monitoring and debugging

Use the gcloud command-line tool, Cloud Dataproc REST API, and Google Cloud Platform Console to analyze and debug Cloud Dataproc jobs.

gcloud command

To examine a running job's status:

gcloud dataproc jobs describe job-id

To view job driver output, see Accessing job driver output.


Call jobs.get to examine a job's JobStatus.State, JobStatus.Substate, JobStatus.details, and YarnApplication fields.


To view job driver output, see Accessing job driver output.

To view the dataproc agent log in Stackdriver logging, select Cloud Dataproc Cluster→Cluster Name→Cluster UUID from the logs viewer cluster selector.

Then use the logs selector to select google.dataproc.agent logs.

View Job logs in Stackdriver

If a job fails, you can access job logs in Stackdriver Logging.

Determining who submitted a job

Looking up the details of a job will show who submitted that job in the submittedBy field. For example, this job output shows user@domain submitted the example job to a cluster.

  clusterName: cluster-name
  clusterUuid: cluster-uuid
  jobId: job-uuid
  projectId: project
  state: DONE
  stateStartTime: '2018-11-01T00:53:37.599Z'
- state: PENDING
  stateStartTime: '2018-11-01T00:33:41.387Z'
- state: SETUP_DONE
  stateStartTime: '2018-11-01T00:33:41.765Z'
- details: Agent reported job success
  state: RUNNING
  stateStartTime: '2018-11-01T00:33:42.146Z'
submittedBy: user@domain
هل كانت هذه الصفحة مفيدة؟ يرجى تقييم أدائنا:

إرسال تعليقات حول...

Cloud Dataproc Documentation
هل تحتاج إلى مساعدة؟ انتقل إلى صفحة الدعم.